环境变量(Env)设置#

默认情况下,无需对环境变量进行更改,即可正常使用 MegEngine 框架。

如果你需要进行一定的改动,请确保你完全了解可能产生的影响。👻 👻 👻

Precautions#

Note

MegEngine environment variable used can be divided into “active” and “non-dynamic” categories:

  • Dynamic environment variables can be dynamically read, that is, modifications can take effect immediately during code execution.

  • Non-dynamic environment variables are only read once in the entire process, and subsequent modifications to them will not take effect.

For non-dynamic environment variables, we will mark them as Only the first time setting takes effect.

Only the first time setting takes effect#

Warning

It should be noted that some environment variables will be read during the process of running import megengine.

Therefore, for those environment variables that are only read once, it is recommended to use one of the following two methods to set:

  • Make temporary settings through ``export’’ in the Shell, and then run the code;

  • Set in the code that needs to be run, and place the relevant code at the very beginning:

    os.environ['MGE_ENV_VAR']="value"  # Put this line before import megengine
    
    import megengine  # Read some environment variables
    

Compilation related#

Note

  • 你可以在 CMakeLists.txt 文件中找到 MegEngine 基本的 option 配置;

  • 第三方库的 cmake 配置文件可以在 cmake 中找到。

Equipment related#

MGE_DEFAULT_DEVICE ( Only the first time setting takes effect)

Set the default computing device to use, refer to: py:func:~.set_default_device.

MEGENGINE_HOST_COMPUTE ( Only the first time setting takes effect)

Whether to allow simple calculations on the Host (even if there is a GPU device), it is enabled by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

Log related#

MEGENGINE_LOGGING_LEVEL ( Only the first time setting takes effect)

Set Log level, select INFO, DEBUG, ERROR.

RUNTIME_OVERRIDE_LOG_LEVEL ( Only the first time setting takes effect)

Set the Runtime Log level, the default is ``ERROR’’ level, need to be set by numbers: DEBUG = 0, INFO = 1, WARN = 2, ERROR = 3, NO_LOG = 4.

``MGB_DEBUG_VAR_SANITY_CHECK_LOG’’

Print the log of the out-of-bounds check of the video memory for the varnode with the specified ID. The default is empty and not enabled.

MGB_LOG_TRT_MEM_ALLOC

Print the video memory requested by tensorrt. The default is empty and not enabled.

Distributed correlation#

MGE_PLASMA_MEMORY

DataLoader shared memory size, the default unit is B.

Set to 0'' means not to use, ``100000000'' means 100MB, ``4000000000 means 4GB, and so on.

``MGE_DATALOADER_PLASMA_DEBUG’’

Whether to display DataLoader shared memory output and error messages.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MGE_MM_OPR_DEBUG ( Only the first time setting takes effect)

Whether to output the debug information of the multi-computer operator, it is disabled by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

Video memory related#

``MEGENGINE_INPLACE_UPDATE’’

Whether to modify the model parameters in situ to avoid dynamic video memory fragmentation, you can usually save twice the amount of video memory.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MGB_CUDA_RESERVE_MEMORY ( Only the first time setting takes effect)

Whether to occupy all the CUDA video memory, the video memory allocation may be optimized.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MGB_ROCM_RESERVE_MEMORY ( Only the first time setting takes effect)

Whether to occupy all the ROCm video memory, the video memory allocation may be optimized to a certain extent.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

Sublinear related#

Note

参考 SublinearMemoryConfig API 了解更多信息。

``MGB_SUBLINEAR_MEMORY_THRESH_NR_TRY’’

The number of samples searched in the current range of linear space and sub-linear memory optimization, the default is ``10’’.

``MGB_SUBLINEAR_MEMORY_GENETIC_NR_ITER’’

The number of iterations when the genetic algorithm is used to find the optimal segmentation strategy. The default is ``0’’.

``MGB_SUBLINEAR_MEMORY_GENETIC_POOL_SIZE’’

The number of samples used by the genetic optimization algorithm for crossover selection. The default is ``20’’.

``MGB_SUBLINEAR_MEMORY_LOWER_BOUND_MB’’

The lower bound (in MB) of the bottleneck size for sub-linear memory optimization, which can be used to manually trade-off between memory and speed. The default is ``0’’.

``MGB_SUBLINEAR_MEMORY_WORKERS’’

The number of threads used when searching for the optimal segmentation strategy for sub-linear memory optimization. The default is half of the number of CPUs in the current system.

``MEGENGINE_INPUT_NODE_USE_STATIC_SHAPE’’

Adding a static shape mode to InputNode can make more var_nodes static_shape and use static storage, batch_size can be opened larger, and it is closed by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

DTR related#

MEGENGINE_ENABLE_SWAP ( Only the first time setting takes effect)

Whether to enable the swap policy, it is not enabled by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MEGENGINE_ENABLE_DROP ( Only the first time setting takes effect)

Whether to enable the drop policy, it is not enabled by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MEGENGINE_DTR_AUTO_DROP ( Only the first time setting takes effect)

Whether to enable the automatic drop policy, it is not enabled by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

Note

The above environment variables should be controlled by enable and disable.

``MEGENGINE_DTR_EVICTION_THRESHOLD’’

The upper limit of DTR video memory, after exceeding the automatic drop strategy, refer to: py:func:~.eviction_threshold.

``MEGENGINE_DTR_EVICTEE_MINIMUM_SIZE’’

The size that the tensor that applies the DTR strategy needs to reach, the default is 1048576B (1MB), refer to: py:func:~.evictee_minimum_size.

Diagram mechanism related#

MEGENGINE_INTERP_ASYNC_LEVEL ( Only the first time setting takes effect)

动态图的执行并行度,0 是完全串行,1 是计算异步,2 是用户代码和计算都异步(默认)。 设置为 0 将使 MegEngine 上层的任务队列变成同步执行,即 Python 调用一个 Op, C++ 层执行一个 Op, 没执行完前 Python 层不会走到下一句,便于定位 Python 层报错的位置,但会影响速度。

MEGENGINE_ASYNC_QUEUE_SIZEOnly the first time setting takes effect

异步队列的大小,默认值为 10000。增加该值会发现内存占用持续增长,是正常现象。

MEGENGINE_CATCH_WORKER_EXEC ( Only the first time setting takes effect)

Whether to capture the exception of the dynamic graph worker, it is enabled by default, and it can be disabled during Debugging.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MEGENGINE_RECORD_COMPUTING_PATH ( Only the first time setting takes effect)

Whether to record the historical calculation path of tensor, it is closed by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.

MEGENGINE_EXECUTION_STRATEGY ( Only the first time setting takes effect)

Set the kernel selection strategy (fast-run), which affects the running speed, reproducibility and compilation:

  • HEURISTIC-means heuristic selection kernel

  • ``PROFILE’’-indicates that the kernel is selected according to the profile time

  • ``REPRODUCEABLE’’-indicates the use of reproducible algorithms

  • OPTIMIZED-means to use an optimized algorithm

The default is HEURISTIC, refer to: py:func:~.set_execution_strategy for more information.

MGB_CONV_PROFILING_TIMEOUT ( Only the first time setting takes effect)

Profile timeout threshold. When the timeout expires, the Kernel operation will be killed directly. The default value is 0, which means no restriction.

MGB_PROFILE_ONLY_WAIT ( Only the first time setting takes effect)

In Prifile, only operators with wait behavior are selected, which is empty by default and is not enabled.

CUDA_BIN_PATH ( Only the first time setting takes effect)

Set the path of CUDA compiler nvcc for compiling fuse kernel.

The default is to search from the PATH, LIBRARY_PATH environment variables, or you can manually specify the path such as "/data/opt/cuda/bin/".

``MGB_JIT_BACKEND’’

The compilation backend options of jit fuse kernel can be set to HALIDE, NVRTC, MLIR.

MGB_JIT_KEEP_INTERM ( Only the first time setting takes effect)

Whether to save the temporary file generated by jit, the default is empty, do not save.

MGB_JIT_WORKDIR ( Only the first time setting takes effect)

The directory path of temporary files generated by jit, the default is ``/tmp/mgbjit-XXXXXX’’.

MGB_DUMP_INPUT ( Only the first time setting takes effect)

Whether to input the value when exporting each operator while dumping, the default is empty and not enabled.

MGE_FASTRUN_CACHE_TYPEOnly the first time setting takes effect

使用何种方式存储 fastrun cache, 可设置为 FILE (进程退出时会将内容保存到文件里)或 MEMORY (不保存)

MGE_FASTRUN_CACHE_DIROnly the first time setting takes effect

更改 fastrun cache 的存储路径,默认为 ~/.cache/megengine/persistent_cache

Debugging related#

MGB_WAIT_TERMINATE ( Only the first time setting takes effect)

When MegEngine crashes, it enters waiting. At this time, you can use gdb attch to debug, which is empty by default and is not enabled.

MGB_THROW_ON_FORK ( Only the first time setting takes effect)

Whether to throw an exception in the Fork process, the default is empty and not enabled.

MGB_THROW_ON_SCALAR_IDX ( Only the first time setting takes effect)

If the subscript of Tensor index is scalar, you can use subtensor to choose whether to throw an exception, and it is closed by default.

Set to ``0’’ to turn off, and to ``1’’ to turn on.