megenginelite.network

class LiteOptions[源代码]

the inference options which can optimize the network forwarding performance

变量
  • weight_preprocess – is the option which optimize the inference performance with processing the weights of the network ahead

  • fuse_preprocess – fuse preprocess patten, like astype + pad_channel + dimshuffle

  • fake_next_exec – whether only to perform non-computing tasks (like memory allocation and queue initialization) for next exec. This will be reset to false when the graph is executed.

  • var_sanity_check_first_run – Disable var sanity check on the first run. Var sanity check is enabled on the first-time execution by default, and can be used to find some potential memory access errors in the operator

  • const_shape – used to reduce memory usage and improve performance since some static inference data structures can be omitted and some operators can be compute before forwarding

  • force_dynamic_alloc – force dynamic allocate memory for all vars

  • force_output_dynamic_alloc – force dynamic allocate memory for output tensor which are used as the input of CallbackCaller Operator

  • no_profiling_on_shape_change – do not re-profile to select best implement algo when input shape changes (use previous algo)

  • jit_level

    Execute supported operators with JIT (support MLIR, NVRTC). Can only be used on Nvidia GPUs and X86 CPU, this value indicates JIT level:

    level 1: for JIT execute with basic elemwise operator

    level 2: for JIT execute elemwise and reduce operators

  • record_level

    flags to optimize the inference performance with record the kernel tasks in first run, hereafter the inference all need is to execute the recorded tasks.

    level = 0 means the normal inference

    level = 1 means use record inference

    level = 2 means record inference with free the extra memory

  • graph_opt_level

    network optimization level:

    0: disable

    1: level-1: inplace arith transformations during graph construction

    2: level-2: level-1, plus global optimization before graph compiling

    3: also enable JIT

  • async_exec_level

    level of dispatch on separate threads for different comp_node.

    0: do not perform async dispatch

    1: dispatch async if there are more than one comp node with limited queue

    mask 0b10: async if there are multiple comp nodes with

    mask 0b100: always async

实际案例

from megenginelite import *
options = LiteOptions()
options.weight_preprocess = true
options.record_level = 1
options.fuse_preprocess = true
async_exec_level

Structure/Union member

comp_node_seq_record_level

Structure/Union member

const_shape

Structure/Union member

enable_nchw32

Structure/Union member

enable_nchw4

Structure/Union member

enable_nchw44

Structure/Union member

enable_nchw44_dot

Structure/Union member

enable_nchw64

Structure/Union member

enable_nchw88

Structure/Union member

enable_nhwcd4

Structure/Union member

fake_next_exec

Structure/Union member

force_dynamic_alloc

Structure/Union member

force_output_dynamic_alloc

Structure/Union member

force_output_use_user_specified_memory

Structure/Union member

fuse_preprocess

Structure/Union member

graph_opt_level

Structure/Union member

jit_level

Structure/Union member

no_profiling_on_shape_change

Structure/Union member

var_sanity_check_first_run

Structure/Union member

weight_preprocess

Structure/Union member

class LiteConfig(device_type=LiteDeviceType.LITE_CPU, option=None)[源代码]

Configuration when load and compile a network

变量
  • has_compression – flag whether the model is compressed, the compress method is stored in the model

  • device_id – configure the device id of a network

  • device_type – configure the device type of a network

  • backend – configure the inference backend of a network, now only support megengine

  • bare_model_cryption_name – is the bare model encryption method name, bare model is not packed with json information, this encryption method name is useful to decrypt the encrypted bare model

  • options – configuration of Options

  • auto_optimize_inference – lite will detect the device information add set the options heuristically

实际案例

from megenginelite import *
config = LiteConfig()
config.has_compression = False
config.device_type = LiteDeviceType.LITE_CPU
config.backend = LiteBackend.LITE_DEFAULT
config.bare_model_cryption_name = "AES_default".encode("utf-8")
config.auto_optimize_inference = False
auto_optimize_inference

Structure/Union member

backend

Structure/Union member

property bare_model_cryption_name
device_id

Structure/Union member

device_type

Structure/Union member

has_compression

Structure/Union member

options

Structure/Union member

class LiteIO(name, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]

config the network input and output item, the input and output tensor information will describe there

变量
  • name – the tensor name in the graph corresponding to the IO is_host: Used to mark where the input tensor comes from and where the output tensor will copy to, if is_host is true, the input is from host and output copy to host, otherwise in device. Sometimes the input is from device and output no need copy to host, default is true.

  • io_type – The IO type, it can be SHAPE or VALUE, when SHAPE is set, the input or output tensor value is invaid, only shape will be set, default is VALUE

  • config_layout – The layout of the config from user, if other layout is set before forward or get after forward, this layout will by pass. if no other layout is set before forward, this layout will work. if this layout is no set, the model will forward with its origin layout. if in output, it will used to check.

注解

if other layout is set to input tensor before forwarding, this layout will not work

if no layout is set before forwarding, the model will forward with its origin layout

if layout is set in output tensor, it will used to check whether the layout computed from the network is correct

实际案例

from megenginelite import *
io = LiteIO(
    "data2",
    is_host=True,
    io_type=LiteIOType.LITE_IO_SHAPE,
    layout=LiteLayout([2, 4, 4]),
)
config_layout

Structure/Union member

io_type

Structure/Union member

is_host

Structure/Union member

property name

get the name of IO item

class LiteNetworkIO(inputs=None, outputs=None)[源代码]

the input and output information when load the network for user the NetworkIO will remain in the network until the network is destroyed.

变量
  • inputs – The all input tensors information that will configure to the network

  • outputs – The all output tensors information that will configure to the network

实际案例

from megenginelite import *
input_io = LiteIO("data", is_host=False, io_type=LiteIOType.LITE_IO_VALUE)
io = LiteNetworkIO()
io.add_input(input_io)
output_io = LiteIO("out", is_host=True, layout=LiteLayout([1, 1000]))
io.add_output(output_io)
add_input(obj, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]

add input information into LiteNetworkIO

add_output(obj, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]

add output information into LiteNetworkIO

class LiteNetwork(config=None, io=None)[源代码]

the network to load a model and forward

实际案例

from megenginelite import *
config = LiteConfig()
config.device_type = LiteDeviceType.LITE_CPU
network = LiteNetwork(config)
network.load("model_path")

input_name = network.get_input_name(0)
input_tensor = network.get_io_tensor(input_name)
output_name = network.get_output_name(0)
output_tensor = network.get_io_tensor(output_name)

input_tensor.set_data_by_copy(input_data)

network.forward()
network.wait()
async_with_callback(async_callback)[源代码]

set the network forwarding in async mode and set the AsyncCallback callback function

参数

async_callback – the callback to set for network

property device_id

get the device id

返回

the device id of current network used

dump_layout_transform_model(model_file)[源代码]

dump network after global layout transform optimization to the specific path

参数

model_file – the file path to dump model

enable_cpu_inplace_mode()[源代码]

set cpu forward in inplace mode with which cpu forward only create one thread

注解

this must be set before the network loaded

enable_global_layout_transform()[源代码]

set global layout transform optimization for network, global layout optimization can auto determine the layout of every operator in the network by profile, thus it can improve the performance of the network forwarding

enable_profile_performance(profile_file)[源代码]

enable get the network performance profiled information and save into given file

参数

profile_file – the file to save profile information

extra_configure(extra_config)[源代码]

Extra Configuration to the network.

forward()[源代码]

forward the network with filled input data and fill the output data to the output tensor

get_all_input_name()[源代码]

get all the input tensor name in the network

返回

the names of all input tesor in the network

get_all_output_name()[源代码]

get all the output tensor name in the network

返回

the names of all output tesor in the network

get_input_name(index)[源代码]

get the input name by the index in the network

参数

index – the index of the input name

返回

the name of input tesor with given index

get_io_tensor(name, phase=LiteTensorPhase.LITE_IO)[源代码]

get input or output tensor by its name

参数
  • name – the name of io tensor

  • phase – the type of LiteTensor, this is useful to separate input or output tensor with the same name

返回

the tensor with given name and type

get_output_name(index)[源代码]

get the output name by the index in the network

参数

index – the index of the output name

返回

the name of output tesor with given index

get_static_memory_alloc_info(log_dir='logs/test')[源代码]

get static peak memory info showed by Graph visualization

参数

log_dir – the directory to save information log

io_bin_dump(bin_dir)[源代码]

dump all input/output tensor of all operators to the output file, in binary format, user can use this function to debug compute error

参数

bin_dir – the binary file directory

io_txt_dump(txt_file)[源代码]

dump all input/output tensor of all operators to the output file, in txt format, user can use this function to debug compute error

参数

txt_file – the txt file

is_cpu_inplace_mode()[源代码]

whether the network run in cpu inpalce mode

返回

if use inpalce mode return True, else return False

load(path)[源代码]

load network from given path

set_finish_callback(finish_callback)[源代码]

when the network finish forward, the callback will be called, the finish_callback with param mapping from LiteIO to the corresponding LiteTensor

参数

finish_callback – the callback to set for network

set_network_algo_policy(policy, shared_batch_size=0, binary_equal_between_batch=False)[源代码]

set the network algorithm search policy for fast-run

参数
  • shared_batch_size – the batch size used by fastrun, Non-zero value means that fastrun use this batch size regardless of the batch size of the model. Zero means fastrun use batch size of the model

  • binary_equal_between_batch – if the content of each input batch is binary equal,whether the content of each output batch is promised to be equal

set_network_algo_workspace_limit(size_limit)[源代码]

set the opr workspace limitation in the target network, some opr maybe use large of workspace to get good performance, set workspace limitation can save memory but may influence the performance

参数

size_limit – the byte size of workspace limitation

set_start_callback(start_callback)[源代码]

when the network start forward, the callback will be called, the start_callback with param mapping from LiteIO to the corresponding LiteTensor

参数

start_callback – the callback to set for network

share_runtime_memroy(src_network)[源代码]

share runtime memory with the srouce network

参数

src_network – the network to share runtime memory

share_weights_with(src_network)[源代码]

share weights with the loaded network

参数

src_network – the network to share weights

property stream_id

get the stream id

返回

the value of stream id set for detwork

property threads_number

get the thread number of the netwrok

返回

the number of thread set in the network

use_tensorrt()[源代码]

use TensorRT

注解

this must be set before the network loaded

wait()[源代码]

wait until forward finish in sync model