megenginelite.network#

class LiteOptions[源代码]#

推理优化选项用于提升推理性能

变量:

weight_preprocess – 推理优化选项用于模型权重预处理
fuse_preprocess – 融合预处理模式，如融合astype + pad_channel + dimshuffle
fake_next_exec – 是否进行预热，只执行如内存分配，队列初始化等非计算任务，下次执行时会被设为false
var_sanity_check_first_run – 初次运行时不进行合理性检查，变量合理性检查在初次执行时默认进行，可以用来发现一些潜在的内存访问错误
const_shape – 用于减少内存使用以提升性能，部分静态推理数据以及算子可以在真正推理之前执行
force_dynamic_alloc – 所有变量强制进行动态内存分配
force_output_dynamic_alloc – 对作为 CallbackCaller 输入tensor的输出tensor强制进行动态内存分配
no_profiling_on_shape_change – 当输入shape发生变化时不重新profile 选择最优算法(使用之前的算法)
jit_level – 执行支持JIT的算子，查看MGB_JIT_BACKEND获取更多细节，JIT的等级有以下含义：等级1: JIT 执行基本的elemwise 算子等级2: JIT 执行 elemwise 以及reduce 算子
record_level – 首次运行时记录kernel 任务, 然后后面的推理只会执行记录的kernel 任务， level = 0为基本推理 level = 1 为记录kernel任务的推理， level = 2为记录kernel任务并优化内存使用的推理
graph_opt_level – 模型优化级别：等级 0: 不优化等级 1: 图构建时进行原地算数运算转换等级2: 等级1加上图编译前的全局优化等级3: 等级2加上JIT
async_exec_level – 分配不同计算节点到不同线程上的等级：0: 不进行异步分配 1: 当有多个具有有限队列计算节点时进行异步分配，掩码 0b10: 有多个计算节点时进行异步计算，掩码 0b100: 总是进行异步计算

实际案例

from megenginelite import *
options = LiteOptions()
options.weight_preprocess = true
options.record_level = 1
options.fuse_preprocess = true

async_exec_level#: Structure/Union 成员

comp_node_seq_record_level#: Structure/Union 成员

const_shape#: Structure/Union 成员

enable_nchw32#: Structure/Union 成员

enable_nchw4#: Structure/Union 成员

enable_nchw44#: Structure/Union 成员

enable_nchw44_dot#: Structure/Union 成员

enable_nchw64#: Structure/Union 成员

enable_nchw88#: Structure/Union 成员

enable_nhwcd4#: Structure/Union 成员

fake_next_exec#: Structure/Union 成员

force_dynamic_alloc#: Structure/Union 成员

force_output_dynamic_alloc#: Structure/Union 成员

force_output_use_user_specified_memory#: Structure/Union 成员

fuse_preprocess#: Structure/Union 成员

graph_opt_level#: Structure/Union 成员

jit_level#: Structure/Union 成员

no_profiling_on_shape_change#: Structure/Union 成员

var_sanity_check_first_run#: Structure/Union 成员

weight_preprocess#: Structure/Union 成员

class LiteConfig(device_type=LiteDeviceType.LITE_CPU, option=None)[源代码]#

模型加载运行时的配置参数

变量:

has_compression – 模型是否进行权重压缩的标志，压缩方法注册在模型中
device_id – 配置模型device ID
device_type – 配置模型device 类型
backend – 配置模型推理后端，目前只支持MegEngine
bare_model_cryption_name – 裸模型加密方法，用于解密模型时用，裸模型指没有集成json信息的模型
options – 推理优化选项的配置
auto_optimize_inference – lite 会启发式的探测设备信息并设置优化选项
discrete_input_name – 配置哪个输入是由多个分散的tensor组成

实际案例

from megenginelite import *
config = LiteConfig()
config.has_compression = False
config.device_type = LiteDeviceType.LITE_CPU
config.backend = LiteBackend.LITE_DEFAULT
config.bare_model_cryption_name = "AES_default".encode("utf-8")
config.auto_optimize_inference = False

auto_optimize_inference#: Structure/Union 成员

backend#: Structure/Union 成员

property bare_model_cryption_name#

device_id#: Structure/Union 成员

device_type#: Structure/Union 成员

discrete_input_name#: Structure/Union 成员

has_compression#: Structure/Union 成员

options#: Structure/Union 成员

class LiteIO(name, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]#

配置模型输入输出内容，输入输出相关信息在这里描述

变量:

name – 计算图中相关IO的tensor 名称，is_host：用于标记输入输出tensor所在位置，当为true时，表明输入或输出tensor在host上，否则在device上。主要用于输入输出tesnor 数据同步过程
io_type – The IO type, it can be SHAPE or VALUE, when SHAPE is set, the input or output tensor value is invaid, only shape will be set, default is VALUE
config_layout – 用户配置layout: 在推理之行前后设置了其他layout, 该layout会被忽略，若没有设置其他layout，该layout 起作用。若该layout 也没有设置，推理会采用默认layout 推理并检查输出layout

备注

如果前向推理执行前输入tensor被设置成了其他layout，这里的layout 不会生效

如果前向推理执行前没有设置任何layout，模型推理时会选择原始的layout

如果这里设置了输出 tensor layout，该layout 会被用于检查模型网络计算得到的layout的正确性

实际案例

from megenginelite import *
io = LiteIO(
    "data2",
    is_host=True,
    io_type=LiteIOType.LITE_IO_SHAPE,
    layout=LiteLayout([2, 4, 4]),
)

config_layout#: Structure/Union 成员

io_type#: Structure/Union 成员

is_host#: Structure/Union 成员

property name#: 获取 IO 名称

class LiteNetworkIO(inputs=None, outputs=None)[源代码]#

用户加载模型时的输入输出信息，NetworkIO会保留到模型释放之前

变量:

inputs – 配置到模型中的所有输入tensor信息
outputs – 配置到模型中的所有输出tensor信息

实际案例

from megenginelite import *
input_io = LiteIO("data", is_host=False, io_type=LiteIOType.LITE_IO_VALUE)
io = LiteNetworkIO()
io.add_input(input_io)
output_io = LiteIO("out", is_host=True, layout=LiteLayout([1, 1000]))
io.add_output(output_io)

add_input(obj, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]#: 在LiteNetworkIO添加输入信息

add_output(obj, is_host=True, io_type=LiteIOType.LITE_IO_VALUE, layout=None)[源代码]#: 在LiteNetworkIO中添加输出信息

class LiteNetwork(config=None, io=None)[源代码]#

加载或推理的模型网络

实际案例

from megenginelite import *
config = LiteConfig()
config.device_type = LiteDeviceType.LITE_CPU
network = LiteNetwork(config)
network.load("model_path")

input_name = network.get_input_name(0)
input_tensor = network.get_io_tensor(input_name)
output_name = network.get_output_name(0)
output_tensor = network.get_io_tensor(output_name)

input_tensor.set_data_by_copy(input_data)

network.forward()
network.wait()

async_with_callback(async_callback)[源代码]#

设置模型推理为异步模式并设置异步回调函数

参数:: async_callback – 设置到模型里面的回调函数

property device_id#

获取device ID

返回:: 模型当前使用的device ID

dump_layout_transform_model(model_file)[源代码]#

全局图优化后保存模型到指定文件

参数:: model_file – 需要保存的模型文件路径

enable_cpu_inplace_mode()[源代码]#: 设置推理cpu为inplace模式，这时只会创建一个线程进行推理

备注

必须在模型加载前设置

enable_global_layout_transform()[源代码]#: 设置模型全局图优化，全局图优化可以通过profile自动选择最优的layout,提升模型推理性能

enable_profile_performance(profile_file)[源代码]#

获取模型profile 信息并保存到指定文件

参数:: profile_file – 保存profile信息的文件

extra_configure(extra_config)[源代码]#: 模型其他配置信息

forward()[源代码]#: 使用指定输入执行模型推理并获取相应输出

get_all_input_name()[源代码]#

获取所有模型输入tensor名称

返回:: 所有输入tensor名称

get_all_output_name()[源代码]#

获取所有模型输出tensor名称

返回:: 模型输出tensor名称

get_discrete_tensor(name, n_idx, phase=LiteTensorPhase.LITE_INPUT)[源代码]#

获取模型中输入tensor中第n_idx个, 输入tesnor由多个同名tensor 组成

参数:

name – 输入tesnor 名称
n_idx – 输入tensor 索引
phase – LiteTensor 类型，在输入tesnor同名时起作用

返回:

给定名称和类型的tensor

get_input_name(index)[源代码]#

通过索引获取模型输入tesnor 名称

参数:: index – 输入tesnor 名称的索引
返回:: 给定索引的输入tensor 名称

get_io_tensor(name, phase=LiteTensorPhase.LITE_IO)[源代码]#

通过名称获取输入或输出tensor

参数:

name – the name of io tensor
phase – LiteTensor 类型，在输入tesnor同名时起作用

返回:

给定名称和类型的tensor

get_output_name(index)[源代码]#

通过索引获取模型输出tesnor 名称

参数:: index – 输出tesnor 名称的索引
返回:: 给定索引的输入tensor 名称

get_static_memory_alloc_info(log_dir='logs/test')[源代码]#

获取静态峰值内存信息用于可视化展示

参数:: log_dir – 保存信息log所在目录

io_bin_dump(bin_dir)[源代码]#

以二进制形式保存所有算子的输入输出tensor, 用户可以通过这个函数debug 计算错误

参数:: bin_dir – 二进制文件所在目录

io_txt_dump(txt_file)[源代码]#

以文本形式保存所有算子的输入输出tensor, 用户可以通过这个函数debug 计算错误

参数:: txt_file – 文本文件路径

is_cpu_inplace_mode()[源代码]#

模型是否使用cpu inplace模式

返回:: 使用inpalce时返回True, 不使用时返回False

load(path)[源代码]#: load network from given path

set_finish_callback(finish_callback)[源代码]#

模型推理结束时调用该回调函数，带参的 finish_callback 会映射LiteIO到相关LiteTensor

参数:: finish_callback – 设置到模型里面的回调函数

set_network_algo_policy(policy, shared_batch_size=0, binary_equal_between_batch=False)[源代码]#

设置fast-run的算法搜索策略

参数:

shared_batch_size – fast-run用到的 batch 大小，当该值大于0时，fast-run会忽略模型本来的batch大小而使用该值，当该值为0时，使用模型默认的batch
binary_equal_between_batch – 当输入batch 的内容完全相同时，输出batch内容是否如约定完全相等

set_network_algo_workspace_limit(size_limit)[源代码]#

设置目标模型算子运行所需的workspace 极限，因为部分算子会使用较大的workspace来提升性能，设置workspace 极限可以节省内存但可能会对性能有影响

参数:: size_limit – workspace 极限字节大小

set_start_callback(start_callback)[源代码]#

模型推理前，该回调函数被调用，带参的start_callback映射LiteIO到对应LiteTensor

参数:: start_callback – 设置到模型里面的回调函数

share_runtime_memroy(src_network)[源代码]#

与source模型共享运行时内存

参数:: src_network – 待共享运行时内存的模型

share_weights_with(src_network)[源代码]#

与已加载的模型共享weight

参数:: src_network – 待共享内存的模型

property stream_id#

获取stream ID

返回:: 设置到模型中的stream的值

property threads_number#

获取模型线程数量

返回:: 模型中设置的线程数

use_tensorrt()[源代码]#: 开启 TensorRT

备注

必须在模型加载前设置

wait()[源代码]#: 等待模型同步推理结束