megengine.utils.network.Network.optimize_for_inference¶
- Network.optimize_for_inference(dest_vars, **kwargs)[源代码]¶
- 优化该网络,使其在推理时获得更优越的性能。 - 参数
- dest_vars – list of output vars in the operator graph 
 - Keyword Arguments: - enable_io16xc32 – whether to use float16 for I/O between oprs and use float32 as internal computation precision. Note the output var would be changed to float16. 
- enable_ioc16 – whether to use float16 for both I/O and computation precision. 
- enable_hwcd4 – whether to use NHWCD4 data layout. This is faster on some OpenCL backend. 
- enable_nchw88 – whether to use NCHW88 data layout, currently used in X86 AVX backend. 
- enable_nchw44 – whether to use NCHW44 data layout, currently used in arm backend. 
- enable_nchw44_dot – whether to use NCHW44_dot data layout, currently used in armv8.2+dotprod backend. 
- enable_nchw4 – whether to use NCHW4 data layout, currently used in nvidia backend(based on cudnn). 
- enable_nchw32 – whether to use NCHW32 data layout, currently used in nvidia backend with tensorcore(based on cudnn). 
- enable_chwn4 – whether to use CHWN4 data layout, currently used in nvidia backend with tensorcore. 
- enable_nchw64 – whether to use NCHW64 data layout, used for fast int4 support on Nvidia GPU. 
- enable_fuse_conv_bias_nonlinearity: whether to fuse conv+bias+nonlinearty into one opr. 
- enable_fuse_conv_bias_with_z: whether to fuse conv_bias with z input for inference on nvidia backend(this optimization pass will result in mismatch of the precision of output of training and inference)