.. _quantization-basic-concept: ================ 閲忓寲鏂规鍘熺悊璁茶В ================ 鍓嶉潰鎻愬埌浜嗛噺鍖栧氨鏄皢鍩轰簬娴偣鏁版嵁绫诲瀷鐨勬ā鍨嬭浆鎹负瀹氱偣鏁拌繘琛岃繍绠楋紝鍏舵牳蹇冨氨鏄浣曠敤瀹氱偣鏁板幓琛ㄧず妯″瀷涓殑娴偣鏁帮紝浠ュ強濡備綍鐢ㄥ畾鐐硅繍绠楀幓琛ㄧず瀵瑰簲鐨勬诞鐐硅繍绠椼€� 浠loat32杞瑄int8涓轰緥锛屼竴绉嶆渶绠€鍗曠殑杞崲鏂规硶鏄洿鎺ヨ垗鍘籪loat32鐨勫皬鏁伴儴鍒嗭紝鍙彇鍏朵腑鐨勬暣鏁伴儴鍒嗭紝骞朵笖瀵逛簬瓒呭嚭(0,255)琛ㄧず鑼冨洿鐨勫€肩敤0鎴栬€�255琛ㄧず銆� 杩欑鏂规鏄剧劧鏄笉鍚堥€傜殑锛屽挨鍏舵槸娣卞害绁炵粡缃戠粶缁忚繃bn澶勭悊鍚庯紝鍏朵腑闂村眰杈撳嚭鍩烘湰閮芥槸0鍧囧€硷紝1鏂瑰樊鐨勬暟鎹寖鍥达紝鍦ㄨ繖绉嶆柟妗堜笅鍥犱负灏忔暟閮ㄥ垎琚垗寮冩帀浜嗭紝浼氬甫鏉ュぇ閲忕殑 绮惧害鎹熷け銆傚苟涓斿洜涓�(0,255)浠ュ鐨勯儴鍒嗚clip鍒颁簡0鎴�255锛屽綋娴偣鏁颁负璐熸垨鑰呭ぇ浜�255鐨勬儏鍐典笅锛屼細瀵艰嚧宸ㄥぇ鐨勮宸€� 杩欎釜鏂规姣旇緝鎺ヨ繎鎴戜滑甯歌缂栫▼璇█涓殑绫诲瀷杞崲閫昏緫锛屾垜浠彲浠ユ妸瀹冪О涔嬩负绫诲瀷杞崲鏂规銆備笂闈㈢殑鍒嗘瀽鍙互鐪嬪嚭锛岀被鍨嬭浆鎹㈡柟妗堝浜庤繃澶ф垨鑰呰繃灏忕殑鏁版嵁閮戒細浜х敓杈冨ぇ鐨勭簿搴︽崯澶便€� 鐩墠涓绘祦鐨勬诞鐐硅浆瀹氱偣鏂规鍩烘湰閲囩敤鍧囧寑閲忓寲锛屽洜涓鸿繖绉嶆柟妗堝鎺ㄧ悊鏇村弸濂姐€傚皢涓€涓诞鐐规暟鏍规嵁鍏跺€煎煙鑼冨洿锛屽潎鍖€鐨勬槧灏勫埌涓€涓畾鐐规暟鐨勮〃杈捐寖鍥翠笂銆� 鍧囧寑閲忓寲鏂规 ~~~~~~~~~~~~ 鎴戜滑鍋囪涓€涓诞鐐规暟x鐨勫€煎煙鑼冨洿涓�$\\{x_{min}, x_{max}\\}$锛岃杞崲鍒颁竴涓〃杈捐寖鍥翠负(0,255)鐨�8bit瀹氱偣鏁扮殑杞崲鍏紡濡備笅 $$x_{int} = round(x/s) + z$$ $$x_{Q} = clamp(0,255,x_{int})$$ 鍏朵腑$s$涓簊cale锛屼篃鍙闀匡紝鏄釜娴偣鏁般€�$z$涓洪浂鐐癸紝鍗虫诞鐐规暟涓殑0锛屾槸涓畾鐐规暟銆� $$scale = (x_{max} - x_{min}) / 255$$ $$z = round(0 - x_{min}) / 255$$ 鐢变笂鍙互鐪嬪嚭鍧囧寑閲忓寲鏂规瀵逛簬浠绘剰鐨勫€煎煙鑼冨洿閮借兘琛ㄨ揪鐩稿涓嶉敊鐨勬€ц兘锛屼笉浼氬瓨鍦ㄧ被鍨嬭浆鎹㈡柟妗堢殑杩囧皬鍊煎煙涓㈢簿搴﹀拰杩囧ぇ鍊煎煙鏃犳硶琛ㄧず鐨勬儏鍐点€� 浠d环鏄渶瑕侀澶栧紩鍏ラ浂鐐�$z$鍜屽€煎煙$s$涓や釜鍙橀噺銆傚悓鏃舵垜浠彲浠ョ湅鍑猴紝鍧囧寑閲忓寲鏂规鍥犱负$round$鍜�$clamp$鎿嶄綔涔熸槸瀛樺湪绮惧害鎹熷け鐨勶紝鎵€浠ヤ細瀵规ā鍨嬬殑鎬ц兘浜х敓褰卞搷銆� 濡備綍鍑忚交鏁版嵁浠庢诞鐐硅浆鎹㈠埌瀹氱偣鐨勭簿搴︽崯澶憋紝鏄暣涓噺鍖栫爺绌剁殑閲嶇偣銆� ``娉ㄦ剰闆剁偣寰堥噸瑕侊紝鍥犱负鎴戜滑鐨勭綉缁滄ā鍨嬬殑padding锛宺elu绛夎繍绠楀浜�0姣旇緝鏁忔劅锛岄渶瑕佽姝g‘閲忓寲鎵嶈兘淇濊瘉杞崲鍚庣殑瀹氱偣杩愮畻鐨勬纭€с€傚綋娴偣鏁扮殑鍊煎煙鑼冨洿涓嶅寘鍚浂鐐圭殑鏃跺€欙紝涓轰簡淇濊瘉姝g‘閲忓寲锛屾垜浠渶瑕佸鍏跺€煎煙鑼冨洿杩涜涓€瀹氱▼搴︾殑缂╂斁浣垮叾鍙互鍖呭惈0鐐筦` 鍧囧寑閲忓寲鏂规瀵瑰簲鐨勫弽閲忓寲鍏紡濡備笅 $$x_{float} = (x_{Q} - z) * s$$ 鎵€浠ョ粡杩囬噺鍖栧拰鍙嶉噺鍖栦箣鍚庣殑娴偣鏁颁笌鍘熸潵鐨勬诞鐐规暟瀛樺湪涓€瀹氱殑璇樊锛岃繖涓繃绋嬬殑宸紓鍙互鏌ョ湅涓嬪浘銆傞噺鍖栧鎴戜滑缃戠粶妯″瀷鐨勫弬鏁拌繘琛屼簡绂绘暎鍖栵紝杩欑鎿嶄綔瀵逛簬妯″瀷鏈€缁堢偣鏁扮殑褰卞搷绋嬪害鍙栧喅浜庢垜浠ā鍨嬫湰韬殑鍙傛暟鍒嗗竷涓庡潎鍖€鍒嗗竷鐨勫樊寮� 姝ゅ闇€瑕佹彃鍏ュ浘鐗囷紝 鎺ヤ笅鏉ユ垜浠潵鐪嬬湅濡備綍鐢ㄧ粡杩囬噺鍖栬繍绠楃殑瀹氱偣鍗风Н杩愮畻鍘昏〃绀轰竴涓師濮嬬殑娴偣鍗风Н鎿嶄綔 .. math:: :nowrap: \begin{aligned} conv(x, w) &= conv((x_{Q} - z_{x}) * s_{x}, (w_{Q} - z_{w}) * s_{w}) \\ &= s_{x}s_{w} conv(x_{Q} - z_{x},w_{Q} - z_{w} ) \\ &= s_{x}s_{w} (conv(x_{Q}, w_{Q}) - z_{x} \sum_{k,l,m}x_{Q} - z_{w}\sum_{k,l,m,n}w_{Q} + z_{x}z_{w}) \end{aligned} 鍏朵腑$k,l,m,n$鍒嗗埆鏄�$kernel\\_size锛宱utput\\_channel$鍜�$input\\_channel$鐨勯亶鍘嗕笅鏍囥€傚彲浠ョ湅鍑猴紝褰撳嵎绉殑杈撳叆鍜屽弬鏁扮殑zero_point閮芥槸0鐨勬椂鍊欙紝娴偣鍗风Н灏嗙畝鍖栨垚 $$ conv(x, w) = s_{x}s_{w} (conv(x_{Q}, w_{Q})) $$ 鍗冲畾鐐圭殑鍗风Н杩愮畻缁撴灉鍜屽疄闄呰緭鍑哄彧鏈変竴涓猻cale涓婄殑鍋忓樊锛屽ぇ澶х殑绠€鍖栦簡瀹氱偣鐨勮繍绠楅€昏緫锛� 鎵€浠ュぇ閮ㄥ垎鎯呭喌涓嬫垜浠兘鏄娇鐢ㄥ绉板潎鍖€閲忓寲銆� 褰撴垜浠妸瀹氱偣閲忓寲瀵瑰簲鐨�$zero\\_point$鍥哄畾鍦ㄦ暣鍨嬬殑0澶勬椂锛屼究鏄绉板潎鍖€閲忓寲銆傛垜浠互int8鐨勫畾鐐规暟涓轰緥 (閫夊彇int8鍙槸涓轰簡鐪嬩笂鍘绘洿瀵圭О涓€浜涳紝閫夊彇uint8涔熸槸鍙互鐨�), 閲忓寲鍏紡濡備笅 .. math:: :nowrap: \begin{aligned} scale &= max(abs(x_{min}), abs(x_{max})) / 127 \\ x_{int} &= round(x/s) \\ x_{Q} &= clamp(-128,127,x_{int}) \end{aligned} 鍑轰簬鍒╃敤鏇村揩鐨凷IMD瀹炵幇鐨勭洰鐨勶紝鎴戜滑浼氭妸鍗风Н鐨剋eight鐨勫畾鐐硅寖鍥磋〃绀烘垚(-127,127)锛屽搴旂殑鍙嶉噺鍖栨搷浣滀负 $$ x_{float} = x_{Q}*s $$ 鐢辨鍙锛屽绉板潎鍖€閲忓寲鐨勯噺鍖栧拰鍙嶉噺鍖栨搷浣滀細鏇村姞鐨勪究鎹蜂竴浜� 闄ゆ涔嬪杩樻湁闅忔満鍧囧寑閲忓寲绛夊埆鐨勯噺鍖栨墜娈碉紝鍥犱负澶ч儴鍒嗘儏鍐典笅鎴戜滑閮介噰鐢ㄥ绉板潎鍖€閲忓寲锛岃繖閲屼笉鍐嶅睍寮€鎻忚堪銆� .. note:: megengine鍦ㄧ敤simd鎸囦护瀹炵幇閲忓寲鏃讹紝鏈夐儴鍒唊ernel浣跨敤浜�16-bit鐨勭疮鍔犲櫒鍘诲瓨鍌╝*b+c*d鐨勫€硷紙鍗充箻娉曠殑缁撴灉绱姞涓€娆$殑鍊硷級锛� 杩欓噷鐨刟,b,c,d閮芥槸qint8锛屼笉闅惧彂鐜帮紝浠ヤ笂鍊煎綋涓斾粎褰揳,b,c,d閮芥槸-128鏃舵湁鍙兘浼氭孩鍑猴紝鍙閬垮紑杩欑鎯呭喌灏变笉浼氭湁婧㈠嚭鐨勯棶棰樸€� 鐢变簬a,b,c,d涓繀鐒舵湁涓や釜鍊兼槸weight锛屽洜姝ゆ垜浠紶缁熶笂鐨勫仛娉曟槸鎶妛eight鐨勯噺鍖栬寖鍥村畾涔変负[-127, 127] 鍊煎煙缁熻 ~~~~~~~~ 涓婇潰鍧囧寑閲忓寲浠嬬粛閲岀殑鍏抽敭灏辨槸$scale$鍜�$zero\\_point$锛岃€屽畠浠槸閫氳繃娴偣鏁扮殑鍊煎煙鑼冨洿鏉ョ‘瀹氱殑銆傛垜浠浣曠‘瀹氱綉缁滀腑姣忎釜闇€瑕侀噺鍖栫殑鏁版嵁 鐨勫€煎煙鑼冨洿鍛紝涓€鑸湁浠ヤ笅涓ょ鏂规: * 涓€绉嶆槸鏍规嵁缁忛獙鎵嬪姩璁惧畾鍊煎煙鑼冨洿锛屽湪缂轰箯鏁版嵁鐨勬椂鍊欐垨鑰呭浜庝竴浜涗腑闂磃eature鎴戜滑鍙互杩欐牱鍋� * 杩樻湁涓€绉嶆槸璺戜竴鎵瑰皯閲忔暟鎹紝鏍规嵁缁熻閲忔潵杩涜璁惧畾锛岃繖閲岀粺璁℃柟寮忓彲浠ヨ鏁版嵁鐗规€ц€屽畾銆� 閲忓寲鎰熺煡璁粌 ~~~~~~~~~~~~ 鍦ㄥ潎鍖€閲忓寲鐨勫皬鑺傛垜浠彁鍒伴噺鍖栧墠鍚庣殑璇樊涓昏鍙栧喅浜庢ā鍨嬬殑鍙傛暟鍜屾縺娲诲€煎垎甯冧笌鍧囧寑鍒嗗竷鐨勫樊寮傘€傚浜庨噺鍖栧弸濂界殑妯″瀷锛屾垜浠彧闇€瑕侀€氳繃 鍊煎煙缁熻寰楀埌鍏跺€煎煙鑼冨洿锛岀劧鍚庤皟鐢ㄥ搴旂殑閲忓寲鏂规杩涜瀹氱偣鍖栧氨鍙互浜嗐€備絾鏄浜庨噺鍖栦笉鍙嬪ソ鐨勬ā鍨嬶紝鐩存帴杩涜閲忓寲浼氬洜涓鸿宸緝澶ц€屼娇寰� 鏈€鍚庢ā鍨嬬殑姝g‘鐜囪繃浣庤€屾棤娉曚娇鐢ㄣ€傛湁娌℃湁涓€绉嶆柟娉曞彲浠ュ湪璁粌鐨勬椂鍊欏氨鎻愬崌妯″瀷瀵归噺鍖栫殑鍙嬪ソ搴﹀憿锛� 绛旀鏄湁鐨勶紝鎴戜滑鍙互閫氳繃鍦ㄨ缁冭繃绋嬩腑锛岀粰寰呴噺鍖栧弬鏁拌繘琛岄噺鍖栧拰鍙嶉噺鍖栫殑鎿嶄綔锛屼究鍙互寮曞叆閲忓寲甯︽潵鐨勭簿搴︽崯澶憋紝鐒跺悗閫氳繃璁粌璁╃綉缁滈€愭笎 閫傚簲杩欑骞叉壈锛屼粠鑰屼娇寰楃綉缁滃湪鐪熸閲忓寲鍚庣殑琛ㄧ幇涓庤缁冭〃鐜颁竴鑷淬€傝繖涓搷浣滃氨鍙噺鍖栨劅鐭ヨ缁冿紝涔熷彨qat (Quantization-aware-training) 鍏朵腑闇€瑕佹敞鎰忕殑鏄紝鍥犱负閲忓寲鎿嶄綔涓嶅彲瀵硷紝鎵€浠ュ湪瀹為檯璁粌鐨勬椂鍊欏仛浜嗕竴姝ヨ繎浼硷紝鎶婁笂涓€灞傜殑瀵兼暟鐩存帴璺宠繃閲忓寲鍙嶉噺鍖栨搷浣滀紶閫掔粰浜嗗綋鍓嶅弬鏁般€� 閲忓寲缃戠粶鐨勬帹鐞嗘祦绋� ~~~~~~~~~~~~~~~~~~ 涓婇潰璁茶堪浜嗗畾鐐规儏鍐典笅鍗风Н鎿嶄綔鐨勫舰寮忥紝澶у鍙互鑷繁鎺ㄥ涓€涓嬪畾鐐规儏鍐典笅婵€娲诲嚱鏁皉elu鎯呭喌銆� 瀵逛簬bn锛屽洜涓哄ぇ閮ㄥ垎缃戠粶鍦ㄩ兘浼氳繘琛屽惛bn鐨勬搷浣滐紝鎵€浠ユ垜浠彲浠ユ妸瀹冮泦鎴愯繘conv閲屻€� 瀵逛簬鐜版垚缃戠粶锛屾垜浠彲浠ュ湪姣忎釜鍗风Н灞傚墠鍚庡姞涓婇噺鍖栦笌鍙嶉噺鍖栫殑鎿嶄綔锛岃繖鏍峰氨瀹炵幇浜嗙敤瀹氱偣杩愮畻鏇夸唬娴偣杩愮畻鐨勭洰鐨勩€� 鏇磋繘涓€姝ョ殑锛屾垜浠彲浠ュ湪鏁翠釜缃戠粶鎺ㄧ悊杩囩▼涓淮鎶ゆ瘡涓噺鍖栧彉閲忓搴旂殑scale鍙橀噺锛岃繖鏍锋垜浠彲浠ュ湪涓嶈繘琛屽弽閲忓寲鐨勬儏鍐典笅璧板畬 鏁翠釜缃戠粶锛岃繖鏍锋垜浠櫎浜嗗甫鏉ユ瀬灏戦噺棰濆鐨剆cale璁$畻寮€閿€澶栵紝渚垮彲浠ュ皢鏁翠釜缃戠粶鐨勬诞鐐硅繍绠楄浆鎹㈡垚瀵瑰簲鐨勫畾鐐硅繍绠椼€傚叿浣撴祦绋嬪彲浠� 鍙傝€冧笅鍥俱€� .. image:: ../../../_static/images/quantization-inference.jpg :align: center 鍊煎煙缁熻鍜岄噺鍖栨劅鐭ヨ缁冮渶瑕佹秹鍙婄殑鎿嶄綔澶ч儴鍒嗛兘鍙戠敓鍦ㄨ缁冮樁娈碉紝megengine瀵逛簬杩欎袱涓搷浣滈兘鎻愪緵浜嗙浉搴旂殑灏佽锛屽苟涓嶉渶瑕佹垜浠墜鍔ㄥ疄鐜� 鑷虫鎴戜滑绮楃暐鐨勪粙缁嶄簡鏁翠釜缃戠粶閲忓寲鐨勫畾鐐硅浆鎹互鍙婅浆鎹㈠悗鐨勮绠楁柟妗堛€� 鍙傝€冩枃鐚細 https://arxiv.org/pdf/1806.08342.pdf