.. _megengine-basics: ================== MegEngine 鍩虹姒傚康 ================== .. admonition:: 鏈暀绋嬫秹鍙婄殑鍐呭 :class: note * 浠嬬粛 MegEngine 鐨勫熀鏈暟鎹粨鏋� :class:`~.Tensor` 浠ュ強 :mod:`~.functional` 妯″潡涓殑鍩虹杩愮畻鎿嶄綔锛� * 浠嬬粛璁$畻鍥剧殑鏈夊叧姒傚康锛屽疄璺垫繁搴﹀涔犱腑鍓嶅悜浼犳挱銆佸弽鍚戜紶鎾拰鍙傛暟鏇存柊鐨勫熀鏈祦绋嬶紱 * 鏍规嵁鍓嶉潰鐨勪粙缁嶏紝鍒嗗埆浣跨敤 NumPy 鍜� MegEngine 瀹屾垚涓€涓畝鍗曠殑鐩寸嚎鎷熷悎浠诲姟銆� .. admonition:: 瀛︿範鐨勮繃绋嬩腑搴旈伩寮€瀹岀編涓讳箟 :class: warning 璇蜂互瀹屾垚鏁欑▼鐩爣涓洪瑕佷换鍔¤繘琛屽涔狅紝MegEngine 鏁欑▼涓細鍏呮枼鐫€璁稿鐨勬嫇灞曡В閲婂拰閾炬帴锛岃繖浜涘唴瀹瑰線寰€涓嶆槸蹇呴渶鍝併€� 閫氬父瀹冧滑鏄负瀛︽湁浣欏姏鐨勫悓瀛︼紝浜︽垨鑰呭熀纭€杩囦簬钖勫急鐨勫悓瀛﹁€屽噯澶囩殑锛屽鏋滀綘閬囧埌涓€浜涗笉鏄緢娓呮鐨勫湴鏂癸紝 涓嶅Θ璇曠潃鍏堝皢鏁翠釜鏁欑▼鐪嬪畬锛屼唬鐮佽窇瀹岋紝鍐嶅洖澶磋ˉ鍏呴偅浜涢渶瑕佺殑鐭ヨ瘑銆� 鍩虹鏁版嵁缁撴瀯锛氬紶閲� ------------------ >>> from numpy import array >>> from megengine import Tensor .. figure:: ../../_static/images/ndim-axis-shape.png MegEngine 涓彁渚涗簡涓€绉嶅悕涓� 鈥滃紶閲忊€� 锛� :py:class:`~.Tensor` 锛夌殑鏁版嵁缁撴瀯锛� 鍖哄埆浜庢暟瀛︿腑鐨勫畾涔夛紝鍏舵蹇典笌 NumPy_ :footcite:p:`harris2020array` 涓殑 :py:class:`~numpy.ndarray` 鏇村姞鐩镐技锛屽嵆澶氱淮鏁扮粍銆� 鐪熷疄涓栫晫涓殑寰堝闈炵粨鏋勫寲鐨勬暟鎹紝濡傛枃瀛椼€佸浘鐗囥€侀煶棰戙€佽棰戠瓑锛岄兘鍙互鎶借薄鎴� Tensor 鐨勫舰寮忚繘琛岃〃杈俱€� 鎴戜滑鎵€鎻愬埌鐨� Tensor 鐨勬蹇靛線寰€鏄叾瀹冩洿鍏蜂綋姒傚康鐨勬鎷紙鎴栬€呰鎺ㄥ箍锛夛細 ===================== ===================== ======== =========== 鏁板 璁$畻鏈虹瀛� 鍑犱綍姒傚康 鍏疯薄鍖栦緥瀛� ===================== ===================== ======== =========== 鏍囬噺锛坰calar锛� 鏁板瓧锛坣umber锛� 鐐� 寰楀垎銆佹鐜� 鍚戦噺锛坴ector锛� 鏁扮粍锛坅rray锛� 绾� 鍒楄〃 鐭╅樀锛坢atrix锛� 2 缁存暟缁勶紙2d-array锛� 闈� Excel 琛ㄦ牸 3 缁村紶閲� 3 缁存暟缁勶紙3d-array锛� 浣� RGB 鍥剧墖 ... ... ... ... n 缁村紶閲� n 缁存暟缁勶紙nd-array锛� 楂樼淮绌洪棿 ===================== ===================== ======== =========== 浠ヤ竴涓� 2x3 鐨勭煩闃碉紙2 缁村紶閲忥級涓轰緥锛屽湪 MegEngine 涓敤宓屽鐨� Python 鍒楄〃鍒濆鍖� Tensor: >>> Tensor([[1, 2, 3], [4, 5, 6]]) Tensor([[1 2 3] [4 5 6]], dtype=int32, device=xpux:0) 瀹冪殑鍩烘湰灞炴€ф湁 :attr:`.Tensor.ndim`, :attr:`.Tensor.shape`, :attr:`.Tensor.dtype`, :attr:`.Tensor.device` 绛夈€� * 鎴戜滑鍙互鍩轰簬 Tensor 鏁版嵁缁撴瀯锛岃繘琛屽悇寮忓悇鏍风殑绉戝璁$畻锛� * Tensor 涔熸槸绁炵粡缃戠粶缂栫▼鏃舵墍鐢ㄧ殑涓昏鏁版嵁缁撴瀯锛岀綉缁滅殑杈撳叆銆佽緭鍑哄拰杞崲閮戒娇鐢� Tensor 琛ㄧず銆� .. _Numpy: https://numpy.org .. note:: 涓� NumPy 鐨勫尯鍒箣澶勫湪浜庯紝MegEngine 杩樻敮鎸佸埄鐢� GPU 璁惧杩涜鏇村姞楂樻晥鐨勮绠椼€� 褰� GPU 鍜� CPU 璁惧閮藉彲鐢ㄦ椂锛孧egEngine 灏嗕紭鍏堜娇鐢� GPU 浣滀负榛樿璁$畻璁惧锛屾棤闇€鐢ㄦ埛杩涜鎵嬪姩璁惧畾銆� 鍙﹀ MegEngine 杩樻敮鎸佽嚜鍔ㄥ井鍒嗭紙Autodiff锛夌瓑鐗规€э紝鎴戜滑灏嗗湪鍚庣画鏁欑▼閫傚綋鐨勭幆鑺傝繘琛屼粙缁嶃€� .. admonition:: 濡傛灉浣犲畬鍏ㄦ病鏈� NumPy 浣跨敤缁忛獙 :class: warning * 鍙互鍙傝€� :ref:`tensor-guide` 涓殑浠嬬粛, 鎴栬€呭厛鏌ョ湅 NumPy_ 瀹樼綉鏂囨。鍜屾暀绋嬶紱 * 鍏跺畠姣旇緝涓嶉敊鐨勮ˉ鍏呮潗鏂欒繕鏈� CS231n 鐨� 銆� `Python Numpy Tutorial (with Jupyter and Colab) <https://cs231n.github.io/python-numpy-tutorial/>`_ 銆嬨€� Tensor 鎿嶄綔涓庤绠� ----------------- 涓� NumPy 鐨勫缁存暟缁勪竴鏍凤紝Tensor 鍙互鐢ㄦ爣鍑嗙畻鏁拌繍绠楃杩涜閫愬厓绱狅紙Element-wise锛夌殑鍔犲噺涔橀櫎绛夎繍绠楋細 >>> a = Tensor([[2, 4, 2], [2, 4, 2]]) >>> b = Tensor([[2, 4, 2], [1, 2, 1]]) >>> a + b Tensor([[4 8 4] [3 6 3]], dtype=int32, device=xpux:0) >>> a - b Tensor([[0 0 0] [1 2 1]], dtype=int32, device=xpux:0) >>> a * b Tensor([[ 4 16 4] [ 2 8 2]], dtype=int32, device=xpux:0) >>> a / b Tensor([[1. 1. 1.] [2. 2. 2.]], device=xpux:0) :class:`~.Tensor` 绫讳腑鎻愪緵浜嗕竴浜涙瘮杈冨父瑙佺殑鏂规硶锛屾瘮濡� :meth:`.Tensor.reshape` 鏂规硶锛� 鍙互鐢ㄦ潵鏀瑰彉 Tensor 鐨勫舰鐘讹紙璇ユ搷浣滀笉浼氭敼鍙� Tensor 鍏冪礌鎬绘暟鐩互鍙婂悇涓厓绱犵殑鍊硷級锛� >>> a = Tensor([[1, 2, 3], [4, 5, 6]]) >>> b = a.reshape((3, 2)) >>> print(a.shape, b.shape) (2, 3) (3, 2) 浣嗛€氬父鎴戜滑浼� :ref:`functional-guide`, 渚嬪浣跨敤 :func:`.functional.reshape` 鏉ユ敼鍙樺舰鐘讹細 >>> import megengine.functional as F >>> b = F.reshape(a, (3, 2)) >>> print(a.shape, b.shape) (2, 3) (3, 2) .. warning:: 涓€涓父瑙佽鍖烘槸锛屽垵瀛﹁€呬細璁や负璋冪敤 ``a.reshape()`` 鍚� ``a`` 鑷韩鐨勫舰鐘朵細鍙戠敓鏀瑰彉銆� 浜嬪疄涓婂苟闈炲姝わ紝鍦� MegEngine 涓粷澶ч儴鍒嗘搷浣滈兘涓嶆槸鍘熷湴锛圛n-place锛夋搷浣滐紝 杩欐剰鍛崇潃閫氬父璋冪敤杩欎簺鎺ュ彛灏嗕細杩斿洖涓€涓柊鐨� Tensor, 鑰屼笉浼氬鍘熸湰鐨� Tensor 杩涜鏇存敼銆� .. seealso:: 鍦� :mod:`~.functional` 妯″潡涓彁渚涗簡鏇村鐨勭畻瀛愶紙Operator锛夛紝骞舵寜鐓т娇鐢ㄦ儏鏅鍛藉悕绌洪棿杩涜浜嗗垝鍒嗭紝 鐩墠鎴戜滑鍙渶瑕佹帴瑙﹁繖浜涙渶鍩烘湰鐨勭畻瀛愬嵆鍙紝灏嗘潵浼氭帴瑙﹀埌涓撻棬鐢ㄤ簬绁炵粡缃戠粶缂栫▼鐨勭畻瀛愩€� 鐞嗚В璁$畻鍥� ---------- .. note:: * MegEngine 鏄熀浜庤绠楀浘锛圕omputing Graph锛夌殑娣卞害绁炵粡缃戠粶瀛︿範妗嗘灦锛� * 鍦ㄦ繁搴﹀涔犻鍩燂紝浠讳綍澶嶆潅鐨勬繁搴︾缁忕綉缁滄ā鍨嬫湰璐ㄤ笂閮藉彲浠ョ敤涓€涓绠楀浘琛ㄧず鍑烘潵銆� 鎴戜滑鍏堥€氳繃涓€涓畝鍗曠殑鏁板琛ㄨ揪寮� :math:`y=w*x+b` 浣滀负渚嬪瓙锛屾潵浠嬬粛璁$畻鍥剧殑鍩烘湰姒傚康锛� .. figure:: ../../_static/images/computing_graph.png MegEngine 涓� Tensor 涓烘暟鎹妭鐐�, Operator 涓鸿绠楄妭鐐� 浠庤緭鍏ユ暟鎹埌杈撳嚭鏁版嵁涔嬮棿鐨勮妭鐐逛緷璧栧叧绯诲彲浠ユ瀯鎴愪竴寮犳湁鍚戞棤鐜浘锛圖AG锛夛紝鍏朵腑鏈夛細 * 鏁版嵁鑺傜偣锛氬杈撳叆鏁版嵁 :math:`x`, 鍙傛暟 :math:`w` 鍜� :math:`b`, 涓棿缁撴灉 :math:`p`, 浠ュ強鏈€缁堣緭鍑� :math:`y`; * 璁$畻鑺傜偣锛氬鍥句腑鐨� :math:`*` 鍜� :math:`+` 鍒嗗埆浠h〃涔樻硶鍜屽姞娉曚袱绉嶇畻瀛愶紝鏍规嵁缁欏畾鐨勮緭鍏ヨ绠楄緭鍑猴紱 * 鏈夊悜杈癸細琛ㄧず浜嗘暟鎹殑娴佸悜锛屼綋鐜颁簡鏁版嵁鑺傜偣鍜岃绠楄妭鐐逛箣闂寸殑鍓嶅悗渚濊禆鍏崇郴銆� 鏈変簡璁$畻鍥捐繖涓€琛ㄧず褰㈠紡锛屾垜浠彲浠ュ鍓嶅悜浼犳挱鍜屽弽鍚戜紶鎾殑杩囩▼鏈夋洿鍔犵洿瑙傜殑鐞嗚В銆� .. dropdown:: 鍓嶅悜浼犳挱锛團orward propagation锛� 鏍规嵁妯″瀷鐨勫畾涔夎繘琛屽墠鍚戣绠楀緱鍒拌緭鍑猴紝鍦ㄤ笂闈㈢殑渚嬪瓙涓嵆鏄� 鈥斺€� #. 杈撳叆鏁版嵁 :math:`x` 鍜屽弬鏁� :math:`w` 缁忚繃涔樻硶杩愮畻寰楀埌涓棿缁撴灉 :math:`p`; #. 涓棿缁撴灉 :math:`p` 鍜屽弬鏁� :math:`b` 缁忚繃鍔犳硶杩愮畻寰楀埌杈撳嚭缁撴灉 :math:`y`; #. 瀵逛簬鏇村姞澶嶆潅鐨勮绠楀浘缁撴瀯锛屽叾鍓嶅悜璁$畻鐨勪緷璧栧叧绯绘湰璐ㄤ笂灏辨槸涓€涓嫇鎵戞帓搴忋€� .. dropdown:: 鍙嶅悜浼犳挱锛圔ack propagation锛� 鏍规嵁闇€瑕佷紭鍖栫殑鐩爣锛堣繖閲屾垜浠畝鍗曞亣瀹氫负 :math:`y`锛夛紝閫氳繃閾惧紡姹傚娉曞垯锛� 姹傚嚭妯″瀷涓墍鏈夊弬鏁版墍瀵瑰簲鐨勬搴︼紝鍦ㄤ笂闈㈢殑渚嬪瓙涓嵆璁$畻 :math:`\nabla y(w, b)`, 鐢卞亸瀵� :math:`\frac{\partial y}{\partial w}` 鍜� :math:`\frac{\partial y}{\partial b}` 缁勬垚銆� 杩欎竴灏忚妭浼氫娇鐢ㄥ埌寰Н鍒嗙煡璇嗭紝鍙互鍊熷姪浜掕仈缃戜笂鐨勪竴浜涜祫鏂欒繘琛屽揩閫熷涔�/澶嶄範锛� 3Blue1Brown - `寰Н鍒嗙殑鏈川 [Bilibili] <https://space.bilibili.com/88461692/channel/seriesdetail?sid=1528931>`_ / `Essence of calculus [YouTube] <https://youtube.com/playlist?list=PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr>`_ 渚嬪锛屼负浜嗗緱鍒颁笂鍥句腑 :math:`y` 鍏充簬鍙傛暟 :math:`w` 鐨勫亸瀵硷紝鍙嶅悜浼犳挱鐨勮繃绋嬪涓嬪浘鎵€绀猴細 .. figure:: ../../_static/images/back_prop.png #. 棣栧厛鏈� :math:`y=p+b`, 鍥犳鏈� :math:`\frac{\partial y}{\partial p}=1`; #. 缁х画鍙嶅悜浼犳挱锛屾湁 :math:`p=w*x`, 鍥犳鏈� :math:`\frac{\partial p}{\partial w}=x`; #. 鏍规嵁閾惧紡娉曞垯鏈� :math:`\frac{\partial y}{\partial w}=\frac{\partial y}{\partial p} \cdot \frac{\partial p}{\partial w}=1 \cdot x`, 鍥犳鏈€缁堟眰鍑� :math:`y` 鍏充簬鍙傛暟 :math:`w` 鐨勫亸瀵间负 :math:`x`. 姹傚緱鐨勬搴︿篃浼氭槸涓€涓� Tensor, 灏嗗湪涓嬩竴姝ュ弬鏁颁紭鍖栦腑琚娇鐢ㄣ€� .. dropdown:: 鍙傛暟浼樺寲锛圥arameter Optimization锛� 甯歌鐨勫仛娉曟槸浣跨敤姊害涓嬮檷娉曞鍙傛暟杩涜鏇存柊锛屽湪涓婇潰鐨勪緥瀛愪腑鍗冲 :math:`w` 鍜� :math:`b` 鐨勫€煎仛鏇存柊銆� 鎴戜滑鐢ㄤ竴涓緥瀛愬府鍔╀綘鐞嗚В姊害涓嬮檷鐨勬牳蹇冩€濊矾锛氬亣璁句綘鐜板湪杩峰け浜庝竴涓北璋蜂腑锛岄渶瑕佸鎵炬湁浜虹儫鐨勬潙搴勶紝鎴戜滑鐨勭洰鏍囨槸鏈€浣庣殑骞冲師鐐� 锛堥偅鍎挎湁浜虹儫鐨勬鐜囨槸鏈€澶х殑锛夈€傞噰鍙栨搴︿笅闄嶇殑绛栫暐锛屽垯瑕佹眰鎴戜滑姣忔閮界幆椤惧洓鍛紝鐪嬪摢涓柟鍚戞槸鏈€闄″抄鐨勶紱 鐒跺悗娌跨潃姊害鐨勮礋鏂瑰悜鍚戜笅杩堝嚭涓€姝ワ紝寰幆鎵ц涓婇潰鐨勬楠わ紝鎴戜滑璁や负杩欐牱鑳芥洿蹇湴涓嬪北銆� 鎴戜滑姣忓畬鎴愪竴娆″弬鏁扮殑鏇存柊锛屼究璇存槑瀵瑰弬鏁拌繘琛屼簡涓€娆¤凯浠o紙Iteration锛夛紝璁粌妯″瀷鏃跺線寰€浼氭湁澶氭杩唬銆� 濡傛灉浣犺繕涓嶆竻妤氭搴︿笅闄嶈兘鍙栧緱浠€涔堟牱鐨勬晥鏋滐紝娌℃湁鍏崇郴锛屾湰鏁欑▼鏈熬浼氭湁鏇村姞鐩磋鐨勪换鍔″疄璺点€� 浣犱篃鍙互鍦ㄤ簰鑱旂綉涓婃煡闃呮洿澶氳В閲婃搴︿笅闄嶇畻娉曠殑璧勬枡銆� .. code-block:: python w, x, b = Tensor(3.), Tensor(2.), Tensor(-1.) p = w * x y = p + b dydp = Tensor(1.) dpdw = x dydw= dydp * dpdw >>> dydw Tensor(2.0, device=xpux:0) 鑷姩寰垎涓庡弬鏁颁紭鍖� ------------------ 涓嶉毦鍙戠幇锛屾湁浜嗛摼寮忔硶鍒欙紝瑕佸仛鍒拌绠楁搴﹀苟涓嶅洶闅俱€備絾鎴戜滑涓婅堪婕旂ず鐨勮绠楀浘浠呬粎鏄竴涓潪甯哥畝鍗曠殑杩愮畻锛� 褰撴垜浠娇鐢ㄥ鏉傜殑妯″瀷鏃讹紝鎶借薄鍑虹殑璁$畻鍥剧粨鏋勪篃浼氬彉寰楁洿鍔犲鏉傦紝濡傛灉姝ゆ椂鍐嶅幓鎵嬪姩鍦版牴鎹摼寮忔硶鍒欒绠楁搴︼紝 鏁翠釜杩囩▼灏嗗彉寰楀紓甯告灟鐕ユ棤鑱婏紝鑰屼笖杩欏绮楀績鐨勬湅鍙嬫潵璇存瀬鍏朵笉鍙嬪ソ锛岃皝涔熶笉甯屾湜鍥犱负鏌愪竴姝ョ畻閿欏鑷磋繘鍏ユ极闀跨殑 Debug 闃舵銆� MegEngine 浣滀负娣卞害瀛︿範妗嗘灦鐨勫彟涓€鐗规€ф槸鏀寔浜嗚嚜鍔ㄥ井鍒嗭紝鍗宠嚜鍔ㄥ湴瀹屾垚鍙嶄紶杩囩▼涓牴鎹摼寮忔硶鍒欏幓鎺ㄥ鍙傛暟姊害鐨勮繃绋嬨€� 涓庢鍚屾椂锛屼篃鎻愪緵浜嗘柟渚胯繘琛屽弬鏁颁紭鍖栫殑鐩稿簲鎺ュ彛銆� Tensor 姊害涓庢搴︾鐞嗗櫒 ~~~~~~~~~~~~~~~~~~~~~~~ 鍦� MegEngine 涓紝姣忎釜 :class:`~.Tensor` 閮藉叿澶� :attr:`.Tensor.grad` 杩欎竴灞炴€э紝鍗虫搴︼紙Gradient锛夌殑缂╁啓銆� >>> print(w.grad, b.grad) None None 鐒惰€屼笂闈㈢殑鐢ㄦ硶骞朵笉姝g‘锛岄粯璁ゆ儏鍐典笅 Tensor 璁$畻鏃朵笉浼氳绠楀拰璁板綍姊害淇℃伅銆� 鎴戜滑闇€瑕佺敤鍒版搴︾鐞嗗櫒 :class:`~.GradManager` 鏉ュ畬鎴愮浉鍏虫搷浣滐細 * 浣跨敤 :meth:`.GradManager.attach` 鏉ョ粦瀹氶渶瑕佽绠楁搴︾殑鍙傛暟锛� * 浣跨敤 ``with`` 鍏抽敭瀛楋紝閰嶅悎璁板綍鏁翠釜鍓嶅悜璁$畻鐨勮繃绋嬶紝褰㈡垚璁$畻鍥撅紱 * 璋冪敤 :meth:`.GradManager.backward` 鍗冲彲鑷姩杩涜鍙嶅悜浼犳挱锛堣繃绋嬩腑杩涜浜嗚嚜鍔ㄥ井鍒嗭級銆� .. code-block:: python from megengine.autodiff import GradManager w, x, b = Tensor(3.), Tensor(2.), Tensor(-1.) gm = GradManager().attach([w, b]) with gm: y = w * x + b gm.backward(y) 杩欐椂鍙互鐪嬪埌鍙傛暟 :math:`w` 鍜� :math:`b` 瀵瑰簲鐨勬搴︼紙鍓嶉潰璁$畻杩囦簡 :math:`\frac{\partial y}{\partial w} = x = 2.0` 锛夛細 >>> w.grad Tensor(2.0, device=xpux:0) .. warning:: 鍊煎緱娉ㄦ剰鐨勬槸锛� :meth:`.GradManager.backward` 璁$畻姊害鏃剁殑鍋氭硶鏄疮鍔犺€屼笉鏄浛鎹紝濡傛灉鎺ョ潃鎵ц锛� >>> # Note that `w.grad` is `2.0` now, not `None` >>> with gm: ... y = w * x + b ... gm.backward(y) # w.grad += backward_grad_for_w >>> w.grad Tensor(4.0, device=xpux:0) 鍙互鍙戠幇姝ゆ椂鍙傛暟 :math:`w` 鐨勬搴︽槸 4 鑰屼笉鏄� 2, 杩欐槸鍥犱负鏂扮殑姊害鍜屾棫鐨勬搴﹁繘琛屼簡绱姞銆� .. seealso:: 鎯宠浜嗚В鏇村缁嗚妭锛屽彲浠ュ弬鑰� :ref:`autodiff-guide` 銆� 鍙傛暟锛圥arameter锛� ~~~~~~~~~~~~~~~~~ 浣犲彲鑳芥敞鎰忓埌浜嗚繖鏍蜂竴涓粏鑺傦細鎴戜滑鍦ㄥ墠闈㈢殑浠嬬粛涓紝浣跨敤鍙傛暟锛圥arameter锛夋潵绉板懠 :math:`w` 鍜� :math:`b`. 鍥犱负涓庤緭鍏ユ暟鎹� :math:`x` 涓嶅悓锛屽畠浠槸闇€瑕佸湪妯″瀷璁粌杩囩▼涓浼樺寲鏇存柊鐨勫彉閲忋€� 鍦� MegEngine 涓湁 :class:`~.Parameter` 绫讳笓闂ㄥ拰 :class:`~.Tensor` 杩涜鍖哄垎锛屼絾瀹冩湰璐ㄤ笂鏄竴绉嶇壒娈婄殑寮犻噺銆� 鍥犳姊害绠$悊鍣ㄤ篃鏀寔缁存姢璁$畻杩囩▼涓� :class:`~.Parameter` 鐨勬搴︿俊鎭€� .. code-block:: python from megengine import Parameter x = Tensor(2.) w, b = Parameter(3.), Parameter(-1.) gm = GradManager().attach([w, b]) with gm: y = w * x + b gm.backward(y) >>> w Parameter(3.0, device=xpux:0) >>> w.grad Tensor(2.0, device=xpux:0) .. note:: :class:`~.Parameter` 鍜� :class:`~.Tensor` 鐨勫尯鍒富瑕佷綋鐜板湪鍙傛暟浼樺寲杩欎竴姝ワ紝鍦ㄤ笅涓€灏忚妭浼氳繘琛屼粙缁嶃€� 鍦ㄥ墠闈㈡垜浠凡鐭ヤ簡鍙傛暟 :math:`w` 鍜屽畠瀵瑰簲鐨勬搴� :math:`w.grad`, 鎵ц涓€娆℃搴︿笅闄嶇殑閫昏緫闈炲父绠€鍗曪細 .. math:: w = w - lr * w.grad 瀵规瘡涓弬鏁伴兘鎵ц涓€鏍风殑鎿嶄綔銆傝繖閲屽紩鍏ヤ簡涓€涓秴鍙傛暟锛氬涔犵巼锛圠earning rate锛夛紝鎺у埗姣忔鍙傛暟鏇存柊鐨勫箙搴︺€� 涓嶅悓鐨勫弬鏁板湪鏇存柊鏃跺彲浠ヤ娇鐢ㄤ笉鍚岀殑瀛︿範鐜囷紝鐢氳嚦鍚屾牱鐨勫弬鏁板湪涓嬩竴娆℃洿鏂版椂涔熷彲浠ユ敼鍙樺涔犵巼锛� 浣嗘槸涓轰簡渚夸簬鍒濇湡鐨勫涔犲拰鐞嗚В锛屾垜浠湪鏁欑▼涓皢浣跨敤涓€鑷寸殑瀛︿範鐜囥€� 浼樺寲鍣紙Optimizer锛� ~~~~~~~~~~~~~~~~~~~ MegEngine 鐨� :mod:`~.optimizer` 妯″潡鎻愪緵浜嗗熀浜庡悇绉嶅父瑙佷紭鍖栫瓥鐣ョ殑浼樺寲鍣紝濡� :class:`~.SGD` 鍜� :class:`~.Adam` 绛夈€� 瀹冧滑鐨勫熀绫绘槸 :class:`~.Optimizer`锛屽叾涓� :class:`~.SGD` 瀵瑰簲闅忔満姊害涓嬮檷绠楁硶锛屼篃鏄湰鏁欑▼涓皢浼氱敤鍒扮殑浼樺寲鍣ㄣ€� .. code-block:: python import megengine.optimizer as optim x = Tensor(2.) w, b = Parameter(3.), Parameter(-1.) gm = GradManager().attach([w, b]) optimizer = optim.SGD([w, b], lr=0.01) with gm: y = w * x + b gm.backward(y) optimizer.step().clear_grad() 璋冪敤 :meth:`.Optimizer.step` 杩涜涓€娆″弬鏁版洿鏂帮紝璋冪敤 :meth:`.Optimizer.clear_grad` 鍙互娓呯┖ :attr:`.Tensor.grad`. >>> w Parameter(2.98, device=xpux:0) 璁稿鍒濆鑰呭鏄撳繕璁板湪鏂颁竴杞殑鍙傛暟鏇存柊鏃舵竻绌烘搴︼紝瀵艰嚧寰楀埌浜嗕笉姝g‘鐨勭粨鏋溿€� .. warning:: :class:`~.Optimizer` 鎺ュ彈鐨勮緭鍏ョ被鍨嬪繀椤绘槸 :class:`~.Parameter` 鑰岄潪 :class:`~.Tensor`, 鍚﹀垯鎶ラ敊銆� .. code-block:: shell TypeError: optimizer can only optimize Parameters, but one of the params is ... .. seealso:: 鎯宠浜嗚В鏇村缁嗚妭锛屽彲浠ュ弬鑰� :ref:`optimizer-guide` 銆� .. admonition:: 浼樺寲鐩爣鐨勯€夊彇 鎯宠鎻愬崌妯″瀷鐨勯娴嬫晥鏋滐紝鎴戜滑闇€瑕佹湁涓€涓悎閫傜殑浼樺寲鐩爣銆� 浣嗚娉ㄦ剰锛屼笂闈㈢敤浜庝妇渚嬬殑琛ㄨ揪寮忎粎鐢ㄤ簬鐞嗚В璁$畻鍥撅紝 鍏惰緭鍑哄€� :math:`y` 寰€寰€骞朵笉鏄疄闄呴渶瑕佽浼樺寲鐨勫璞★紝 瀹冧粎浠呮槸妯″瀷鐨勮緭鍑猴紝鍗曠函鍦颁紭鍖栬繖涓€兼病鏈変换浣曟剰涔夈€� 閭d箞鎴戜滑瑕佸浣曞幓璇勪及涓€涓ā鍨嬮娴嬫€ц兘鐨勫ソ鍧忓憿锛� 鏍稿績鍘熷垯鏄細 **鐘敊瓒婂皯锛岃〃鐜拌秺濂姐€�** 閫氬父鑰岃█锛屾垜浠渶瑕佷紭鍖栫殑鐩爣琚О涓烘崯澶憋紙Loss锛夛紝鐢ㄦ潵搴﹂噺妯″瀷鐨勮緭鍑哄€煎拰瀹為檯缁撴灉涔嬮棿鐨勫樊寮傘€� 濡傛灉鑳藉灏嗘崯澶变紭鍖栧埌灏藉彲鑳藉湴浣庯紝灏辨剰鍛崇潃妯″瀷鍦ㄥ綋鍓嶆暟鎹笂鐨勯娴嬫晥鏋滆秺濂姐€� 鐩墠鎴戜滑鍙互璁や负锛屼竴涓湪褰撳墠鏁版嵁闆嗕笂琛ㄧ幇鑹ソ鐨勬ā鍨嬶紝涔熻兘澶熷鏂拌緭鍏ョ殑鏁版嵁浜х敓涓嶉敊鐨勯娴嬫晥鏋溿€� 杩欐牱鐨勬弿杩版垨璁告湁浜涙娊璞★紝璁╂垜浠洿鎺ラ€氳繃瀹炶返鏉ヨ繘琛岀悊瑙c€� 缁冧範锛氭嫙鍚堜竴鏉$洿绾� ------------------ 鍋囪浣犲緱鍒颁簡鏁版嵁闆� :math:`\mathcal{D}=\{ (x_i, y_i) \}`, 鍏朵腑 :math:`i \in \{1, \ldots, 100 \}`, 甯屾湜灏嗘潵缁欏嚭杈撳叆 :math:`x`, 鑳藉棰勬祴鍑哄悎閫傜殑 :math:`y` 鍊笺€� .. dropdown:: get_point_examples() 婧愮爜 涓嬮潰鏄殢鏈虹敓鎴愯繖浜涙暟鎹偣鐨勪唬鐮佸疄鐜帮細 .. code-block:: python import numpy as np np.random.seed(20200325) def get_point_examples(w=5.0, b=2.0, nums_eample=100, noise=5): x = np.zeros((nums_eample,)) y = np.zeros((nums_eample,)) for i in range(nums_eample): x[i] = np.random.uniform(-10, 10) y[i] = w * x[i] + b + np.random.uniform(-noise, noise) return x, y 鍙互鍙戠幇鏁版嵁鐐规槸鍩轰簬鐩寸嚎 :math:`y = 5.0 * x + 2.0` 鍔犱笂涓€浜涢殢鏈哄櫔澹扮敓鎴愮殑銆� 浣嗘槸鍦ㄦ湰鏁欑▼涓紝鎴戜滑搴斿綋鍋囪鑷繁娌℃湁杩欐牱鐨勪笂甯濊瑙掞紝 鎵€鑳借幏寰楃殑浠呬粎鏄繖浜涙暟鎹偣鐨勫潗鏍囷紝骞朵笉鐭ラ亾鐞嗘兂鎯呭喌涓嬬殑 :math:`w=5.0` 浠ュ強 :math:`b=2.0`, 鍙兘閫氳繃宸叉湁鐨勬暟鎹幓杩唬鏇存柊鍙傛暟銆� 閫氳繃鎹熷け鎴栬€呭叾瀹冪殑鎵嬫鏉ュ垽鏂渶缁堟ā鍨嬬殑濂藉潖锛堟瘮濡傜洿绾跨殑鎷熷悎绋嬪害锛夛紝 鍦ㄥ悗缁暀绋嬩腑浼氬悜浣犲睍绀烘洿鍔犵瀛︾殑鍋氭硶銆� >>> x, y = get_point_examples() >>> print(x.shape, y.shape) (100,) (100,) .. figure:: ../../_static/images/point-data.png 閫氳繃鍙鍖栧垎鏋愬彂鐜帮紙濡備笂鍥撅級锛氳繖浜涚偣鐨勫垎甯冨緢閫傚悎鐢ㄤ竴鏉$洿绾� :math:`f(x) = w * x + b` 鍘昏繘琛屾嫙鍚堛€� >>> def f(x): ... return w * x + b 鎵€鏈夌殑鏍锋湰鐐圭殑妯潗鏍� :math:`x` 缁忚繃鎴戜滑鐨勬ā鍨嬪悗浼氬緱鍒颁竴涓娴嬭緭鍑� :math:`\hat{y} = f(x)`. 鍦ㄦ湰鏁欑▼涓紝鎴戜滑灏嗛噰鍙栫殑姊害涓嬮檷绛栫暐鏄壒姊害涓嬮檷锛圔atch Gradient Descent锛�, 鍗虫瘡娆¤凯浠f椂閮藉皢鍦ㄦ墍鏈夋暟鎹偣涓婅繘琛岄娴嬬殑鎹熷け绱Н璧锋潵寰楀埌鏁翠綋鎹熷け鍚庢眰骞冲潎锛屼互姝や綔涓轰紭鍖栫洰鏍囧幓璁$畻姊害鍜屼紭鍖栧弬鏁般€� 杩欐牱鐨勫ソ澶勬槸鍙互閬垮厤鍣0鏁版嵁鐐瑰甫鏉ョ殑骞叉壈锛屾瘡娆℃洿鏂板弬鏁版椂浼氭湞鐫€鏁翠綋鏇村姞鍧囪 鐨勬柟鍚戝幓浼樺寲銆� 浠ュ強浠庤绠楁晥鐜囪搴︽潵鐪嬶紝鍙互鍏呭垎鍒╃敤涓€绉嶅彨鍋� **鍚戦噺鍖栵紙Vectorization锛�** 鐨勭壒鎬э紝鑺傜害鏃堕棿锛堟嫇灞曟潗鏂欎腑杩涜浜嗛獙璇侊級銆� 璁捐涓庡疄鐜版崯澶卞嚱鏁� ~~~~~~~~~~~~~~~~~~ 瀵逛簬杩欐牱鐨勬ā鍨嬶紝濡備綍搴﹂噺杈撳嚭鍊� :math:`\hat{y} = f(x)` 涓庣湡瀹炲€� :math:`y` 涔嬮棿鐨勬崯澶� :math:`l` 鍛紵 璇烽『鐫€涓嬮潰鐨勬€濊矾杩涜鎬濊€冿細 #. 鏈€瀹规槗鎯冲埌鐨勫仛娉曟槸鐩存帴璁$畻璇樊锛圗rror锛夛紝鍗冲姣忎釜 :math:`(x_i, y_i)` 鍜� :math:`\hat{y_i}` 鏈� :math:`l_i = l(\hat{y_i},y_i) = \hat{y_i} - y_i`. #. 杩欐牱鐨勬兂娉曞緢鑷劧锛岄棶棰樺湪浜庡浜庡洖褰掗棶棰橈紝涓婅堪褰㈠紡寰楀埌鐨勬崯澶� :math:`l_i` 鏄湁姝f湁璐熺殑锛� 鍦ㄦ垜浠绠楀钩鍧囨崯澶� :math:`l = \frac{1}{n} \sum_{i}^{n} 锛圽hat{y_i} - y_i)` 鏃朵細灏嗕竴浜涙璐熷€艰繘琛屾姷娑堬紝 姣斿瀵逛簬 :math:`y_1 = 50, \hat{y_1} = 100` 鍜� :math:`y2 = 50, \hat{y_2} = 0`, 寰楀埌鐨勫钩鍧囨崯澶变负 :math:`l = \frac{1}{2} \big( (100 - 50) + (0 - 50) \big) = 0`, 杩欏苟涓嶆槸鎴戜滑鎯宠鐨勬晥鏋溿€� 鎴戜滑甯屾湜鍗曚釜鏍锋湰涓婄殑璇樊搴旇鏄彲绱Н鐨勶紝鍥犳瀹冮渶瑕佹槸姝e€硷紝鍚屾椂鏂逛究鍚庣画璁$畻銆� #. 鍙互灏濊瘯鐨勬敼杩涙槸浣跨敤骞冲潎缁濆璇樊锛圡ean Absolute Error, MAE锛�: :math:`l = \frac{1}{n} \sum_{i}^{n} |\hat{y_i} - y_i|` 浣嗘敞鎰忓埌锛屾垜浠紭鍖栨ā鍨嬩娇鐢ㄧ殑鏄搴︿笅闄嶆硶锛岃繖瑕佹眰鐩爣鍑芥暟锛堝嵆鎹熷け鍑芥暟锛夊敖鍙兘鍦拌繛缁彲瀵硷紝涓旀槗浜庢眰瀵煎拰璁$畻銆� 鍥犳鎴戜滑鍦ㄥ洖褰掗棶棰樹腑鏇村父瑙佺殑鎹熷け鍑芥暟鏄钩鍧囧钩鏂硅宸紙Mean Squared Error, MSE锛�: .. math:: l = \frac{1}{n} \sum_{i}^{n} (\hat{y_i} - y_i)^2 .. note:: * 涓€浜涙満鍣ㄥ涔犺绋嬩腑鍙兘浼氫负浜嗘柟渚挎眰瀵兼椂鎶垫秷鎺夊钩鏂瑰甫鏉ョ殑绯绘暟 2锛屽湪鍓嶉潰涔樹笂 :math:`\frac{1}{2}`, 鏈暀绋嬩腑娌℃湁杩欐牱鍋氾紙鍥犱负 MegEngine 鏀寔鑷姩姹傚锛屽彲浠ュ拰鎵嬪姩姹傚杩囩▼鐨勪唬鐮佽繘琛屽姣旓級锛� * 鍙﹀鎴戜滑鍙互浠庢鐜囩粺璁¤瑙掕В閲婁负浣曢€夌敤 MSE 浣滀负鎹熷け鍑芥暟锛� 鍋囧畾璇樊婊¤冻骞冲潎鍊� :math:`\mu = 0` 鐨勬鎬佸垎甯冿紝閭d箞 MSE 灏辨槸瀵瑰弬鏁扮殑鏋佸ぇ浼肩劧浼拌銆� 璇︾粏鐨勮В閲婂彲浠ョ湅 CS229 鐨� `璁蹭箟 <https://see.stanford.edu/materials/aimlcs229/cs229-notes1.pdf>`_ 銆� 濡傛灉浣犱笉浜嗚В涓婇潰杩欏嚑鐐圭粏鑺傦紝涓嶇敤鎷呭績锛岃繖涓嶄細褰卞搷鍒版垜浠畬鎴愭湰鏁欑▼鐨勪换鍔°€� 鎴戜滑鍋囧畾鐜板湪閫氳繃妯″瀷寰楀埌浜� 4 涓牱鏈笂鐨勯娴嬬粨鏋� ``pred``, 鐜板湪鏉ヨ绠楀畠涓庣湡瀹炲€� ``real`` 涔嬮棿鐨� MSE 鎹熷け锛� >>> pred = np.array([3., 3., 3., 3.]) >>> real = np.array([2., 8., 6., 1.]) >>> np_loss = np.mean((pred - real) ** 2) >>> np_loss 9.75 鍦� MegEngine 涓甯歌鐨勬崯澶卞嚱鏁颁篃杩涜浜嗗皝瑁咃紝杩欓噷鎴戜滑鍙互浣跨敤 :func:`~.nn.square_loss`: >>> mge_loss = F.nn.square_loss(Tensor(pred), Tensor(real)) >>> mge_loss Tensor(9.75, device=xpux:0) 娉ㄦ剰锛氱敱浜庢崯澶卞嚱鏁帮紙Loss function锛夋槸娣卞害瀛︿範涓彁鍑虹殑姒傚康锛屽洜姝ょ浉鍏虫帴鍙e簲褰撻€氳繃 :mod:`.functional.nn` 璋冪敤銆� .. seealso:: * 濡傛灉浣犱笉鐞嗚В涓婇潰鐨勬搷浣滐紝璇峰弬鑰� :ref:`element-wise-operations` 鎴栨祻瑙堝搴旂殑 API 鏂囨。锛� * 鏇村鐨勫父瑙佹崯澶卞嚱鏁帮紝鍙互鍦� :ref:`loss-functions` 鎵惧埌銆� 瀹屾暣浠g爜瀹炵幇 ~~~~~~~~~~~~ 鎴戜滑鍚屾椂缁欏嚭 NumPy 瀹炵幇鍜� MegEngine 瀹炵幇浣滀负瀵规瘮锛� * 鍦� NumPy 瀹炵幇涓渶瑕佹墜鍔ㄦ帹瀵� :math:`\frac{\partial l}{\partial w}` 涓� :math:`\frac{\partial l}{\partial b}`, 鑰屽湪 MegEngine 涓彧闇€瑕佽皟鐢� ``gm.backward(loss)`` 鍗冲彲; * 杈撳叆鏁版嵁 :math:`x` 鏄舰鐘朵负 :math:`(100,)` 鐨勫悜閲忥紙1 缁存暟缁勶級锛� 涓庢爣閲� :math:`w` 鍜� :math:`b` 杩涜杩愮畻鏃讹紝鍚庤€呬細骞挎挱鍒扮浉鍚岀殑褰㈢姸锛屽啀杩涜璁$畻銆� 杩欐牱鍒╃敤浜嗗悜閲忓寲鐨勭壒鎬э紝璁$畻鏁堢巼鏇撮珮锛岀浉鍏崇粏鑺傚彲浠ュ弬鑰� :ref:`tensor-broadcasting` 銆� .. panels:: :container: +full-width :card: NumPy ^^^^^ .. code-block:: python import numpy as np x, y = get_point_examples() w = 0.0 b = 0.0 def f(x): return w * x + b nums_epoch = 5 for epoch in range(nums_epoch): # optimzer.clear_grad() w_grad = 0 b_grad = 0 # forward and calculate loss pred = f(x) loss = ((pred - y) ** 2).mean() # backward(loss) w_grad += (2 * (pred - y) * x).mean() b_grad += (2 * (pred - y)).mean() # optimizer.step() lr = 0.01 w = w - lr * w_grad b = b - lr * b_grad print(f"Epoch = {epoch}, \ w = {w:.3f}, \ b = {b:.3f}, \ loss = {loss:.3f}") --- MegEngine ^^^^^^^^^ .. code-block:: python import megengine.functional as F from megengine import Tensor, Parameter from megengine.autodiff import GradManager import megengine.optimizer as optim x, y = get_point_examples() w = Parameter(0.0) b = Parameter(0.0) def f(x): return w * x + b gm = GradManager().attach([w, b]) optimizer = optim.SGD([w, b], lr=0.01) nums_epoch = 5 for epoch in range(nums_epoch): x = Tensor(x) y = Tensor(y) with gm: pred = f(x) loss = F.nn.square_loss(pred, y) gm.backward(loss) optimizer.step().clear_grad() print(f"Epoch = {epoch}, \ w = {w.item():.3f}, \ b = {b.item():.3f}, \ loss = {loss.item():.3f}") 浜岃€呭簲璇ヤ細寰楀埌涓€鏍风殑杈撳嚭銆� 鐢变簬鎴戜滑浣跨敤鐨勬槸鎵规搴︿笅闄嶇瓥鐣ワ紝姣忔杩唬锛圛teration锛夐兘鏄熀浜庢墍鏈夋暟鎹绠楀緱鍒扮殑骞冲潎鎹熷け鍜屾搴﹁繘琛岀殑銆� 涓轰簡杩涜澶氭杩唬锛屾垜浠閲嶅澶氳稛锛圗pochs锛夎缁冿紙鎶婃暟鎹畬鏁磋繃涓€閬嶏紝绉颁负瀹屾垚涓€涓� Epoch 鐨勮缁冿級銆� 鑰屽湪鎵规搴︿笅闄嶇瓥鐣ヤ笅锛屾瘡瓒熻缁冨弬鏁板彧浼氭洿鏂颁竴涓� Iter, 鍚庨潰鎴戜滑浼氶亣鍒颁竴涓� Epoch 杩唬澶氭鐨勬儏鍐碉紝 杩欎簺鏈鍦ㄦ繁搴﹀涔犻鍩熺殑浜ゆ祦涓潪甯稿父瑙侊紝浼氬湪鍚庣画鐨勬暀绋嬩腑琚弽澶嶆彁鍒般€� 鍙互鍙戠幇锛岀粡杩� 5 瓒熻缁冿紙缁忕粰瀹氫换鍔� T杩� 5 娆¤凯浠o級锛屾垜浠殑鎹熷け鍦ㄤ笉鏂湴涓嬮檷锛屽弬鏁� :math:`w` 鍜� :math:`b` 涔熷湪涓嶆柇鍙樺寲銆� .. code-block:: shell Epoch = 0, w = 3.486, b = -0.005, loss = 871.968 Epoch = 1, w = 4.508, b = 0.019, loss = 86.077 Epoch = 2, w = 4.808, b = 0.053, loss = 18.446 Epoch = 3, w = 4.897, b = 0.088, loss = 12.515 Epoch = 4, w = 4.923, b = 0.123, loss = 11.888 閫氳繃涓€浜涘彲瑙嗗寲鎵嬫锛屽彲浠ョ洿瑙傚湴鐪嬪埌鎴戜滑鐨勭洿绾挎嫙鍚堢▼搴﹁繕鏄緢涓嶉敊鐨勩€� .. figure:: ../../_static/images/line.png 杩欐槸鎴戜滑 MegEngine 涔嬫梾鐨勪竴灏忔锛屾垜浠凡缁忔垚鍔熷湴鐢� MegEngine 瀹屾垚浜嗙洿绾挎嫙鍚堢殑浠诲姟锛� .. seealso:: 鏈暀绋嬬殑瀵瑰簲婧愮爜锛� :docs:`examples/beginner/megengine-basic-fit-line.py` 鎬荤粨锛氫竴鍏冪嚎鎬у洖褰� ------------------ 鎴戜滑灏濊瘯鐢ㄤ笓涓氱殑鏈鏉ュ畾涔夛細鍥炲綊鍒嗘瀽鍙秹鍙婂埌涓や釜鍙橀噺鐨勶紝绉颁竴鍏冨洖褰掑垎鏋愩€� 濡傛灉鍙湁涓€涓嚜鍙橀噺 :math:`X`, 鑰屼笖鍥犲彉閲� :math:`Y` 鍜岃嚜鍙橀噺 :math:`X` 涔嬮棿鐨勬暟閲忓彉鍖栧叧绯诲憟杩戜技绾挎€у叧绯伙紝 灏卞彲浠ュ缓绔嬩竴鍏冪嚎鎬у洖褰掓柟绋嬶紝鐢辫嚜鍙橀噺 :math:`X` 鐨勫€兼潵棰勬祴鍥犲彉閲� :math:`Y` 鐨勫€硷紝杩欏氨鏄竴鍏冪嚎鎬у洖褰掗娴嬨€� 涓€鍏冪嚎鎬у洖褰掓ā鍨� :math:`y_{i}=\alpha+\beta x_{i}+\varepsilon_{i}` 鏄渶绠€鍗曠殑鏈哄櫒瀛︿範妯″瀷锛岄潪甯搁€傚悎鍏ラ棬銆� 鍏朵腑闅忔満鎵板姩椤� :math:`\varepsilon_{i}` 鏄棤娉曠洿鎺ヨ娴嬬殑闅忔満鍙橀噺锛屼篃鍗虫垜浠笂闈㈢敓鎴愭暟鎹椂寮曞叆鐨勫櫔澹般€� 鎴戜滑鏍规嵁瑙傚療宸叉湁鐨勬暟鎹偣鍘诲涔犲嚭 :math:`w` 鍜� :math:`b`, 寰楀埌浜嗘牱鏈洖褰掓柟绋� :math:`\hat{y}_{i}= wx_{i}+b` 浣滀负涓€鍏冪嚎鎬у洖褰掗娴嬫ā鍨嬨€� 涓€鍏冪嚎鎬у洖褰掓柟绋嬬殑鍙傛暟浼拌閫氬父浼氱敤鍒版渶灏忓钩鏂规硶锛堜篃鍙渶灏忎簩涔樻硶锛孡east squares method锛� 姹傝В姝h鏂圭▼鐨勫舰寮忓幓姹傚緱瑙f瀽瑙o紙Closed-form expression锛夛紝鏈暀绋嬩笉浼氫粙缁嶈繖绉嶅仛娉曪紱 鎴戜滑杩欓噷閫夋嫨鐨勬柟娉曟槸浣跨敤姊害涓嬮檷娉曞幓杩唬浼樺寲璋冨弬锛� 涓€鏄负浜嗗睍绀� MegEngine 涓殑鍩烘湰鍔熻兘濡� :class:`~.GradManager` 鍜� :class:`~.Optimizer` 鐨勪娇鐢紝 浜屾槸涓轰簡浠ュ悗鑳藉鏇磋嚜鐒跺湴瀵圭缁忕綉缁滆繖鏍风殑闈炵嚎鎬фā鍨嬭繘琛屽弬鏁颁紭鍖栵紝灞婃椂鏈€灏忎簩涔樻硶灏嗕笉鍐嶉€傜敤銆� 杩欐椂鍊欏彲浠ユ彁鍙� Tom Mitchell 鍦� 銆� `Machine Learning <http://www.cs.cmu.edu/~tom/mlbook.html>`_ :footcite:p:`10.5555/541177`銆� 涓€涔︿腑瀵� 鈥滄満鍣ㄥ涔犫€� 鐨勫畾涔夛細 A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. 濡傛灉涓€涓绠楁満绋嬪簭鑳藉鏍规嵁缁忛獙 E 鎻愬崌鍦ㄦ煇绫讳换鍔� T 涓婄殑鎬ц兘 P, 鍒欐垜浠绋嬪簭浠庣粡楠� E 涓繘琛屼簡瀛︿範銆� 鍦ㄦ湰鏁欑▼涓紝鎴戜滑鐨勪换鍔� T 鏄皾璇曟嫙鍚堜竴鏉$洿绾匡紝缁忛獙 E 鏉ヨ嚜浜庢垜浠凡鏈夌殑鏁版嵁鐐癸紝 鏍规嵁鏁版嵁鐐圭殑鍒嗗竷锛屾垜浠嚜鐒惰€岀劧鍦版兂鍒颁簡閫夋嫨涓€鍏冪嚎鎬фā鍨嬫潵棰勬祴杈撳嚭锛� 鎴戜滑璇勪及妯″瀷濂藉潖锛堟€ц兘 P锛夋椂鐢ㄥ埌浜� MSE 鎹熷け浣滀负鐩爣鍑芥暟锛屽苟鐢ㄦ搴︿笅闄嶇畻娉曟潵浼樺寲鎹熷け銆� 鍦ㄤ笅涓€涓暀绋嬩腑锛屾垜浠皢鎺ヨЕ鍒板鍏冪嚎鎬у洖褰掓ā鍨嬶紝骞跺鏈哄櫒瀛︿範鐨勬蹇垫湁鏇村姞娣卞埢鐨勮璇嗐€� 鍦ㄦ涔嬪墠锛屼綘鍙兘闇€瑕佽姳璐逛竴浜涙椂闂村幓娑堝寲鍚告敹宸茬粡鍑虹幇鐨勭煡璇嗭紝澶氬缁冧範銆� .. admonition:: 浠诲姟锛屾ā鍨嬩笌浼樺寲绠楁硶 鏈哄櫒瀛︿範棰嗗煙鏈夌潃闈炲父澶氱绫荤殑妯″瀷锛屼紭鍖栫畻娉曚篃骞堕潪鍙湁姊害涓嬮檷杩欎竴绉嶃€� 鎴戜滑鍦ㄥ悗闈㈢殑鏁欑▼涓細鎺ヨЕ鍒板鍏冪嚎鎬у洖褰掓ā鍨嬨€佷互鍙婄嚎鎬у垎绫绘ā鍨嬶紝 浠庣嚎鎬фā鍨嬭繃娓″埌娣卞害瀛︿範涓殑鍏ㄨ繛鎺ョ缁忕綉缁滄ā鍨嬶紱 涓嶅悓鐨勬ā鍨嬮€傜敤浜庝笉鍚岀殑鏈哄櫒瀛︿範浠诲姟锛屽洜姝ゆā鍨嬮€夋嫨寰堥噸瑕併€� 娣卞害瀛︿範涓娇鐢ㄧ殑妯″瀷琚О涓虹缁忕綉缁滐紝绁炵粡缃戠粶鐨勯瓍鍔涗箣涓€鍦ㄤ簬锛� 瀹冭兘澶熻搴旂敤浜庤澶氫换鍔★紝骞朵笖鏈夋椂鍊欒兘鍙栧緱姣斾紶缁熸満鍣ㄥ涔犳ā鍨嬪ソ寰堝鐨勬晥鏋溿€� 浣嗗畠妯″瀷缁撴瀯骞朵笉澶嶆潅锛屼紭鍖栨ā鍨嬬殑娴佺▼鍜屾湰鏁欑▼澶у悓灏忓紓銆� 鍥炲繂涓€涓嬶紝浠讳綍绁炵粡缃戠粶妯″瀷閮借兘澶熻〃杈炬垚璁$畻鍥撅紝鑰屾垜浠凡缁忓垵绐ュ叾濂ュ銆� .. admonition:: 灏濊瘯璋冩暣瓒呭弬鏁� 鎴戜滑鎻愬埌浜嗕竴浜涙蹇靛瓒呭弬鏁帮紙Hyperparameter锛夛紝瓒呭弬鏁版槸闇€瑕佷汉涓鸿繘琛岃瀹氾紝閫氬父鏃犳硶鐢辨ā鍨嬭嚜宸卞寰楃殑鍙傛暟銆� 浣犳垨璁稿凡缁忓彂鐜颁簡锛屾垜浠湪姣忔杩唬鍙傛暟 :math:`w` 鍜� :math:`b` 鏃讹紝浣跨敤鐨勬槸鍚屾牱鐨勫涔犵巼銆� 缁忚繃 5 娆¤凯浠e悗锛屽弬鏁� :math:`w` 宸茬粡璺濈鐞嗘兂鎯呭喌寰堟帴杩戜簡锛岃€屽弬鏁� :math:`b` 杩橀渶缁х画鏇存柊銆� 灏濊瘯鏀瑰彉 `lr` 鐨勫€硷紝鎴栬€呭鍔犺缁冪殑 `Epoch` 鏁帮紝鐪嬫崯澶卞€艰兘鍚﹁繘涓€姝ュ湴闄嶄綆銆� .. admonition:: 鎹熷け瓒婁綆锛屼竴瀹氭剰鍛崇潃瓒婂ソ鍚楋紵 鏃㈢劧鎴戜滑閫夋嫨浜嗗皢鎹熷け浣滀负浼樺寲鐩爣锛岀悊鎯虫儏鍐典笅鎴戜滑鐨勬ā鍨嬪簲璇ユ嫙鍚堢幇鏈夋暟鎹腑灏藉彲鑳藉鐨勪釜鐐规潵闄嶄綆鎹熷け銆� 浣嗗眬闄愪箣澶勫湪浜庯紝鎴戜滑寰楀埌鐨勮繖浜涚偣濮嬬粓鏄缁冩暟鎹紝瀵逛簬涓€涓満鍣ㄥ涔犱换鍔★紝 鎴戜滑鍙兘浼氬湪璁粌妯″瀷鏃朵娇鐢ㄦ暟鎹泦 A, 鑰屽湪瀹為檯浣跨敤妯″瀷鏃剁敤鍒颁簡鏉ヨ嚜鐜板疄涓栫晫鐨勬暟鎹泦 B. 鍦ㄨ繖绉嶆椂鍊欙紝灏嗚缁冩ā鍨嬫椂鐨勬崯澶变紭鍖栧埌鏋佽嚧鍙嶈€屽彲鑳戒細瀵艰嚧杩囨嫙鍚堬紙Overfitting锛夈€� .. figure:: ../../_static/images/overfitting.png Christopher M Bishop `Pattern Recognition and Machine Learning <https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf>`_ :footcite:p:`10.5555/1162264` - Figure 1.4 涓婂浘涓殑鏁版嵁鐐瑰垎甯冨叾瀹炴潵鑷簬涓夎鍑芥暟鍔犱笂涓€浜涘櫔澹帮紝鎴戜滑閫夋嫨澶氶」寮忓洖褰掓ā鍨嬪苟杩涜浼樺寲锛� 甯屾湜澶氶」寮忔洸绾胯兘澶熷敖鍙兘鎷熷悎鏁版嵁鐐广€傚彲浠ュ彂鐜板綋杩唬娆℃暟杩囧鏃讹紝浼氬嚭鐜版渶鍚庝竴寮犲浘鐨勬儏鍐点€� 杩欎釜鏃跺€欒櫧鐒跺湪鐜版湁鏁版嵁鐐逛笂鐨勬嫙鍚堢▼搴﹁揪鍒颁簡鐧惧垎鐧撅紙鎹熷け涓� 0锛夛紝浣嗗浜庢柊杈撳叆鐨勬暟鎹紝 鍏堕娴嬫€ц兘鍙兘杩樹笉濡傛棭鏈熺殑璁粌鎯呭喌銆傚洜姝わ紝涓嶈兘鍏夐潬璁粌杩囩▼涓殑鎹熷け鍑芥暟鏉ヤ綔涓烘ā鍨嬫€ц兘鐨勮瘎浼版寚鏍囥€� 鎴戜滑鍦ㄥ悗缁殑鏁欑▼涓紝浼氱粰鍑烘洿鍔犵瀛︾殑瑙e喅鏂规銆� 鎷撳睍鏉愭枡 -------- .. dropdown:: :fa:`eye,mr-1` 鍏充簬鍚戦噺鍖栦紭浜� for 寰幆鐨勭畝鍗曢獙璇� 鍦� NumPy 鍐呴儴锛屽悜閲忓寲杩愮畻鐨勯€熷害鏄紭浜� for 寰幆鐨勶紝鎴戜滑寰堝鏄撻獙璇佽繖涓€鐐癸細 .. code-block:: python import time n = 1000000 a = np.random.rand(n) b = np.random.rand(n) c1 = np.zeros(n) time_start = time.time() for i in range(n): c1[i] = a[i] * b[i] time_end = time.time() print('For loop version:', str(1000 * (time_end - time_start)), 'ms') time_start = time.time() c2 = a * b time_end = time.time() print('Vectorized version:', str(1000 * (time_end - time_start)), 'ms') print(c1 == c2) .. code-block:: shell For loop version: 460.2222442626953 ms Vectorized version: 3.6432743072509766 ms [ True True True ... True True True] 鑳屽悗鏄埄鐢� SIMD 杩涜鏁版嵁骞惰锛屼簰鑱旂綉涓婃湁闈炲父澶氬崥瀹㈣缁嗗湴杩涜浜嗚В閲婏紝鎺ㄨ崘闃呰锛� * `Why is vectorization, faster in general, than loops? <https://stackoverflow.com/questions/35091979/why-is-vectorization-faster-in-general-than-loops>`_ * `Nuts and Bolts of NumPy Optimization Part 1: Understanding Vectorization and Broadcasting <https://blog.paperspace.com/numpy-optimization-vectorization-and-broadcasting/>`_ 鍚屾牱鍦帮紝鍚戦噺鍖栫殑浠g爜鍦� MegEngine 涓篃浼氭瘮 for 寰幆鍐欐硶鏇村揩锛屽挨鍏舵槸鍒╃敤 GPU 骞惰璁$畻鏃躲€� .. dropdown:: :fa:`eye,mr-1` Scikit-learn 鏂囨。锛氭瑺鎷熷悎鍜岃繃鎷熷悎 Scikit-learn 鏄潪甯告湁鍚嶇殑 Python 鏈哄櫒瀛︿範搴擄紝閲岄潰瀹炵幇浜嗚澶氱粡鍏告満鍣ㄥ涔犵畻娉曘€� 鍦� Scikit-learn 鐨勬ā鍨嬮€夋嫨鏂囨。涓紝缁欏嚭浜嗚В閲婃ā鍨嬫瑺鎷熷悎鍜岃繃鎷熷悎鐨勪唬鐮侊細 https://scikit-learn.org/stable/auto_examples/model_selection/plot_underfitting_overfitting.html 鎰熷叴瓒g殑璇昏€呭彲浠ュ€熸鍘讳簡瑙d竴涓� Scikit-learn, 鎴戜滑鍦ㄤ笅涓€涓暀绋嬩腑浼氱敤鍒板畠鎻愪緵鐨勬暟鎹泦鎺ュ彛銆� 鍙傝€冩枃鐚� -------- .. footbibliography::