澶辨晥閾炬帴澶勭悊 |
鏄懼瓨?shù)紭鍖柦{栫暐綃?PDF 涓嬭澆
杞澆鑷細(xì)http://www.python222.com/article/1214
鐩稿叧鎴浘錛?/strong>
![]()
涓昏鍐呭錛?/strong>
涓€銆佷粙緇嶄竴涓?/strong> gradient accumulation 鏄懼瓨?shù)紭鍖栨柟寮忓Q?/strong>
姝e父鎯呭喌涓嬫槸涓€涓?/span>batch涔嬪悗緇熶竴璁$畻姊害澶у皬錛?/span>gradient accumulation鍙互鍐嶆寚瀹氫釜batch涔嬪悗涓€璧鋒洿鏂版搴︼紝
榪欑鎯呭喌涓嬶紝鍙互鍐?/span>batch_size寰堝皬鐨勬椂鍊欙紝鎻愬崌鐪熸鐨?/span>batch_size錛屾槸涓€縐嶆樉瀛樺崰鐢ㄧ殑浼樺寲綆楁硶銆傞殢鐫€妯″瀷鍜?/span>
鏁版嵁瑙勬ā瓚婃潵瓚婂ぇ錛屾樉瀛樼揣寮犵殑鏃跺€欙紝闇€瑕佹妸batch_size璁劇疆鐨勫緢灝忥紝浣跨敤gradient accumulation鐨勬妧鏈彲浠ュ湪
瀹為檯涓婃彁楂樼湡姝g殑batch_size銆愬鏋?/span>batch_size寰堝皬鐨勮瘽錛屼細(xì)瀵艱嚧璁粌涓嶇ǔ瀹氾紝鏀舵暃鏇存參銆?/span>
姊害绱Н錛?/span>Gradient Accumulation錛夋槸娣卞害瀛︿範(fàn)璁粌涓殑涓€縐嶆妧鏈紝鐢ㄤ簬鍦ㄤ竴嬈″弽鍚戜紶鎾紙backpropagation錛?/span>
涓瘡縐涓皬鎵歸噺鏁版嵁鐨勬搴︼紝鐒跺悗涓€嬈℃€ф洿鏂版ā鍨嬪弬鏁般€傝繖涓妧鏈殑涓昏鐩殑鏄湪鍐呭瓨鏈夐檺鐨勬儏鍐典笅錛岃兘澶熸湁
鏁堝湴浣跨敤澶ф壒閲忔暟鎹繘琛岃緇冿紝浠庤€屾彁楂樻ā鍨嬫€ц兘銆備互涓嬫槸姊害绱Н鐨勮緇嗚В閲婏細(xì)
灝界姊害绱Н鍙互鎻愪緵涓婅堪浼樺娍錛屼絾涔熼渶瑕佹敞鎰忎竴浜涢棶棰樸€傝緝澶х殑绱Н姝ユ暟鍙兘瀵艱嚧鏇存柊棰戠巼榪囦綆錛屼粠鑰岄檷浣庤
緇冮€熷害銆傛澶栵紝绱Н姊害鍙兘浼?xì)瀵艰嚧涓€浜涗紭鍖栫畻娉曠殑鎬ц兘涓嬮檷錛屽洜涓轟竴嬈℃€ф洿鏂板弬鏁板彲鑳戒細(xì)褰卞搷鍔ㄩ噺鍜屽涔?fàn)鐜嚱{?/span>
鍙傛暟鐨勮綆椼€傛€諱箣錛屾搴︾瘡縐槸涓€縐嶆湁鏁堢殑鎶€鏈紝鍙互鍦ㄥ唴瀛樻湁闄愮殑鎯呭喌涓嬶紝鍏呭垎鍒╃敤澶ф壒閲忔暟鎹繘琛屾繁搴﹀涔?/span>
妯″瀷鐨勮緇冿紝浠庤€屾彁楂樻€ц兘鍜屾晥鐜囥€傚湪浣跨敤姊害绱Н鏃訛紝闇€瑕佹牴鎹叿浣撴儏鍐佃繘琛屽弬鏁扮殑璁劇疆鍜岃皟鏁淬€?/span>
浼犵粺鐨勬搴︽洿鏂版柟寮忥紝瀵逛簬姣忎竴涓?/span>batch閮借繘琛屾崯澶辮綆楀拰姊害鏇存柊錛?/span>
1. 鑳屾櫙錛氬湪娣卞害瀛︿範(fàn)涓紝閫氬父浼?xì)鋴社敤灏忔墯w噺闅忔満姊害涓嬮檷錛?/span>Mini-batch Stochastic Gradient Descent錛岀畝縐?/span>
SGD錛夋潵璁粌妯″瀷銆傛瘡涓皬鎵歸噺鏁版嵁閮戒細(xì)璁$畻涓€嬈℃搴︼紝騫剁敤榪欎釜姊害鏉ユ洿鏂版ā鍨嬪弬鏁般€傜劧鑰岋紝鍦ㄦ煇浜涙儏鍐?/span>
涓嬶紝鐢變簬鏄懼瓨錛?/span>GPU鍐呭瓨錛夌殑闄愬埗錛屾棤娉曚竴嬈℃€у鐞嗗ぇ鎵歸噺鏁版嵁銆傝繖鍙兘浼?xì)闄愬埗浜?jiǎn)妯″瀷鐨勬壒閲忓ぇ灝忥紝浠庤€屽獎(jiǎng)
鍝嶄簡(jiǎn)璁粌鏁堢巼鍜屾€ц兘銆?/span>
2. 姊害绱Н鐨勫師鐞嗭細(xì)姊害绱Н鐨勫熀鏈€濇兂鏄紝灝嗗涓皬鎵歸噺鏁版嵁鐨勬搴︾瘡縐搗鏉ワ紝鐒跺悗涓€嬈℃€ф洿鏂版ā鍨嬪弬鏁般€?/span>
鍏蜂綋鎿嶄綔鏄紝瀵逛簬姣忎釜灝忔壒閲忔暟鎹紝璁$畻鍏舵搴︼紝騫跺皢榪欎簺姊害绱Н鍦ㄤ竴璧楓€傚綋绱Н鐨勬搴﹁揪鍒頒竴瀹氭暟閲忔椂
錛堥€氬父縐頒負(fù)绱Н姝ユ暟錛夛紝鎵嶆墽琛屼竴嬈″弬鏁版洿鏂版搷浣溿€?/span>
3. 浣滅敤錛氭搴︾瘡縐殑涓昏浣滅敤鏈変互涓嬪嚑鐐癸細(xì)
a. 鍐呭瓨鏁堢巼錛氭搴︾瘡縐厑璁稿湪鍐呭瓨鏈夐檺鐨勬儏鍐典笅錛屼嬌鐢ㄦ洿澶х殑鎵歸噺鏁版嵁榪涜璁粌銆傝櫧鐒舵瘡涓皬鎵歸噺鏁版嵁
鐨勬搴︿細(xì)琚瘡縐紝浣嗙瘡縐殑榪囩▼涓嶄細(xì)鍗犵敤棰濆鐨勫唴瀛樼┖闂達(dá)紝鍥犳鍙互鍏呭垎鍒╃敤璁$畻璧勬簮錛屾彁楂樿緇冩晥
鐜囥€?/span>
b. 紼沖畾鎬э細(xì)澶ф壒閲忔暟鎹彲鑳藉寘鍚洿鍏ㄩ潰鍜屼赴瀵岀殑淇℃伅錛屽彲浠ュ噺灝戞搴︾殑鏂瑰樊錛屼粠鑰屽湪璁粌榪囩▼涓彁渚涙洿
紼沖畾鐨勬搴︿俊鍙鳳紝鏈夊姪浜庢洿蹇湴鏀舵暃鍒拌緝濂界殑妯″瀷鐘舵€併€?/span>
c. 鍙傛暟鏇存柊棰戠巼鎺у埗錛氶€氳繃璁劇疆绱Н姝ユ暟錛屽彲浠ユ帶鍒跺弬鏁版洿鏂扮殑棰戠巼銆傝繖鍙互鍦ㄨ緇冭繃紼嬩腑榪涜鐏墊椿鐨?/span>
璋冩暣錛屼互閫傚簲涓嶅悓鐨勭‖浠墮檺鍒跺拰璁粌闇€姹?/span>
|