澶辨晥閾炬帴澶勭悊 |
澶фā鍨嬶紙LLMs錛夋樉瀛橀棶棰橀潰 PDF 涓嬭澆
杞澆鑷細(xì)http://www.python222.com/article/1219
鐩稿叧鎴浘錛?/strong>
![]() 涓昏鍐呭錛?/strong>
1. 澶фā鍨嬪ぇ姒傛湁澶氬ぇ錛屾ā鍨嬫枃浠舵湁澶氬ぇ?
涓€鑸斁鍑烘潵鐨勬ā鍨嬫枃浠墮兘鏄?/span>fp16鐨勶紝鍋囪鏄竴涓?/span> n B鐨勬ā鍨嬶紝閭d箞妯″瀷鏂囦歡鍗?/span> 2n G錛?/span>fp16鍔犺澆鍒版樉瀛橀噷鍋氭帹鐞嗕篃
鏄崰 2n G錛屽澶栫殑pr閮芥槸 10n 浜垮弬鏁扮殑妯″瀷銆?/span>
2. 鑳藉惁鐢?/strong>4 * v100 32G璁粌vicuna 65b錛?/strong>
涓嶈兘銆?/span>
• 棣栧厛錛?/span>llama 65b鐨勬潈閲嶉渶瑕?/span>5* v100 32G鎵嶈兘瀹屾暣鍔犺澆鍒?/span>GPU銆?/span>
• 鍏舵錛?/span>vicuna浣跨敤flash-attention鍔犻€熻緇冿紝鏆備笉鏀寔v100錛岄渶瑕?/span>turing鏋舵瀯涔嬪悗鐨勬樉鍗°€?/span>
3. 濡傛灉灝辨槸鎯寵璇曡瘯65b妯″瀷錛屼絾鏄樉瀛樹笉澶氭€庝箞鍔烇紵
鏈€灝戝ぇ姒?/span>50g鏄懼瓨錛屽彲浠ュ湪llama-65b-int4錛?/span>gptq錛夋ā鍨嬪熀紜€涓?/span>LoRA[6]錛屽綋鐒跺悇縐嶅簱瑕佸畨瑁呭畾鍒剁増鏈殑銆?/span>
4. nB妯″瀷鎺ㄧ悊闇€瑕佸灝戞樉瀛橈紵
鑰冭檻妯″瀷鍙傛暟閮芥槸fp16錛?/span>2nG鐨勬樉瀛樿兘鎶婃ā鍨嬪姞杞姐€?/span>
5. nB妯″瀷璁粌闇€瑕佸灝戞樉瀛橈紵
鍩虹鏄懼瓨錛氭ā鍨嬪弬鏁?/span>+姊害+浼樺寲鍣紝鎬誨叡16nG銆?/span>
activation鍗犵敤鏄懼瓨錛屽拰max len銆?/span>batch size鏈夊叧
瑙i噴錛氫紭鍖栧櫒閮ㄥ垎蹇呴』鐢?/span>fp32錛堜技涔?/span>fp16浼?xì)瀵艰嚧璁l冧笉紼沖畾錛夛紝鎵€浠ュ簲璇ユ槸2+2+12=16錛屽弬鑰?/span>ZeRO璁烘枃銆?/span>
娉ㄤ互涓婄畻鏁頒笉澶熺洿瑙傦紝涓句釜渚嬪瓙錛?/span>
7B鐨?/span>vicuna鍦?/span>fsdp涓嬫€誨叡160G鏄懼瓨鍕夊己鍙互璁粌銆傦紙鎸夌収涓婇潰璁$畻7*16=112G鏄熀紜€鏄懼瓨錛?/span>
鎵€浠ュ叏閲忚緇冨噯澶囨樉瀛?/span>20nG澶ф鏄渶浣庤姹傦紝闄ら潪鍐呭瓨鍏呰凍錛屾樉瀛樹笉澶?/span>offload鍐呭瓨琛ャ€?/span>
6. 濡備綍 浼扮畻妯″瀷鎵€闇€鐨?/strong>RAM錛?/strong>
棣栧厛錛屾垜浠渶瑕佷簡瑙e浣曟牴鎹弬鏁伴噺浼拌妯″瀷澶ц嚧鎵€闇€鐨?/span> RAM錛岃繖鍦ㄥ疄璺典腑鏈夊緢閲嶈鐨勫弬鑰冩剰涔夈€傛垜浠渶瑕侀€?/span>
榪囦及綆楄緗?/span> batch_size錛岃緗ā鍨嬬簿搴︼紝閫夋嫨寰皟鏂規(guī)硶鍜屽弬鏁板垎甯冩柟娉曠瓑銆?/span>
鎺ヤ笅鏉ワ紝鎴戜滑鐢?/span>LLaMA-6B 妯″瀷涓轟緥浼扮畻鍏跺ぇ鑷撮渶瑕佺殑鍐呭瓨銆?/span>
|