Commit Graph

19 Commits

Author SHA1 Message Date
王晶 f6b12f4631 !760 支持Mixtral 8x7B MOE模型
Merge pull request !760 from 王晶/modellink
2024-03-04 10:38:57 +00:00
guhangsong 9b4153e4cf !480 支持指令微调功能
Merge pull request !480 from guhangsong/sft
2024-01-30 02:13:31 +00:00
liuyanghan 76e9d607e1 modellink基于megatron实现分布式lora微调 2024-01-08 19:51:37 +08:00
yangcheng 96b56395e4 llama2_7b 2024-01-05 10:27:02 +08:00
fengliangjun 15963528c8 rename ascendspeed to modellink 2023-12-21 11:02:07 +08:00
mhh001 ad6f867d94 transformer框架升级 2023-11-14 20:44:46 +08:00
g00841271 981428f3b0 (Part Ⅳ ) replace assert with check_* 2023-10-27 16:25:13 +08:00
Zhihong Chen 5bf57ecb87 change dict[] to dict.get() 2023-10-19 10:23:32 +08:00
mhh001 e3aeab62e7 (第二部分)tensor parallel v1到v3升级,目前提交的代码会整体将ascendspeed框架迁移到新版的tp上面,删除原有的mpu目录,涉及到引用的修改,还有一些新版本代码适配的修改。 2023-10-13 10:45:41 +08:00
g00841271 3483bddb1c (PartⅠ) replace assert with check_* 2023-10-11 15:25:19 +08:00
fengliangjun 7a21f0bf58 up 2023-09-26 14:30:47 +08:00
fengliangjun 260e8eea8f create megatron core 2023-07-24 15:00:57 +08:00
wiyr 6304cab765 remove useless code 2023-06-20 16:54:12 +08:00
machangjun 2d8c6fee9d add bloom st and adapt new data load method
modify bloom st run

modify bloom st run

modify times

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add new pretrain_bloom.py

add st
2023-06-17 17:36:17 +08:00
wiyr 2f826f7351 can run with bloom7b and pass ci 2023-06-12 14:42:29 +08:00
chenzomi 37cc0b949d change megatron to ascendspeed 2023-06-10 21:26:01 +08:00
fengliangjun 106a415556 inital AscendSpeed 2023-06-09 16:15:23 +08:00
wangyixian d55d341fe1 Adapt the bloom 7.1b model to the AscendSpeed framework, which is jointly completed by liulinfeng and wangyixian 2023-06-06 22:30:19 +08:00
chenzomi e4a120a662 fork megatron-deepspeed code. 2023-05-25 14:49:59 +08:00