History

fengliangjun 27485bba55 !1376 将分支名从1.0.0改为1.0 Merge pull request !1376 from fengliangjun/1.0.0		2024-06-24 13:29:15 +00:00
..
README.md	!1376 将分支名从1.0.0改为1.0	2024-06-24 13:29:15 +00:00
README_en.md	!1376 将分支名从1.0.0改为1.0	2024-06-24 13:29:15 +00:00
pretrain_qwen_7b_ptd.sh	!1074 requirements.txt移除apex依赖，模型训练脚本规范化加上日志存档	2024-03-19 10:55:11 +00:00
pretrain_qwen_14b_ptd.sh	!1074 requirements.txt移除apex依赖，模型训练脚本规范化加上日志存档	2024-03-19 10:55:11 +00:00
pretrain_qwen_72b_ptd.sh	!1098 Qwen模型readme修改	2024-03-22 01:05:02 +00:00

README_en.md

Qwen `\color{black}{\rm\tiny{【Model}}` `\color{black}{\rm\tiny{contributed}}` `\color{black}{\rm\tiny{by}}` `\color{black}{\rm\tiny{Ascend】}}`

简体中文 | English

Qwen-7B
Qwen-14B
Qwen-72B

Qwen-7B

Training

Here's a hardware summary of pre-training Qwen-7B:

Hardware	Value
NPU	8 x Ascend NPUs

Script

Clone the repository to your local server:

git clone https://gitee.com/ascend/ModelLink.git
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout -f bcce6f
cp -r megatron ../ModelLink/
cd ..
cd ModelLink
git checkout 1.0
mkdir logs
mkdir model_from_hf
mkdir dataset
mkdir ckpt

Build environment

# python3.8
conda create -n test python=3.8
conda activate test

# install torch and torch_npu
pip install torch-2.1.0-cp38-cp38m-manylinux2014_aarch64.whl
pip install torch_npu-2.1.0*-cp38-cp38m-linux_aarch64.whl
pip install apex-0.1_ascend*-cp38-cp38m-linux_aarch64.whl

# install MindSpeed
git clone https://gitee.com/ascend/MindSpeed.git
cd MindSpeed
git checkout 224ae35e8fc96778f957029d1371ddb623452a50
pip install -r requirements.txt
pip install -e .
cd ..

# install other packages
pip install -r requirements.txt

Prepare pretrained weights and tokenizer Download the Qwen-7B checkpoint from here

mkdir ./model_from_hf/Qwen-7B/
cd ./model_from_hf/Qwen-7B/
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/cache_autogptq_cuda_256.cpp
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/cache_autogptq_cuda_kernel_256.cu
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/config.json
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/configuration_qwen.py
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/cpp_kernels.py
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/generation_config.json
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00001-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00002-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00003-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00004-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00005-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00006-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00007-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model-00008-of-00008.safetensors
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/model.safetensors.index.json
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/modeling_qwen.py
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/qwen.tiktoken
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/qwen_generation_utils.py
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/tokenization_qwen.py
wget https://huggingface.co/Qwen/Qwen-7B/resolve/main/tokenizer_config.json
cd ../../

Modify line 39 in the modelling_qwen.py file, changing:

SUPPORT_FP16 = SUPPORT_CUDA and torch.cuda.get_device_capability(0)[0] >= 7

to：

SUPPORT_FP16 = True

Weights convert

Convert weights from huggingface format to megatron format (This scenario is generally used to train open-source HuggingFace models on Megatron)

# modify the script according to your own ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

python tools/checkpoint/util.py \
    --model-type GPT \
    --loader qwen_hf \
    --saver megatron \
    --target-tensor-parallel-size 8 \
    --load-dir ./model_from_hf/Qwen-7B/ \
    --save-dir ./model_weights/Qwen-7B-v0.1-tp8-pp1/ \
    --tokenizer-model ./model_from_hf/Qwen-7B/qwen.tiktoken \
    --add-qkv-bias

Any Megatron weights with parallel slicing strategy --> Any Megatron weights with parallel slicing strategy (This scenario is generally used to convert the trained megatron model back to the HuggingFace format)

# Modify the ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh
python tools/checkpoint/util.py \
    --model-type GPT \
    --loader megatron \
    --saver megatron \
    --save-model-type save_huggingface_qwen \
    --load-dir ./model_weights/Qwen-7B-v0.1-tp8-pp1/ \
    --target-tensor-parallel-size 1 \
    --target-pipeline-parallel-size 1 \
    --add-qkv-bias \
    --save-dir ./model_from_hf/Qwen-7B/   # Fill in the original HF model path here, new weights will be saved in ./model_from_hf/Qwen-7B/mg2hg/

Prepare dataset

Download the Qwen-7B datasets from here

# download datasets
cd ./dataset
wget https://huggingface.co/datasets/tatsu-lab/alpaca/resolve/main/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet
cd ..

# process datasets  
mkdir ./dataset/Qwen-7B/
python ./tools/preprocess_data.py \
    --input ./dataset/train-00000-of-00001-a09b74b3ef9c3b56.parquet \
    --tokenizer-name-or-path ./model_from_hf/Qwen-7B/ \
    --output-prefix ./dataset/Qwen-7B/alpaca \
    --tokenizer-type PretrainedFromHF \
    --seq-length 8192 \
    --workers 4 \
    --log-interval 1000

pre-training

Config Qwen-7B pre-training script: examples/qwen/pretrain_qwen_7b_ptd.sh

 # modify the script according to your own ascend-toolkit path
 source /usr/local/Ascend/ascend-toolkit/set_env.sh 

 # modify config according to your own actual situation
 CKPT_SAVE_DIR="./ckpt/Qwen-7B/"
 TOKENIZER_MODEL="./model_from_hf/Qwen-7B/"  #tokenizer path
 DATA_PATH="./dataset/Qwen-7B/alpaca_text_document"  #processed dataset
 CKPT_LOAD_DIR="./model_weights/Qwen-7B-v0.1-tp8-pp1/"

Config Qwen-7B pre-training script: examples/qwen/pretrain_qwen_7b_ptd.sh

 bash examples/qwen/pretrain_qwen_7b_ptd.sh

Note: If using multi machine training, and no data sharing configuration on the mechines, it's necessary to add the parameter --no-shared-storage. This parameter will determine whether non master nodes need to load data based on distributed parameters, and check the corresponding cache and generated data.

Performance

Machine performance

The performance of Qwen-7B in Ascend NPU and Reference:

Device	Model	throughput rate (tokens/s/p)
NPUs	Qwen-7B	2499
Reference	Qwen-7B	2867

Inference

Config qwen-7b inference script: tasks/inference/generate_qwen_7b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh 
 
# modify script model path and tokenizer path
CHECKPOINT="./model_weights/Qwen-7B-v0.1-tp8-pp1/"
TOKENIZER_PATH="./model_from_hf/Qwen-7B/"

Launch qwen-7b inference script: tasks/inference/generate_qwen_7b_ptd.sh

bash tasks/inference/generate_qwen_7b_ptd.sh

Some inference samples are as follows:

Evaluation

We use the CEval benchmark and MMLU benchmark to evaluate our model.

Config qwen-7b evaluation script: tasks/evaluation/evaluate_qwen_7b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

# Modify the model parameter path and vocabulary path
TOKENIZER_PATH="./model_from_hf/Qwen-7B/"  # vocabulary path
CHECKPOINT="./model_weights/Qwen-7B-v0.1-tp8-pp1/"  # parameter path

# Configure the task type and dataset path
DATA_PATH="./mmlu/data/test/"  # "./ceval/val/" for ceval task
TASK="mmlu"  # "ceval" for ceval task

Launch qwen-7b evaluation

bash ./tasks/evaluation/evaluate_qwen_7b_ptd.sh

Task	Subset	Question	OpenSource	NPU
CEval	52	1346	63.5	62.5
MMLU	57	14042	58.2	58.1

Qwen-14B

Training

Here's a hardware summary of pre-training Qwen-14B:

Hardware	Value
NPU	8 x Ascend NPUs

Script

Clone the repository to your local server:

git clone https://gitee.com/ascend/ModelLink.git
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout -f bcce6f
cp -r megatron ../ModelLink/
cd ..
cd ModelLink
git checkout 1.0
mkdir logs
mkdir model_from_hf
mkdir dataset
mkdir ckpt

Build environment

# python3.8
conda create -n test python=3.8
conda activate test

# install torch and torch_npu
pip install torch-2.1.0-cp38-cp38m-manylinux2014_aarch64.whl
pip install torch_npu-2.1.0*-cp38-cp38m-linux_aarch64.whl
pip install apex-0.1_ascend*-cp38-cp38m-linux_aarch64.whl

# install MindSpeed
git clone https://gitee.com/ascend/MindSpeed.git
cd MindSpeed
git checkout 224ae35e8fc96778f957029d1371ddb623452a50
pip install -r requirements.txt
pip install -e .
cd ..

# install other packages
pip install -r requirements.txt

Prepare pretrained weights and tokenizer Download the Qwen-14B checkpoint from here

mkdir ./model_from_hf/Qwen-14B/
cd ./model_from_hf/Qwen-14B/
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/cache_autogptq_cuda_256.cpp
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/cache_autogptq_cuda_kernel_256.cu
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/config.json
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/configuration_qwen.py
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/cpp_kernels.py
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/generation_config.json
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00001-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00002-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00003-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00004-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00005-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00006-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00007-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00008-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00009-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00010-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00011-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00012-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00013-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00014-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model-00015-of-00015.safetensors
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/model.safetensors.index.json
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/modeling_qwen.py
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/qwen.tiktoken
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/qwen_generation_utils.py
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/tokenization_qwen.py
wget https://huggingface.co/Qwen/Qwen-14B/resolve/main/tokenizer_config.json
cd../../

Modify line 39 in the modelling_qwen.py file, changing:

SUPPORT_FP16 = SUPPORT_CUDA and torch.cuda.get_device_capability(0)[0] >= 7

to：

SUPPORT_FP16 = True

Weights convert

Convert weights from huggingface format to megatron format (This scenario is generally used to train open-source HuggingFace models on Megatron)

# modify the script according to your own ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

python tools/checkpoint/util.py \
    --model-type GPT \
    --loader qwen_hf \
    --saver megatron \
    --target-tensor-parallel-size 8 \
    --load-dir ./model_from_hf/Qwen-14B/ \
    --save-dir ./model_weights/Qwen-14B-v0.1-tp8-pp1/ \
    --tokenizer-model ./model_from_hf/Qwen-14B/qwen.tiktoken \
    --add-qkv-bias

# Modify the ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh
python tools/checkpoint/util.py \
    --model-type GPT \
    --loader megatron \
    --saver megatron \
    --save-model-type save_huggingface_qwen \
    --load-dir ./model_weights/Qwen-14B-v0.1-tp8-pp1/ \
    --target-tensor-parallel-size 1 \
    --target-pipeline-parallel-size 1 \
    --add-qkv-bias \
    --save-dir ./model_from_hf/Qwen-14B/   # Fill in the original HF model path here, new weights will be saved in ./model_from_hf/Qwen-14B/mg2hg/

Prepare dataset

Download the Qwen-14B datasets from here

# download datasets
cd ./dataset
wget https://huggingface.co/datasets/tatsu-lab/alpaca/resolve/main/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet
cd ..

# process datasets  
mkdir ./dataset/Qwen-14B/
python ./tools/preprocess_data.py \
    --input ./dataset/train-00000-of-00001-a09b74b3ef9c3b56.parquet \
    --tokenizer-name-or-path ./model_from_hf/Qwen-14B/ \
    --output-prefix ./dataset/Qwen-14B/alpaca \
    --tokenizer-type PretrainedFromHF \
    --seq-length 2048 \
    --workers 4 \
    --log-interval 1000

pre-training

Config Qwen-14B pre-training script: examples/qwen/pretrain_qwen_14b_ptd.sh

 # modify the script according to your own ascend-toolkit path
 source /usr/local/Ascend/ascend-toolkit/set_env.sh 

 # modify config according to your own actual situation
 CKPT_SAVE_DIR="./ckpt/Qwen-14B/"
 TOKENIZER_MODEL="./model_from_hf/Qwen-14B/"  #tokenizer path
 DATA_PATH="./dataset/Qwen-14B/alpaca_text_document"  #processed dataset
 CKPT_LOAD_DIR="./model_weights/Qwen-14B-v0.1-tp8-pp1/"

Launch Qwen-14B pre-training script: examples/qwen/pretrain_qwen_14b_ptd.sh

 bash examples/qwen/pretrain_qwen_14b_ptd.sh

Performance

Machine performance

The performance of Qwen-14B in Ascend NPU and Reference:

Device	Model	throughput rate (tokens/s/p)
NPUs	Qwen-14B	1560
Reference	Qwen-14B	1578

Inference

Config qwen-14b inference script: tasks/inference/generate_qwen_14b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh 
 
# modify script model path and tokenizer path
CHECKPOINT="./model_weights/Qwen-14B-v0.1-tp8-pp1/"
TOKENIZER_PATH="./model_from_hf/Qwen-14B/"

Launch qwen-14b inference script: tasks/inference/generate_qwen_14b_ptd.sh

bash tasks/inference/generate_qwen_7b_ptd.sh

Some inference samples are as follows:

Evaluation

We use the CEval benchmark and MMLU benchmark to evaluate our model.

Config qwen-14b evaluation script: tasks/evaluation/evaluate_qwen_14b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

# Modify the model parameter path and vocabulary path
TOKENIZER_PATH="./model_from_hf/Qwen-14B/"  # vocabulary path
CHECKPOINT="./model_weights/Qwen-14B-v0.1-tp8-pp1/"  # parameter path

# Configure the task type and dataset path
DATA_PATH="./mmlu/data/test/"  # "./ceval/val/" for ceval task
TASK="mmlu"  # "ceval" for ceval task

Launch qwen-14b evaluation

bash ./tasks/evaluation/evaluate_qwen_14b_ptd.sh

Task	Subset	Question	OpenSource	NPU
CEval	52	1346	72.1	71.1
MMLU	57	14042	66.3	66.1

Qwen-72B

Training

Here's a hardware summary of pre-training Qwen-72B:

Hardware	Seq-length	Value
NPU	8k	64 x Ascend NPUs
NPU	32k	320 x Ascend NPUs

Script

Clone the repository to your local server:

git clone https://gitee.com/ascend/ModelLink.git
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout -f bcce6f
cp -r megatron ../ModelLink/
cd ..
cd ModelLink
git checkout 1.0
mkdir logs
mkdir model_from_hf
mkdir dataset
mkdir ckpt

Build environment

# python3.8
conda create -n test python=3.8
conda activate test

# install torch and torch_npu
pip install torch-2.1.0-cp38-cp38m-manylinux2014_aarch64.whl
pip install torch_npu-2.1.0*-cp38-cp38m-linux_aarch64.whl
pip install apex-0.1_ascend*-cp38-cp38m-linux_aarch64.whl

# install MindSpeed
git clone https://gitee.com/ascend/MindSpeed.git
cd MindSpeed
git checkout 224ae35e8fc96778f957029d1371ddb623452a50
pip install -r requirements.txt
pip install -e .
cd ..

# install other packages
pip install -r requirements.txt

Prepare pretrained weights and tokenizer Download the Qwen-72B checkpoint from here

mkdir ./model_from_hf/Qwen-72B/
cd ./model_from_hf/Qwen-72B/
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/cache_autogptq_cuda_256.cpp
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/cache_autogptq_cuda_kernel_256.cu
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/config.json
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/configuration_qwen.py
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/cpp_kernels.py
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/generation_config.json
wget https://huggingface.co/Qwen/Qwen-72B/resolve/main/model-00001-of-000082.safetensors
...
cd ../../

Modify line 39 in the modelling_qwen.py file, changing:

SUPPORT_FP16 = SUPPORT_CUDA and torch.cuda.get_device_capability(0)[0] >= 7

to：

SUPPORT_FP16 = True

Weights convert

Convert weights from huggingface format to megatron format (This scenario is generally used to train open-source HuggingFace models on Megatron)

# modify the script according to your own ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

python tools/checkpoint/util.py \
    --model-type GPT \
    --loader qwen_hf \
    --saver megatron \
    --target-tensor-parallel-size 8 \
    --load-dir ./model_from_hf/Qwen-72B/ \
    --save-dir ./model_weights/Qwen-72B-v0.1-tp8-pp1/ \
    --tokenizer-model ./model_from_hf/Qwen-72B/qwen.tiktoken \
    --add-qkv-bias

# Modify the ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh
python tools/checkpoint/util.py \
    --model-type GPT \
    --loader megatron \
    --saver megatron \
    --save-model-type save_huggingface_qwen \
    --load-dir ./model_weights/Qwen-72B-v0.1-tp8-pp1/ \
    --target-tensor-parallel-size 1 \
    --target-pipeline-parallel-size 1 \
    --add-qkv-bias \
    --save-dir ./model_from_hf/Qwen-72B/    # Fill in the original HF model path here, new weights will be saved in ./model_from_hf/Qwen-72B/mg2hg/

Prepare dataset

Download the Qwen-72B datasets from here

# download datasets
cd ./dataset
wget https://huggingface.co/datasets/tatsu-lab/alpaca/resolve/main/data/train-00000-of-00001-a09b74b3ef9c3b56.parquet
cd ..


# process datasets  
mkdir ./dataset/Qwen-72B/
python ./tools/preprocess_data.py \
    --input ./dataset/train-00000-of-00001-a09b74b3ef9c3b56.parquet \
    --tokenizer-name-or-path ./model_from_hf/Qwen-72B/ \
    --output-prefix ./dataset/Qwen-72B/alpaca \
    --tokenizer-type PretrainedFromHF \
    --seq-length 8192 \
    --workers 4 \
    --log-interval 1000

pre-training

Config Qwen-72B pre-training script: examples/qwen/pretrain_qwen_72b_ptd.sh

    # modify the script according to your own ascend-toolkit path
    source /usr/local/Ascend/ascend-toolkit/set_env.sh 

    # modify config according to your own actual situation
    CKPT_SAVE_DIR="./ckpt/Qwen-72B/"
    TOKENIZER_MODEL="./model_from_hf/Qwen-72B/"  #tokenizer path
    DATA_PATH="./dataset/Qwen-72B/alpaca_text_document"  #processed dataset
    CKPT_LOAD_DIR="./model_weights/Qwen-72B-v0.1-tp8-pp1/"

To use a 32K sequence, turn on the re-computation feature and change the value of seq-length to 32768. The parameter configuration is as follows:

--recompute-granularity full
--recompute-method block
--recompute-num-layers 80 \


 Launch Qwen-72B pre-training script: examples/qwen/pretrain_qwen_72b_ptd.sh

```shell
 bash examples/qwen/pretrain_qwen_72b_ptd.sh

Performance

Machine performance

The performance of Qwen-72B in Ascend NPU and Reference:

Device	Model	throughput rate (tokens/s/p)(8k)	throughput rate (tokens/s/p)(32k)
NPUs	Qwen-7B	285	--
Reference	Qwen-7B	345	--

Inference

Config qwen-72b inference script: tasks/inference/generate_qwen_72b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh 
 
# modify script model path and tokenizer path
CHECKPOINT="./model_weights/Qwen-72B-v0.1-tp8-pp1/"
TOKENIZER_PATH="./model_from_hf/Qwen-72B/"

Launch qwen-72b inference script: tasks/inference/generate_qwen_72b_ptd.sh

bash tasks/inference/generate_qwen_72b_ptd.sh

Some inference samples are as follows:

Evaluation

We use the CEval benchmark and MMLU benchmark to evaluate our model.

Config qwen-72b evaluation script: tasks/evaluation/evaluate_qwen_72b_ptd.sh

# ascend-toolkit path
source /usr/local/Ascend/ascend-toolkit/set_env.sh

# Modify the model parameter path and vocabulary path
TOKENIZER_PATH="./model_from_hf/Qwen-72B/"  # vocabulary path
CHECKPOINT="./model_weights/Qwen-72B-v0.1-tp8-pp1/"  # parameter path

# Configure the task type and dataset path
DATA_PATH="./mmlu/data/test/"  # "./ceval/val/" for ceval task
TASK="mmlu"  # "ceval" for ceval task

Launch qwen-72b evaluation

bash ./tasks/evaluation/evaluate_qwen_72b_ptd.sh

Task	Subset	Question	OpenSource	NPU
CEval	52	1346	83.3	81.8
MMLU	57	14042	77.4	74.6

README_en.md Unescape Escape

Qwen \color{black}{\rm\tiny{【Model}} \color{black}{\rm\tiny{contributed}} \color{black}{\rm\tiny{by}} \color{black}{\rm\tiny{Ascend】}}

Contents

Qwen-7B

Training

Script

Performance

Machine performance

Inference

Evaluation

Qwen-14B

Training

Script

Performance

Machine performance

Inference

Evaluation

Qwen-72B

Training

Script

Performance

Machine performance

Inference

Evaluation

README_en.md

Qwen `\color{black}{\rm\tiny{【Model}}` `\color{black}{\rm\tiny{contributed}}` `\color{black}{\rm\tiny{by}}` `\color{black}{\rm\tiny{Ascend】}}`