Update README.md

2023-05-12 12:01:43 +08:00 · 2023-05-12 12:01:43 +08:00 · 8c58a91788
parent e4b0b04b6f
commit 8c58a91788
1 changed files with 46 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -1,9 +1,10 @@
-# AI-LM
+# 人工智能大模型汇总

-## LLM
-大型语言模型(LLM)已经席卷了NLP社区和人工智能社区。下面是一个关于大型语言模型的列表，持续更新！
+## 关于LLM的里程碑论文列表

-| 日期  |       关键词       |   组织    | 文章/博客                                                                                                                                                                               | 出版 |
+大型语言模型(LLM)已经席卷了NLP社区和人工智能社区。下面是一个关于大型语言模型的里程碑式论文列表：
+
+| 日期  |       关键词       |   组织    | 文章                                                                                                                                                                               | 出版 |
 | :-----: | :------------------: | :--------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------: |
 | 2017-06 |     Transformers     |      Google      | [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)                                                                                                                      |   NeurIPS   |
 | 2018-06 |       GPT 1.0       |      OpenAI      | [Improving Language Understanding by Generative Pre-Training](https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf)                                                 |            |
@ -49,4 +50,44 @@
 | 2023-03 | GPT 4 | OpenAI | [GPT-4 Technical Report](https://openai.com/research/gpt-4)||
 | 2023-04 | Pythia | EleutherAI et al. | [Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling](https://arxiv.org/abs/2304.01373)|ICML|
 | 2023-05 | Dromedary | CMU et al. | [Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision](https://arxiv.org/abs/2305.03047)||
-| 2023-05 | PaLM 2 | Google | [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf)||
+| 2023-05 | PaLM 2 | Google | [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf)||
+                                                        
+   
+## 面向代码的LLM开放模型 
+
+| Language Model | Release Date | Checkpoints | Paper/Blog | Params (B) | Context Length                                                                         | Licence |
+| --- | --- | --- | --- | --- |----------------------------------------------------------------------------------------| --- |
+| SantaCoder | TODO | [santacoder](https://huggingface.co/bigcode/santacoder) |[SantaCoder: don't reach for the stars!](https://arxiv.org/abs/2301.03988) | 1.1 | [2048](https://huggingface.co/bigcode/santacoder/blob/main/README.md#model-summary)                | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
+| StarCoder | TODO | [starcoder](https://huggingface.co/bigcode/starcoder) | [StarCoder: A State-of-the-Art LLM for Code](https://huggingface.co/blog/starcoder), [StarCoder: May the source be with you!](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | 15 | [8192](https://huggingface.co/bigcode/starcoder#model-summary)                         | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
+| StarChat Alpha | TODO | [starchat-alpha](https://huggingface.co/HuggingFaceH4/starchat-alpha) | [Creating a Coding Assistant with StarCoder](https://huggingface.co/blog/starchat-alpha) | 16 | [8192](https://huggingface.co/bigcode/starcoder#model-summary)  | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
+| Replit Code | TODO | [replit-code-v1-3b](https://huggingface.co/replit/replit-code-v1-3b) | [Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit](https://www.latent.space/p/reza-shabani#details) | 2.7 | [infinity? (ALiBi)](https://huggingface.co/replit/replit-code-v1-3b#model-description) | CC BY-SA-4.0 |
+ | CodeGen2 | TODO | [codegen2 1B-16B](https://github.com/salesforce/CodeGen2) | [CodeGen2: Lessons for Training LLMs on Programming and Natural Languages](https://arxiv.org/abs/2305.02309) | 1 - 16 | [2048](https://arxiv.org/abs/2305.02309)                                               | Apache 2.0 |
+
+## 面向预训练的开放LLM数据集   
+
+| Name | Release Date | Paper/Blog | Dataset | Tokens (T) | License |
+| --- | --- | --- | --- | --- | ---- | 
+| starcoderdata | 2023/05 | [StarCoder: A State-of-the-Art LLM for Code](https://huggingface.co/blog/starcoder) | [starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata) | ? | Apache 2.0 |
+| RedPajama | 2023/04 | [RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens](https://www.together.xyz/blog/redpajama) | [RedPajama-Data](https://github.com/togethercomputer/RedPajama-Data) | 1.2 | Apache 2.0 |
+
+## 面向指令调优的开放LLM数据集
+
+| Name | Release Date |  Paper/Blog | Dataset | Samples (K) | License |
+| --- | --- | --- | --- | --- | ---- | 
+| MPT-7B-Instruct | 2023/05 | [Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](https://www.mosaicml.com/blog/mpt-7b) | [dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) | 59 | CC BY-SA-3.0 |
+| databricks-dolly-15k | 2023/04 | [Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm) |  [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) | 15 |  CC BY-SA-3.0 |
+| OIG (Open Instruction Generalist)   | 2023/03 | [THE OIG DATASET](https://laion.ai/blog/oig-dataset/) | [OIG](https://huggingface.co/datasets/laion/OIG) | 44,000 | Apache 2.0 |
+
+## 面向指对齐调优的开放LLM数据集
+
+| Name | Release Date |  Paper/Blog | Dataset | Samples (K) | License |
+| --- | --- | --- | --- | --- | ---- |
+| OpenAssistant Conversations Dataset | 2023/04 | [OpenAssistant Conversations - Democratizing Large Language Model Alignment](https://drive.google.com/file/d/10iR5hKwFqAKhL3umx8muOWSRm7hs5FqX/view) | [oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) | 161 | Apache 2.0 |
+
+## 开放LLMs的评估工具
+
+- [Leaderboard by lmsys.org](https://chat.lmsys.org/?leaderboard)
+- [Evals by MosaicML](https://twitter.com/jefrankle/status/1654631746506301441)
+- [Holistic Evaluation of Language Models (HELM)](https://crfm.stanford.edu/helm/latest/?groups=1)
+- [LLM-Leaderboard](https://github.com/LudwigStumpp/llm-leaderboard)
+- [TextSynth Server Benchmarks](https://bellard.org/ts_server/)