Update README.md
This commit is contained in:
parent
e4b0b04b6f
commit
8c58a91788
51
README.md
51
README.md
|
@ -1,9 +1,10 @@
|
|||
# AI-LM
|
||||
# 人工智能大模型汇总
|
||||
|
||||
## LLM
|
||||
大型语言模型(LLM)已经席卷了NLP社区和人工智能社区。下面是一个关于大型语言模型的列表,持续更新!
|
||||
## 关于LLM的里程碑论文列表
|
||||
|
||||
| 日期 | 关键词 | 组织 | 文章/博客 | 出版 |
|
||||
大型语言模型(LLM)已经席卷了NLP社区和人工智能社区。下面是一个关于大型语言模型的里程碑式论文列表:
|
||||
|
||||
| 日期 | 关键词 | 组织 | 文章 | 出版 |
|
||||
| :-----: | :------------------: | :--------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :---------: |
|
||||
| 2017-06 | Transformers | Google | [Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf) | NeurIPS |
|
||||
| 2018-06 | GPT 1.0 | OpenAI | [Improving Language Understanding by Generative Pre-Training](https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf) | |
|
||||
|
@ -49,4 +50,44 @@
|
|||
| 2023-03 | GPT 4 | OpenAI | [GPT-4 Technical Report](https://openai.com/research/gpt-4)||
|
||||
| 2023-04 | Pythia | EleutherAI et al. | [Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling](https://arxiv.org/abs/2304.01373)|ICML|
|
||||
| 2023-05 | Dromedary | CMU et al. | [Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision](https://arxiv.org/abs/2305.03047)||
|
||||
| 2023-05 | PaLM 2 | Google | [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf)||
|
||||
| 2023-05 | PaLM 2 | Google | [PaLM 2 Technical Report](https://ai.google/static/documents/palm2techreport.pdf)||
|
||||
|
||||
|
||||
## 面向代码的LLM开放模型
|
||||
|
||||
| Language Model | Release Date | Checkpoints | Paper/Blog | Params (B) | Context Length | Licence |
|
||||
| --- | --- | --- | --- | --- |----------------------------------------------------------------------------------------| --- |
|
||||
| SantaCoder | TODO | [santacoder](https://huggingface.co/bigcode/santacoder) |[SantaCoder: don't reach for the stars!](https://arxiv.org/abs/2301.03988) | 1.1 | [2048](https://huggingface.co/bigcode/santacoder/blob/main/README.md#model-summary) | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
|
||||
| StarCoder | TODO | [starcoder](https://huggingface.co/bigcode/starcoder) | [StarCoder: A State-of-the-Art LLM for Code](https://huggingface.co/blog/starcoder), [StarCoder: May the source be with you!](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | 15 | [8192](https://huggingface.co/bigcode/starcoder#model-summary) | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
|
||||
| StarChat Alpha | TODO | [starchat-alpha](https://huggingface.co/HuggingFaceH4/starchat-alpha) | [Creating a Coding Assistant with StarCoder](https://huggingface.co/blog/starchat-alpha) | 16 | [8192](https://huggingface.co/bigcode/starcoder#model-summary) | [OpenRAIL-M v1](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement) |
|
||||
| Replit Code | TODO | [replit-code-v1-3b](https://huggingface.co/replit/replit-code-v1-3b) | [Training a SOTA Code LLM in 1 week and Quantifying the Vibes — with Reza Shabani of Replit](https://www.latent.space/p/reza-shabani#details) | 2.7 | [infinity? (ALiBi)](https://huggingface.co/replit/replit-code-v1-3b#model-description) | CC BY-SA-4.0 |
|
||||
| CodeGen2 | TODO | [codegen2 1B-16B](https://github.com/salesforce/CodeGen2) | [CodeGen2: Lessons for Training LLMs on Programming and Natural Languages](https://arxiv.org/abs/2305.02309) | 1 - 16 | [2048](https://arxiv.org/abs/2305.02309) | Apache 2.0 |
|
||||
|
||||
## 面向预训练的开放LLM数据集
|
||||
|
||||
| Name | Release Date | Paper/Blog | Dataset | Tokens (T) | License |
|
||||
| --- | --- | --- | --- | --- | ---- |
|
||||
| starcoderdata | 2023/05 | [StarCoder: A State-of-the-Art LLM for Code](https://huggingface.co/blog/starcoder) | [starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata) | ? | Apache 2.0 |
|
||||
| RedPajama | 2023/04 | [RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens](https://www.together.xyz/blog/redpajama) | [RedPajama-Data](https://github.com/togethercomputer/RedPajama-Data) | 1.2 | Apache 2.0 |
|
||||
|
||||
## 面向指令调优的开放LLM数据集
|
||||
|
||||
| Name | Release Date | Paper/Blog | Dataset | Samples (K) | License |
|
||||
| --- | --- | --- | --- | --- | ---- |
|
||||
| MPT-7B-Instruct | 2023/05 | [Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs](https://www.mosaicml.com/blog/mpt-7b) | [dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) | 59 | CC BY-SA-3.0 |
|
||||
| databricks-dolly-15k | 2023/04 | [Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm) | [databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) | 15 | CC BY-SA-3.0 |
|
||||
| OIG (Open Instruction Generalist) | 2023/03 | [THE OIG DATASET](https://laion.ai/blog/oig-dataset/) | [OIG](https://huggingface.co/datasets/laion/OIG) | 44,000 | Apache 2.0 |
|
||||
|
||||
## 面向指对齐调优的开放LLM数据集
|
||||
|
||||
| Name | Release Date | Paper/Blog | Dataset | Samples (K) | License |
|
||||
| --- | --- | --- | --- | --- | ---- |
|
||||
| OpenAssistant Conversations Dataset | 2023/04 | [OpenAssistant Conversations - Democratizing Large Language Model Alignment](https://drive.google.com/file/d/10iR5hKwFqAKhL3umx8muOWSRm7hs5FqX/view) | [oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) | 161 | Apache 2.0 |
|
||||
|
||||
## 开放LLMs的评估工具
|
||||
|
||||
- [Leaderboard by lmsys.org](https://chat.lmsys.org/?leaderboard)
|
||||
- [Evals by MosaicML](https://twitter.com/jefrankle/status/1654631746506301441)
|
||||
- [Holistic Evaluation of Language Models (HELM)](https://crfm.stanford.edu/helm/latest/?groups=1)
|
||||
- [LLM-Leaderboard](https://github.com/LudwigStumpp/llm-leaderboard)
|
||||
- [TextSynth Server Benchmarks](https://bellard.org/ts_server/)
|
Loading…
Reference in New Issue