Model | Arch | License | Params | Seq Len | FP Format | VRAM Infer | Lib | Tokenizer | Comments | Other flavours |
---|---|---|---|---|---|---|---|---|---|---|
bigcode/starcoder | GPT | OpenRAIL-Mv1 | 15B | 8k | fp32 | 60Gb ~30Gb fp/bf16 ~16Gb 8bit ~8Gb 4bit |
Megatron-LM fork | GPT2Tokenizer 49152 | FlashAttn, MQA, FIM, 1T=250Bx4 tokens of StarCoderData | starcoderplus, starchat-beta (humanEval 📉) |
Salesforce/codegen2 | GPT (J) | Apache 2.0 | 1B, 3.7B 7B, 16B |
1k | 7b fp32 28Gb | JaxFormer | GPT2Tokenizer 51200 | RoPE, FIM, The Stack dedup v1.1, | -instruct research-only | |
Salesforce/codegen2.5 | LLaMA | Apache 2.0 | 7B | 2k | fp32 28Gb | JaxFormer | Tiktoken 51200 | FlashAttn, Triton, FIM, 1.4T=300Bx4+ tokens of StarCoderData | -mono Python, -instruct research-only | |
Salesforce/xgen-7b-8k-base | LLaMA | Apache 2.0 | 7B | 8k | Tiktoken | -inst research only | ||||
OpenLLaMA v2 | LLaMA | Apache 2.0 | 3B,7B 13B |
2k | 7b fp16 14Gb | PyTorch/HF, JAX/EasyLM |
HF (fast) tokenizer 32k | 1T tokens of RedPajama + StarCoderData + Falcon | ||
LLaMA 2 | LLaMA | Fb CLA (<700M MAU, no knowlege distilation) | 3B,13B 70B |
4k | 7b fp16 14Gb | PyTorch/HF | SentencePeice 32k + digits (LlamaTokenizer) | RoPE, grouped-query attention (GQA) in 70B, 2T tokens |
-chat | |
CodeLlama | LLaMA | Fb CLA | 7B,13B 34B |
4k-16k (HF) | 7b bf16 14Gb | PyTorch, HF (partial from 4.33) | SentencePiece 32k + FIM (LlamaCodeTokenizer) |
RoPE (+scaling), FIM | -Python, -Instruct | |
Mistral 7b v0.1 | Apache 2.0 | 7B | 8k | PyTorch/HF xFormers | SentencePiece 32k (LalamaTokenizer) | GQA, SWA, FlashAttn 2 | -Instruct | |||
replit-code-v1-3b | MPT | CC BY-SA 4.0 | 2.7B | 2k | fp32 10Gb | Mosiac LLM Foundry | SentencePiece 32768 | FlashAttn, Triton, AliBi, FasterTransformer, The Stack dedup v1.2 | ||
MPT | MPT | Apache 2.0 | 7B, 30B | 8k | LLM Foundry | FlashAttn, Titon, ALiBi, FT 1T tokens |
-instruct CC-By-SA-3.0 -chat CC-By-NC-SA-4.0 | |||
Falcon | GPT (RW) | Apache 2.0 | 7B, 40B | 2k | bf16 | 7B ~15Gb 40B ~90Gb |
TokenizerFast 65024 | 1T RefinedWeb, GPTQ | -instruct Apach 2.0 (!) | |
StableLM | GPT (NeoX) | CC-By-SA-4.0 | 3B, 7B | 4k | fp32 16Gb | 3b fp16 12Gb 7b 24Gb 7b |
GPT-NeoX | RoPE, 1.5T tokens of The Pile | -tuned CC-By-NC-SA-4.0 | |
RedPajama-INCITE-Base-7B-v0.1 | GPT (NeoX) | Apache 2.0 | 7B | 2k | fp16 | 16Gb | 50432 | RoPE | base, instruct, chat |
Last active
October 11, 2023 14:31
-
-
Save bzz/314c02023589bdf0d54d7a7864025c1d to your computer and use it in GitHub Desktop.
Commercial-friendly, permissively licensed Open Source Large Language Models
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment