Model | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA | Avg |
---|---|---|---|---|---|---|---|---|
GPT4All-J 6B v1.0 | 73.4 | 74.8 | 63.4 | 64.7 | 54.9 | 36 | 40.2 | 58.2 |
GPT4All-J v1.1-breezy | 74 | 75.1 | 63.2 | 63.6 | 55.4 | 34.9 | 38.4 | 57.8 |
GPT4All-J v1.2-jazzy | 74.8 | 74.9 | 63.6 | 63.8 | 56.6 | 35.3 | 41 | 58.6 |
GPT4All-J v1.3-groovy | 73.6 | 74.3 | 63.8 | 63.5 | 57.7 | 35 | 38.8 | 58.1 |
GPT4All-J Lora 6B | 68.6 | 75.8 | 66.2 | 63.5 | 56.4 | 35.7 | 40.2 | 58.1 |
GPT4All LLaMa Lora 7B | 73.1 | 77.6 | 72.1 | 67.8 | 51.1 | 40.4 | 40.2 | 60.3 |
GPT4All 13B snoozy | 83.3 | 79.2 | 75 | 71.3 | 60.9 | 44.2 | 43.4 | 65.3 |
GPT4All Falcon | 77.6 | 79.8 | 74.9 | 70.1 | 67.9 | 43.4 | 42.6 | 65.2 |
Nous-Hermes | 79.5 | 78.9 | 80 | 71.9 | 74.2 | 49.2 | 46.4 | 68.6 |
Dolly 6B | 68.8 | 77.3 | 67.6 | 63.9 | 62.9 | 38.7 | 41.2 | 60.1 |
Dolly 12B | 56.7 | 75.4 | 71 | 62.2 | 64.6 | 38.5 | 40.4 | 58.4 |
Alpaca 7B | 73.9 | 77.2 | 73.9 | 66.1 | 59.8 | 43.3 | 43.4 | 62.5 |
Alpaca Lora 7B | 74.3 | 79.3 | 74 | 68.8 | 56.6 | 43.9 | 42.6 | 62.8 |
GPT-J 6.7B | 65.4 | 76.2 | 66.2 | 64.1 | 62.2 | 36.6 | 38.2 | 58.4 |
LLama 7B | 73.1 | 77.4 | 73 | 66.9 | 52.5 | 41.4 | 42.4 | 61 |
LLama 13B | 68.5 | 79.1 | 76.2 | 70.1 | 60 | 44.6 | 42.2 | 63 |
Pythia 6.7B | 63.5 | 76.3 | 64 | 61.1 | 61.3 | 35.2 | 37.2 | 56.9 |
Pythia 12B | 67.7 | 76.6 | 67.3 | 63.8 | 63.9 | 34.8 | 38 | 58.9 |
Fastchat T5 | 81.5 | 64.6 | 46.3 | 61.8 | 49.3 | 33.3 | 39.4 | 53.7 |
Fastchat Vicuña 7B | 76.6 | 77.2 | 70.7 | 67.3 | 53.5 | 41.2 | 40.8 | 61 |
Fastchat Vicuña 13B | 81.5 | 76.8 | 73.3 | 66.7 | 57.4 | 42.7 | 43.6 | 63.1 |
StableVicuña RLHF | 82.3 | 78.6 | 74.1 | 70.9 | 61 | 43.5 | 44.4 | 65 |
StableLM Tuned | 62.5 | 71.2 | 53.6 | 54.8 | 52.4 | 31.1 | 33.4 | 51.3 |
StableLM Base | 60.1 | 67.4 | 41.2 | 50.1 | 44.9 | 27 | 32 | 46.1 |
Koala 13B | 76.5 | 77.9 | 72.6 | 68.8 | 54.3 | 41 | 42.8 | 62 |
Open Assistant Pythia 12B | 67.9 | 78 | 68.1 | 65 | 64.2 | 40.4 | 43.2 | 61 |
Mosaic MPT7B | 74.8 | 79.3 | 76.3 | 68.6 | 70 | 42.2 | 42.6 | 64.8 |
Mosaic mpt-instruct | 74.3 | 80.4 | 77.2 | 67.8 | 72.2 | 44.6 | 43 | 65.6 |
Mosaic mpt-chat | 77.1 | 78.2 | 74.5 | 67.5 | 69.4 | 43.3 | 44.2 | 64.9 |
Wizard 7B | 78.4 | 77.2 | 69.9 | 66.5 | 56.8 | 40.5 | 42.6 | 61.7 |
Wizard 7B Uncensored | 77.7 | 74.2 | 68 | 65.2 | 53.5 | 38.7 | 41.6 | 59.8 |
Wizard 13B Uncensored | 78.4 | 75.5 | 72.1 | 69.5 | 57.5 | 40.4 | 44 | 62.5 |
GPT4-x-Vicuna-13b | 81.3 | 75 | 75.2 | 65 | 58.7 | 43.9 | 43.6 | 63.2 |
Falcon 7b | 73.6 | 80.7 | 76.3 | 67.3 | 71 | 43.3 | 44.4 | 65.2 |
Falcon 7b instruct | 70.9 | 78.6 | 69.8 | 66.7 | 67.9 | 42.7 | 41.2 | 62.5 |
text-davinci-003 | 88.1 | 83.8 | 83.4 | 75.8 | 83.9 | 63.9 | 51 | 75.7 |
Last active
September 26, 2024 08:58
-
-
Save CandyMi/1daf3b41b63458dcb85e09869d0c0e5e to your computer and use it in GitHub Desktop.
GPT4All Performance Benchmarks
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Model | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA | Avg | |
---|---|---|---|---|---|---|---|---|---|
GPT4All-J 6B v1.0 | 73.4 | 74.8 | 63.4 | 64.7 | 54.9 | 36 | 40.2 | 58.2 | |
GPT4All-J v1.1-breezy | 74 | 75.1 | 63.2 | 63.6 | 55.4 | 34.9 | 38.4 | 57.8 | |
GPT4All-J v1.2-jazzy | 74.8 | 74.9 | 63.6 | 63.8 | 56.6 | 35.3 | 41 | 58.6 | |
GPT4All-J v1.3-groovy | 73.6 | 74.3 | 63.8 | 63.5 | 57.7 | 35 | 38.8 | 58.1 | |
GPT4All-J Lora 6B | 68.6 | 75.8 | 66.2 | 63.5 | 56.4 | 35.7 | 40.2 | 58.1 | |
GPT4All LLaMa Lora 7B | 73.1 | 77.6 | 72.1 | 67.8 | 51.1 | 40.4 | 40.2 | 60.3 | |
GPT4All 13B snoozy | 83.3 | 79.2 | 75 | 71.3 | 60.9 | 44.2 | 43.4 | 65.3 | |
GPT4All Falcon | 77.6 | 79.8 | 74.9 | 70.1 | 67.9 | 43.4 | 42.6 | 65.2 | |
Nous-Hermes | 79.5 | 78.9 | 80 | 71.9 | 74.2 | 49.2 | 46.4 | 68.6 | |
Dolly 6B | 68.8 | 77.3 | 67.6 | 63.9 | 62.9 | 38.7 | 41.2 | 60.1 | |
Dolly 12B | 56.7 | 75.4 | 71 | 62.2 | 64.6 | 38.5 | 40.4 | 58.4 | |
Alpaca 7B | 73.9 | 77.2 | 73.9 | 66.1 | 59.8 | 43.3 | 43.4 | 62.5 | |
Alpaca Lora 7B | 74.3 | 79.3 | 74 | 68.8 | 56.6 | 43.9 | 42.6 | 62.8 | |
GPT-J 6.7B | 65.4 | 76.2 | 66.2 | 64.1 | 62.2 | 36.6 | 38.2 | 58.4 | |
LLama 7B | 73.1 | 77.4 | 73 | 66.9 | 52.5 | 41.4 | 42.4 | 61 | |
LLama 13B | 68.5 | 79.1 | 76.2 | 70.1 | 60 | 44.6 | 42.2 | 63 | |
Pythia 6.7B | 63.5 | 76.3 | 64 | 61.1 | 61.3 | 35.2 | 37.2 | 56.9 | |
Pythia 12B | 67.7 | 76.6 | 67.3 | 63.8 | 63.9 | 34.8 | 38 | 58.9 | |
Fastchat T5 | 81.5 | 64.6 | 46.3 | 61.8 | 49.3 | 33.3 | 39.4 | 53.7 | |
Fastchat Vicuña 7B | 76.6 | 77.2 | 70.7 | 67.3 | 53.5 | 41.2 | 40.8 | 61 | |
Fastchat Vicuña 13B | 81.5 | 76.8 | 73.3 | 66.7 | 57.4 | 42.7 | 43.6 | 63.1 | |
StableVicuña RLHF | 82.3 | 78.6 | 74.1 | 70.9 | 61 | 43.5 | 44.4 | 65 | |
StableLM Tuned | 62.5 | 71.2 | 53.6 | 54.8 | 52.4 | 31.1 | 33.4 | 51.3 | |
StableLM Base | 60.1 | 67.4 | 41.2 | 50.1 | 44.9 | 27 | 32 | 46.1 | |
Koala 13B | 76.5 | 77.9 | 72.6 | 68.8 | 54.3 | 41 | 42.8 | 62 | |
Open Assistant Pythia 12B | 67.9 | 78 | 68.1 | 65 | 64.2 | 40.4 | 43.2 | 61 | |
Mosaic MPT7B | 74.8 | 79.3 | 76.3 | 68.6 | 70 | 42.2 | 42.6 | 64.8 | |
Mosaic mpt-instruct | 74.3 | 80.4 | 77.2 | 67.8 | 72.2 | 44.6 | 43 | 65.6 | |
Mosaic mpt-chat | 77.1 | 78.2 | 74.5 | 67.5 | 69.4 | 43.3 | 44.2 | 64.9 | |
Wizard 7B | 78.4 | 77.2 | 69.9 | 66.5 | 56.8 | 40.5 | 42.6 | 61.7 | |
Wizard 7B Uncensored | 77.7 | 74.2 | 68 | 65.2 | 53.5 | 38.7 | 41.6 | 59.8 | |
Wizard 13B Uncensored | 78.4 | 75.5 | 72.1 | 69.5 | 57.5 | 40.4 | 44 | 62.5 | |
GPT4-x-Vicuna-13b | 81.3 | 75 | 75.2 | 65 | 58.7 | 43.9 | 43.6 | 63.2 | |
Falcon 7b | 73.6 | 80.7 | 76.3 | 67.3 | 71 | 43.3 | 44.4 | 65.2 | |
Falcon 7b instruct | 70.9 | 78.6 | 69.8 | 66.7 | 67.9 | 42.7 | 41.2 | 62.5 | |
text-davinci-003 | 88.1 | 83.8 | 83.4 | 75.8 | 83.9 | 63.9 | 51 | 75.7 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment