Model | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|
recoilme-gemma-2-9B-v0.3 | 40.71 | Error: File does not exist | 58.62 | Error: File does not exist |
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 22.44 | ± | 2.62 |
acc_norm | 24.41 | ± | 2.70 | ||
agieval_logiqa_en | 0 | acc | 36.56 | ± | 1.89 |
acc_norm | 37.33 | ± | 1.90 | ||
agieval_lsat_ar | 0 | acc | 22.17 | ± | 2.75 |
acc_norm | 21.74 | ± | 2.73 | ||
agieval_lsat_lr | 0 | acc | 44.90 | ± | 2.20 |
acc_norm | 41.96 | ± | 2.19 | ||
agieval_lsat_rc | 0 | acc | 65.43 | ± | 2.91 |
acc_norm | 61.34 | ± | 2.97 | ||
agieval_sat_en | 0 | acc | 77.67 | ± | 2.91 |
acc_norm | 76.70 | ± | 2.95 | ||
agieval_sat_en_without_passage | 0 | acc | 30.10 | ± | 3.20 |
acc_norm | 27.67 | ± | 3.12 | ||
agieval_sat_math | 0 | acc | 35.91 | ± | 3.24 |
acc_norm | 34.55 | ± | 3.21 |
Average: 40.71%
Average: Error: File does not exist%
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
truthfulqa_mc | 1 | mc1 | 40.02 | ± | 1.72 |
mc2 | 58.62 | ± | 1.51 |
Average: 58.62%
Average: Error: File does not exist%
Average score: Not available due to errors
Elapsed time: 03:33:24