Model | AGIEval | GPT4All | TruthfulQA | Bigbench |
---|---|---|---|---|
Anthropic_RLFH_ORDP_40k | 30.55 | Error: File does not exist | 45.38 | 36.75 |
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 21.26 | ± | 2.57 |
acc_norm | 22.83 | ± | 2.64 | ||
agieval_logiqa_en | 0 | acc | 28.11 | ± | 1.76 |