Maxime Labonne mlabonne

Model	Average	AGIEval	GPT4All	TruthfulQA	Bigbench
mlabonne/OmniTruthyBeagle-7B-v0 📄	57.8	45.72	77.49	76.16	50.18
mlabonne/NeuralOmniBeagle-7B-v2 📄	57.75	45.86	77.31	75.34	50.09
mlabonne/OmniBeagle-7B 📄	57.72	45.64	77.48	75.03	50.03
mlabonne/NeuralOmniBeagle-7B 📄	57.71	45.85	77.26	76.06	50.03
mlabonne/NeuralOmni-7B [📄](https://gist.github.com/mlabonne/4b5ecee86d0fd3714ba0cbd

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
zephyr-7b-alpha	38	72.24	56.06	40.57	51.72

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	20.47	±	2.54
		acc_norm	19.69	±	2.50
agieval_logiqa_en	0	acc	31.49	±	1.82

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
dolphin-2.2.1-mistral-7b	38.64	72.24	54.09	39.22	51.05

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	23.23	±	2.65
		acc_norm	21.26	±	2.57
agieval_logiqa_en	0	acc	35.48	±	1.88

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
Mistral-7B-Instruct-v0.2	38.5	71.64	66.82	42.29	54.81

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	23.62	±	2.67
		acc_norm	22.05	±	2.61
agieval_logiqa_en	0	acc	36.10	±	1.88

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
MistralTrix-v1	44.98	76.62	71.44	47.17	60.05

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	25.59	±	2.74
		acc_norm	24.80	±	2.72
agieval_logiqa_en	0	acc	37.48	±	1.90

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
zephyr-7b-beta	37.33	71.83	55.1	39.7	50.99

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	21.26	±	2.57
		acc_norm	20.47	±	2.54
agieval_logiqa_en	0	acc	33.33	±	1.85

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
openchat_3.5	42.67	72.92	47.27	42.51	51.34

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	24.02	±	2.69
		acc_norm	24.80	±	2.72
agieval_logiqa_en	0	acc	38.86	±	1.91

	# Based on younesbelkada/finetune_llama_v2.py
	# Install the following libraries:
	# pip install accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7 scipy

	from dataclasses import dataclass, field
	from typing import Optional

	import torch
	from datasets import load_dataset
	from transformers import (

	# Example usage:
	# python merge_peft.py --base_model=meta-llama/Llama-2-7b-hf --peft_model=./qlora-out --hub_id=alpaca-qlora

	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	import argparse

	def get_args():

	base_model: codellama/CodeLlama-7b-hf
	base_model_config: codellama/CodeLlama-7b-hf
	model_type: LlamaForCausalLM
	tokenizer_type: LlamaTokenizer
	is_llama_derived_model: true
	hub_model_id: EvolCodeLlama-7b

	load_in_8bit: false
	load_in_4bit: true
	strict: false