Skip to content

Instantly share code, notes, and snippets.

View CultriX-Github's full-sized avatar

CultriX CultriX-Github

  • Netherlands
  • 08:49 (UTC +02:00)
View GitHub Profile
@CultriX-Github
CultriX-Github / Generate-QA-Dataset.py
Created June 3, 2025 17:24
Script for QA-style dataset generation from custom data:
#!/usr/bin/env python3
"""
Refactored Q&A Dataset Generation Script
========================================
Features:
- Separate configuration for generator vs. judge (API keys, endpoints, and models).
- EnvironmentΓÇÉvariable and CLIΓÇÉdriven configuration.
- Consistent use of pathlib for file paths.
- Modular logging with debug mode.
@CultriX-Github
CultriX-Github / Tally-Multi-Vote Dataset Generation.py
Last active January 27, 2025 22:13
Tally-Multi-Vote Dataset Generation.
import os
import requests
import random
import logging
import re
import time
import json
import matplotlib
matplotlib.use('Agg') # Set the backend to 'Agg' before importing pyplot
import matplotlib.pyplot as plt
#!/bin/bash
# Functions
install_basic_packages() {
echo "Installing basic packages..."
apt update -y && apt install -y screen nano git git-lfs speedometer htop libaio-dev || {
echo "Failed to install basic packages" >&2
exit 1
}
Model AGIEval GPT4All TruthfulQA Bigbench
Llama-3.2-3B 25.76 Error: File does not exist 39.22 34.61

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 20.87 ± 2.55
acc_norm 23.23 ± 2.65
agieval_logiqa_en 0 acc 23.96 ± 1.67
Model AGIEval GPT4All TruthfulQA Bigbench
Llama-3.2-3B-DPO 27.06 Error: File does not exist 58.93 34.96

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 18.90 ± 2.46
acc_norm 20.87 ± 2.55
agieval_logiqa_en 0 acc 26.11 ± 1.72
Model AGIEval GPT4All TruthfulQA Bigbench
Llama3-8B-function-calling-uncensored-dareties 39.15 Error: File does not exist 54.99 42.52

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 24.41 ± 2.70
acc_norm 23.23 ± 2.65
agieval_logiqa_en 0 acc 34.56 ± 1.87
Model AGIEval GPT4All TruthfulQA Bigbench
Llama3-8B-function-calling-dpo-slerp 39.52 Error: File does not exist 56.01 42.8

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 25.98 ± 2.76
acc_norm 23.62 ± 2.67
agieval_logiqa_en 0 acc 38.25 ± 1.91
Model AGIEval GPT4All TruthfulQA Bigbench
Hermes-3-Llama-3.1-8B 41.51 Error: File does not exist 58.61 43.08

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 26.38 ± 2.77
acc_norm 25.20 ± 2.73
agieval_logiqa_en 0 acc 39.02 ± 1.91
Model AGIEval GPT4All TruthfulQA Bigbench
Llama3-8B-DPO 41.87 Error: File does not exist 71.38 44.5

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 21.65 ± 2.59
acc_norm 20.47 ± 2.54
agieval_logiqa_en 0 acc 40.71 ± 1.93
Model AGIEval GPT4All TruthfulQA Bigbench Average
Phi-3-mini-4k-instruct 44.44 71.88 57.77 41.9 54

AGIEval

Task Version Metric Value Stderr
agieval_aqua_rat 0 acc 29.13 ± 2.86
acc_norm 28.74 ± 2.85
agieval_logiqa_en 0 acc 42.86 ± 1.94