Skip to content

Instantly share code, notes, and snippets.

View neubig's full-sized avatar

Graham Neubig neubig

View GitHub Profile
@neubig
neubig / get_citations.py
Created October 21, 2021 22:39
Comparing citations between EMNLP 2020 and EMNLP 2020 findings
import requests
import sys
import time
sleep_time = 20
def query_api(url, session):
global sleep_time
time.sleep(sleep_time / 1000.0)
r = session.get(url)
while r.status_code == 429:
@neubig
neubig / dispatch_openai_requests.py
Last active April 17, 2026 15:23
A simple script to get results from the OpenAI Asynchronous API
# NOTE:
# You can find an updated, more robust and feature-rich implementation
# in Zeno Build
# - Zeno Build: https://github.com/zeno-ml/zeno-build/
# - Implementation: https://github.com/zeno-ml/zeno-build/blob/main/zeno_build/models/providers/openai_utils.py
import openai
import asyncio
from typing import Any
@neubig
neubig / openscholar_summary.txt
Created December 9, 2024 17:23
openscholar_summary.txt
OpenScholar is a retrieval-augmented language model that assists researchers in synthesizing scientific literature. The system uses a database of 45 million open-access papers to provide citation-backed responses to queries, accurately identifying relevant passages and generating reliable answers across multiple scientific domains. This approach addresses the growing challenge of keeping up with rapidly expanding scientific literature.
The researchers developed ScholarQABench, a multi-domain benchmark for evaluating literature search capabilities, with 2,967 expert-written queries and 208 detailed answers across computer science, physics, neuroscience, and biomedicine. In testing, OpenScholar-8B outperformed GPT-4o by 5% and PaperQA2 by 7% in correctness metrics, despite being a smaller, open model.
Citation accuracy stands as a key strength of OpenScholar. While GPT-4o shows concerning citation hallucination rates of 78-90%, OpenScholar matches human expert-level accuracy in citation verification. The syst
@neubig
neubig / create_naacl2025_calendar_files.py
Last active April 16, 2025 12:33
Create ics format calendar files for NAACL 2025
# Export your NAACL 2025 events into ics format
# 1. Go to the NAACL schedule and download it as a csv: https://docs.google.com/spreadsheets/d/1SXIF0ovLudQ4UvR0nTyagDcgnn9zdulhUY578mvQpRk/edit?gid=1679189789#gid=1679189789
# 2. Change MY_NAME below and run the program
# 3. Go to "Settings" in Google Calendar, click Import/Export, and import the file to your calendar
MY_NAME = "Neubig"
import csv
import os
from datetime import datetime
@neubig
neubig / windows-openhands-setup.md
Created May 23, 2025 17:28
windows-openhands-setup.md

Running OpenHands GUI on Windows

This guide provides step-by-step instructions for running OpenHands on a Windows machine without using WSL or Docker.

Prerequisites

  1. Windows 10/11 - A modern Windows operating system
  2. PowerShell 5.1 or PowerShell 7+ - Windows PowerShell comes pre-installed on Windows 10/11, but PowerShell 7+ is recommended for better compatibility
  3. Python 3.12 - Python 3.12 is required (Python 3.14 is not supported due to pythonnet compatibility)
  4. Git - For cloning the repository and version control
@neubig
neubig / neubig_prs_analysis.py
Created August 15, 2025 13:58
A script to analyze the number of PRs created by `neubig` and the number that openhands contributed to
import os, csv, requests, datetime as dt, calendar, time, random
from collections import OrderedDict
import argparse
import matplotlib.pyplot as plt
BASE_SEARCH_URL = 'https://api.github.com/search/issues'
def parse_link_header(value: str):
links = {}
@neubig
neubig / classify_affiliations.py
Created October 18, 2025 21:12
Plot affiliations of people publishing at ICLR/ICML/NeurIPS
# Get the data from here: https://github.com/martenlienen/icml-neurips-iclr-dataset
import pandas as pd
import re
from collections import Counter
import matplotlib.pyplot as plt
def classify_affiliation(affiliation):
"""Classify an affiliation into university, industry, or other."""
if pd.isna(affiliation) or affiliation == 'None' or affiliation.strip() == '':
@neubig
neubig / fusion_harness_example.py
Created June 29, 2026 23:28
Fusion Harness: How to combine a more expensive main model and a sidekick model
"""Fusion-style delegation harness built with the OpenHands SDK.
Install:
uv pip install openhands-sdk openhands-tools
Run:
export LLM_API_KEY="..." # or export OPENHANDS_API_KEY="..."
export MAIN_MODEL="openhands/gpt-5.5"
export SIDEKICK_MODEL="openhands/minimax-m2.7"
uv run python fusion_harness_example.py "Find and fix the failing tests in this repo."