Skip to content

Instantly share code, notes, and snippets.

View JordanLazzaro's full-sized avatar
🚀

Jordan Lazzaro JordanLazzaro

🚀
  • The Code Dungeon
View GitHub Profile
@JordanLazzaro
JordanLazzaro / grpo_demo.py
Created October 30, 2025 05:50 — forked from willccbb/grpo_demo.py
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
"""
citation:
@misc{brown2025grpodemo,
title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
author={Brown, William},
@JordanLazzaro
JordanLazzaro / simplempt.ipynb
Last active March 14, 2024 16:16
simplempt.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@JordanLazzaro
JordanLazzaro / transformersinanutshell.ipynb
Created November 14, 2022 00:37
TransformersInANutshell.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.