Skip to content

Instantly share code, notes, and snippets.

@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active October 2, 2025 18:53
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@rain-1
rain-1 / GPT-4 Reverse Turing Test.md
Last active July 10, 2025 23:15
GPT-4 Reverse Turing Test

The reverse turing test

I asked GPT-4 to come up with 10 questions to determine if the answerer was AI or human.

I provided my own answers for these questions and I also asked ChatGPT to answer them.

The result is that GPT-4 was able to correctly differentiate between AI and Human.

@aparrish
aparrish / spacy_intro.ipynb
Last active March 14, 2025 21:43
NLP Concepts with spaCy. Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@syllog1sm
syllog1sm / gist:10343947
Last active August 1, 2025 16:40
A simple Python dependency parser
"""A simple implementation of a greedy transition-based parser. Released under BSD license."""
from os import path
import os
import sys
from collections import defaultdict
import random
import time
import pickle
SHIFT = 0; RIGHT = 1; LEFT = 2;
@arjunvenkat
arjunvenkat / scraper_lab_p2.rb
Created December 12, 2012 17:48
scrape multiple pages using Nokogiri and Mechanize
# nokogiri requires open-uri
require 'nokogiri'
require 'open-uri'
# csv will be used to export data
require 'csv'
require 'mechanize'
# pp is useful to display mechanize objects
require 'pp'