Skip to content

Instantly share code, notes, and snippets.

View kventil's full-sized avatar
🐌

Robert Hibbeler kventil

🐌
View GitHub Profile
@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active November 5, 2024 01:58
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

anonymous
anonymous / GAME_MASTER_v0_1.protobuf
Created July 16, 2016 16:31
Pokemon Go decoded GAME_MASTER protobuf file v0.1
Result: 1
Items {
TemplateId: "BADGE_BATTLE_ATTACK_WON"
Badge {
BadgeType: BADGE_BATTLE_ATTACK_WON
BadgeRanks: 4
Targets: "\nd\350\007"
}
}
Items {
@kventil
kventil / Things to ask
Last active January 27, 2016 09:03
Interview Fragen an das Unternehmen
Fragen an alle:
1. Was habt ihr gestern gemacht? -> Was habt ihr umgesetzt?
2. Wann ist euer nächstes großes Release?
3. Dokumentation?
4. Was ist das größte Problem an dem ihr gerade arbeitet?
5. Bisher Coolstes/Arbeitsintensivstes Feature?
6. Wie steht ihr zu Büro-Hunden? / Hat jemand eine Hundeallergie?
7. Wie sahen eure Firmenevents bisher aus?
8. Gröster Durchbruch bisher?