Skip to content

Instantly share code, notes, and snippets.

@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active April 28, 2025 03:31
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@Impact123
Impact123 / Check Space.md
Last active May 5, 2025 15:47
Check Space

How to check HAOS space usage

The goal of this Article is to teach you how to find out what takes how much space on your HAOS system.
At the end of this article you should have an interactive way to explore your storage similar to this.
Animation

To check the recorder database size/stats I recommend the DBStats addon/container.

Table of contents