Skip to content

Instantly share code, notes, and snippets.

View reichaves's full-sized avatar
🎯
Focusing

Reinaldo Chaves reichaves

🎯
Focusing
View GitHub Profile
@sergiospagnuolo
sergiospagnuolo / funcao_robots.R
Created October 29, 2024 16:09
Identifica presença de parâmetros de agentes de IA em arquivos robots.txt
library(httr)
library(stringr)
# Agentes de AI, mapeados do nytimes.com/robots.txt
ai_keywords <- c(
"GPTBot", "ChatGPT-User", "PerplexityBot", "Amazonbot", "ClaudeBot",
"Omgilibot", "FacebookBot", "Applebot", "Applebot-Extended", "anthropic-ai", "Bytespider",
"Claude-Web", "YouBot", "CCBot", "Google-Extended", "Quora-Bot", "Meta-ExternalAgent"
)
@dannguyen
dannguyen / README.openai-structured-output-demo.md
Last active October 2, 2025 18:53
A basic test of OpenAI's Structured Output feature against financial disclosure reports and a newspaper's police blotter. Code examples use the Python SDK and pydantic for the schema definition.

Extracting financial disclosure reports and police blotter narratives using OpenAI's Structured Output

tl;dr this demo shows how to call OpenAI's gpt-4o-mini model, provide it with URL of a screenshot of a document, and extract data that follows a schema you define. The results are pretty solid even with little effort in defining the data — and no effort doing data prep. OpenAI's API could be a cost-efficient tool for large scale data gathering projects involving public documents.

OpenAI announced Structured Outputs for its API, a feature that allows users to specify the fields and schema of extracted data, and guarantees that the JSON output will follow that specification.

For example, given a Congressional financial disclosure report, with assets defined in a table like this:

@jsoma
jsoma / Code.gs
Created March 6, 2024 11:36
Tiny little script to help you validate LLM responses in Google Sheets
function onOpen() {
const ui = SpreadsheetApp.getUi();
// Adds a custom menu to the Google Sheets UI
ui.createMenu('Checking helper')
.addItem('Create Sample', 'showStratificationPrompt')
.addToUi();
}
function showStratificationPrompt() {
const ui = SpreadsheetApp.getUi();
@jkeefe
jkeefe / access.json
Last active May 3, 2024 19:22
Access One Bucket Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets"
],
"Resource": "arn:aws:s3:::*"
},
@jkeefe
jkeefe / cors.json
Last active October 25, 2023 22:38
Open CORS policy for AWS S3
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"GET",
"HEAD"
],
"AllowedOrigins": [
@turicas
turicas / Transcrição de textos em Português com whisper (OpenAI).ipynb
Last active September 10, 2025 18:40
Transcrição de textos em Português com whisper (OpenAI)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@johnymontana
johnymontana / 0_import.cypher
Last active April 12, 2023 15:33
NICAR 2023 Neo4j Workshop
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Parcel) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Subject) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Bill) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Committee) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Legislator) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Trip) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Organization) REQUIRE n.neo4jImportId IS UNIQUE;
CREATE CONSTRAINT IF NOT EXISTS FOR (n:Destination) REQUIRE n.neo4jImportId IS UNIQUE;
CALL apoc.import.json("https://cdn.neo4jlabs.com/data/landgraph/landgraph.json");
@fernandobarbalho
fernandobarbalho / get_cofog_data.r
Created July 14, 2021 15:02
Extração de dados do cofog diretamente da base de dados abertos do Tesouro Transparente
library(readxl)
library(ckanr)
library(purrr)
ckanr::package_search()
package<- ckanr::package_show(id= "22d13d17-bf69-4a1a-add2-25cc1e25f2d7",
url= "https://www.tesourotransparente.gov.br/ckan") #busca todos os dados do dataset que se refere aos dados de COFOG
@svavassori
svavassori / powerbi_extractor.py
Last active September 2, 2025 15:12
PowerBi Extractor in Python
import sys
import csv
import json
# Converts the JSON output of a PowerBI query to a CSV file
def extract(input_file, output_file):
input_json = read_json(input_file)
data = input_json["results"][0]["result"]["data"]
dm0 = data["dsr"]["DS"][0]["PH"][0]["DM0"]
import geopandas as gpd
import folium
import matplotlib.pyplot as plt
# read date
url = 'https://geoservicos.pbh.gov.br/geoserver/wfs?service=WFS&version=1.0.0&request=GetFeature&typeName=ide_bhgeo:BAIRRO&srsName=EPSG:31983&outputFormat=application%2Fjson'
gdf = gpd.read_file(url)
# check if data is right
fig, ax = plt.subplots(figsize=(10,10))