Skip to content

Instantly share code, notes, and snippets.

View kyleavery's full-sized avatar

Kyle Avery kyleavery

View GitHub Profile
@kyleavery
kyleavery / pdf_to_md.py
Created December 1, 2024 23:53
PDF to Markdown
import os
import base64
from concurrent.futures import ThreadPoolExecutor, as_completed
import openai
from pdf2image import convert_from_path
from PIL import Image
@kyleavery
kyleavery / scrape.py
Created September 28, 2024 19:22
URLs to Markdown (Jina AI)
import requests
import re
import json
import time
INFILE = "url_list.txt"
OUTFILE = "html_content.jsonl"
LOGFILE = "failed_urls.txt"
DELAY = 5

Keybase proof

I hereby claim:

  • I am kyleavery on github.
  • I am kyleavery (https://keybase.io/kyleavery) on keybase.
  • I have a public key ASBCemRfWsmURNQxCq7h3bwpduOcmF-oZmtsHU3RfljxFgo

To claim this, I am signing this object: