Skip to content

Instantly share code, notes, and snippets.

View yanlesin's full-sized avatar

Yan Lyesin yanlesin

View GitHub Profile
@yanlesin
yanlesin / python_sec_13f_list.py
Created October 21, 2024 00:08
Python parsing of SEC 13(f) list
from poppler import load_from_file, PageRenderer
import pandas as pd
import polars as pl
import numpy as np
pdf_document = load_from_file("/Users/yanlyesin/Downloads/13flist2024q3.pdf")
df_from_pdf = pd.DataFrame(columns=['text', 'page', 'cusip', 'issuer_name', 'issuer_description', 'status', 'text_length'])
for page in range(2, pdf_document.pages):
text_from_pdf = pdf_document.create_page(page).text().split("\n")
df_text = pd.DataFrame({'text': text_from_pdf, 'page': [page] * len(text_from_pdf)})
@yanlesin
yanlesin / 13F_list_2018_Q2.R
Created October 15, 2018 16:54
Parsing list of securities for 13F Report
library(pdftools)
library(stringr)
library(dplyr)
library(openxlsx)
download.file("https://www.sec.gov/divisions/investment/13f/13flist2018q2.pdf",'13flist2018q2.pdf',mode='wb')
file <- "13flist2018q2.pdf"
text <- pdf_text(file)
pages <- length(text)

Keybase proof

I hereby claim:

  • I am yanlesin on github.
  • I am yanlyesin (https://keybase.io/yanlyesin) on keybase.
  • I have a public key ASCAsfLfenGHhinbpo_y-rMPX5-QEFJyT1d7kVFQvw6plgo

To claim this, I am signing this object: