Skip to content

Instantly share code, notes, and snippets.

View mgao6767's full-sized avatar
😀

Adrian Gao mgao6767

😀
View GitHub Profile
Running 0.00 km ░░░░░░░░░░░░░░░░░░░ 0.00/h
Cycling 2154.77 km ███████████████████ 29.09/h
Last month 519.12 km 1 achievement 18:33h
@mgao6767
mgao6767 / hist_hq_state_zipcode_from_8k.py
Created March 19, 2023 22:42
Script to get firm historical HQ state and zipcode from 8K filings. See https://mingze-gao.com/posts/firm-historical-headquarter-state-from-10k/
import sqlite3, sys, os, pathlib, logging
import pandas as pd
# fmt: off
states = [ 'AK', 'AL', 'AR', 'AZ', 'CA', 'CO', 'CT', 'DC', 'DE', 'FL', 'GA',
'HI', 'IA', 'ID', 'IL', 'IN', 'KS', 'KY', 'LA', 'MA', 'MD', 'ME',
'MI', 'MN', 'MO', 'MS', 'MT', 'NC', 'ND', 'NE', 'NH', 'NJ', 'NM',
'NV', 'NY', 'OH', 'OK', 'OR', 'PA', 'RI', 'SC', 'SD', 'TN', 'TX',
'UT', 'VA', 'VT', 'WA', 'WI', 'WV', 'WY']
# fmt: on
@mgao6767
mgao6767 / Shumway.sas
Last active August 27, 2024 21:46
Original SAS code in Bharath and Shumway (2008 RFS) for Merton DD. See comment below for some minor issues.
/* This SAS program calculates the distance to default using the KMV-Merton model
with the iterated estimate of the volatility of firm value. Many of the results
of Bharath and Shumway (2004) are generated by this program. The program
requires the data described below, and it generates a permanent sas data file
called ssd.kmv which contains distances to default. The program calculates
monthly distances to default every year from 1980 to 1990 as it is currently written*/
/* This program requires two datasets:
ssd.comp, which contains monthly observations taken from the quarterly compustat
@mgao6767
mgao6767 / shared_memory_test.py
Created June 8, 2020 13:23
An example of multiprocessing.shared_memory
from multiprocessing.shared_memory import SharedMemory
from multiprocessing.managers import SharedMemoryManager
from concurrent.futures import ProcessPoolExecutor, as_completed
from multiprocessing import current_process, cpu_count, Process
from datetime import datetime
import numpy as np
import pandas as pd
import tracemalloc
import time
@mgao6767
mgao6767 / Bitcoin Address Generator in Obfuscated Python.py
Created February 24, 2019 04:43
Bitcoin Address Generator in Obfuscated Python
_ =r"""A(W/2,*M(3*G
*G*V(2*J%P),G,J,G)+((M((J-T
)*V((G-S)%P),S,T,G)if(S@(G,J))if(
W%2@(S,T)))if(W@(S,T);H=2**256;import&h
ashlib&as&h,os,re,bi nascii&as&k;J$:int(
k.b2a_hex(W),16);C$:C (W/ 58)+[W%58]if(W@
[];X=h.new("rip em d160");Y$:h.sha25
6(W).digest();I$ d=32:I(W/256,d-1)+
chr(W%256)if(d>0@""; U$:J(k.a2b_base
64(W));f=J(os.urando m(64)) %(H-U("AUVRIxl
@mgao6767
mgao6767 / discretionary accruals.sas
Last active March 28, 2024 00:23
Compute 5 measures of firm-year discretionary accruals.
/* Use Jackknife method to compute discretionary accruals */
/* see https://mingze-gao.com/posts/compute-jackknife-coefficient-estimates-in-sas/ */
/* UseHribarCollinsTotalAccruals:
- true: use Hribar-Collins Cashflow Total Accruals
- false: use normal method */
%let UseHribarCollinsTotalAccruals = false;
/* Include %array and %do_over */
filename do_over url "https://mingze-gao.com/utils/do_over.sas";
@mgao6767
mgao6767 / Computing Jackknifed Coefficient Estimates in SAS.md
Last active June 10, 2020 04:23
Computing Jackknifed Coefficient Estimates in SAS

Background

In certain scenarios, we want to estimate a model's parameters on the sample for each observation with itself excluded. This can be achieved by estimating the model repeatedly on the leave-one-out samples but is very inefficient. If we estimate the model on the full sample, however, the coefficient estimates will certainly be biased. Thankfully, we have the Jackknife method to correct for the bias, which produces the (Jackknifed) coefficient estimates for each observation.

Variable Definition

Let's start with some variable definitions to help with the explanation.

Variable Definition
@mgao6767
mgao6767 / common financial ratios.sas
Created January 27, 2019 04:24
Computes a broad range of financial ratios at both firm and the industry level using Fama-French industry classification.
/* ********************************************************************************* */
/* ******************** W R D S R E S E A R C H M A C R O S ******************** */
/* ********************************************************************************* */
/* WRDS Macro: INDRATIOS */
/* Summary : Computes a broad range of financial ratios aggregated at */
/* the industry level using Fama-French industry classification */
/* Date : Apr, 2009 */
/* Modified : Nov, 2010 */
/* Author : Denys Glushkov, WRDS */
/* Parameters: */
@mgao6767
mgao6767 / Industry Classification.sas
Created January 27, 2019 00:10
Constructs 4 different industry classifications based on SIC, NAICS, GICS and Fama-French industry classifications
/* ********************************************************************************* */
/* ******************** W R D S R E S E A R C H M A C R O S ******************** */
/* ********************************************************************************* */
/* WRDS Macro: INDCLASS */
/* Summary : Constructs 4 different industry classifications based on SIC, NAICS, */
/* GICS and Fama-French industry classifications */
/* */
/* Date : Feb, 2010 */
/* Author : Denys Glushkov, WRDS */
/* Variables : */
@mgao6767
mgao6767 / winsorize.sas
Created January 26, 2019 23:56
WRDS Macros: WINSORIZE
/* ********************************************************************************* */
/* ******************** W R D S R E S E A R C H M A C R O S ******************** */
/* ********************************************************************************* */
/* WRDS Macro: WINSORIZE */
/* Summary : Winsorizes or Trims Outliers */
/* Date : April 14, 2009 */
/* Author : Rabih Moussawi, WRDS */
/* Variables : - INSET and OUTSET are input and output datasets */
/* - SORTVAR: sort variable used in ranking */
/* - VARS: variables to trim and winsorize */