Skip to content

Instantly share code, notes, and snippets.

View winnydejong's full-sized avatar

Winny de Jong winnydejong

View GitHub Profile
@winnydejong
winnydejong / jsons_in_pandas.py
Last active December 24, 2020 14:26
Handling jsons with Pandas
# A shameless copy-paste from the lovely Zufanka,
# since I always forget to look here:
# https://gist.github.com/zufanka/39b8a55d707b3b4a2a4d369694739561#handling-jsons
import json
import requests
from pandas.io.json import json_normalize
r = requests.get("https://api.tenders.exposed/networks/58d77f85-bbc6-447d-a292-c3f17b7936b0/").text
data = json.loads(r)
@winnydejong
winnydejong / regex_filter_dataframe.py
Last active November 19, 2020 21:03
Regex filter: select rows if column matches regex
import re
import pandas as pd
#returns rows that match regexpattern for given column
df[df.COLUMNNAME.str.match('REGEXPATTERN', re.IGNORECASE)]
@winnydejong
winnydejong / dates_for_archive.py
Last active January 5, 2021 09:51
Since I always forget how to set the dates as I use them in all my filenames, I made myself this gist. Will thank me later, surely. -_-
import datetime as dt
now = dt.datetime.now().strftime('%y%m%d %H.%M')
today = dt.datetime.now().strftime('%y%m%d')
@winnydejong
winnydejong / Search Jupyter Notbeooks in Terminal
Created August 19, 2019 09:03
Code to search multiple Jupyter Notebooks for a specific piece of code in the terminal
# Thanks Grant Nestor, https://groups.google.com/d/msg/jupyter/Qi9b7z_sgRU/9npQA1zlAgAJ
grep --include='*.ipynb' --exclude-dir='.ipynb_checkpoints' -rliw . -e 'search query'
@winnydejong
winnydejong / list files based on filetype.py
Last active August 14, 2019 09:39
Python function to list all files with a certain extension
# function that lists files based on filetype
def listFiles(dr, ext):
return glob(path.join(dr,"*.{}".format(ext)))
# example of function: when you want to list all pdfs in working dir
pdfs = listFiles('','pdf')
# example of function: when you want to list all txts in home dir
txts = listFiles('/Users/Name','txt')
@winnydejong
winnydejong / dataExplorer.py
Created February 25, 2019 14:30
Helpful function to get datatype, count of nulls, and count of unique values for every column in a Pandas dataframe
# Helpful function to look through the columns of a Pandas dataframe
# By Roland Jeannier, https://medium.com/@rtjeannier/pandas-101-fbb5bf86a9bc
def eda_helper(df):
dict_list = []
for col in df.columns:
data = df[col]
dict_ = {}
# The null count for a column.
dict_.update({"null_count" : data.isnull().sum()})
# Counting the unique values in a column
@winnydejong
winnydejong / parseXML.py
Created July 25, 2018 08:41
Parsing XMLs with XPath and ElementTree XML API
"""
The ElementTree documentation shows how to parse XML using XPath:
https://docs.python.org/3.4/library/xml.etree.elementtree.html#example
"""
import xml.etree.ElementTree as ET
root = ET.fromstring(countrydata)
# Top-level elements