Winny de Jong winnydejong

Data journalist and Pythonista. Uses the interwebs as her university.

winnydejong / jsons_in_pandas.py

Last active December 24, 2020 14:26

Handling jsons with Pandas

	# A shameless copy-paste from the lovely Zufanka,
	# since I always forget to look here:
	# https://gist.github.com/zufanka/39b8a55d707b3b4a2a4d369694739561#handling-jsons

	import json
	import requests
	from pandas.io.json import json_normalize

	r = requests.get("https://api.tenders.exposed/networks/58d77f85-bbc6-447d-a292-c3f17b7936b0/").text
	data = json.loads(r)

winnydejong / regex_filter_dataframe.py

Last active November 19, 2020 21:03

Regex filter: select rows if column matches regex

	import re
	import pandas as pd

	#returns rows that match regexpattern for given column
	df[df.COLUMNNAME.str.match('REGEXPATTERN', re.IGNORECASE)]

winnydejong / dates_for_archive.py

Last active January 5, 2021 09:51

Since I always forget how to set the dates as I use them in all my filenames, I made myself this gist. Will thank me later, surely. -_-

	import datetime as dt

	now = dt.datetime.now().strftime('%y%m%d %H.%M')
	today = dt.datetime.now().strftime('%y%m%d')

winnydejong / Search Jupyter Notbeooks in Terminal

Created August 19, 2019 09:03

Code to search multiple Jupyter Notebooks for a specific piece of code in the terminal

	# Thanks Grant Nestor, https://groups.google.com/d/msg/jupyter/Qi9b7z_sgRU/9npQA1zlAgAJ
	grep --include='*.ipynb' --exclude-dir='.ipynb_checkpoints' -rliw . -e 'search query'

winnydejong / list files based on filetype.py

Last active August 14, 2019 09:39

Python function to list all files with a certain extension

	# function that lists files based on filetype
	def listFiles(dr, ext):
	return glob(path.join(dr,"*.{}".format(ext)))

	# example of function: when you want to list all pdfs in working dir
	pdfs = listFiles('','pdf')

	# example of function: when you want to list all txts in home dir
	txts = listFiles('/Users/Name','txt')

winnydejong / dataExplorer.py

Created February 25, 2019 14:30

Helpful function to get datatype, count of nulls, and count of unique values for every column in a Pandas dataframe

	# Helpful function to look through the columns of a Pandas dataframe
	# By Roland Jeannier, https://medium.com/@rtjeannier/pandas-101-fbb5bf86a9bc
	def eda_helper(df):
	dict_list = []
	for col in df.columns:
	data = df[col]
	dict_ = {}
	# The null count for a column.
	dict_.update({"null_count" : data.isnull().sum()})
	# Counting the unique values in a column

winnydejong / parseXML.py

Created July 25, 2018 08:41

Parsing XMLs with XPath and ElementTree XML API

	"""
	The ElementTree documentation shows how to parse XML using XPath:
	https://docs.python.org/3.4/library/xml.etree.elementtree.html#example
	"""

	import xml.etree.ElementTree as ET

	root = ET.fromstring(countrydata)

	# Top-level elements