This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/python3 | |
# Copy contents of a XSFP music playlist to a target folder | |
# | |
# required two arguments: path to xspf file, target path | |
# requires Python >= 3.8 | |
# | |
# author: Markus Konrad <[email protected]> | |
import os.path | |
import sys |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Transfer all GitLab projects from the user authenticated with a supplied private access token (PAT) to a new | |
namespace (i.e. a group with a group ID). | |
To generate a PAT, log in to your GitLab account and go to "User settings > Access tokens". | |
To find out the ID of a group to which you want to transfer the projects, go to the group's page. The group ID is shown | |
under the title of the group. | |
Requirements: Python 3 with requests package installed (tested with Python 3.8 and requests 2.27.1). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Voronoi regions of schools in East Germany. | |
An example using the geovoronoi package (https://pypi.org/project/geovoronoi/). | |
Feb. 2021 | |
Markus Konrad <[email protected]> | |
""" | |
import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Sample scripts for blog post "Robust data collection via web scraping and web APIs" | |
(https://datascience.blog.wzb.eu/2020/12/01/robust-data-collection-via-web-scraping-and-web-apis/). | |
Script 1. Starting point – baseline (unreliable) web scraping script. | |
December 2020, Markus Konrad <[email protected]> | |
""" | |
from datetime import datetime, timedelta |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def word_cooccurrence(dtm): | |
""" | |
Calculate the co-document frequency (aka word co-occurrence) matrix for a document-term matrix `dtm`, i.e. how often | |
each pair of tokens occurs together at least once in the same document. | |
:param dtm: (sparse) document-term-matrix of size NxM (N docs, M is vocab size) with raw term counts. | |
:return: co-document frequency (aka word co-occurrence) matrix with shape MxM |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def str_multisplit(s, sep): | |
""" | |
Split string `s` by all characters/strings in `sep`. | |
:param s: a string to split | |
:param sep: sequence or set of characters to use for splitting | |
:return: list of split string parts | |
""" | |
if not isinstance(s, (str, bytes)): | |
raise ValueError('`s` must be of type `str` or `bytes`') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Source for blog post "Zooming in on maps with sf and ggplot2" | |
# URL: https://datascience.blog.wzb.eu/2019/04/30/zooming-in-on-maps-with-sf-and-ggplot2/ | |
# | |
# Markus Konrad <[email protected]> | |
# Wissenschaftszentrum Berlin für Sozialforschung | |
# April 30, 2019 | |
# | |
#### world map #### |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Plot a network graph of nodes with geographic coordinates on a map. | |
# | |
# Author: Markus Konrad <[email protected]> | |
# May 2018 | |
# | |
# This script shows three ways of plotting a network graph on a map. | |
# The following information should be visualized (with the respective | |
# aestethics added): | |
# | |
# * graph nodes with: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Runtime optimization through vectorization and parallelization. | |
Script 3: Parallel and vectorized calculation of haversine distance. | |
Please note that this might be slower than the single-core vectorized version because of the overhead that is caused | |
by multiprocessing. | |
January 2018 | |
Markus Konrad <[email protected]> | |
""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Create a "balloon plot" as alternative to a heatmap with ggplot2 | |
# | |
# January 2017 | |
# Author: Markus Konrad <[email protected]>, WZB Berlin Social Science Center | |
library(dplyr) | |
library(tidyr) | |
library(ggplot2) | |
# define the variables that will be displayed in the columns |
NewerOlder