This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
def word_cooccurrence(dtm): | |
""" | |
Calculate the co-document frequency (aka word co-occurrence) matrix for a document-term matrix `dtm`, i.e. how often | |
each pair of tokens occurs together at least once in the same document. | |
:param dtm: (sparse) document-term-matrix of size NxM (N docs, M is vocab size) with raw term counts. | |
:return: co-document frequency (aka word co-occurrence) matrix with shape MxM |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Sample scripts for blog post "Robust data collection via web scraping and web APIs" | |
(https://datascience.blog.wzb.eu/2020/12/01/robust-data-collection-via-web-scraping-and-web-apis/). | |
Script 1. Starting point – baseline (unreliable) web scraping script. | |
December 2020, Markus Konrad <[email protected]> | |
""" | |
from datetime import datetime, timedelta |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Voronoi regions of schools in East Germany. | |
An example using the geovoronoi package (https://pypi.org/project/geovoronoi/). | |
Feb. 2021 | |
Markus Konrad <[email protected]> | |
""" | |
import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Transfer all GitLab projects from the user authenticated with a supplied private access token (PAT) to a new | |
namespace (i.e. a group with a group ID). | |
To generate a PAT, log in to your GitLab account and go to "User settings > Access tokens". | |
To find out the ID of a group to which you want to transfer the projects, go to the group's page. The group ID is shown | |
under the title of the group. | |
Requirements: Python 3 with requests package installed (tested with Python 3.8 and requests 2.27.1). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/python3 | |
# Copy contents of a XSFP music playlist to a target folder | |
# | |
# required two arguments: path to xspf file, target path | |
# requires Python >= 3.8 | |
# | |
# author: Markus Konrad <[email protected]> | |
import os.path | |
import sys |
OlderNewer