Skip to content

Instantly share code, notes, and snippets.

View hannesdatta's full-sized avatar
🚩
https://tilburgsciencehub.com

Hannes Datta hannesdatta

🚩
https://tilburgsciencehub.com
View GitHub Profile
@hannesdatta
hannesdatta / download-list-of-repositories-from-github.sh
Last active April 22, 2022 08:12
mass-clone GitHub repositories and download them to separate folders
while read p; do git clone https://github.com/$p ${p%/*}; done <files.txt
@hannesdatta
hannesdatta / extract.sh
Last active January 16, 2023 11:33
Extract/unzip content of many compressed zip files to separate directories (zip/rar)
for FILE in *.zip; do unzip -j -d ${FILE%.*} $FILE; done
# for rar, first install it via `brew install rar` & provide permissions for app to launch (e.g., on Mac)
for FILE in *.rar; do unrar X $FILE -op${FILE%.*} -ep; done
@hannesdatta
hannesdatta / varcovar.R
Created April 11, 2022 13:37
From a variance-covariance matrix to a cholesky decomposition (and back)
# From a variance-covariance matrix to a cholesky decomposition (and back)
## Let's first define our var-covar matrix
varcovar = diag(3)
varcovar[1,2] <- .5
varcovar[2,1] <- .5
## Let's get our (upper) cholesky decomposition
decomp = chol(varcovar)
@hannesdatta
hannesdatta / example.ipynb
Created March 15, 2022 14:09
Scraping: Clicking on a page + separating collection of seeds from actually scraping product data
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hannesdatta
hannesdatta / attributes-tables.ipynb
Created March 15, 2022 11:44
Use webscraping to flexibly parse contents of changing product attributes (e.g., when attributes change from one product category to another; application: mediamarkt.nl)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hannesdatta
hannesdatta / data_storage_scraping.ipynb
Created March 15, 2022 09:00
data_storage_for_scraping.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hannesdatta
hannesdatta / src\make_files.R
Created March 11, 2022 07:12
Simple make workflow ("Data Prep. and Workflow Mngmt", dprep.hannesdatta.com)
dir.create('../gen/test', recursive = T)
sink('../gen/test/test.txt')
cat('This is test 1')
sink()
dir.create('../gen/test2', recursive = T)
sink('../gen/test2/test2.txt')
cat('This is test 2')
@hannesdatta
hannesdatta / makefile
Created March 10, 2022 06:29
Use R in a makefile to clean-up (temporary) directories
# Cleanup, wiping the (sub)directories output/temp/audit
wipe:
R -e "unlink('../../gen/data-preparation/output/*.*')"
R -e "unlink('../../gen/data-preparation/temp/*.*')"
R -e "unlink('../../gen/data-preparation/audit/*.*')"
@hannesdatta
hannesdatta / inside_airbnb.R
Created February 17, 2022 15:15
R code to download multiple data sets from InsideAirBnB
# Looping
for (i in 1:10) {
print(i)
}
# Using looping with return values (i.e., to "save" stuff to carry on working)
results = lapply(1:10, function(x) x*2)
# Demo to download all of Europe's listing data to R
library(googledrive)
@hannesdatta
hannesdatta / models.R
Created January 11, 2022 11:11
Handling memory issues in R: terminal use, arguments, storing model objects by regular expressions
# HOW TO MANAGE MEMORY ISSUES IN R?
# A common problem of data- and computation-intensive projects
# in R is memory management.
# Suppose you would like to estimate a series of models,
# but estimating all of them would exceed your available
# memory.
#
# One solution could be to have individual R scripts