Hannes Datta hannesdatta

🚩

https://tilburgsciencehub.com

Associate Professor at Tilburg University (Quantitative Marketing)

119 followers · 22 following

Tilburg University
Tilburg, The Netherlands
https://hannesdatta.com
https://orcid.org/0000-0002-8723-6002

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

hannesdatta / hashes.ipynb

Created March 6, 2024 10:34

Anonymizing usernames for web scraping projects

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

hannesdatta / script.R

Created January 31, 2024 10:42

Code from the first session of dPrep 2024 (https://dprep.hannesdatta.com)

	# use cases

	# as a calculator
	x + 1

	# to assign variables
	x = 5

	# calculation w/ variables
	x + 5

hannesdatta / commands.txt

Created October 13, 2023 13:44

starting up R from the command line/terminal

	R --vanilla < "filename.R" # you see output on screen
	Rscript filename.R # no output, unless explicitly "print"-ed
	R -e "unlink(.)" # executes one R command
	R -e "rmarkdown::render('filename.Rmd', output_file='../paper/output/filename.pdf')"

hannesdatta / scraper.py

Last active September 14, 2023 13:42

Web Scraping Mistakes: Handling Lists in Python: Code for https://youtu.be/RV9WOlqmL3E

	# FINAL CODE
	import requests
	from bs4 import BeautifulSoup

	# Define the URL and user-agent header
	url = 'https://www.coolblue.nl/tweedekans-product/2191236'
	headers = {
	'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
	'(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
	}

hannesdatta / scripts.R

Created March 2, 2023 11:29

A data cleaning script for demonstration of a setup-input-transformation-output building block

	# Setup/initialization
	library(tidyverse)

	## Wipe any downloaded files before
	unlink('*.zip')
	unlink('*.csv')

	## Download raw data
	download.file('https://github.com/hannesdatta/course-dprep/raw/master/content/docs/tutorials/data-preparation/data_without_duplicates.zip', 'data.zip')

hannesdatta / exercises.Rmd

Created February 16, 2023 11:37

dprep-exercises-2023-02-16

	---
	title: "dPrep Tutorial"
	output: html_document
	date: "2023-02-16"
	---

	```{r setup, include=FALSE}
	knitr::opts_chunk$set(echo = TRUE)
	```

hannesdatta / books_to_scrape.ipynb

Created September 20, 2022 13:58

Getting product descriptions and unique product category links from books.toscrape.com

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

hannesdatta / exercise_3.9.ipynb

Created September 2, 2022 11:24

Solution to exercise 3.9 in my Python Bootcamp Tutorial (https://odcm.hannesdatta.com/docs/tutorials/pythonbootcamp/)

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

hannesdatta / script.R

Created September 1, 2022 09:29

R scripts written intro class to R

	# This is the R Bootcamp - demo (written by Hannes)

	1+1
	cat("Hello!")

	name <- 'Hannes'

	dir.create('data')
	dir.create('data_output')
	dir.create('documents')

hannesdatta / scrape_reddit.py

Created May 11, 2022 09:48

Searching reddit and saving search results with a web scraper

	# Setup

	# Make selenium and chromedriver work for Untappd.com

	from selenium import webdriver
	from selenium.webdriver.chrome.options import Options
	from webdriver_manager.chrome import ChromeDriverManager

	#driver = webdriver.Chrome()
	driver = webdriver.Chrome(ChromeDriverManager().install())

NewerOlder