Skip to content

Instantly share code, notes, and snippets.

@pr130
Created October 10, 2020 13:37
Show Gist options
  • Save pr130/5165ca90088ea1c7b873edf7ddb951e3 to your computer and use it in GitHub Desktop.
Save pr130/5165ca90088ea1c7b873edf7ddb951e3 to your computer and use it in GitHub Desktop.
Clean csvs exported from DKB online banking
---
title: "DKB Export Cleanup"
author: "Frie"
date: "10/10/2020"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(purrr)
library(dplyr)
library(stringr)
library(lubridate)
library(tidyr)
library(readr)
library(janitor)
```
```{r}
# we have data from october 2017 until today as csv :tada:
dkb_credit <- read.delim("data/dkb/credit/2017-10-2020-10.csv", sep = ";", skip = 7, fileEncoding = "latin1", stringsAsFactors = FALSE)
dkb_debit <- read.delim("data/dkb/giro/2017-10-2020-10.csv", sep = ";", skip = 6, fileEncoding = "latin1", stringsAsFactors = FALSE)
```
```{r dkb-credit}
dkb_credit <- dkb_credit %>%
clean_names() %>%
rename(date_billed = belegdatum,
date_booked = wertstellung,
not_included_in_saldo = umsatz_abgerechnet_und_nicht_im_saldo_enthalten,
what = beschreibung,
amount_eur = betrag_eur,
amount_orig = ursprunglicher_betrag) %>%
select(-x) %>%
mutate(across(starts_with("date_"), lubridate::dmy)) %>%
mutate(across(starts_with("amount"), function(col) readr::parse_number(col, locale = readr::locale(decimal_mark = ",")))) %>%
mutate(konto = "credit")
```
```{r}
dkb_debit <- dkb_debit %>%
clean_names() %>%
rename(date_billed = buchungstag,
date_booked = wertstellung,
booking_type = buchungstext,
what = verwendungszweck,
amount_eur = betrag_eur,
who_to_whom = auftraggeber_begunstigter,
kontonr = kontonummer,
blz = blz) %>%
select(-glaubiger_id, -kundenreferenz, -x) %>%
mutate(across(starts_with("date_"), lubridate::dmy)) %>%
mutate(across(starts_with("amount"), function(col) readr::parse_number(col, locale = readr::locale(decimal_mark = ",")))) %>%
mutate(konto = "debit")
```
```{r}
# optional: bind the two data frames together
# this gives a bit of a messy data frame because some columns are unique to girokonto or kreditkarte
dkb <- bind_rows(dkb_credit, dkb_debit)
dkb <- dkb %>%
select(konto, date_billed, date_booked, what, amount_eur, amount_orig, everything())
dkb %>% write_csv("data/cleaned/dkb.csv")
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment