This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import dateparser | |
import re | |
import requests | |
from bs4 import BeautifulSoup | |
import csv | |
from collections import defaultdict | |
url = ".../program.html" | |
track_url = ".../tracks.html" | |
css_url = "..../program.css" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# see https://github.com/sparklemotion/sqlite3-ruby/issues/383#issuecomment-1473000796 | |
SQLTITE_EXTENSIONS_DIR=/opt/sqlite-extensions # must exist | |
cd /tmp | |
git clone https://github.com/sqlite/sqlite.git | |
cd sqlite | |
gcc -g -fPIC -shared ./ext/misc/spellfix.c -o spellfix.o | |
sudo mv spellfix.o $SQLTITE_EXTENSIONS_DIR | |
rm -rf sqlite # clean up |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
# run as root user | |
apt install -y git ruby-dev curl | |
# RUBY | |
cd ~ | |
git clone https://github.com/rbenv/rbenv.git ~/.rbenv | |
echo 'eval "$(~/.rbenv/bin/rbenv init - bash)"' >> ~/.bashrc |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(tidyverse) | |
library(tidytext) | |
library(xts) | |
library(tstools) | |
base_dir = "C:\\Users\\Boulanger\\ownCloud\\Abteilung 3\\Projekte\\Legal Theory Graph" | |
corpus_dirs = c() | |
for (dir in list.dirs(base_dir)) { | |
if (dir != base_dir) { | |
corpus_dirs = c(corpus_dirs, dir) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ref | 4 Giordano, in: Mecklenburg (Hrsg.), Handbuch deutscher Rechtsextremismus, 1996, | |
| S. 14; auch Hoffmann-Riem, NJW 2004, 2777 (2778) wirft vor, die deutsche Gesell- | |
| schaft habe „bei der Aufarbeitung dieser Vergangenheit weithin versagt“; Falk, Ent- | |
| nazifizierung und Kontinuität, 2017, S. 14 spricht von einer „Verdrängung von all | |
| dem, was mit der NS-Zeit negativ in Verbindung gebracht werden konnte.“; zu den | |
| Schwierigkeiten bei der Entnazifizierung auch Görtemaker/Safferling, Die Akte Ro- | |
| senburg, 2016, S. 63 ff.; vgl. auch Bauer, Die Wurzeln faschistischen und nationalso- | |
| zialistischen Handelns, 1961, S. 7. | |
| 5 Dazu die akribische Aufbereitung von Will, Ephorale Verfassung, 2017. | |
| 6 Salzborn, Rechtsextremismus, 2. Aufl. 2015, S. 36 f. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
signal_phrases = [[ | |
"see", "cf.", "e.g.", "compare", "on", "for example", | |
"on the contrary", "on the other hand", | |
"generally", "similarly", "alternatively", | |
"the discussion in", "regarding", "on this, see", | |
"also", "although", "as discussed in", "detailed", "described", "cited in", "by", | |
"first published as", "reprinted in" | |
], [ | |
"siehe", "s.", "vgl.", "vgl", | |
"vgl. nur", "vgl", "zum ganzen", "so z.b.", "so auch", "bei", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<text>1 Ober 50% der Deutschen verbringen jeden Abend und über 70% jedes Wochenende zu Hause, vgl. </text><ref>Michael Andritzky/Gert Selle (Hrsg.) , Lembereich Wohnen Band 1, Reinbek 1979, S. 13</ref>. | |
<text>2 Der Anteil der Mietwohnungen betrug 1978 63 % (</text><ref>Lothar Herberger und Mitarbeiter, Bestand und Struktur der Gebäude und Wohnungen, in: Wirtschaft und Statistik 1980, S. 283—291, 286</ref><text>), in Großstädten sind sogar mehr als 80 % der Wohnungen vermietet (</text><ref>Rudi Ulbrich, a.a.O., Anm. 3, S. 18</ref><text>). ' </text> | |
<text>3 Die 1 %-Wohnungsstichprobe 1978 hat ergeben, daß einer Zahl von 24,3 Mio. Haushalten nur 23,4 Mio. Wohnungen gegenüberstehen. Hiervon stehen knapp 700000 leer und etwa 200000 dienen als Zweitwohnungen, vgl. hierzu ausführlich </text><ref>Rudi Ulbrich, Die Wohnungsversorgung im Spiegel der Statistik, in: Joachim Brech (Hrsg.), Wohnen zur Miete, Weinheim 1981, S. 16-21</ref> | |
<text>6 </text><ref>Bericht der Bundesregierung über die Auswirkunge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'anystyle' | |
require 'active_graph' | |
require 'serrano' | |
# connect to Neo4j | |
url = 'neo4j+s://4dcc21ca.databases.neo4j.io' | |
auth = Neo4j::Driver::AuthTokens.basic('neo4j', 'HacrjGERBJpLsVMZvMdGpB7FBvvexENZ3ikNXXfaE1s') | |
ActiveGraph::Base.driver = Neo4j::Driver::GraphDatabase.driver(url, auth, encryption: false) | |
# setup models |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from gensim.models.phrases import Phrases, Phraser, ENGLISH_CONNECTOR_WORDS | |
from gensim.parsing.preprocessing import remove_stopwords, preprocess_string, strip_tags, strip_punctuation, strip_numeric, remove_stopwords | |
import regex as re | |
import pandas as pd | |
# both dataframes have "title", "abstract", "published" columns | |
df_ngram = pd.read_pickle("journal-ngram-corpus.pkl") | |
df_analysis = pd.read_pickle("journal-analysis.pkl") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import {default as fetch} from 'node-fetch'; | |
const { pdf } = require("pdf-to-img"); | |
import {tmpdir} from "os"; | |
import {createWriteStream, createReadStream} from 'fs'; | |
import * as fsp from 'fs/promises' | |
import * as archiver from 'archiver'; | |
import {ArchiverError} from "archiver"; | |
import * as path from "path"; | |
import {Parser, Builder} from "xml2js"; |