Skip to content

Instantly share code, notes, and snippets.

@mathiasfls
mathiasfls / old-school.ipynb
Created October 15, 2024 22:44 — forked from palewire/old-school.ipynb
"Old School" Machine Learning Classifier
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mathiasfls
mathiasfls / new-school.ipynb
Created October 15, 2024 22:44 — forked from palewire/new-school.ipynb
"New School" LLM Classifier
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mathiasfls
mathiasfls / script.sh
Created July 12, 2021 17:35 — forked from bplmp/script.sh
Extrair autos de infração do IBAMA de HTML para CSV
# baixe os dados
curl http://dadosabertos.ibama.gov.br/dados/SIFISC/auto_infracao/auto_infracao/auto_infracao.html > autos.html
# gere csv
# (primeiro passo é quebrar as linhas, acelera o processamento)
cat autos.html \
| sed -e 's|<\/tr>|\n|g' \
| sed -e 's/.*<thead\sbgcolor\=\"\#808080\">//' \
-e 's|\salign\=\"center\"||g' \
-e 's|<\/th>|\|\|\||g' \
@mathiasfls
mathiasfls / json2csv.R
Created February 26, 2021 23:09 — forked from retrography/json2csv.R
JSON to CSV convertor. Uses `jsonlite` R package, flattens all the hierarchical structure and converts all remaining lists/arrays into strings.
#!/usr/bin/Rscript
if (!require(jsonlite, quietly = TRUE, warn.conflicts = FALSE)) {
stop("This program requires the `jsonlite` package. Please install it by running `install.packages('jsonlite')` in R and then try again.")
}
args <- commandArgs(trailingOnly = TRUE)
if (length(args) == 0) {
input <- readLines(file("stdin"))
@mathiasfls
mathiasfls / FacebookFromR.r
Created December 8, 2020 15:24 — forked from epijim/FacebookFromR.r
scrape facebook from R. Based off
###############################################################################################
## ##
## Setup ##
## ##
###############################################################################################
# install.packages("Rfacebook") # from CRAN
# install.packages("Rook") # from CRAN
# install.packages("igraph") # from CRAN
@mathiasfls
mathiasfls / webserver.py
Created November 18, 2020 11:50 — forked from jph00/webserver.py
Minimal web server demo in Python (requires fastcore: `pip install fastcore`)
from fastcore.utils import *
host = 8888,'localhost'
sock = start_server(*host)
print(f'Serving on {host}...')
while True:
conn,addr = sock.accept()
with conn:
data = conn.recv(1024)
print(data.decode())
@mathiasfls
mathiasfls / index.js
Created June 20, 2020 13:32 — forked from bcks/index.js
tableau-covid-scraping
const URL = 'https://public.tableau.com/views/PPV_15924847800480/ppv_db?%3Aembed=y&%3AshowVizHome=no&%3Adisplay_count=y&%3Adisplay_static_image=n&%3AbootstrapWhenNotified=true&%3Alanguage=en&:embed=y&:showVizHome=n&:apiID=host0';
const puppeteer = require('puppeteer');
// Below, largely cribbed from Thomas Dondorf at https://stackoverflow.com/questions/52969381/how-can-i-capture-all-network-requests-and-full-response-data-when-loading-a-pag
(async () => {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
<!DOCTYPE qgis_style>
<qgis_style version="2">
<symbols>
<symbol name="qartoon" force_rhr="0" clip_to_extent="1" alpha="1" type="fill">
<layer class="SimpleFill" enabled="1" locked="0" pass="0">
<prop v="3x:0,0,0,0,0,0" k="border_width_map_unit_scale"/>
<prop v="0,0,0,255" k="color"/>
<prop v="round" k="joinstyle"/>
<prop v="1,1" k="offset"/>
<prop v="3x:0,0,0,0,0,0" k="offset_map_unit_scale"/>
@mathiasfls
mathiasfls / cheerio.R
Created April 11, 2020 00:59 — forked from jeroen/cheerio.R
V8 cheerio rvest example
# Proof of concept of using V8 to parse HTML in R
# Example taken from rvest readme
# Jeroen Ooms, 2015
library(V8)
stopifnot(packageVersion("V8") >= "0.4")
# Get Document
html <- paste(readLines("http://www.imdb.com/title/tt1490017/"), collapse="\n")
@mathiasfls
mathiasfls / passive.R
Created August 19, 2019 21:55 — forked from almogsi/passive.R
This code snippet takes a vector of strings and calculates the percentage of passive voice in the input text. It uses Stanford NLP tool and coreNLP for R.
library(rJava)
library(coreNLP)
initCoreNLP()
#in this case, 'test' is a data frame with a col named 'text'
for (i in 1:dim(test)[1]) {
cat(paste0(i / dim(test)[1] * 100, '% completed'))
ann <- annotateString(paste(test$text[i]), format = c("obj"), outputFile = NA, includeXSL = FALSE)
gd <- getDependency(ann)