Skip to content

Instantly share code, notes, and snippets.

@gagliardetto
gagliardetto / ngram-cleaner.sh
Last active February 19, 2016 11:54
Terminal comands to download and clean Google ngram datasets with a set of regular expressions.
#!/bin/bash
cat lista.txt | while read -r file; do wget "$file" && gzip -dc "${file##*/}" | \
tr '[:upper:]' '[:lower:]' | sed \
-e 's/\(_\?noun_\?\)\|\(_\?adp_\?\)\|\(_\?adv_\?\)\|\(_\?det_\?\)\|\(_\?verb_\?\)\|\(_\?adj_\?\)\|\(_\?end_\?\)\|\(_\?conj_\?\)\|\(S\?_\?pron_\?\)\|\(_\?num_\?\)/ /g' \
-e "s/\([^a-zàéèìòù '.]\)/ /g" \
-e "s/\s\{2,\}/ /g" \
-e "s/\.\{2,\}/./g" \
-e "s/[^a-zàéèìòù.]\{1,\}\.\{1,\}//g" \
-e "s/\s\{2,\}/ /g" \
@gagliardetto
gagliardetto / google-ngrams_2grams-to-5grams_without-punctuation.txt
Last active October 4, 2017 23:23
Calculate the uncompressed size of remote gzip files by downloading them, unzipping to stdin and counting bytes.
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-a_.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-aa.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ab.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ac.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ad.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ae.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-af.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ag.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ah.gz
https://storage.googleapis.com/books/ngrams/books/googlebooks-ita-all-2gram-20120701-ai.gz
@gagliardetto
gagliardetto / golang_job_queue.md
Created August 5, 2016 07:26 — forked from harlow/golang_job_queue.md
Job queues in Golang
package main
import (
"bufio"
"encoding/csv"
"encoding/json"
"fmt"
"io"
"os"
"path/filepath"

1. Clone your fork:

git clone [email protected]:YOUR-USERNAME/YOUR-FORKED-REPO.git

2. Add remote from original repository in your forked repository:

cd into/cloned/fork-repo
git remote add upstream git://github.com/ORIGINAL-DEV-USERNAME/REPO-YOU-FORKED-FROM.git
git fetch upstream
@gagliardetto
gagliardetto / client.go
Created July 29, 2017 16:08 — forked from jzelinskie/client.go
grpc bidirectional streams in golang
package main
import (
"log"
"time"
"golang.org/x/net/context"
"google.golang.org/grpc"
pb "github.com/jzelinskie/grpc/simple"
APIs:
-
method: GET
url: http://idowhcjymo.apps.wk-0.us-east-1.aws.dev.magalix.cloud:8001/rpush/guestbook/Hello%20World
body: ~
PATTERNS:
-
cron: "0 1 * * * *" # Run every hh:01 (00:01, 01:01, 02:55, ...)
concurrent_calls: 10
@gagliardetto
gagliardetto / gob.go
Created November 8, 2017 14:27 — forked from whyrusleeping/gob.go
golang gob interface example
package main
import (
"bytes"
"encoding/gob"
"fmt"
)
type MyFace interface {
A()
@gagliardetto
gagliardetto / client.go
Created March 24, 2018 16:31 — forked from spikebike/client.go
TLS server and client
package main
import (
"crypto/tls"
"crypto/x509"
"fmt"
"io"
"log"
)
package main
import "fmt"
func main() {
user := User{}
// Set the group:
groupSetter("admin", user)