Skip to content

Instantly share code, notes, and snippets.

@jorandradefig
Last active September 22, 2018 00:00
Show Gist options
  • Save jorandradefig/c2b67760e8b36b5fb0df6bbb86d71dd5 to your computer and use it in GitHub Desktop.
Save jorandradefig/c2b67760e8b36b5fb0df6bbb86d71dd5 to your computer and use it in GitHub Desktop.
OONI Probe Data Mining
## Install AWS Command Line Interface (CLI)
sudo pip install awscli
## Install LZ4 lossless compression algorithm
brew install lz4

### List
aws --no-sign-request s3 ls s3://ooni-data/autoclaved/jsonl.tar.lz4/2018-09-20/
aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl.tar.lz4/2018-09-20/20180911T164715Z-US-AS6167-web_connectivity-20180911T164717Z_AS6167_xi0qFbLUII9Y68PhuOc7nZDOU2GtFKBdnEOZjw6hoUfZwAuw8W-0.2.0-probe.json.lz4 ./20180911T164715Z-US-AS6167-web_connectivity-20180911T164717Z_AS6167_xi0qFbLUII9Y68PhuOc7nZDOU2GtFKBdnEOZjw6hoUfZwAuw8W-0.2.0-probe.json.lz4
lz4 20180911T164715Z-US-AS6167-web_connectivity-20180911T164717Z_AS6167_xi0qFbLUII9Y68PhuOc7nZDOU2GtFKBdnEOZjw6hoUfZwAuw8W-0.2.0-probe.json.lz4

aws --no-sign-request s3 ls s3://ooni-data/autoclaved/jsonl/2018-09-20/
aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl/2018-09-20/20180916T112129Z-US-AS36850-telegram-20180916T112205Z_AS36850_2ZhPYFe3UwaNCSPCzPYm3sNOtxWAbhtexakvzddV29ZLPU4pF2-0.2.0-probe.json ./20180916T112129Z-US-AS36850-telegram-20180916T112205Z_AS36850_2ZhPYFe3UwaNCSPCzPYm3sNOtxWAbhtexakvzddV29ZLPU4pF2-0.2.0-probe.json

aws --no-sign-request s3 sync s3://ooni-data/autoclaved/jsonl/2018-09-20/ ./2018-09-20/

aws s3 ls s3://ooni-data/autoclaved/jsonl.tar.lz4/ --recursive | grep -e "-MX-"

aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl.tar.lz4/ ./ooni-data/ --exclude "*" --include "*-MX-*" --recursive
find ./ooni-data/ -name \*.lz4 -exec lz4 {} \;

aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl.tar.lz4/ ./ooni-data/ --exclude "*" --include "*-MX-*" --recursive && find ./ooni-data/ -name \*.lz4 -exec lz4 {} \;

aws --no-sign-request s3 cp s3://ooni-data/autoclaved/jsonl/ ./ooni-data/ --exclude "*" --include "*-MX-*" --recursive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment