Skip to content

Instantly share code, notes, and snippets.

@doppiomacchiatto
doppiomacchiatto / .bash_profile
Created April 9, 2016 18:01 — forked from natelandau/.bash_profile
Mac OSX Bash Profile
# ---------------------------------------------------------------------------
#
# Description: This file holds all my BASH configurations and aliases
#
# Sections:
# 1. Environment Configuration
# 2. Make Terminal Better (remapping defaults and adding functionality)
# 3. File and Folder Management
# 4. Searching
# 5. Process Management
# http://wiki.apache.org/solr/FAQ#How_can_I_delete_all_documents_from_my_index.3F
# http://wiki.apache.org/solr/UpdateXmlMessages#Updating_a_Data_Record_via_curl
curl http://index.websolr.com/solr/a0b1c2d3/update -H "Content-Type: text/xml" --data-binary '<delete><query>*:*</query></delete>'
@doppiomacchiatto
doppiomacchiatto / s3bucketsize.py
Created October 26, 2016 16:07 — forked from robinkraft/s3bucketsize.py
Simple python script to calculate size of S3 buckets
import boto
s3 = boto.connect_s3(aws_id, aws_secret_key)
# based on http://www.quora.com/Amazon-S3/What-is-the-fastest-way-to-measure-the-total-size-of-an-S3-bucket
def get_bucket_size(bucket_name):
bucket = s3.lookup(bucket_name)
total_bytes = 0
n = 0
for key in bucket:
@doppiomacchiatto
doppiomacchiatto / gist:c9bfb52200d9fbaa914c198986904079
Created January 22, 2017 23:53 — forked from tott/gist:3895832
create cpu load in python
#!/usr/bin/env python
"""
Produces load on all available CPU cores
"""
from multiprocessing import Pool
from multiprocessing import cpu_count
def f(x):
while True:
@doppiomacchiatto
doppiomacchiatto / ls.py
Created February 24, 2017 17:38 — forked from jbeezley/ls.py
Recursively list files in s3
#!/usr/bin/env python
import sys
import json
from boto.s3.connection import S3Connection
from boto.s3.prefix import Prefix
from boto.s3.key import Key
bucketname = sys.argv[1]
delimiter = '/'
@doppiomacchiatto
doppiomacchiatto / role_arn_to_session.py
Created March 28, 2017 12:55 — forked from gene1wood/role_arn_to_session.py
Simple python function to assume an AWS IAM Role from a role ARN and return a boto3 session object
import boto3
def role_arn_to_session(**args):
"""
Usage :
session = role_arn_to_session(
RoleArn='arn:aws:iam::012345678901:role/example-role',
RoleSessionName='ExampleSessionName')
client = session.client('sqs')
"""
@doppiomacchiatto
doppiomacchiatto / checkDockerDisks.sh
Created June 23, 2017 15:44 — forked from robsonke/checkDockerDisks.sh
This Bash script will loop through all running docker containers on a host and list the disk usage per mount. In case it's breaching the 65%, it will email you.
#!/bin/bash
# get all running docker container names
containers=$(sudo docker ps | awk '{if(NR>1) print $NF}')
host=$(hostname)
# loop through all containers
for container in $containers
do
echo "Container: $container"
@doppiomacchiatto
doppiomacchiatto / ecs.json
Created June 28, 2017 02:00 — forked from caevyn/ecs.json
ecs definition
{
"taskDefinitionArn": "arn:aws:ecs:us-west-2:<scc number>:task-definition/build-blog:3",
"revision": 3,
"containerDefinitions": [
{
"volumesFrom": [],
"portMappings": [],
"command": [],
"environment": [
{

Count stats for twitter stream and store in Cassandra

cd $SPARK_HOME

/bin/spark-submit --packages TargetHolding/pyspark-cassandra:0.3.5 /Users/drehman/Apps/workspace/spark_cassandra_stream_example.py

python twitter_rolling_count.py -q data -d data 2>&1 | nc -lk 10.0.0.235 9999
@doppiomacchiatto
doppiomacchiatto / sentiment_classification.py
Created July 19, 2017 19:54 — forked from bonzanini/sentiment_classification.py
Sentiment analysis with scikit-learn
# You need to install scikit-learn:
# sudo pip install scikit-learn
#
# Dataset: Polarity dataset v2.0
# http://www.cs.cornell.edu/people/pabo/movie-review-data/
#
# Full discussion:
# https://marcobonzanini.wordpress.com/2015/01/19/sentiment-analysis-with-python-and-scikit-learn