Skip to content

Instantly share code, notes, and snippets.

# ssh into your AWS instance and setup s3fs:
sudo yum install git gcc libstdc++-devel gcc-c++ fuse fuse-devel curl-devel libxml2-devel openssl-devel mailcap automake
git clone git://github.com/s3fs-fuse/s3fs-fuse.git
cd s3fs-fuse/
./autogen.sh
./configure --prefix=/usr
make
@alexwoolford
alexwoolford / stakoverflowPythonMySQLquandry.py
Last active August 29, 2015 14:07
Stackoverflow: why-when-add-where-clause-in-sql-statement-query-return-no-data
#http://stackoverflow.com/questions/26500406/why-when-add-where-clause-in-sql-statement-query-return-no-data
"""
create table tb_test
(
id integer,
name varchar(255),
date date
);
insert into tb_test (id, name, date) values (0,'Mike','2014-04-24');
@alexwoolford
alexwoolford / parseOutValidIPs.py
Created October 25, 2014 19:03
How to parse valid IP's from a log file (Stackoverflow answer)
"""
http://stackoverflow.com/questions/26564513/python-valid-ips-from-each-line-on-a-text-file/26564920#26564920
iplog.txt:
Host : 75.75.75.75 , DNS : resolved dns , Location : USA
Host : 266.266.266.266 , DNS : resolved dns , Location : USA
Host : 10.0.1.1 , DNS : resolved dns , Location : USA
ipclear.txt:
75.75.75.75
@alexwoolford
alexwoolford / sendFromGmail.py
Last active August 29, 2015 14:08
Send from Gmail
#!/usr/bin/env python
# suggested gmail code for StackOverflow question: http://stackoverflow.com/questions/26772416/how-to-write-a-program-to-check-website-for-a-string
import smtplib
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
def sendEmail(fileName, emailTo):
@alexwoolford
alexwoolford / nasdaqCalendar188Rows.py
Last active August 29, 2015 14:08
Nasdaq Calendar StackOverflow question
# http://stackoverflow.com/questions/26793632/beautifulsoup-unable-to-to-read-the-complete-html-table
# coding: utf-8
# In[1]:
import urllib2
from bs4 import BeautifulSoup
@alexwoolford
alexwoolford / listComprehensionAndSets.py
Created November 7, 2014 04:34
List comprehension and sets
# coding: utf-8
# http://stackoverflow.com/questions/26794029/how-to-remove-duplicate-letters-in-a-comma-separated-cell
# In[1]:
text = """A,B,B,C
G,G,A,T
G,A,A
T,T"""

Keyless SSH with Ansible

First, generate a set of SSH keys:

ssh-keygen

Setup the ~/.ssh/config so, by default, we login with a specific user (in this case, root):

Host hadoop01

@alexwoolford
alexwoolford / mapr-installer.log
Last active August 29, 2015 14:15
output from /opt/mapr-installer/var/mapr-installer.log
awoolford@hadoop01:~$ cat /opt/mapr-installer/var/mapr-installer.log
2015-02-12 10:37:03,968 mapr-install 139 [INFO]:
2015-02-12 10:37:03,968 mapr-install 140 [INFO]: ================================
2015-02-12 10:37:03,968 mapr-install 141 [INFO]: Installer Version: 4.0.2.136 started
2015-02-12 10:37:03,971 common 398 [INFO]: Now querying package python-pycurl
2015-02-12 10:37:03,989 common 403 [INFO]: Package: python-pycurl
Status: install ok installed
Priority: optional
Section: python
Installed-Size: 215
@alexwoolford
alexwoolford / hive-site.xml
Last active August 29, 2015 14:15
/opt/mapr/hive/hive-0.13/conf/hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@alexwoolford
alexwoolford / hadoop_conf-details_print-all-effective-properties.xml
Created February 20, 2015 05:59
The output from "hadoop conf-details print-all-effective-properties"
<?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration>
<property><name>mapreduce.job.ubertask.enable</name><value>false</value><source>mapred-default.xml</source></property>
<property><name>yarn.resourcemanager.delayed.delegation-token.removal-interval-ms</name><value>30000</value><source>yarn-default.xml</source></property>
<property><name>yarn.resourcemanager.max-completed-applications</name><value>10000</value><source>yarn-default.xml</source></property>
<property><name>io.bytes.per.checksum</name><value>512</value><source>core-default.xml</source></property>
<property><name>yarn.timeline-service.leveldb-timeline-store.read-cache-size</name><value>104857600</value><source>yarn-default.xml</source></property>
<property><name>mapreduce.client.submit.file.replication</name><value>10</value><source>mapred-default.xml</source></property>
<property><name>mapreduce.shuffle.connection-keep-alive.enable</name><value>false</value><source>mapred-default.xml</source></property>
<property><name>yarn.node