Skip to content

Instantly share code, notes, and snippets.

@jnatkins
jnatkins / pom.xml
Created August 29, 2012 18:56
A sample POM for setting up a basic Maven project for CDH application development
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<!-- Replace the group ID with your group ID -->
<groupId>com.mycompany.hadoopproject</groupId>
<!-- Replace the artifact ID with the name of your project -->
<artifactId>my-hadoop-project</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
@jnatkins
jnatkins / tweet.json
Created September 14, 2012 21:56
Sample Tweet
{
   "retweeted_status": {
      "contributors": null,
      "text": "#Crowdsourcing – drivers already generate traffic data for your smartphone to suggest alternative routes when a road is clogged. #bigdata",
      "geo": null,
      "retweeted": false,
      "in_reply_to_screen_name": null,
      "truncated": false,
      "entities": {
         "urls": [],
@jnatkins
jnatkins / gist:3744233
Created September 18, 2012 16:48
Query Results
created_at entities text user
Mon Sep 10 21:19:23 +0000 2012 {"urls":[],"user_mentions":[{"screen_name":"ScottOstby","name":"Scott Ostby"}],"hashtags":[{"text":"Crowdsourcing"}]} RT @ScottOstby: #Crowdsourcing – drivers already generate traffic data for your smartphone to suggest alternative routes when a road is ... {"screen_name":"ParvezJugon","name":"Parvez Jugon","friends_count":299,"followers_count":70,"statuses_count":1294,"verified":false,"utc_offset":null,"time_zone":null}
package com.cloudera.nile.etl;
import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Iterator;
import java.util.TimeZone;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
@jnatkins
jnatkins / KijiAndHBaseTestLoader.java
Created March 4, 2013 19:06
A loader for generating a small set of sample data for a Kiji and HBase table
package org.kiji.examples.importers;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.util.Bytes;
@jnatkins
jnatkins / DeleteTest.java
Created March 7, 2013 20:25
Test deleting a single HBase cell, while avoiding masking of earlier versions.
package org.kiji.examples.importers;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;