Skip to content

Instantly share code, notes, and snippets.

package example
import org.apache.avro.Schema.Parser
import java.io.{DataInput, DataOutput, File}
import org.apache.avro.generic.GenericData.Record
import org.apache.avro.generic.{GenericRecord, GenericDatumWriter}
import org.apache.avro.file.DataFileWriter
import org.apache.spark.SparkContext
import org.apache.avro.mapreduce.AvroKeyInputFormat
import org.apache.avro.mapred.AvroKey
@laserson
laserson / gist:1d1185b412b41057810b
Last active August 29, 2015 14:02
Running custom Spark build on a YARN cluster (for PySpark)

Building Spark for PySpark use on top of YARN

Build Spark on local machine (only if using PySpark; otherwise, remote machine works) (http://spark.apache.org/docs/latest/building-with-maven.html)

export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

Copy the assembly/target/scala-2.10/...jar to the corresponding directory on the cluster node and also into a location in HDFS.

@swarminglogic
swarminglogic / watchfile.sh
Last active January 1, 2025 19:07
watchfile - monitor file(s) and execute a command when files are changed
#!/bin/bash
version=1.0.1
versionDate="2014-02-14"
function showHelp() {
echo "watchfile - monitor file(s)/command and perform action when changed
Possible ways of usage
----------------------------------------
@clemsos
clemsos / gensim_workflow.py
Last active February 22, 2022 11:09
How to calculate TF-IDF similarity matrix of a complete corpus with Gensim
#!/usr/bin/env python
# -*- coding: utf-8 -*-
'''
This script just show the basic workflow to compute TF-IDF similarity matrix with Gensim
OUTPUT :
@lisamelton
lisamelton / convert-mp4-to-mkv.sh
Last active May 24, 2024 17:41
Convert MP4 video file into Matroska format without transcoding.
#!/bin/bash
#
# convert-video.sh
#
# Copyright (c) 2013-2014 Don Melton
#
about() {
cat <<EOF
$program 2.0 of December 3, 2014
@lisamelton
lisamelton / detect-crop.sh
Last active May 24, 2024 17:42
Detect crop values for video file to use with `mplayer` and `transcode-video.sh` (a wrapper script for `HandBrakeCLI`).
#!/bin/bash
#
# detect-crop.sh
#
# Copyright (c) 2013-2015 Don Melton
#
about() {
cat <<EOF
$program 3.3 of January 22, 2015
@lisamelton
lisamelton / transcode-video.sh
Last active April 29, 2025 20:17
Transcode video file (works best with Blu-ray or DVD rip) into MP4 (or optionally Matroska) format, with configuration and at bitrate similar to popular online downloads.
#!/bin/bash
#
# transcode-video.sh
#
# Copyright (c) 2013-2015 Don Melton
#
about() {
cat <<EOF
$program 5.13 of April 8, 2015
@rxaviers
rxaviers / gist:7360908
Last active July 19, 2025 01:42
Complete list of github markdown emoji markup

People

:bowtie: :bowtie: πŸ˜„ :smile: πŸ˜† :laughing:
😊 :blush: πŸ˜ƒ :smiley: ☺️ :relaxed:
😏 :smirk: 😍 :heart_eyes: 😘 :kissing_heart:
😚 :kissing_closed_eyes: 😳 :flushed: 😌 :relieved:
πŸ˜† :satisfied: 😁 :grin: πŸ˜‰ :wink:
😜 :stuck_out_tongue_winking_eye: 😝 :stuck_out_tongue_closed_eyes: πŸ˜€ :grinning:
πŸ˜— :kissing: πŸ˜™ :kissing_smiling_eyes: πŸ˜› :stuck_out_tongue:
@paulgribble
paulgribble / gist:7291469
Created November 3, 2013 15:30
getting afni working on mac osx 10.9 mavericks with home-brew
Getting afni working on Mac OSX 10.9 Mavericks, using Homebrew
Briefly: parts of afni (e.g. uber_subject.py) require the pyqt library ... and pyqt requires qt ... but the homebrew install of qt is currently broken on osx mavericks 10.9 ... but there is a patch ... but pyqt also requires sip ... and there is a problem where pyqt doesn't like the version of sip, in particular with a newer haswell CPU macbook pro ... but there is a fix ... and we need to add lines to our .bash_profile for afni ... and be careful if you have other python things installed since afni expects to find python things in certain locations
Here is a step-by-step:
1. install Xcode command line tools
xcode-select --install