Skip to content

Instantly share code, notes, and snippets.

@snowindy
snowindy / spark-create-rdd-from-s3-parallel.scala
Last active September 24, 2020 12:25
This code allows parallel loading of data from S3 to Spark RDD. Support multiple paths to load from. Based on http://tech.kinja.com/how-not-to-pull-from-s3-using-apache-spark-1704509219
val s3Paths = "s3://yourbucket/path/to/file1.txt,s3://yourbucket/path/to/directory"
val pageLength = 100
val key = "YOURKEY"
val secret = "YOUR_SECRET"
import com.amazonaws.services.s3._, model._
import com.amazonaws.auth.BasicAWSCredentials
import com.amazonaws.services.s3.model.ObjectListing
import scala.collection.JavaConverters._
import scala.io.Source
@mrtns
mrtns / gist:78d15e3263b2f6a231fe
Last active January 19, 2025 08:02
Upgrade Chrome from Command Line on Ubuntu
# Install
# via http://askubuntu.com/questions/510056/how-to-install-google-chrome
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
sudo sh -c 'echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
sudo apt-get update
sudo apt-get install google-chrome-stable
# Update
@ashrithr
ashrithr / FIleSystemOperations.java
Created February 26, 2015 01:27
HDFS FileSystems API example
package com.cloudwick.mapreduce.FileSystemAPI;
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
@ktheory
ktheory / dd.log
Last active November 10, 2023 23:41
EC2 EBS-SSD vs instance-store performance on an EBS-optimized m3.2xlarge
# /tmp/test = EBS-SSD
# /mnt/test = instance-store
root@ip-10-0-2-6:~# dd bs=1M count=256 if=/dev/zero of=/tmp/test
256+0 records in
256+0 records out
268435456 bytes (268 MB) copied, 3.26957 s, 82.1 MB/s
root@ip-10-0-2-6:~# dd bs=1M count=256 if=/dev/zero of=/tmp/test
256+0 records in
256+0 records out
@staltz
staltz / introrx.md
Last active May 15, 2025 10:37
The introduction to Reactive Programming you've been missing
@danielecook
danielecook / LCR_region.sh
Last active March 7, 2017 08:24
Generate Low Complexity Region (LCR) bedfile of masked regions from UCSC repeatmasker data and its complement for use with bcftools
#!/bin/bash
wget 'http://hgdownload.soe.ucsc.edu/goldenPath/ce10/database/rmsk.txt.gz' -O LCR_rmsk.txt.gz
gunzip -kfc LCR_rmsk.txt.gz | grep 'Low_complexity' | cut -f 6,7,8 > LCR_ce10_rmsk.bed
rm LCR_rmsk.txt.gz
# Generate the set of regions complementary (e.g. NOT low complexity)
# Download c. elegans chromosome information
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from ce10.chromInfo" > ce10.genome
bedtools complement -i LCR_ce10_rmsk.bed -g ce10.genome | sort -k 1,1 -k2,2n > LCR_complement_ce10.bed
@kencharos
kencharos / StreamUtil.java
Created March 31, 2014 02:44
zip in java8
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.function.BiFunction;
import java.util.stream.LongStream;
import java.util.stream.Stream;
/**
*
*/
@pkuczynski
pkuczynski / LICENSE
Last active March 14, 2025 14:12
Read YAML file from Bash script
MIT License
Copyright (c) 2014 Piotr Kuczynski
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWAR

Virtual DOM and diffing algorithm

There was a [great article][1] about how react implements it's virtual DOM. There are some really interesting ideas in there but they are deeply buried in the implementation of the React framework.

However, it's possible to implement just the virtual DOM and diff algorithm on it's own as a set of independent modules.

@jbenet
jbenet / simple-git-branching-model.md
Last active May 3, 2025 18:07
a simple git branching model

a simple git branching model (written in 2013)

This is a very simple git workflow. It (and variants) is in use by many people. I settled on it after using it very effectively at Athena. GitHub does something similar; Zach Holman mentioned it in this talk.

Update: Woah, thanks for all the attention. Didn't expect this simple rant to get popular.