- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap
, especially if the following operation will result in high memory usage. TheflatMap
op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMap
to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
A unit test helper library for App Engine. | |
Note that this is currently COMPLETELY UNTESTED. Consider it demo code only. | |
This library aims to make it easier to unit-test app engine apps and libraries | |
by handling the creation and registration of service stubs and so forth for you. | |
It also provides a custom implementation of the Capability service that allows | |
you to specify what capabilities you want it to report as disabled, and it wraps | |
all stubs in a wrapper that will throw a CapabilityDisabledError if you attempt | |
to use a disabled service or method. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env python | |
import redis | |
import random | |
import pylibmc | |
import sys | |
r = redis.Redis(host = 'localhost', port = 6389) | |
mc = pylibmc.Client(['localhost:11222']) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from fabric.api import env, local, require | |
def deploy(): | |
"""fab [environment] deploy""" | |
require('environment') | |
maintenance_on() | |
push() | |
syncdb() | |
migrate() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from django.template.defaultfilters import slugify | |
from django.contrib.sites.models import Site | |
from django.core.files import File | |
from taggit.models import Tag | |
from .models import Photo | |
import factory | |
import os | |
TEST_MEDIA_PATH = os.path.join(os.path.dirname(__file__), 'tests', 'test_media') | |
TEST_PHOTO_PATH = os.path.join(TEST_MEDIA_PATH, 'test_photo.png') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ brew update | |
$ brew install hive |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# iOS | |
app_identifier "com.myapp.app" # The bundle identifier of your app | |
apple_id "[email protected]" # Your Apple email address | |
team_id "1234ABCD" # Developer Portal Team ID | |
# Android | |
json_key_file "./google-play-api-secret.json" # Path to the json secret file - Follow https://github.com/fastlane/supply#setup to get one | |
package_name "com.myapp.app" # Your Android app package |
Tested with Cloudera 5.12.0 Quickstart VM (https://www.cloudera.com/downloads/quickstart_vms/5-12.html)
Library | Version |
---|---|
JanusGraph | 0.3.0-SNAPSHOT |
TinkerPop | 3.3.0 |
Spark | 2.2.0 |
HBase | 1.2.0 |
Cassandra | 2.2.11 |
Java | 1.8.0_151 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// This can be imported via ./bin/gremlin.sh -i describe.groovy | |
// A variable 'graph' must be defined with a JanusGraph graph | |
// Run it as a plugin command ':schema' | |
// :schema describe | |
// | |
import org.janusgraph.graphdb.database.management.MgmtLogType | |
import org.codehaus.groovy.tools.shell.Groovysh | |
import org.codehaus.groovy.tools.shell.CommandSupport |