Skip to content

Instantly share code, notes, and snippets.

View crawles's full-sized avatar

Chris Rawles crawles

View GitHub Profile
@crawles
crawles / bq_load_tsv.sh
Created June 13, 2018 16:40
How to load a TSV file into BigQuery
# Can work for other delimiters as well
# Tab delimiter
bq load --source_format=CSV --field_delimiter=tab \
--skip_leading_rows 1 -<destination_table> <source> \
<schema>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from setuptools import find_packages
from setuptools import setup
REQUIRED_PACKAGES = ['docopt']
setup(
name='my-package',
version='0.1',
author = 'Chris Rawles',
author_email = '[email protected]',
TRAIN_DATA_PATHS=path/to/training/data
OUTPUT_DIR=path/to/output/location
JOBNAME=my_ml_job_$(date -u +%y%m%d_%H%M%S)
REGION='us-central1'
BUCKET='my-bucket'
gcloud ml-engine jobs submit training $JOBNAME \
--package-path=$PWD/my_model_package/trainer \
--module-name=trainer.task \
--region=$REGION \
"""Run a training job on Cloud ML Engine for a given use case.
Usage:
trainer.task --train_data_paths <train_data_paths> --output_dir <outdir>
[--batch_size <batch_size>] [--hidden_units <hidden_units>]
Options:
-h --help Show this screen.
--batch_size <batch_size> Integer value indiciating batch size [default: 150]
--hidden_units <hidden_units> CSV seperated integers indicating hidden layer
sizes. For a fully connected model.', [default: 100]
@crawles
crawles / simple_task.py
Last active May 31, 2018 23:03
Simple task.py
import argparse
import model # Your model.py file.
if __name__ == '__main__':
parser = argparse.ArgumentParser()
# Input Arguments
parser.add_argument(
'--train_data_paths',
help = 'GCS or local path to training data',
def dnn_custom_estimator(features, labels, mode, params):
in_training = mode == tf.estimator.ModeKeys.TRAIN
use_batch_norm = params['batch_norm']
net = tf.feature_column.input_layer(features, params['features'])
for i, n_units in enumerate(params['hidden_units']):
net = build_fully_connected(net, n_units=n_units, training=in_training,
batch_normalization=use_batch_norm,
activation=params['activation'],
name='hidden_layer'+str(i))
# def ml-engine function
submitMLEngineJob() {
gcloud ml-engine jobs submit training $JOBNAME \
--package-path=$(pwd)/mnist_classifier/trainer \
--module-name trainer.task \
--region $REGION \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.4 \
-- \
def train_and_evaluate(output_dir):
features = [tf.feature_column.numeric_column(key='image_data', shape=(28*28))]
classifier = tf.estimator.Estimator(model_fn=dnn_custom_estimator,
model_dir=output_dir,
params={'features': features,
'batch_norm': USE_BATCH_NORMALIZATION,
'activation': ACTIVATION,
'hidden_units': HIDDEN_UNITS,
'learning_rate': LEARNING_RATE})