Skip to content

Instantly share code, notes, and snippets.

View r39132's full-sized avatar

Sid Anand r39132

View GitHub Profile
from airflow import configuration as conf
from airflow import DAG
from airflow.operators import BashOperator
from datetime import datetime
# build DAG
default_args = {
'owner': 'jrideout',
'pool': 'ep_generate_spoofs',
'depends_on_past': False,
import os
import yaml
import yaml
from airflow import configuration as conf
from airflow import DAG
from airflow.operators import BashOperator, PostgresOperator
from datetime import datetime
from pprint import pprint
check process airflow-webserver with pidfile /home/deploy/airflow/pids/airflow-webserver.pid
group airflow
start program "/bin/sh -c '( HISTTIMEFORMAT="%d/%m/%y %T " TMP=/data/tmp AIRFLOW_HOME=/home/deploy/airflow PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin airflow webserver -p 8080 2>&1 & echo $! > /home/deploy/airflow/pids/airflow-webserver.pid ) | logger -p local7.info'"
as uid deploy and gid deploy
stop program "/bin/sh -c 'PATH=/bin:/sbin:/usr/bin:/usr/sbin pkill -TERM -P `cat /home/deploy/airflow/pids/airflow-webserver.pid` && rm -f /home/deploy/airflow/pids/airflow-webserver.pid'"
as uid deploy and gid deploy
~
from datetime import datetime
from airflow.models import DAG
from airflow.operators import BashOperator, ShortCircuitOperator
import logging
def skip_to_current_job(ds, **kwargs):
now = datetime.now()
left_window = kwargs['dag'].following_schedule(kwargs['execution_date'])
right_window = kwargs['dag'].following_schedule(left_window)
now = datetime.now()
now_to_the_hour = now.replace(hour=now.time().hour, minute=0, second=0, microsecond=0)
START_DATE = now_to_the_hour + timedelta(hours=-3)
DAG_NAME = 'ep_telemetry_v2'
ORG_IDS = get_active_org_ids_string()
default_args = {
'owner': 'sanand',
'depends_on_past': True,
'pool': 'ep_data_pipeline',
sid-as-mbp:ep siddharth$ terraform plan --target=aws_kinesis_stream.scored_output
var.im_ami
Enter a value: 1
Refreshing Terraform state prior to plan...
aws_s3_bucket.agari_stage_ep_scored_output_firehose_bucket: Refreshing state... (ID: agari-stage-ep-scored-output-firehose)
aws_iam_role.firehose_role: Refreshing state... (ID: collector_ingest_firehose_role)
aws_kinesis_firehose_delivery_stream.scored_output_firehose: Refreshing state... (ID: arn:aws:firehose:us-west-2:118435376172:deliverystream/agari-stage-ep-scored-output-firehose)
aws_kinesis_stream.scored_output: Refreshing state... (ID: arn:aws:kinesis:us-west-2:118435376172:stream/agari-stage-ep-scored-output)
@r39132
r39132 / gist:30cc62c74b3ba23039a622c31016766f
Created February 11, 2017 01:41
ElasticSearch 2.3 --> 5.1 Migration : new IP fields do not support ipv6
I recently migrated from AWS ES 2.3 to 5.1.
I followed the instructions on [http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-version-migration.html]
TLDR, i snapshotted my 2.3 ES cluster to S3 and then restored to the new 5.1 cluster from S3.
However, I ran into a problem. I added an *ip* field to my indexes, which included indexes brought over from 2.3 as well as new indexes created on 5.1. Here's an sample mapping:
curl -XPUT "localhost:80/cars/_mapping/transactions" -d'
{
"""
### Example HTTP operator and sensor
"""
from airflow import DAG
from airflow.operators.http_operator import SimpleHttpOperator
from airflow.operators.sensors import HttpSensor
from datetime import datetime, timedelta
import json
seven_days_ago = datetime.combine(datetime.today() - timedelta(7),
from airflow import DAG, utils
from airflow.operators.dummy_operator import DummyOperator
from datetime import date, datetime, time, timedelta
today = datetime.today()
# Round to align with the schedule interval
START_DATE = today.replace(minute=0, second=0, microsecond=0)
DAG_NAME = 'clear_task_bug_dag_1.0'
PayPal currently supports 4+ generations of software stacks in production and runs 2k+ distinct microservices, which together provide customers with the fast and seamless user experience they expect. To maintain high quality while promoting happy, productive developers in such an environment, self-service tools with a high grade of automation under the hood are paramount. In this talk, I will tell the story of how PayPal moved from a "PayPal on a box"-test environment, to VM-based environments, and finally to a delivery pipeline leveraging our container platform. I will describe how our pipeline delivers containers to fly-away test environments for automated integration testing and how that paradigm shift impacted our engineering teams and their workflows. We will share our insights and learnings on what worked really well for us as well as how some of our learnings can be applied at other companies.