Skip to content

Instantly share code, notes, and snippets.

@brianv0
brianv0 / formatted.sql
Last active April 5, 2016 22:44
presto-formatted.sql
SELECT
(CASE WHEN ("x" = 1) THEN "y" ELSE "z" END)
, "a"
FROM
II/295/SSTGC
, II/293/glimpse
WHERE ((1 = "CONTAINS"("POINT"('ICRS', "II/295/SSTGC"."RAJ2000", "II/295/SSTGC"."DEJ2000"), "BOX"('GALACTIC', 0, 0, (30 / 60.0), (10 / 60.0)))) AND (1 = "CONTAINS"("POINT"('ICRS', "II/295/SSTGC"."RAJ2000", "II/295/SSTGC"."DEJ2000"), "CIRCLE"('ICRS', "II/293/glimpse"."RAJ2000", "II/293/glimpse"."DEJ2000", (2 / 3600.0)))))
LIMIT 10
Raw
@brianv0
brianv0 / ivoa_summary.md
Last active June 3, 2016 20:20
IVOA Summary

IVOA Meeting Summary

Data Interfaces

LSST has several interests in IVOA interfaces and their implementations. First off, it is imperative that we support standardized interfaces for data access so that our users can reuse community-standard tools, such as TOPCAT. Secondly, HTTP interfaces are an extremely flexible way of providing and managing data access both inside a datacenter and across the internet, and the IVOA has several standards we could potentially leverage where they make sense. Finally, it's in our best interest to reuse data access interface implementations and code where we can in order to reduce our workload and hopefully improve the existing body of work where we can.

When evaluating the IVOA standards, the data access team has two specific use cases to accomodate. The first is to accomodate the SUIT team with relevant interfaces to access catalog data and images. The second use case is to support standard tools such as TOPCAT

@brianv0
brianv0 / airflow-help.txt
Last active June 7, 2018 13:20
Airflow discovery
I'm not sure the celery worker option would scale in my case. I think what I actually need is potentially a local executor proxy which talks to a remote executor proxy that does job submission.
For our use case (astrophysics, e.g. Monte Carlo simulations), it's not uncommon to have more than 10k jobs running in parallel per site (~8 jobs per host over 1k hosts), and globally approaching 30k concurrent jobs, so it's a big no-no to open up database connections in this scenario unless you have some sort of global semaphore (e.g Zookeeper, etcd) to manage connection counts. Typically, most job communication is handled through a message queue (rabbitMQ, email, HTTP, or some site-local proxy which feeds those).
One option is to have a system which just launches pilot jobs for an entire host and actually just run the Mesos agent, but the problem there is that most batch systems, especially those on supercomputers, typically have a maximum wallclock time of 24 hours and you'd need to factor that into the mesos exec
@brianv0
brianv0 / client.py
Created June 22, 2016 21:27
SLAC simple batch interface with kerberos authentication
import requests
from requests_kerberos import HTTPKerberosAuth
import os
from os.path import join # Used to build url, assumes non-windows
# If this is a service, Start the process with
# /usr/local/bin/pagsh k5start -t -f [/etc/path/to/keytab.keytab] [[email protected]] python
URL = os.environ("SLATCH_HOST") # i.e. "http://scalnx-v01.slac.stanford.edu:5000"
PRINCIPAL = os.environ("SLATCH_PRINCIPAL")
@brianv0
brianv0 / Dockerfile
Created August 25, 2016 19:52
Stack Dockerfile
FROM ubuntu:16.04
MAINTAINER Brian Van Klaveren <[email protected]>
ENV INSTALL_MINICONDA N
ENV LSST_STACK_DIR /lsst/stack
ENV LSST_USER qserv
ENV LSST_GROUP qserv
ENV LSST_WORKSPACE /qserv
ENV LSST_PRODUCT qserv_distrib
ENV LSST_TAG_OPT -t qserv_latest
@brianv0
brianv0 / Dockerfile
Last active November 11, 2016 18:51
DESC LSST Dockerfile
FROM ubuntu:16.04
MAINTAINER Brian Van Klaveren <[email protected]>
RUN apt-get --yes update && \
apt-get --yes install bash \
bison \
bzip2 \
cmake \
curl \
flex \
@brianv0
brianv0 / devtools.txt
Last active November 19, 2016 00:12
Software Development Tools - Applications and Services
Continuous Integration and Deployment (Build Automation)
Jenkins, Travis, CircleCI, Bamboo
Build Tools[1]
Make, Maven (Gradle, sbt, etc...), CMake, SCons, Other: (TeamCity, Bazel, Buck, Waf)
Code Quality/Static Analysis
Coverity, Sonarqube. Typically integrated with CI: PMD, FindBugs, CPPCheck, Clang Analyzer. Other: (CI Plugins, Language tools)
Code Review
PASSWORD=somePassword
docker volume create airflow-data
docker run --rm \
--name airflow-db \
-e POSTGRES_PASSWORD=$PASSWORD \
-v airflow-data:/var/lib/postgresql/data postgres:9.5
-d postgres:9.5
@brianv0
brianv0 / alt2.sql
Last active February 24, 2017 21:10
alternative
select
.
.
.
FROM pss_job_activitiy a
LEFT OUTER JOIN pss_activity_detail d on (a.id = d.activity_id)
WHERE a.job_id > 250
AND (
a.activity_type_id = 'check_in'
OR a.activity_type_id = 'delivered'