Skip to content

Instantly share code, notes, and snippets.

View pingzh's full-sized avatar

Ping Zhang pingzh

View GitHub Profile
@pingzh
pingzh / gist:6f7e86861abd611a5d8b42e031bc93f9
Created January 7, 2022 07:12 — forked from chanks/gist:7585810
Turning PostgreSQL into a queue serving 10,000 jobs per second

Turning PostgreSQL into a queue serving 10,000 jobs per second

RDBMS-based job queues have been criticized recently for being unable to handle heavy loads. And they deserve it, to some extent, because the queries used to safely lock a job have been pretty hairy. SELECT FOR UPDATE followed by an UPDATE works fine at first, but then you add more workers, and each is trying to SELECT FOR UPDATE the same row (and maybe throwing NOWAIT in there, then catching the errors and retrying), and things slow down.

On top of that, they have to actually update the row to mark it as locked, so the rest of your workers are sitting there waiting while one of them propagates its lock to disk (and the disks of however many servers you're replicating to). QueueClassic got some mileage out of the novel idea of randomly picking a row near the front of the queue to lock, but I can't still seem to get more than an an extra few hundred jobs per second out of it under heavy load.

So, many developers have started going straight t

@pingzh
pingzh / test_ti_creation.py
Last active January 6, 2022 19:32
for airflow perf test for ti creation inside the dag_run verify_integrity. The test is against a database without other traffic
import time
import logging
from airflow.utils.db import create_session
from airflow.utils import timezone
from airflow.models import TaskInstance
from airflow.models.serialized_dag import SerializedDagModel
logger = logging.getLogger(__name__)
out_hdlr = logging.FileHandler('./log.txt')
# ==================
# Top-level metadata
# ==================
%global pybasever 3.9
# pybasever without the dot:
%global pyshortver 39
Name: python%{pybasever}
version: "2.1"
services:
wireguard:
image: linuxserver/wireguard
container_name: wireguard
cap_add:
- NET_ADMIN
- SYS_MODULE
environment:
- PUID=1000
@pingzh
pingzh / run_alembic_command_with_code.py
Created February 10, 2021 06:54
Run alembic command with code
from alembic.config import Config
from alembic import command
def run_db_migrations(dsn: str) -> None:
LOGGER.info('Running DB migrations on %r', dsn)
alembic_cfg = Config('alembic.ini')
alembic_cfg.set_main_option('sqlalchemy.url', dsn)
command.upgrade(alembic_cfg, 'head')
import time
from airflow.configuration import conf
from airflow.utils.log.logging_mixin import LoggingMixin
class JobDispatcherExecutor(LoggingMixin):
def __init__(self, celery_executor, kubernetes_executor):
"""
"""
@pingzh
pingzh / stream_pod.py
Created August 24, 2020 06:30
An example of streaming pod events from k8s for a namespace
from datetime import datetime
from kubernetes import client, config, watch
config.load_kube_config()
# Create a configuration object
with_ssl_disabled_config = client.Configuration()
with_ssl_disabled_config.verify_ssl = False
api_client = client.ApiClient(with_ssl_disabled_config)
v1 = client.CoreV1Api(api_client)
{
"documents": [
{
"id": "1",
"language": "en",
"text": "I had a wonderful experience! The rooms were wonderful and the staff was helpful."
},
{
"id": "2",
"language": "en",
@pingzh
pingzh / provision_docker_in_aws.sh
Created January 11, 2020 05:12
Provision aws with docker
### ubuntu ###
sudo apt update -y
sudo apt install -y docker docker.io git net-tools vim tmux
sudo usermod -aG docker ubuntu
sudo curl -L https://github.com/docker/compose/releases/download/1.22.0/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo systemctl start docker
sudo systemctl enable docker
@pingzh
pingzh / headless.md
Created November 20, 2017 01:49 — forked from addyosmani/headless.md
So, you want to run Chrome headless.

Update May 2017

Eric Bidelman has documented some of the common workflows possible with headless Chrome over in https://developers.google.com/web/updates/2017/04/headless-chrome.

Update

If you're looking at this in 2016 and beyond, I strongly recommend investigating real headless Chrome: https://chromium.googlesource.com/chromium/src/+/lkgr/headless/README.md

Windows and Mac users might find using Justin Ribeiro's Docker setup useful here while full support for these platforms is being worked out.