Skip to content

Instantly share code, notes, and snippets.

View liwh's full-sized avatar
🎯
Focusing

robie lee liwh

🎯
Focusing
View GitHub Profile
- certain endpoints are always blocked
if nginx_uri == "/_access_token" or nginx_uri == "/_me" then
ngx.exit(403)
end
-- import requirements
local cjson = require "cjson"
-- setup some app-level vars
local app_id = "APP_ID"
[core]
# The home folder for airflow, default is ~/airflow
airflow_home = /Users/p1nox/airflow
# The folder where your airflow pipelines live, most likely a
# subfolder in a code repository
dags_folder = /Users/p1nox/airflow/dags
# The folder where airflow should store its log files. This location
base_log_folder = /Users/p1nox/airflow/logs
@liwh
liwh / elasticsearch_best_practices.txt
Created March 21, 2018 03:07 — forked from duydo/elasticsearch_best_practices.txt
Elasticsearch - Index best practices from Shay Banon
If you want, I can try and help with pointers as to how to improve the indexing speed you get. Its quite easy to really increase it by using some simple guidelines, for example:
- Use create in the index API (assuming you can).
- Relax the real time aspect from 1 second to something a bit higher (index.engine.robin.refresh_interval).
- Increase the indexing buffer size (indices.memory.index_buffer_size), it defaults to the value 10% which is 10% of the heap.
- Increase the number of dirty operations that trigger automatic flush (so the translog won't get really big, even though its FS based) by setting index.translog.flush_threshold (defaults to 5000).
- Increase the memory allocated to elasticsearch node. By default its 1g.
- Start with a lower replica count (even 0), and then once the bulk loading is done, increate it to the value you want it to be using the update_settings API. This will improve things as possibly less shards will be allocated to each machine.
- Increase the number of machines you have so
@liwh
liwh / avazu_ftrl_concurrent.go
Created June 11, 2020 02:11 — forked from jack281291/avazu_ftrl_concurrent.go
Kaggle Avazu Challenge: FTRL-Proximal with L1 & L2 implemented in Go (Concurrent/Multi-threaded)
// Based on tinrtgu's Python script here:
// https://www.kaggle.com/c/avazu-ctr-prediction/forums/t/10927/beat-the-benchmark-with-less-than-1mb-of-memory
package main
import (
"encoding/csv"
"os"
"strconv"
"hash/fnv"
"math"