Skip to content

Instantly share code, notes, and snippets.

View kakarukeys's full-sized avatar

Wong Jiang Fung kakarukeys

View GitHub Profile
@kakarukeys
kakarukeys / gist:51551cd1ad38bb77b0a849d929b7844c
Created September 9, 2021 01:56
summary email harvesting
import re
from pymongo import MongoClient
from settings import DEST_DATABASE_URL
if __name__ == '__main__':
with open("CREDENTIALS") as f:
credentials = f.read().strip()
@kakarukeys
kakarukeys / stream_from_gz.py
Last active October 21, 2021 05:56
stream from gz
import io
import time
from gzip import GzipFile
import pandas as pd
# https://stackoverflow.com/a/20260030/496852
def iterable_to_stream(iterable, buffer_size=io.DEFAULT_BUFFER_SIZE):
"""
@kakarukeys
kakarukeys / gist:5cdb111c1ed9cb26c423434abf59ee75
Last active March 5, 2022 15:39
Frontend Engineer Coding Challenge
## Exercise
You are tasked with creating a UI that allows user to filter and view a company dataset, with the following requirements.
1. The app is a React app based on the structure of https://github.com/react-boilerplate/react-boilerplate.
2. The app connects to an HTTP API:
https://faker-companies.dk-dev.leadbook.com/api/v1/industries/
https://faker-companies.dk-dev.leadbook.com/api/v1/companies/
https://faker-companies.dk-dev.leadbook.com/api/v1/companies/?company_location=BA
@kakarukeys
kakarukeys / robust_parallel_bulk.py
Created July 7, 2022 03:21
robust_parallel_bulk.py
import logging
from enum import IntEnum
from datetime import datetime
from typing import List, Iterable, Dict, Any, Tuple, Literal, NamedTuple, Type
import tenacity as tn
from elasticsearch.helpers import parallel_bulk
from app.elasticsearch.client import es_client
from app.db.models import BaseModel