Skip to content

Instantly share code, notes, and snippets.

View kalihman's full-sized avatar
🐡

YJ Song kalihman

🐡
View GitHub Profile

Feature Store

Uber Michelangelo

https://eng.uber.com/michelangelo/

Finding good features is often the hardest part of machine learning and we have found that building and managing data pipelines is typically one of the most costly pieces of a complete machine learning solution.

A platform should provide standard tools for building data pipelines to generate feature and label data sets for training (and re-training) and feature-only data sets for predicting. These tools should have deep integration with the company’s data lake or warehouses and with the company’s online data serving systems. The pipelines need to be scalable and performant, incorporate integrated monitoring for data flow and data quality, and support both online and offline training and predicting. Ideally, they should also generate the features in a way that is shareable across teams to reduce duplicate work and increase data quality. They should also provide strong guard rails and controls to encourage and empower users to adop

@480
480 / gist:3b41f449686a089f34edb45d00672f28
Last active January 23, 2025 16:11
MacOS X + oh my zsh + powerline fonts + visual studio code terminal settings

MacOS X + oh my zsh + powerline fonts + visual studio code (vscode) terminal settings

Thank you everybody, Your comments makes it better

Install oh my zsh

http://ohmyz.sh/

sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
@lifthrasiir
lifthrasiir / inquiry.md
Last active July 30, 2019 13:15
"ꡬ글, 'https' 채택 μ•ˆν•œ λˆ„λ¦¬μ§‘μ— μ•ˆμ „ν•˜μ§€ μ•Šμ€ κ³³ '낙인'" 기사에 λŒ€ν•œ 의견

μ•„λž˜ 메일은 2017-02-12 21:43(μ΄ν•˜ ν•œκ΅­ ν‘œμ€€μ‹œ)에 ν•œκ²¨λ ˆ 기사에 λŒ€ν•œ μ˜κ²¬μœΌλ‘œμ„œ 기사에 μ œμ‹œλœ κΉ€μž¬μ„­ 기자의 λ©”μΌλ‘œ 보낸 λ‚΄μš©μ΄λ‹€. λ©”μΌμ—μ„œ 사싀 관계 λ“±μ˜ 였λ₯˜κ°€ μžˆλ‹€λ©΄ λͺ¨λ‘ λ‚˜μ˜ μ‹€μˆ˜μ΄λ‹€.

2017-02-13 14:53에 λ§λΆ™μž„: 더 이상 gistλ₯Ό λΉ„κ³΅κ°œλ‘œ ν•  μ΄μœ κ°€ μ—†μ–΄μ‘ŒμœΌλ―€λ‘œ 곡개둜 μ „ν™˜. 이 메일에 λŒ€ν•œ 닡변은 λ°›μ•˜μœΌλ‚˜ κ³΅κ°œν•  만큼 μ€‘μš”ν•œ 반둠이 λ“€μ–΄ μžˆμ§„ μ•ŠμœΌλ©° 곡개 여뢀도 묻지 μ•Šμ•˜μœΌλ―€λ‘œ κ³΅κ°œν•˜μ§€ μ•ŠλŠ”λ‹€. μ•„λž˜ κΈ€ μžμ²΄μ—λ„ λ‹€μ–‘ν•œ λΉ„λ¬Έκ³Ό μ˜€μžκ°€ μžˆμœΌλ‚˜ 본래 보낸 λ‚΄μš©μ„ 살리기 μœ„ν•΄ μ „ν˜€ μˆ˜μ •μ„ ν•˜μ§€ μ•ŠκΈ°λ‘œ ν–ˆμŒμ„ μ–‘ν•΄ λ°”λžŒ.

2017-02-13 19:00에 λ§λΆ™μž„: 이 κΈ°μ‚¬μ˜ ν›„μ†μœΌλ‘œ ꡬ글코리아 츑의 κΈ°μžκ°„λ‹΄νšŒκ°€ μ˜¬λΌκ°”λ‹€. μƒˆ 기사에 λŒ€ν•΄μ„œλŠ” νŠΉμ΄ν•œ 게 μ—†μœΌλ―€λ‘œ λ…Έμ½”λ©˜νŠΈ. λ˜ν•œ μœ„μ˜ 기사 링크λ₯Ό λ―Έλ””μ–΄λ‹€μŒμ—μ„œ ν•œκ²¨λ ˆ μ›Ήμ‚¬μ΄νŠΈλ‘œ 가도둝 μˆ˜μ •.

원문

μ•ˆλ…•ν•˜μ‹­λ‹ˆκΉŒ, κ·€ν•˜κ»˜μ„œ μž‘μ„±ν•˜μ‹  (λ¬Όλ‘  μ €λŠ” κ·Έ μ§„μœ„λ₯Ό μ•Œ 수 μ—†μŠ΅λ‹ˆλ‹€λ§Œ, 적어도 κ·Έλ ‡κ²Œ λ‚˜μ™€ μžˆλŠ”) 기사에 λŒ€ν•œ μ˜κ²¬μ„ μ œκΈ°ν•˜κ³ μž 메일을 μ”λ‹ˆλ‹€. 이 메일은 μ €μ˜ 개인 의견이며 μ €λ₯Ό κ³ μš©ν•˜κ³  μžˆλŠ” νšŒμ‚¬λ‚˜ 단체 λ“±μ˜ μ˜κ²¬μ„ μ „ν˜€ λŒ€λ³€ν•˜μ§€ μ•ŠμŒμ„ ν˜Ήμ‹œλ‚˜ μ‹Άμ§€λ§Œ 미리 λ°ν˜€ λ‘‘λ‹ˆλ‹€.

@loleg
loleg / iotcam.py
Created November 7, 2015 01:26
Detects barcodes from a webcam stream using Python, zbar and CV2
from picamera.array import PiRGBArray
from picamera import PiCamera
import time
import sys
import cv2
import zbar
import Image
# Debug mode
DEBUG = False
@PurpleBooth
PurpleBooth / README-Template.md
Last active May 13, 2025 15:10
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

@datagrok
datagrok / README.md
Last active February 24, 2025 09:53
What happens when you cancel a Jenkins job

When you cancel a Jenkins job

Unfinished draft; do not use until this notice is removed.

We were seeing some unexpected behavior in the processes that Jenkins launches when the Jenkins user clicks "cancel" on their job. Unexpected behaviors like:

  • apparently stale lockfiles and pidfiles
  • overlapping processes
  • jobs apparently ending without performing cleanup tasks
  • jobs continuing to run after being reported "aborted"
@Chaser324
Chaser324 / GitHub-Forking.md
Last active May 13, 2025 18:32
GitHub Standard Fork & Pull Request Workflow

Whether you're trying to give back to the open source community or collaborating on your own projects, knowing how to properly fork and generate pull requests is essential. Unfortunately, it's quite easy to make mistakes or not know what you should do when you're initially learning the process. I know that I certainly had considerable initial trouble with it, and I found a lot of the information on GitHub and around the internet to be rather piecemeal and incomplete - part of the process described here, another there, common hangups in a different place, and so on.

In an attempt to coallate this information for myself and others, this short tutorial is what I've found to be fairly standard procedure for creating a fork, doing your work, issuing a pull request, and merging that pull request back into the original project.

Creating a Fork

Just head over to the GitHub page and click the "Fork" button. It's just that simple. Once you've done that, you can use your favorite git client to clone your repo or j

@rxaviers
rxaviers / gist:7360908
Last active May 15, 2025 03:32
Complete list of github markdown emoji markup

People

:bowtie: :bowtie: πŸ˜„ :smile: πŸ˜† :laughing:
😊 :blush: πŸ˜ƒ :smiley: ☺️ :relaxed:
😏 :smirk: 😍 :heart_eyes: 😘 :kissing_heart:
😚 :kissing_closed_eyes: 😳 :flushed: 😌 :relieved:
πŸ˜† :satisfied: 😁 :grin: πŸ˜‰ :wink:
😜 :stuck_out_tongue_winking_eye: 😝 :stuck_out_tongue_closed_eyes: πŸ˜€ :grinning:
πŸ˜— :kissing: πŸ˜™ :kissing_smiling_eyes: πŸ˜› :stuck_out_tongue:
@sloria
sloria / bobp-python.md
Last active April 27, 2025 07:06
A "Best of the Best Practices" (BOBP) guide to developing in Python.

The Best of the Best Practices (BOBP) Guide for Python

A "Best of the Best Practices" (BOBP) guide to developing in Python.

In General

Values

  • "Build tools for others that you want to be built for you." - Kenneth Reitz
  • "Simplicity is alway better than functionality." - Pieter Hintjens
@hellerbarde
hellerbarde / latency.markdown
Created May 31, 2012 13:16 — forked from jboner/latency.txt
Latency numbers every programmer should know

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns
L2 cache reference ........................... 7 ns
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns             
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 Β΅s
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 Β΅s
SSD random read ........................ 150,000 ns  = 150 Β΅s

Read 1 MB sequentially from memory ..... 250,000 ns = 250 Β΅s