Skip to content

Instantly share code, notes, and snippets.

View sigridjineth's full-sized avatar
๐ŸŽฏ
Lynne Frederick (1954-1994)

Sigrid Jin (เธ‡'ฬ€-'ฬ)เธ‡ oO sigridjineth

๐ŸŽฏ
Lynne Frederick (1954-1994)
View GitHub Profile
@sigridjineth
sigridjineth / test-tune.py
Created May 24, 2025 00:40
Tuning Test
import argparse
import os
import pandas as pd
import vertexai
from google.cloud.aiplatform import pipeline_jobs
from google.cloud.aiplatform.models import Model # For type hinting deployed_model
from vertexai.language_models import TextEmbeddingInput, TextEmbeddingModel
from google.oauth2 import service_account # <--- [์ˆ˜์ •๋จ] ๋ชจ๋“ˆ ์ž„ํฌํŠธ
# --- Configuration (can be overridden by argparse) ---
@sigridjineth
sigridjineth / toss-frontend-rules.mdc
Created May 9, 2025 17:10 — forked from toy-crane/toss-frontend-rules.mdc
ํ† ์Šค ํ”„๋ก ํŠธ์—”๋“œ ๊ฐ€์ด๋“œ๋ผ์ธ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“  Cursor rule
# Frontend Design Guideline
This document summarizes key frontend design principles and rules, showcasing
recommended patterns. Follow these guidelines when writing frontend code.
# Readability
Improving the clarity and ease of understanding code.
import pytorch_lightning as pl
import numpy as np
import torch
from torch.nn import MSELoss
from torch.optim import Adam
from torch.utils.data import DataLoader, Dataset
import torch.nn as nn
class SimpleDataset(Dataset):
@sigridjineth
sigridjineth / reranker.md
Last active May 27, 2024 01:41
Reranker์— ๋Œ€ํ•œ ๊ณ ์ฐฐ

Reranker

  • ๋ฆฌ๋žญ์ปค๋Š” binary classification task๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ชจ๋ธ
  • Binary classification์€ ์ฃผ์–ด์ง„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋‘ ๊ฐ€์ง€ ๋ฒ”์ฃผ ์ค‘ ํ•˜๋‚˜๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ž‘์—…
  • ๋ฆฌ๋žญ์ปค๋Š” ์ฃผ๋กœ sigmoid ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ ํ›„, BCE(Binary Cross-Entropy) ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต
  • Sigmoid ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ ๊ฐ’์„ 0๊ณผ 1 ์‚ฌ์ด์˜ ํ™•๋ฅ  ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•ด ์ฃผ๋Š” ํ•จ์ˆ˜
  • BCE ์†์‹ค ํ•จ์ˆ˜๋Š” ์˜ˆ์ธก๋œ ํ™•๋ฅ  ๊ฐ’๊ณผ ์‹ค์ œ ๋ ˆ์ด๋ธ” ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ชจ๋ธ์˜ ์˜ค์ฐจ๋ฅผ ์ธก์ •
  • Threshold์— ๋Œ€ํ•˜์—ฌ ์ถœ๋ ฅ๊ฐ’ ์ž์ฒด๋กœ๋Š” 0.5๋ผ๋Š” ๊ฐ’ ์ž์ฒด์— ํŠน๋ณ„ํ•œ ์˜๋ฏธ๊ฐ€ ์žˆ๋‹ค๊ณ  ๋ณด๊ธฐ ์–ด๋ ต์ง€๋งŒ, ํ•™์Šต ๊ณผ์ •์—์„œ ์‚ฌ์šฉ๋˜๋Š” sigmoid ํ•จ์ˆ˜์™€ BCE ์†์‹ค ํ•จ์ˆ˜์˜ ํŠน์„ฑ ์ƒ 0.5๋ผ๋Š” ๊ฐ’์ด ๋‘ ๋ฒ”์ฃผ๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ๊ธฐ์ค€์ ์œผ๋กœ ํ•ด์„๋  ์ˆ˜ ์žˆ์Œ
  • Hard negative mining: ์ฒซ ๋‹จ๊ณ„ ๋ชจ๋ธ์—์„œ ์ƒ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹. ์ด ๋ฐ์ดํ„ฐ์…‹์€ ๋ชจ๋ธ์ด ๊ตฌ๋ถ„ํ•˜๊ธฐ ์–ด๋ ค์šด ๋ถ€์ •์ ์ธ ์ƒ˜ํ”Œ(hard negatives)์„ ํฌํ•จํ•˜๊ณ  ์žˆ์–ด, ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์„ ์ค€๋‹ค.
  • Group-wise training: ๊ทธ๋ฃน ํฌ๊ธฐ๋ฅผ 16์œผ๋กœ ์„ค์ •ํ•˜๊ณ , ๊ทธ๋ฃน ๋‚ด์—์„œ positive:negative ๋น„์œจ์„ 1:15๋กœ ์œ ์ง€ํ•˜๋ฉฐ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋ชจ๋ธ์ด ๋” ๋‹ค์–‘ํ•œ negative ์ƒ˜ํ”Œ์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•๋Š”๋‹ค. ๋ชฉ์  ํ•จ์ˆ˜์™€ ํ‰๊ฐ€ ์ง€ํ‘œ๋กœ๋Š” Group CCE(Categorical Cross-Entropy) ๋˜๋Š” LCE(List-wise Cross-Entropy)๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ์ž…๋ ฅ ์‹œํ€€์Šค์˜ ์ตœ๋Œ€ ๊ธธ์ด๋Š” 512๋กœ ์„ค์ •๋˜๋ฉฐ, ํ•™์Šต ์†๋„ ํ–ฅ์ƒ์„ ์œ„ํ•ด Mixed precision(FP16)์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์œผ๋กœ ์งˆ๋ฌธ๊ณผ ์ธ์šฉ๊ตฌ๋ฅผ ๋ฌด์ž‘์œ„๋กœ ์ฆ๊ฐ•ํ•˜๋ฉฐ, ๊ธฐ์šธ๊ธฐ ์ฒดํฌํฌ์ธํŒ…(Gradient Checkpointing)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ค„์ธ๋‹ค. ๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ๊ธฐ๋ฒ•์€ ๋” ํ…Œ์ŠคํŠธํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค.
@sigridjineth
sigridjineth / embed_raft.py
Created March 1, 2024 20:42
raft embedding
import cupy as cp
import numpy as np
from pylibraft.distance import pairwise_distance
from pylibraft.knn import brute_force_knn
def main(num_elements, dim):
# ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ๋ฐ ์ •๊ทœํ™” (CuPy ์‚ฌ์šฉ)
cp.random.seed(42)
data = cp.random.random((num_elements, dim)).astype(cp.float32)
norm_data = data / cp.linalg.norm(data, axis=1, keepdims=True)
@sigridjineth
sigridjineth / hnswlib.py
Created March 1, 2024 20:38
hnswlIb_langcon
import argparse
import hnswlib
import numpy as np
def main(num_elements, dim):
# ๋ฐ์ดํ„ฐ ์ƒ์„ฑ ๋ฐ ์ •๊ทœํ™”
np.random.seed(42)
data = np.random.random((num_elements, dim)).astype(np.float32)
norm_data = data / np.linalg.norm(data, axis=1)[:, None]
@sigridjineth
sigridjineth / tcopl_deepmind.md
Created February 11, 2024 02:29
Training Compute-Optimal Large Language Models - DeepMind

2ํŽธ: โ€˜๋ชจ๋ธ ํฌ๊ธฐโ€™์™€ โ€˜ํ•™์Šต ํ† ํฐ ์ˆ˜โ€™์˜ ์ ์ ˆํ•œ ๋น„์œจ์€? | NeurIPS 2022 | ๊น€ํƒ๋ฏผ

2ํŽธ: โ€˜๋ชจ๋ธ ํฌ๊ธฐโ€™์™€ โ€˜ํ•™์Šต ํ† ํฐ ์ˆ˜โ€™์˜ ์ ์ ˆํ•œ ๋น„์œจ์€? | NeurIPS 2022 | ๊น€ํƒ๋ฏผ

Gopher๋งŒํผ ๊ณ„์‚ฐ๋Ÿ‰์ด ์žˆ๋‹ค๋ฉด 63B โ†’ 1.4T์—์„œ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ์ตœ์ ์ด๋ผ๊ณ  ํ•œ๋‹ค. ์‹ค์ œ ํ˜„์—…์—์„œ๋Š” ๊ด€์  2๊ฐ€ ์กฐ๊ธˆ ๋” ํ˜„์‹ค์„ฑ์ด ์žˆ๋Š”๋ฐ, pretrain์„ ํ•˜๊ธฐ ์ „์— ์ด๋ฏธ GPU๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฏ€๋กœ ๊ณ„์‚ฐ๋Ÿ‰์ด ๊ณ ์ •๋˜์–ด ์žˆ๋Š” ์ƒํƒœ. ์ตœ๊ณ ์˜ ์ •ํ™•๋„๋ฅผ ๋‚ด๊ธฐ ์œ„ํ•ด์„œ ๋ชจ๋ธ ํฌ๊ธฐ๋ฅผ ์–ด๋–ป๊ฒŒ ์กฐ์ •ํ•ด์•ผ ํ•˜๋Š”์ง€ ๊ทธ๋ž˜ํ”„๋ฅผ ํ†ตํ•ด์„œ ๊ฒฐ๋ก ์„ ๋‚ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์•ˆ๋“œ๋ ˆ ์นดํŒŒ์‹œ๋„ nanoGPT์—์„œ ๊ด€์  2๋ฅผ ์ข‹์•„ํ•œ๋‹ค๊ณ  ์–ธ๊ธ‰ํ–ˆ๋‹ค.

Figure 3 > Figure 2 ๋” ์ดํ•ดํ•˜๊ธฐ ํŽธํ•˜๋‹ค.

๊ด€์  1๊ณผ ๊ด€์  2์—์„œ ์–ป์€ ๋ฐ์ดํ„ฐ๋ฅผ ์กฐ๊ธˆ ๋” ์ž˜ ํ•ด์„ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์ œ์•ˆํ•˜๋Š” ๊ฒƒ์ธ๋ฐ, ์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ 400๊ฐœ ๋ชจ์•˜๋Š”๋ฐ ์ตœ์ข… ๋กœ์Šค๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ฉด ์–ด๋–จ๊นŒ? ๋ผ๋Š” ์ƒ๊ฐ์ด๊ณ  ์ƒ˜ ์•ŒํŠธ๋งŒ๋„ ๋‚ด๋ถ€ ์ธํ„ฐ๋ทฐ๋ฅผ ํ†ตํ•ด์„œ ์ตœ์ดˆ ํ•™์Šต ์กฐ๊ธˆ ํ•˜๋ฉด ์ตœ์ข… ๋กœ์Šค๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋‹ค๋Š” ๋‚ด์šฉ ๋งํ–ˆ์Œ. ๋กœ์Šค ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ž˜ ์ด์•ผ๊ธฐํ•˜๋ฉด ๋ชจ๋ธ ํฌ๊ธฐ๋Š” ์–ผ๋งˆ๋‚˜ ๋˜์–ด์•ผ ํ•˜๊ณ  ๋ฐ์ดํ„ฐ์…‹์€ ์–ผ๋งˆ๋‚˜ ํ•„์š”ํ•œ์ง€ ์ž˜ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. GPU ์Šค์ผ€์ค„๋ง์ด ์‰ฌ์›Œ์ง„๋‹ค๋˜์ง€, ์ด ์‚ฌ๋žŒ์—๊ฒŒ ๋ช‡ ๋Œ€๋ฅผ ๋นŒ๋ ค์ฃผ๋ฉด ๋˜๋Š”์ง€ ๋“ฑ ์ž˜ ๋นŒ๋ ค์ฃผ๋Š” ๊ฒƒ์ด ์ข‹์„ ์ˆ˜ ์žˆ๋‹ค. ์ €์ž๋“ค์€ ๋กœ์Šค ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ €๋ ‡๊ฒŒ ์ƒ๊ธธ ์ˆ˜ ์žˆ์ง€ ์•Š์„๊นŒ? ์ด์•ผ๊ธฐํ–ˆ๊ณ . L-BFGS ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•ด์„œ Huber ๋กœ์Šค๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ํ•™์Šตํ•˜๋ฉด ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ฑ„์šธ ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค.

Akka

  1. Actor model ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ Akka๋Š” ๋ช‡ ๊ฐ€์ง€ Component๋กœ ๊ตฌ์„ฑ
  2. Dispatcher: Actor System ๋‚ด์—์„œ ์‹คํ–‰๋˜๋Š” ๋ชจ๋“  ์ฝ”๋“œ๋ฅผ ์Šค์ผ€์ค„๋งํ•˜๋ฉฐ ๊ฐ Actor์˜ ์ฒ˜๋ฆฌ๋Ÿ‰๊ณผ ์‹œ๊ฐ„ ์ ์œ ์œจ์„ ์กฐ์ •ํ•˜์—ฌ ๊ฐ์ž์—๊ฒŒ ๊ณต์ •ํ•œ ๋ฆฌ์†Œ์Šค๋ฅผ ์ œ๊ณต
  3. Mailbox: Dispatcher์—์„œ ๋“ค์–ด์˜ค๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ๋‹ด๋Š” ๋ฉ”์‹œ์ง• ํ๋กœ Actor๋งˆ๋‹ค Mailbox๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ ๋“ค์–ด์˜จ ๋ฉ”์‹œ์ง€์˜ ์ˆœ์„œ๋Œ€๋กœ Actor์—์„œ ์†Œ๋น„
  4. Actor: ์‹œ์Šคํ…œ์„ ๊ตฌ์„ฑํ•˜๋Š” ์ผ์ข…์˜ ํ–‰์œ„ ๊ฐ์ฒด๋กœ ๋ฉ”์‹œ์ง€๋ฅผ ์‹ค์งˆ์ ์œผ๋กœ ํ•„์š”๋กœ ํ•˜๋Š” ์†Œ๋น„์ž ๋ฐ ์†ก/์ˆ˜์‹ ์ž

์•กํ„ฐ๋Š” ์–ด๋–ค ๊ตฌ์กฐ๋กœ ํ†ต์‹ ์„ ํ•˜๋Š”๊ฐ€?

  • ๊ณ„์ธต๊ตฌ์กฐ
  • Akka Actor ๊ณ„์ธต ๊ตฌ์กฐ๋Š” ํŠธ๋ฆฌ ํ˜•ํƒœ๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ
@sigridjineth
sigridjineth / llm_history.md
Last active February 1, 2024 01:50
LLM ์—ญ์‚ฌ๋ฅผ ํ•จ๊ป˜ ๊ฑฐ๊พธ๋กœ ์ฝ์–ด๋ณด์‹œ์ฃ  (๊ฐ•์žฌ์šฑ) 1ํšŒ์ฐจ

1ํŽธ: LLM ์—ญ์‚ฌ๋ฅผ ํ•จ๊ป˜ ๊ฑฐ๊พธ๋กœ ์ฝ์–ด๋ณด์‹œ์ฃ  ! | ๊ฐ•์žฌ์šฑ

Pre-training for completion

  • ์–ธ์–ด์  ํ†ต๊ณ„ ์ •๋ณด๋ฅผ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ๋กœ ์ธ์ฝ”๋”ฉ ํ•˜๋Š” ๊ฒƒ
  • ๋”ฐ๋ผ์„œ ์‚ฌ์ „ ํ•™์Šต ๋‹จ๊ณ„์—์„œ ์–ธ์–ด์  ํ†ต๊ณ„ ์ง€์‹์„ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ
  • next token prediction (or language modeling)
    • ํ…์ŠคํŠธ๋ฅผ ์™„์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ฐ˜๋ณต์‹œํ‚ค๋Š” ๊ฒƒ
    • ์–ธ์–ด์˜ ํ†ต๊ณ„ ์ง€์‹์„ ์ฃผ์ž…ํ•˜๋Š” ํ–‰์œ„์ž„
  • ์‚ฌ๋žŒ์ด ์š”์ฒญํ•œ ๋Šฅ๋ ฅ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์ด pre training ๋•Œ๋ฌธ
@sigridjineth
sigridjineth / mvt_inverss.md
Last active January 12, 2024 01:52
ํ‰๊ท ๊ฐ’ ์ •๋ฆฌ์˜ ์—ญ

ํ‰๊ท ๊ฐ’ ์ •๋ฆฌ์˜ ์—ญ

๊ธฐ๋ณธ ์„ฑ์งˆ

๊ณ ๋“ฑํ•™๊ต ๊ณผ์ •์—์„œ ๋ฐฐ์šฐ๋Š” ํ‰๊ท ๊ฐ’ ์ •๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

"ํ•จ์ˆ˜ f๊ฐ€ ๋‹ซํžŒ๊ตฌ๊ฐ„ [a, b]์—์„œ ์—ฐ์†์ด๊ณ  ์—ด๋ฆฐ๊ตฌ๊ฐ„ (a, b)์—์„œ ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•˜๋ฉด, f(b)-f(a)=f '(c)(b-a)๋ฅผ ๋งŒ์กฑํ•˜๋Š” ์‹ค์ˆ˜ c๊ฐ€ ์—ด๋ฆฐ๊ตฌ๊ฐ„ (a, b)์— ์กด์žฌํ•œ๋‹ค."

ํ‰๊ท ๊ฐ’ ์ •๋ฆฌ๋Š” ์ฃผ์–ด์ง„ ๊ตฌ๊ฐ„์˜ ํ‰๊ท ๋ณ€ํ™”์œจ๊ณผ ์ ‘์„ ์˜ ๊ธฐ์šธ๊ธฐ๊ฐ€ ๊ฐ™์•„์ง€๋Š” ์–ด๋–ค ์ ์ด ์กด์žฌํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋งํ•˜๊ณ  ์žˆ๋‹ค. ๋ฐ˜๋ฉด์—, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ช…์ œ๋Š” ์–ด๋–จ๊นŒ?

"ํ•จ์ˆ˜ f๊ฐ€ ๋‹ซํžŒ๊ตฌ๊ฐ„ [a, b]์—์„œ ์—ฐ์†์ด๊ณ  ์—ด๋ฆฐ๊ตฌ๊ฐ„ (a, b)์—์„œ ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•˜๋ฉด, ์—ด๋ฆฐ๊ตฌ๊ฐ„ (a, b)์— ์†ํ•˜๋Š” ์ž„์˜์˜ c์— ๋Œ€ํ•˜์—ฌ f(s)-f(t)=f '(c)(s-t), s<c<t๋ฅผ ๋งŒ์กฑํ•˜๋Š” ์‹ค์ˆ˜ s, t๊ฐ€ ์—ด๋ฆฐ๊ตฌ๊ฐ„ (a, b)์— ์กด์žฌํ•œ๋‹ค."