Skip to content

Instantly share code, notes, and snippets.

View sayakpaul's full-sized avatar
:octocat:
Learn, unlearn and relearn.

Sayak Paul sayakpaul

:octocat:
Learn, unlearn and relearn.
View GitHub Profile
@sararob
sararob / pipeline-runner.py
Last active August 21, 2022 11:09
This Cloud Function will kick off a Vertex Pipeline run whenever a specified amount of new BigQuery data is available for training. Deploy it as an HTTP function and set up a Cloud Scheduler job to automate running it on a recurring basis. See this blog post for details: https://cloud.google.com/blog/topics/developers-practitioners/lets-get-it-s…
# Copyright 2021 Google LLC.
# SPDX-License-Identifier: Apache-2.0
import kfp
import json
import time
from google.cloud import bigquery
from google.cloud.exceptions import NotFound
from kfp.v2.google.client import AIPlatformClient
client = bigquery.Client()

TF-Hub text embedding modules for underrepresented languages

Mentors:

  • Morgan Roff
  • Sayak Paul
  • jaeyounkim

This is a summary of my GSoC 2021 project. In this project, I tried to produce text embedding modules trained on underrepresented languages like Arabic and Swahili and publish them on tfhub.dev.

@simonster
simonster / attention_distance.py
Last active April 24, 2025 11:48
Mean attention distance
# Copyright 2022 Google LLC.
# SPDX-License-Identifier: Apache-2.0
# Author: Maithra Raghu <[email protected]>
def compute_distance_matrix(patch_size, num_patches, length):
"""Helper function to compute distance matrix."""
distance_matrix = np.zeros((num_patches, num_patches))
@gau-nernst
gau-nernst / flux_infer.py
Last active January 24, 2025 06:33
FLUX CPU offload
import torch
from diffusers import FluxPipeline
from torch import nn
class ModelOffloaderV2:
def __init__(self, model: nn.Module, record_stream: bool = False):
# move model to pinned memory. keep a model copy in CPU pinned memory.
for p in model.parameters():
p.data = p.data.cpu().pin_memory()