Skip to content

Instantly share code, notes, and snippets.

View keyboardAnt's full-sized avatar

Nadav Timor keyboardAnt

View GitHub Profile
@keyboardAnt
keyboardAnt / gist:9dbf094123c818cd68bd986b52af9dd5
Last active September 11, 2016 11:16
ES watch: core dumps query
PUT _watcher/watch/core_dumps_slack
{
"trigger" : {
"schedule" : {"interval" : "1m"}
},
"input" : {
"search" : {
"request" : {
"body" : {
"query": {
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@keyboardAnt
keyboardAnt / lm_format_enforcer_vllm_integration.ipynb
Last active November 1, 2023 19:29
lm_format_enforcer_vllm_integration.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@keyboardAnt
keyboardAnt / copy-of-colab_vllm_integration.ipynb
Last active November 3, 2023 03:30
copy-of-colab_vllm_integration.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@keyboardAnt
keyboardAnt / sd.py
Last active November 11, 2023 11:48
# Refernce: https://static.sched.com/hosted_files/pytorch2023/c0/Accelerating%20Generative%20AI%20PTC%20%282%29.pdf?page=41
import torch
def speculative_decode(
model: LLaMA,
draft_model: LLaMA,
cur_token: torch.Tensor,
input_pos: int,
import numpy as np
# Placeholder functions to be implemented
def get_lora_adapter(username: str) -> callable:
"""Returns a callable function for the Lora adapter."""
pass
def get_reft_adapter(username: str) -> callable:
"""Returns a callable function for the reft adapter."""