Skip to content

Instantly share code, notes, and snippets.

View lambdaofgod's full-sized avatar
🤗
opensource

Jakub Bartczuk lambdaofgod

🤗
opensource
View GitHub Profile
@lambdaofgod
lambdaofgod / chatgpt_cost.org
Last active March 2, 2023 09:45
ChatGPT API monthly cost estimate

chatgpt_experiment

ChatGPT API monthly cost estimate

Assume using ChatGPT 30 days a month, with 100 queries per day.

Mean tokens per query is upper limit (we estimated it using a bunch of queries), in practice this would be lower, especially because it can be bounded by asking ChatGPT to only answer with fixed number of sentences

api_cost_per_token = 2e-6
@lambdaofgod
lambdaofgod / test_tokenization.py
Created January 29, 2023 13:04
Test Rust tokenization
from rust_functions import tokenize_python_code
def test_simple_input():
example_code = """
def foo():
return x + 1
"""
expected_tokens = ["def", "foo", "(", ")", "return", "x", "+", "1"]
assert tokenize_rust(example_code.strip()) == expected_tokens
@lambdaofgod
lambdaofgod / gpt_chat_bm25_search.py
Created December 3, 2022 15:14
search app generated with
"""
Query:
Write a gradio app that shows results of searching in a list of texts:
- tokenizing texts with nltk
- using rank_bm25 library
- displaying results as dataframe under the search box
Response:
Here is an example of a gradio app that uses the rank_bm25 library to search through a list of texts, tokenizes the texts and queries using nltk, and shows the results as a dataframe under the search box:
"""
sudo apt-get install -y libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6 icu-devtools libicu-dev
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
sudo apt-get install npm
sudo npm i elasticdump
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
import ktrain
zsl = ktrain.text.ZeroShotClassifier('huggingface/prunebert-base-uncased-6-finepruned-w-distil-mnli')
def build_segmentation_model(
input_shape,
n_classes,
base_block_size=BASE_BLOCK_SIZE,
base_dropout_rate=BASE_DROPOUT_RATE,
activation=ACTIVATION
):
# Build U-Net segmentation_model
inputs = layers.Input(input_shape)
s = layers.Lambda(lambda x: x - 0.5) (inputs)
apt-get install vim git wget
# Example NeoMutt config file for the index-color feature.
# Entire index line
color index white black '.*'
# Author name, %A %a %F %L %n
# Give the author column a dark grey background
color index_author default color234 '.*'
# Highlight a particular from (~f)
color index_author brightyellow color234 '~fRay Charles'
# Message flags, %S %Z
import torch
import ot
from sklearn import metrics
roberta = torch.hub.load('pytorch/fairseq', 'roberta.large')
roberta.eval() # disable dropout (or leave in train mode to finetune)
def get_roberta_features(text):