lewtun

🤫

LLM whispering

LLM Research and Engineering @ Hugging Face

1.4k followers · 0 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

lewtun / dialogue_template.py

Last active October 6, 2023 12:57

Dialogue template

	# coding=utf-8
	# Copyright 2023 The HuggingFace Team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software

lewtun / m4_inference.py

Created July 4, 2023 15:28

M4 inference

	import torch
	from m4.training.packing import image_attention_mask_for_packed_input_ids, incremental_to_binary_attention_mask
	from m4.training.utils import build_image_transform
	from io import BytesIO
	from PIL import Image
	import requests
	from transformers import AutoTokenizer, AutoModelForCausalLM


	MAX_SEQ_LEN=2048

lewtun / hf-endpoints-inference.ipynb

Created June 10, 2023 07:53

Demo of synchronous and token streaming text generation

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

lewtun / dataset-sharding.ipynb

Created February 12, 2023 15:22

Sharding dataset subsets

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

lewtun / metrics.jsonl

Created December 2, 2022 09:47

metric.jsonl

	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"accuracy","value":0.8967889908256881,"name":"Accuracy"}}
	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"precision","value":0.8898678414096917,"name":"Precision"}}
	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"recall","value":0.9099099099099099,"name":"Recall"}}
	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"auc","value":0.9672186789593331,"name":"AUC"}}
	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"type":"f1","value":0.8997772828507795,"name":"F1"}}
	{"id":"628dfaf7554de818ab126e2d","dataset":{"name":"glue","type":"glue","config":"sst2","split":"validation"},"metric":{"ty

lewtun / format_spaces_urls.ipynb

Created November 22, 2022 12:30

[HF Course] Format Gradio URLs

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

lewtun / update-label-mappings.py

Created July 15, 2022 18:21

Update label mappings in config.json

	import json

	import datasets
	import transformers
	from datasets import ClassLabel, load_dataset
	from huggingface_hub import (
	HfFolder,
	ModelFilter,
	hf_hub_download,
	list_models,

lewtun / page_334.py

Created January 21, 2022 09:27

Correction to page 334

	def get_grouped_params(model, no_decay=["bias", "LayerNorm.weight"]):
	params_with_wd, params_without_wd = [], []
	for n, p in model.named_parameters():
	if any(nd in n for nd in no_decay):
	params_without_wd.append(p)
	else:
	params_with_wd.append(p)
	return [{'params': params_with_wd, 'weight_decay': args.weight_decay},
	{'params': params_without_wd, 'weight_decay': 0.0}]

lewtun / chapter10_page334.py

Created January 20, 2022 11:00

	if any(nd in n for nd in no_decay):
	params_without_wd.append(p)
	else:
	params_with_wd.append(p)

lewtun / chapter06_codeblock01.py

Created January 20, 2022 08:31

Chapter 6 - Improve codeblock for summaries

	from tqdm import tqdm
	import torch

	device = "cuda" if torch.cuda.is_available() else "cpu"

	def chunks(list_of_elements, batch_size):
	"""Yield successive batch-sized chunks from list_of_elements."""
	for i in range(0, len(list_of_elements), batch_size):
	yield list_of_elements[i : i + batch_size]

Newer Older