Mehdi Cherti mehdidc

model	gpu	env	cl	infer_samples_per_sec	infer_step_time	infer_batch_size	train_samples_per_sec	train_step_time	train_batch_size	param_count	img_size
efficientnet_b0	rtx3090	ngc2102	True	7179.22	0.139	512	1628.51	0.609	256	5.29	224
efficientnet_b0	rtx3090	ngc2012	True	6527.77	0.153	512	1504.58	0.654	256	5.29	224
efficientnet_b0	v100_32	ngc2102	True	6496.56	0.154	512	1556.66	0.638	512	5.29	224
efficientnet_b0	rtx3090	1.7.1cu11.0	True	6020.3	0.166	512	1266.03	0.785	512	5.29	224
efficientnet_b0	rtx3090	1.8cu11.1	True	5979.7	0.167	512	1286.76	0.775	512	5.29	224
efficientnet_b0	v100_32	ngc2012	True	5666.05	0.176	512	1459.05	0.676	512	5.29	224
efficientnet_b0	v100_32	1.8cu11.1	True	5529.09	0.181	512	1444.02	0.688	512	5.29	224
efficientnet_b0	v100_32	1.7.1cu11.0	True	5526.07	0.181	512	1425.38	0.691	512	5.29	224
efficientnet_b0	titanrtx	ngc2102	True	5118.38	0.195	512	1156.83	0.862	512	5.29	224

First follow https://huggingface.co/docs/huggingface_hub/how-to-upstream#push-files-with-git-lfs

https://huggingface.co/new-dataset
git clone https://huggingface.co/datasets/user/dataset
cd dataset
put your files in the current folder (probably a mv)
git lfs track *
git add .
git commit -m "dataset"
git push

Install

git clone https://github.com/mlfoundations/open_clip.git
cd open_clip
python3.8 -m venv .env
source .env/bin/activate
pip install -U pip
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
pip install -e .

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

The Verge: "Meta’s powerful AI language model has leaked online — what happens now?"

Could you confirm that you downloaded the LLaMA series from 4chan? Were you able to get it running yourself or did you just repackage the download? (I was a bit confused reading your tweets about that what exactly you'd done there, so if you're able to explain that, it'd be great)

I downloaded it from Facebook, actually. You can find some details here.

Basically, the sequence of events was:

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

	import torch
	import torch.distributed as dist
	import torch.nn as nn
	import torch.multiprocessing as mp

	from torch.nn.parallel import DistributedDataParallel as DDP
	from fairscale.nn.data_parallel import ShardedDataParallel as ShardedDDP
	from fairscale.optim.oss import OSS
	from fairscale.nn.data_parallel import FullyShardedDataParallel as FSDP
	import os

	"""
	stable diffusion dreaming
	creates hypnotic moving videos by smoothly walking randomly through the sample space

	example way to run this script:

	$ python stablediffusionwalk.py --prompt "blueberry spaghetti" --name blueberry

	to stitch together the images, e.g.:
	$ ffmpeg -r 10 -f image2 -s 512x512 -i blueberry/frame%06d.jpg -vcodec libx264 -crf 10 -pix_fmt yuv420p blueberry.mp4

	import numpy as np
	from scipy.fftpack import dct


	def hash_algo(pil_img, size=10):
	"""
	Get perceptual hash of the input image.

	Args:
	image_array: numpy array that corresponds to the image.

	import types
	from typing import Union, List, Optional, Callable

	import diffusers
	import torch

	from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion import StableDiffusionPipelineOutput


	@torch.inference_mode()