Yassine Alouini yassineAlouini

Task

Launch a new agent that has access to the following tools: Bash, Glob, Grep, LS, exit_plan_mode, Read, Edit, MultiEdit, Write, NotebookRead, NotebookEdit, WebFetch, TodoRead, TodoWrite, WebSearch. When you are searching for a keyword or file and are not confident that you will find the right match in the first few tries, use the Agent tool to perform the search for you.

When to use the Agent tool:

If you are searching for a keyword like "config" or "logger", or for questions like "which file does X?", the Agent tool is strongly recommended

When NOT to use the Agent tool:

If you want to read a specific file path, use the Read or Glob tool instead of the Agent tool, to find the match more quickly
If you are searching for a specific class definition like "class Foo", use the Glob tool instead, to find the match more quickly
If you are searching for code within a specific file or set of 2-3 files, use the Read tool instead of the Agent tool, to find the match more quickly

JAX released a persistent compilation cache for TPU VMs! When enabled, the cache writes compiled JAX computations to disk so they don’t have to be re-compiled the next time you start your JAX program. This can save startup time if any of y’all have long compilation times.

First upgrade to the latest jax release:

pip install -U "jax[tpu]>=0.2.18" -f https://storage.googleapis.com/jax-releases/libtpu_releases.html

Then use the following to enable the cache in your jax code:

from jax.experimental.compilation_cache import compilation_cache as cc

	"""
	The most atomic way to train and run inference for a GPT in pure, dependency-free Python.
	This file is the complete algorithm.
	Everything else is just efficiency.

	@karpathy
	"""

	import os # os.path.exists
	import math # math.log, math.exp

	#VERBOSE=0 torchrun --nproc_per_node 3 self_contained_pp_LOC.py
	import os, random, numpy as np, torch, torch.nn as nn, torch.distributed as dist, torch.nn.functional as F
	from torch.optim import AdamW
	from torch.utils.data import DataLoader, DistributedSampler
	from datasets import load_dataset
	from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

	STEP, local_rank, world_size, verbose = 0, int(os.environ["LOCAL_RANK"]), int(os.environ["WORLD_SIZE"]), os.environ.get("VERBOSE", "0") == "1"

	def set_all_seed(seed):

	import math
	import os
	from collections import defaultdict
	from pathlib import Path

	from huggingface_hub import CommitOperationAdd, preupload_lfs_files, create_commit

	# fast transfers using a Rust library, `pip install hf-transfer`
	os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

	""" Converts quora dumps that you can request Quora to send you (multiple zip files) into parsed json and markdown
	To use:
	1. install dateparser, beautifulsoup4, markdownify
	2. copy the zips you've received to ./data, make sure to keep only the ones containing answers (some will
	contain comments, blog posts, and other metadata)
	3. run this script
	4. see ./output folder

	License (MIT):
	Copyright 2021 Yuan Gao

	import torch
	import cv2
	import h5py
	import numpy as np

	from scipy.io import loadmat

	import torch.utils.data as data
	import torch.nn.functional as F
	from torchvision.transforms import Compose

	git clone https://git.videolan.org/git/ffmpeg/nv-codec-headers.git
	cd nv-codec-headers
	vi Makefile # change the first line to PREFIX = ${CONDA_PREFIX}
	make install
	cd ..

	git clone https://git.ffmpeg.org/ffmpeg.git
	cd ffmpeg
	git checkout n4.2.2
	conda install nasm

	import pandas as pd

	def confusion_matrix(df: pd.DataFrame, col1: str, col2: str):
	"""
	Given a dataframe with at least
	two categorical columns, create a
	confusion matrix of the count of the columns
	cross-counts

	use like:

	import cv2
	import numpy as np

	scale = 0.00392
	classes_file = "coco.names"
	weights = "yolov2.weights"
	config_file = "yolov2.cfg"

	# read class names from text file
	classes = None