Pratyay Banerjee Neilblaze

Install HF Code Autocomplete VSCode plugin.

We are not going to set an API token. We are going to specify an API endpoint.
We will try to deploy that API ourselves, to use our own GPU to provide the code assistance.

We will use bigcode/starcoder, a 15.5B param model.
We will use NF4 4-bit quantization to fit this into 10787MiB VRAM.
It would require 23767MiB VRAM unquantized. (still fits on a 4090, which has 24564MiB)!

Setup API

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Pre-Transformer Models

from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
    "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
).to("cuda")
pipeline.load_lora_weights("sayakpaul/yarn_art_lora_flux", weight_name="pytorch_lora_weights.safetensors")
image = pipeline("a puppy in a pond, yarn art style", guidance_scale=3.5, height=768).images[0]
image.save("yarn.png")

	# Find
	import\s(([^;]\|\n))\sfrom\s(['"])(\.{1,2}\/.)(?<!\.js)(?<!\.(css\|pdf\|png\|jpg\|jsx\|mjs\|mp3\|mp4\|svg\|ttf))(?<!\.(avif\|json\|webm\|webp\|woff))(?<!\.woff2)(['"]);

	# Replace with
	import $1 from $3$4.js$7;

	import styles from './Apps.module.scss';
	import { useEffect, useState } from 'react';
	import Link from 'next/link';

	const APPS = [
	{
	title: 'APP',
	hero: 'Lorem ipsum dolor sit amet',
	description:
	'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do.',


	"""
	a simple script that reads tweets inside a json file, uses openai to compute embeddings and creates two files, metadata.tsv and output.tsv, which cam be used to visualise the tweets and their embeddings in TensorFlow Projector (https://projector.tensorflow.org/)
	"""

	# obtain tweets.json from https://gist.github.com/gd3kr/948296cf675469f5028911f8eb276dbc

	import pandas as pd
	import json
	from openai import OpenAI

	import torch
	from huggingface_hub import hf_hub_download
	from diffusers import FluxTransformer2DModel, DiffusionPipeline

	dtype, device = torch.bfloat16, "cuda"
	ckpt_id = "black-forest-labs/FLUX.1-schnell"

	with torch.device("meta"):
	config = FluxTransformer2DModel.load_config(ckpt_id, subfolder="transformer")
	model = FluxTransformer2DModel.from_config(config).to(dtype)

	# -----------------------------------------------------------------------------
	# AI-powered Git Commit Function
	# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
	# 1) gets the current staged changed diff
	# 2) sends them to an LLM to write the git commit message
	# 3) allows you to easily accept, edit, regenerate, cancel
	# But - just read and edit the code however you like
	# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/

	gcm() {

	Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
	Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
	Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
	Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
	Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
	Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:

	0.8+: Continue current approach
	0.5-0.7: Consider minor adjustments
	Below 0.5: Seriously consider backtracking and trying a different approach