GPT-4 is a breakthrough language model with an estimated ~400 billion to 2 trillion parameters. It is used in Bing Chat, but can be easily hijacked to generate code by bypassing the safeguards. It has a usage limit of 15 responses/chat and 150 responses/day. I can get it to answer hard questions - especially using internet search capabilities - but it cannot tear through massive CUDA code bases in a porting workflow. That would require an open-source, locally executed AI like LLaMa (leaked from Meta on 4chan). The 33 billion parameter variant seems like an ideal tradeoff between quality and ability to fit into a GPU's memory. It could be fine-tuned using LoRA, an algorithm that lets you attach cheaply trained "knowledge modules" onto each transformer layer.
_This text was slightly modified to improve code formatting. Otherwise, all tokens are pasted from GPT-4 verbatim. You can prove this by trying it yourself, which is free but requires account creation. Use the purple "Creative" or "I