Skip to content

Instantly share code, notes, and snippets.

View philipturner's full-sized avatar
🏠
Working from home

Philip Turner philipturner

🏠
Working from home
View GitHub Profile
//
// main.swift
// OpenMM_M1_GPU
//
// Created by Philip Turner on 1/6/23.
//
import Foundation
import PythonKit
import Metal
//
// main.swift
// BLASBenchmarks
//
// Created by Philip Turner on 1/20/23.
//
import Accelerate.vecLib
import QuartzCore
import PythonKit
//
// main.swift
// LLMs
//
// Created by Philip Turner on 3/11/23.
//
import MetalPerformanceShadersGraph
// Generate and print random 4-bit weights, 32-bit scales and zeroes.

Bing Conversation

GPT-4 is a breakthrough language model with an estimated ~400 billion to 2 trillion parameters. It is used in Bing Chat, but can be easily hijacked to generate code by bypassing the safeguards. It has a usage limit of 15 responses/chat and 150 responses/day. I can get it to answer hard questions - especially using internet search capabilities - but it cannot tear through massive CUDA code bases in a porting workflow. That would require an open-source, locally executed AI like LLaMa (leaked from Meta on 4chan). The 33 billion parameter variant seems like an ideal tradeoff between quality and ability to fit into a GPU's memory. It could be fine-tuned using LoRA, an algorithm that lets you attach cheaply trained "knowledge modules" onto each transformer layer.

_This text was slightly modified to improve code formatting. Otherwise, all tokens are pasted from GPT-4 verbatim. You can prove this by trying it yourself, which is free but requires account creation. Use the purple "Creative" or "I

//
// main.swift
// ARHKConversion
//
// Created by Philip Turner on 4/22/23.
//
import Metal
import QuartzCore // contains the CACurrentMediaTime() function
CCV_NNC_GEMM_FORWARD [1]: [3] -> [1] (0)
|-> 1. 0x1438bd420 (0x285d90fc0:0) [2x320] 0.517578 0.953613 -0.921875 ..
|-> 2. 0x1438bd570 (0x285d841c0:0) [1280x320] -0.001888 0.001598 0.001110 ..
|-> 3. 0x1438bd5e0 (0x285d84280:0) [1280] -0.019775 0.008278 0.010788 ..
|<- 1. 0x1438a0000 (0x285da5600:0) [2x1280] 0.044556 -0.020798 0.078064 ..
CCV_NNC_SWISH_FORWARD [2]: [1] -> [1] (0)
|-> 1. 0x1438a0000 (0x285da5600:0) [2x1280] 0.044556 -0.020798 0.078064 ..
|<- 1. 0x1438a0000 (0x285da5600:0) [2x1280] 0.022781 -0.010292 0.040558 ..
CCV_NNC_GEMM_FORWARD [3]: [3] -> [1] (0)
|-> 1. 0x1438a0000 (0x285da5600:0) [2x1280] 0.022781 -0.010292 0.040558 ..
@philipturner
philipturner / LLaMA.swift
Last active December 8, 2023 19:13
Port of Facebook's LLaMA model in Swift/Metal
//
// LLaMA.swift
// MetalFlashAttention
//
// Created by Philip Turner on 5/24/23.
//
import MetalPerformanceShadersGraph
import simd
//
// main.swift
// ZAPCoordinates
//
// Created by Philip Turner on 5/30/23.
//
import Foundation
import simd
@philipturner
philipturner / CalculateDiffusion.swift
Last active July 20, 2025 10:28
Calculate the number of floating-point operations in Stable Diffusion, and how those operations are distributed among layers
//
// main.swift
// CalculateDiffusion
//
// Created by Philip Turner on 6/2/23.
//
import Foundation
import QuartzCore
import MetalPerformanceShadersGraph