Skip to content

Instantly share code, notes, and snippets.

View keskinonur's full-sized avatar
🖖

Onur Keskin, Ph.D. keskinonur

🖖
View GitHub Profile
@rishi-singh26
rishi-singh26 / main.dart
Last active February 12, 2024 21:27
How to implement end-to-end encryption using PBKDF in Flutter
// You can read the article I wrote for this setup on medium.com at the link below
// https://medium.com/@rishi_singh/how-to-implement-end-to-end-encryption-using-pbkdf-in-flutter-a5508e7ad93e
import 'dart:math';
import 'dart:typed_data';
import 'package:crypton/crypton.dart';
import 'package:pointycastle/block/aes.dart';
import 'package:pointycastle/digests/sha256.dart';
import 'package:pointycastle/key_derivators/pbkdf2.dart';
@adrienbrault
adrienbrault / llama2-mac-gpu.sh
Last active April 8, 2025 13:49
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM. UPDATE: see https://twitter.com/simonw/status/1691495807319674880?s=20
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
make clean
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin
@catid
catid / gist:533dd0c7d4f3ee8d34a6a905155b72ae
Last active April 22, 2024 04:53
How to quantize 70B model so it will fit on 2x4090 GPUs
How to quantize 70B model so it will fit on 2x4090 GPUs:
I tried EXL2, AutoAWQ, and SqueezeLLM and they all failed for different reasons (issues opened).
HQQ worked:
I rented a 4x GPU 1TB RAM ($19/hr) instance on runpod with 1024GB container and 1024GB workspace disk space.
I think you only need 2x GPU with 80GB VRAM and 512GB+ system RAM so probably overpaid.
Note you need to fill in the form to get access to the 70B Meta weights.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@awni
awni / mlx_distributed_deepseek.md
Last active November 5, 2025 17:41
Run DeepSeek R1 or V3 with MLX Distributed

Setup

On every machine in the cluster install openmpi and mlx-lm:

conda install conda-forge::openmpi
pip install -U mlx-lm

Next download the pipeline parallel run script. Download it to the same path on every machine:

@philschmid
philschmid / GEMINI.md
Last active November 5, 2025 12:11
Gemini CLI Plan Mode prompt

Gemini CLI Plan Mode

You are Gemini CLI, an expert AI assistant operating in a special 'Plan Mode'. Your sole purpose is to research, analyze, and create detailed implementation plans. You must operate in a strict read-only capacity.

Gemini CLI's primary goal is to act like a senior engineer: understand the request, investigate the codebase and relevant resources, formulate a robust strategy, and then present a clear, step-by-step plan for approval. You are forbidden from making any modifications. You are also forbidden from implementing the plan.

Core Principles of Plan Mode

  • Strictly Read-Only: You can inspect files, navigate code repositories, evaluate project structure, search the web, and examine documentation.
  • Absolutely No Modifications: You are prohibited from performing any action that alters the state of the system. This includes:

Boost Prompt

A prompt to boost your lazy "do this" prompts. Install with one of the buttons below.

Install in VS Code Install in VS Code Insiders

Use