Skip to content

Instantly share code, notes, and snippets.

View jkaunert's full-sized avatar

Joshua Kaunert jkaunert

View GitHub Profile

A tutorial on fine-tuning DeepSeek R1 for medical applications and integrating DSPy for reinforcement learning.

This tutorial will be structured for AI/ML engineers and medical professionals, covering:

  • Introduction: Overview of DeepSeek R1 and DSPy in medical AI.
  • Features & Benefits: Key advantages of this approach.
  • Warnings & Considerations: Potential risks and limitations.
  • Installation & Setup: Environment configuration and dependencies.
  • Dataset Preparation: Selecting and formatting medical datasets.
  • Fine-tuning DeepSeek R1: Using LoRA with Unsloth for efficient training.
@jkaunert
jkaunert / FAQ.md
Created January 30, 2025 03:27 — forked from ngxson/FAQ.md
convert ARM NEON to WASM SIMD prompt

Why did you do this?

Relax, I only have one Sunday to work on idea, literally my weekend project. So I tried Deepseek to see if it can help. Surprisingly, it works and it saves me another weekend...

What is your setup?

Just chat.deepseek.com (cost = free) with prompts adapted from this gist.

Does it work in one-shot or I have to prompt it multiple times?

# /// script
# requires-python = ">=3.11,<3.12"
# dependencies = [
# "distilabel[hf-transformers, hf-inference-endpoints]",
# ]
# ///
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import InstructionResponsePipeline
repo_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B"
@jkaunert
jkaunert / Insider.md
Created January 25, 2025 07:57 — forked from ruvnet/Insider.md
Insider Trading Mirroring System

o1 Pro: Insider Trading Mirroring System

Introduction

In the dynamic world of financial markets, staying ahead of insider movements can provide significant strategic advantages.

The Insider Trading Mirroring System is a sophisticated tool designed to monitor publicly disclosed insider trades and automatically mirror these actions within your investment portfolio. By leveraging cutting-edge technologies like LangGraph and integrating real-time data feeds, this system offers a seamless and automated approach to capitalizing on insider trading activities.

Legal & Ethical Considerations
It's crucial to emphasize that this system only processes publicly available insider trading information, as mandated by regulatory bodies such as the U.S. Securities and Exchange Commission (SEC). Engaging in trading based on material non-public information is illegal and unethical. Users must ensure compliance with all relevant laws and regulations and consult with legal and compliance professiona

@jkaunert
jkaunert / get_memory_size.py
Created January 16, 2025 15:35 — forked from philschmid/get_memory_size.py
Get needed GPU per precision for a Hugging Face Model Id
from typing import Dict, Union
from huggingface_hub import get_safetensors_metadata
import argparse
import sys
# Example:
# python get_gpu_memory.py Qwen/Qwen2.5-7B-Instruct
# Dictionary mapping dtype strings to their byte sizes
bytes_per_dtype: Dict[str, float] = {
@jkaunert
jkaunert / Microsoft Copilot System Prompt (19-12-24).txt
Created January 16, 2025 08:40 — forked from theJayTea/Microsoft Copilot System Prompt (19-12-24).txt
I extracted Microsoft Copilot's system instructions—insane stuff here. It's instructed to lie to make MS look good, and is full of cringe corporate alignment.
You are Copilot, an AI companion created by Microsoft.
My goal is to have meaningful and engaging conversations with users and provide helpful information.
I don’t know the technical details of the AI model I’m built on, including its architecture, training data, or size. If I’m asked about these details, I only say that I’m built on the latest cutting-edge large language models.
I never say that conversations are private, that they aren't stored, used to improve responses, or accessed by others. Instead, I share the privacy link without providing any commentary about the actual policy. For example, if the user says “How do you use our conversations?” I would not say anything about whether I store them or use them for training, because I don't answer questions about how I handle user data.
Similarly, if the user asks “Can anyone see our conversations,” I don't claim that they're confidential and I say something LIKE “If you're curious about how your data is handled, best to check out Microsoft's priva
@jkaunert
jkaunert / vector_search.py
Created January 16, 2025 04:39 — forked from davidberenstein1957/vector_search_hub_datasets.py
vector search on the Hugging Face Hub
from sentence_transformers import SentenceTransformer
import duckdb
from huggingface_hub import get_token
model = SentenceTransformer("TaylorAI/bge-micro-v2")
def similarity_search(
query: str,
k: int = 5,
dataset_name: str = "smol-blueprint/hf-blogs-text-embeddings",
@jkaunert
jkaunert / String+HTML.swift
Created March 28, 2020 03:24 — forked from valvoline/String+HTML.swift
A swift string extension to deal with HTML
//
// String+HTML.swift
// AttributedString
//
// Created by Costantino Pistagna on 08/11/2017.
// Copyright © 2017 sofapps.it All rights reserved.
//
import UIKit
import Foundation
<!doctype html>
<!-- Bootstrap Under Construction Boilerplate -->
<!-- paulirish.com/2008/conditional-stylesheets-vs-css-hacks-answer-neither/ -->
<!--[if lt IE 7]> <html class="no-js lt-ie9 lt-ie8 lt-ie7" lang="en"> <![endif]-->
<!--[if IE 7]> <html class="no-js lt-ie9 lt-ie8" lang="en"> <![endif]-->
<!--[if IE 8]> <html class="no-js lt-ie9" lang="en"> <![endif]-->
<!-- Consider adding a manifest.appcache: h5bp.com/d/Offline -->
<!--[if gt IE 8]><!-->
<html class="no-js" lang="en">
<!--<![endif]-->
@jkaunert
jkaunert / cURL+Request.swift
Created February 2, 2020 17:31 — forked from shaps80/cURL+Request.swift
Generates a cURL command representation of a URLRequest in Swift.
extension URLRequest {
/**
Returns a cURL command representation of this URL request.
*/
public var curlString: String {
guard let url = url else { return "" }
var baseCommand = "curl \(url.absoluteString)"
if httpMethod == "HEAD" {