Skip to content

Instantly share code, notes, and snippets.

View ceshine's full-sized avatar

CeShine Lee ceshine

View GitHub Profile
@nbroad1881
nbroad1881 / deberta_mlm.py
Last active August 17, 2022 22:47
Implementation that makes use of the pretrained weights for Deberta for Masked Language Modeling.
from typing import Any, Optional, Union, Tuple
import torch
from torch import nn
from transformers.activations import ACT2FN
from transformers.models.deberta.modeling_deberta import (
DebertaPreTrainedModel,
DebertaModel,
)
from transformers.models.deberta_v2.modeling_deberta_v2 import (
@veekaybee
veekaybee / normcore-llm.md
Last active May 12, 2025 06:56
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@jrknox1977
jrknox1977 / ollama_dspy.py
Created February 9, 2024 18:06
ollama+DSPy using OpenAI APIs.
# install DSPy: pip install dspy
import dspy
# Ollam is now compatible with OpenAI APIs
#
# To get this to work you must include `model_type='chat'` in the `dspy.OpenAI` call.
# If you do not include this you will get an error.
#
# I have also found that `stop='\n\n'` is required to get the model to stop generating text after the ansewr is complete.
# At least with mistral.