Skip to content

Instantly share code, notes, and snippets.

View malcolmgreaves's full-sized avatar

Malcolm Greaves malcolmgreaves

View GitHub Profile
@stefan-it
stefan-it / run_ner.py
Last active April 2, 2022 13:28
NER fine-tuning with PyTorch-Transformers (heavily based on https://github.com/kamalkraj/BERT-NER)
from __future__ import absolute_import, division, print_function
import argparse
import glob
import logging
import os
import random
import numpy as np
import torch
@maxweisspoker
maxweisspoker / dockerfile-command
Last active November 2, 2025 11:42
Bash alias to print "Dockerfile" for image by using sed with "docker history"
alias dockerfile='script.sh'
script.sh:
#!/bin/bash
echo "FROM scratch"
docker history --no-trunc $@ | tac | tr -s ' ' | cut -d " " -f 5- | sed 's,^/bin/sh -c #(nop) ,,g' | sed 's,^/bin/sh -c,RUN,g' | sed 's, && , \\\n & ,g' | sed 's,\s*[0-9]*[\.]*[0-9]*\s*[kMG]*B\s*$,,g' | head -n -1
export CONTAINER_URI="gcr.io/deeplearning-platform-release/experimental.theia.1-7"
export INSTANCE_NAME=...
export PROJECT_NAME=...
export IMAGE_PROJECT="deeplearning-platform-release"
export IMAGE_FAMILY="theia-container-experimental"
export MACHINE_TYPE=... #"n1-standard-4"
export ZONE=... #"us-central1-a"
gcloud compute instances create "${INSTANCE_NAME}" \
--project="${PROJECT_NAME}" \
--zone="${ZONE}" \
export CONTAINER_URI="gcr.io/deeplearning-platform-release/experimental.theia.1-7"
export INSTANCE_NAME=...
export PROJECT_NAME=...
export IMAGE_PROJECT="deeplearning-platform-release"
export IMAGE_FAMILY="theia-container-experimental"
export MACHINE_TYPE=... #"n1-standard-4"
export ZONE=.... #"us-central1-a"
gcloud notebooks instances create "${INSTANCE_NAME}" \
--project="${PROJECT_NAME}" \
--location="${ZONE}" \

Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.

Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback". I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

@arianvp
arianvp / SSH_MACOS_SECURE_ENCLAVES.md
Last active December 1, 2025 20:22
Native Secure Enclaved backed ssh keys on MacOS

Native Secure Enclave backed ssh keys on MacOS

It turns out that MacOS Tahoe can generate and use secure-enclave backed SSH keys! This replaces projects like https://github.com/maxgoedjen/secretive

There is a shared library /usr/lib/ssh-keychain.dylib that traditionally has been used to add smartcard support to ssh by implementing PKCS11Provider interface. However since recently it also implements SecurityKeyProivder which supports loading keys directly from the secure enclave! SecurityKeyProvider is what is normally used to talk to FIDO2 devices (e.g. libfido2 can be used to talk to your Yubikey). However you can now use it to talk to your Secure Enclave instead!