Skip to content

Instantly share code, notes, and snippets.

View jagandecapri's full-sized avatar

Jagatheesan Jag jagandecapri

  • Kuala Lumpur, Malaysia
View GitHub Profile
@JoaoLages
JoaoLages / RLHF.md
Last active October 21, 2024 06:06
Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation

Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.

We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.

Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈

RLHF is especially useful in two scenarios 🌟:

  • You can’t create a good loss function
    • Example: how do you calculate a metric to measure if the model’s output was funny?
  • You want to train with production data, but you can’t easily label your production data
torch.manual_seed(42)
x_tensor = torch.from_numpy(x).float()
y_tensor = torch.from_numpy(y).float()
# Builds dataset with ALL data
dataset = TensorDataset(x_tensor, y_tensor)
# Splits randomly into train and validation datasets
train_dataset, val_dataset = random_split(dataset, [80, 20])
@martinsotir
martinsotir / conda_4.6_powershell.md
Last active October 7, 2024 15:34
Enable conda in powershell

Enabling conda in Windows Powershell

First, in an administrator command prompt, enable unrestricted Powershell script execution (see About Execution Policies):

set-executionpolicy unrestricted

Then makes sure that the conda Script directory in is your Path.

@0xjac
0xjac / private_fork.md
Last active November 18, 2024 00:09
Create a private fork of a public repository

The repository for the assignment is public and Github does not allow the creation of private forks for public repositories.

The correct way of creating a private frok by duplicating the repo is documented here.

For this assignment the commands are:

  1. Create a bare clone of the repository. (This is temporary and will be removed so just do it wherever.)

git clone --bare [email protected]:usi-systems/easytrace.git

@iros
iros / force.reqanimframe.js
Created April 15, 2016 19:49
force layout with d3.timer instead of tick loop
var force = d3.layout.force()
.charge(-150)
.linkDistance(30)
.size([width, height]);
d3.json("assets/500nodes.json", function(error, graph) {
if (error) throw error;
// Task 2:
// Connect the force layout to the nodes and links in our dataset
@sebsto
sebsto / gist:19b99f1fa1f32cae5d00
Created August 8, 2014 15:53
Install Maven with Yum on Amazon Linux
sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
sudo yum install -y apache-maven
mvn --version
@crosbymichael
crosbymichael / mesos-ubuntu-install.sh
Created July 9, 2014 18:43
Install mesos on ubuntu 14.04
#!/bin/bash
set -e
apt-get install -y curl python-setuptools python-pip python-dev python-protobuf
# zookeeper
apt-get install -y zookeeperd
echo 1 | dd of=/var/lib/zookeeper/myid
@jessepollak
jessepollak / instaInterval.js
Last active February 11, 2018 12:52
setInterval + Immediately Invoked Function Expressions == instaInterval.
setInterval((function interval() {
// do something instantly then every 5 seconds
console.log('This is a better version of setInterval');
return interval;
})(), 5000);
@hofmannsven
hofmannsven / README.md
Last active November 10, 2024 13:48
Git CLI Cheatsheet
@bryhal
bryhal / gist:4129042
Created November 22, 2012 02:08
MYSQL: Generate Calendar Table
DROP TABLE IF EXISTS time_dimension;
CREATE TABLE time_dimension (
id INTEGER PRIMARY KEY, -- year*10000+month*100+day
db_date DATE NOT NULL,
year INTEGER NOT NULL,
month INTEGER NOT NULL, -- 1 to 12
day INTEGER NOT NULL, -- 1 to 31
quarter INTEGER NOT NULL, -- 1 to 4
week INTEGER NOT NULL, -- 1 to 52/53
day_name VARCHAR(9) NOT NULL, -- 'Monday', 'Tuesday'...