Skip to content

Instantly share code, notes, and snippets.

@karpathy
karpathy / add_to_zshrc.sh
Created August 25, 2024 20:43
Git Commit Message AI
# -----------------------------------------------------------------------------
# AI-powered Git Commit Function
# Copy paste this gist into your ~/.bashrc or ~/.zshrc to gain the `gcm` command. It:
# 1) gets the current staged changed diff
# 2) sends them to an LLM to write the git commit message
# 3) allows you to easily accept, edit, regenerate, cancel
# But - just read and edit the code however you like
# the `llm` CLI util is awesome, can get it here: https://llm.datasette.io/en/stable/
gcm() {
@ynott
ynott / multipass-on-bridged-network.md
Last active December 29, 2024 03:29
Instructions for running multipass on a bridge network

1. Environmental information

  • OS: Ubuntu 20.04.2 LTS (GNU/Linux 5.8.0-59-generic x86_64)
  • Network: 192.168.xxx.0/24
  • Ubuntu multipass host machine IP: 192.168.xxx.yyy(static IP)
  • NIC: enp2s0(bridge host NIC)
  • Bridge NIC:br0

2. Prerequisites

import gym
import numpy as np
import torch
import torch.nn as nn
import matplotlib.pyplot as plt
from torch.optim import Adam
from torch.distributions import Categorical
from collections import namedtuple
env = gym.make('CartPole-v0')
@mda590
mda590 / stress_test.py
Created June 8, 2018 14:07
Python script useful for stress testing systems
"""
Produces load on all available CPU cores.
Requires system environment var STRESS_MINS to be set.
"""
from multiprocessing import Pool
from multiprocessing import cpu_count
import time
import os
@allenyllee
allenyllee / reverse_sshfs.sh
Created November 8, 2017 03:53
reverse sshfs
#!/bin/bash
##/*
## * @Author: AllenYL
## * @Date: 2017-11-08 11:37:31
## * @Last Modified by: [email protected]
## * @Last Modified time: 2017-11-08 11:37:31
## */
#
@michaellihs
michaellihs / tmux-cheat-sheet.md
Last active June 19, 2025 10:22
tmux Cheat Sheet
@kashif
kashif / cem.md
Last active September 18, 2024 21:33
Cross Entropy Method

Cross Entropy Method

How do we solve for the policy optimization problem which is to maximize the total reward given some parametrized policy?

Discounted future reward

To begin with, for an episode the total reward is the sum of all the rewards. If our environment is stochastic, we can never be sure if we will get the same rewards the next time we perform the same actions. Thus the more we go into the future the more the total future reward may diverge. So for that reason it is common to use the discounted future reward where the parameter discount is called the discount factor and is between 0 and 1.

A good strategy for an agent would be to always choose an action that maximizes the (discounted) future reward. In other words we want to maximize the expected reward per episode.

@simme
simme / Install_tmux
Created October 19, 2011 07:55
Install and configure tmux on Mac OS X
# First install tmux
brew install tmux
# For mouse support (for switching panes and windows)
# Only needed if you are using Terminal.app (iTerm has mouse support)
Install http://www.culater.net/software/SIMBL/SIMBL.php
Then install https://bitheap.org/mouseterm/
# More on mouse support http://floriancrouzat.net/2010/07/run-tmux-with-mouse-support-in-mac-os-x-terminal-app/