Skip to content

Instantly share code, notes, and snippets.

@TengdaHan
TengdaHan / ddp_notes.md
Last active November 1, 2024 05:58
Multi-node-training on slurm with PyTorch

Multi-node-training on slurm with PyTorch

What's this?

  • A simple note for how to start multi-node-training on slurm scheduler with PyTorch.
  • Useful especially when scheduler is too busy that you cannot get multiple GPUs allocated, or you need more than 4 GPUs for a single job.
  • Requirement: Have to use PyTorch DistributedDataParallel(DDP) for this purpose.
  • Warning: might need to re-factor your own code.
  • Warning: might be secretly condemned by your colleagues because using too many GPUs.
@danielgross
danielgross / specific_gpt.py
Created January 23, 2023 15:23
A chat interface that drives GPT-3 towards more specific answers.
"""Stream a response from the OpenAI completion API."""
import os
import re
import sys
import time
import random
import openai
openai.api_key = open(os.path.expanduser("~/.openai")).read().strip()
@nmwsharp
nmwsharp / printarr
Last active August 15, 2024 01:43
Pretty print tables summarizing properties of tensor arrays in numpy, pytorch, jax, etc. --- now on pip: `pip install arrgh`
Pretty print tables summarizing properties of tensor arrays in numpy, pytorch, jax, etc.
Now on pip! `pip install arrgh` https://github.com/nmwsharp/arrgh