- A simple note for how to start multi-node-training on slurm scheduler with PyTorch.
- Useful especially when scheduler is too busy that you cannot get multiple GPUs allocated, or you need more than 4 GPUs for a single job.
- Requirement: Have to use PyTorch DistributedDataParallel(DDP) for this purpose.
- Warning: might need to re-factor your own code.
- Warning: might be secretly condemned by your colleagues because using too many GPUs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
import torch.nn as nn | |
import torch.nn.functional as F | |
class SpatialSoftArgmax(nn.Module): | |
"""Spatial softmax as defined in [1]. | |
Concretely, the spatial softmax of each feature | |
map is used to compute a weighted mean of the pixel |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
if args.enable_torques: | |
print("ASSUMING THE WITNESS POINTS ARE DEFINED WITH RESPECT TO THE CENTER OF MASS!!! CHANGE BODY FRAME ORIGIN IF THIS IS NOT TRUE") | |
# for each contact point, setup the product of binary-continuous for w*b | |
# sum all the torques for a single object | |
# setup obj-obj first | |
for i in range(config.num_internal_bodies): | |
net_torque_to_object_i = np.zeros(3) | |
for j in range(config.num_internal_bodies): | |
net_torque_to_i_from_j = np.zeros(3) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import torch | |
from tqdm import tqdm | |
import time | |
# declare which gpu device to use | |
cuda_device = '0' | |
def check_mem(cuda_device): | |
devices_info = os.popen('"/usr/bin/nvidia-smi" --query-gpu=memory.total,memory.used --format=csv,nounits,noheader').read().strip().split("\n") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pyglet | |
import pyglet.gl as gl | |
import numpy as np | |
import os | |
import tempfile | |
import subprocess | |
import collections |
There is a longstanding issue/missing feature/bug with sockets on Docker on macOS; it may never work; you'll need to use a network connection between Docker containers and X11 on macOS for the foreseeable future.
I started from this gist and made some adjustments:
- the volume mappings aren't relevant/used, due to the socket issue above.
- this method only allows X11 connections from your Mac, not the entire local network, which would include everyone on the café/airport WiFi.
- updated to include using the
host.docker.internal
name for the the container host, instead. - you have to restart XQuartz after the config change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from multiprocessing import Pool | |
from functools import partial | |
def _pickle_method(method): | |
func_name = method.im_func.__name__ | |
obj = method.im_self | |
cls = method.im_class | |
if func_name.startswith('__') and not func_name.endswith('__'): #deal with mangled names | |
cls_name = cls.__name__.lstrip('_') | |
func_name = '_' + cls_name + func_name |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/python | |
# -*- coding:utf-8 -*- | |
""" | |
Idea and code was taken from stackoverflow(). | |
This sample illustrates how to | |
+ how to pass method of instance method | |
to multiprocessing(idea and code was introduced | |
at http://goo.gl/tRHN1D by torek). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from graphviz import Digraph | |
import torch | |
from torch.autograd import Variable, Function | |
def iter_graph(root, callback): | |
queue = [root] | |
seen = set() | |
while queue: | |
fn = queue.pop() | |
if fn in seen: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import open3d | |
import numpy as np | |
def main(): | |
vis = open3d.visualization.Visualizer() | |
vis.create_window("Pose Visualizer") | |
vis.get_render_option().line_width = 10.0 | |
obb = open3d.geometry.OrientedBoundingBox(center=np.array([0.0,0.0,0.0]), R=np.eye(3), extent=np.array([1.0, 1.0, 1.0])) |
NewerOlder