Build colmap with CUDA support

Instructions are for Ubuntu 22.04, with some notes for Ubuntu 24.04 but I didn't test it on 24.04, only on 22.04.

We'll install the following versions:

Ubuntu 22.04 with nvidia-driver-580 and cuda 12.9.1 (from nvidia apt repository)
cmake 3.28.4 (installed via pip)
boost 1.74.0 (that comes from apt)
googletest 1.16.0 (cloned in ceres-solver/third_party)
abseil-cpp 20250127.1 (cloned in ceres-solver/third_party)
ceres-solver master (unreleased 2.3.0)
eigen 3.4.0 (from apt)
libcudss 0.6.0 (from nvidia apt repository)
colmap 3.13.0.dev0 (main branch, see exact commit in the build instructions below)
glomap with colmap/glomap#201

Install nvidia drivers

Install nvidia latest drivers with

sudo apt update
sudo ubuntu-drivers autoinstall

This installed nvidia-driver-580. Reboot the machine.

Install cuda and cudss

Remove previous cuda versions:

sudo apt remove --purge nvidia-cuda-* cuda-*

Install cuda and cudss following instructions on https://developer.nvidia.com/cuda-12-9-1-download-archive (Linux, x86_64, Ubuntu, 22.04, deb (network)) and https://developer.nvidia.com/cudss-downloads

You end up with the commands:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-9
sudo apt-get -y install cuda-drivers
sudo apt-get -y install cudss

This installed cuda-toolkit-12-9, cuda-drivers, cuda-drivers-580 and cudss 0.6.0.

Install cmake

Install a more recent version of cmake only on Ubuntu 22.04 (Ubuntu 22.04 ships with cmake 3.22 that is too old, and we need cmake >= 3.28 for glomap, Ubuntu 24.04 ships with 3.28.3)

sudo apt remove --purge cmake cmake-curses-gui cmake-data
sudo pip install cmake==3.28.4

Configure environment variables to use gcc 10.5.0 (Ubuntu 22.04 only)

To make that documentation reproducible for others, I deactivated conda environment with conda deactivate otherwise colmap build started to use boost 1.82 from the conda environement that I got for an old nerfstudio install. It should use boost 1.74 from the system.

It was reported by ichsan2895 that boost 1.85.0 also works, and 1.87 fails.

Only needed for Ubuntu 22.04, use gcc 10.5.0, that's used to build everything from now on for this terminal session:

sudo apt-get install gcc-10 g++-10
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUDAHOSTCXX=/usr/bin/g++-10
conda deactivate

Install ceres-solver

Instructions from http://ceres-solver.org/installation.html but here is what I executed:

git clone https://github.com/ceres-solver/ceres-solver
cd ceres-solver
git checkout 93e66f0d9480ea1d6022793f073d682717d85897 # Sun Aug 17 10:23:50 2025, latest at the time
sudo apt-get install libgoogle-glog-dev libgflags-dev libeigen3-dev libsuitesparse-dev

I didn't install libatlas-base-dev to be sure the build uses MKL for BLAS & LAPACK

The submodule in third_party includes googletest v1.16.x branch and abseil-cpp lts_2025_01_27 branch (but a commit before 20250127.1), see https://github.com/ceres-solver/ceres-solver/commit/8a545eb46b6aae9c91861bb1104c7cdc530487ee remove them and explicitly get latest compatible releases:

cd third_party/
rm -rf abseil-cpp googletest
git clone -b 20250127.1 https://github.com/abseil/abseil-cpp.git --recursive
git clone -b v1.16.0 https://github.com/google/googletest --recursive
cd ..

Build:

mkdir build
cd build
cmake .. -DCMAKE_CUDA_ARCHITECTURES:STRING=all-major -DCMAKE_CUDA_COMPILER:FILEPATH=/usr/local/cuda-12/bin/nvcc -Dcudss_DIR=/usr/lib/x86_64-linux-gnu/libcudss/12/cmake/cudss/

This should output:

-- The C compiler identification is GNU 10.5.0
-- The CXX compiler identification is GNU 10.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc-10 - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++-10 - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test HAVE_BIGOBJ
-- Performing Test HAVE_BIGOBJ - Failed
-- Looking for pow in m
-- Looking for pow in m - found
-- Detected Ceres version: 2.3.0 from /home/vincentfretin/ceres-solver/include/ceres/version.h
-- Using the version of abseil in ceres-solver/third_party/abseil-cpp with version 20250127
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX17 - Success
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX20
-- Performing Test ABSL_INTERNAL_AT_LEAST_CXX20 - Failed
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Using the version of googletest in ceres-solver/third_party/googletest
-- Found Eigen version 3.4.0: /usr/share/eigen3/cmake
-- Enabling use of Eigen as a sparse linear algebra library.
-- Found CUDA version 12.9.86 installed in: /usr/local/cuda
-- The CUDA compiler identification is NVIDIA 12.9.86
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-12/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Setting CUDA Architecture to 50;60;70;75;80;90
-- Detected and included cudss-static-targets.cmake
-- Found cudss: (Version:0.6.0
               CMakePackageDir:/usr/lib/x86_64-linux-gnu/libcudss/12/cmake/cudss
               IncludeDir:/usr/include
               LibraryDir:/usr/lib/x86_64-linux-gnu/libcudss/12/.
               ComponentsFound:[cudss;cudss_static])
-- Found LAPACK library: /usr/lib/x86_64-linux-gnu/libmkl_intel_lp64.so;/usr/lib/x86_64-linux-gnu/libmkl_intel_thread.so;/usr/lib/x86_64-linux-gnu/libmkl_core.so;/usr/lib/x86_64-linux-gnu/libiomp5.so;-lm;-ldl;-lm;-ldl
-- Found CHOLMOD headers in: /usr/include/suitesparse
-- Found CHOLMOD library: /usr/lib/x86_64-linux-gnu/libcholmod.so
-- Found SPQR headers in: /usr/include/suitesparse
-- Found SPQR library: /usr/lib/x86_64-linux-gnu/libspqr.so
-- Found Config headers in: /usr/include/suitesparse
-- Found Config library: /usr/lib/x86_64-linux-gnu/libsuitesparseconfig.so
-- Found AMD headers in: /usr/include/suitesparse
-- Found AMD library: /usr/lib/x86_64-linux-gnu/libamd.so
-- Found CAMD headers in: /usr/include/suitesparse
-- Found CAMD library: /usr/lib/x86_64-linux-gnu/libcamd.so
-- Found CCOLAMD headers in: /usr/include/suitesparse
-- Found CCOLAMD library: /usr/lib/x86_64-linux-gnu/libccolamd.so
-- Found COLAMD headers in: /usr/include/suitesparse
-- Found COLAMD library: /usr/lib/x86_64-linux-gnu/libcolamd.so
-- Found Intel Thread Building Blocks (TBB) library (2021.5.0). Assuming SuiteSparseQR was compiled with TBB.
-- Looking for shm_open in rt
-- Looking for shm_open in rt - found
-- Adding librt to SuiteSparse_config libraries (required on Linux & Unix [not OSX] if SuiteSparse is compiled with timing).
-- Found METIS: /usr/include (found version "5.1.0")
-- Looking for cholmod_metis
-- Looking for cholmod_metis - found
-- Found SuiteSparse: /usr/include/suitesparse (found suitable version "5.10.1", minimum required is "4.5.6") found components: CHOLMOD SPQR Partition Config AMD CAMD CCOLAMD COLAMD
-- Found SuiteSparse 5.10.1, building with SuiteSparse.
-- Building without Apple's Accelerate sparse support.
-- Failed to find Google benchmark library, disabling build of benchmarks.
-- Building Ceres as a static library.
-- No build type specified; defaulting to CMAKE_BUILD_TYPE=Release.

Continue:

make -j$(nproc)
sudo make install

Install colmap

Note

It installs /usr/local/include/absl /usr/local/include/gtest /usr/local/include/ceres and you actually may already have /usr/include/gtest /usr/include/ceres from ubuntu apt if you have libgtest-dev libgmock-dev libceres-dev installed, but colmap build later will correctly uses the one from /usr/local/include/

But I explicitly don't install those later.

Note

I found the cudcss path with

dpkg -l|grep cudss
dpkg -L libcudss0-dev-cuda-12

About the CUDA_ARCHITECTURES option: https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES

We follow the instructions on https://colmap.github.io/install.html but we removed cmake libgtest-dev libgmock-dev libceres-dev from the list. And don't install nvidia-cuda-toolkit, we already installed cuda via nvidia instructions.

sudo apt-get remove libgtest-dev libgmock-dev libceres-dev libceres2 googletest
sudo apt-get install git ninja-build build-essential libboost-program-options-dev libboost-graph-dev libboost-system-dev libeigen3-dev libfreeimage-dev libmetis-dev libgoogle-glog-dev libsqlite3-dev libglew-dev qtbase5-dev libqt5opengl5-dev libcgal-dev libcurl4-openssl-dev libssl-dev libmkl-full-dev

I answered No to the question "Do you want to use libmkl_rt.so as an alternative by default to BLAS/LAPACK?"

git clone https://github.com/colmap/colmap.git
cd colmap
git checkout 98277ba2a42a398de98dda0c36d7bde1940cb974 # 2025-08-29 07:58
mkdir build
cd build
cmake .. -GNinja -DBLA_VENDOR=Intel10_64lp -DCMAKE_CUDA_ARCHITECTURES:STRING=all-major -DCMAKE_CUDA_COMPILER:FILEPATH=/usr/local/cuda-12/bin/nvcc -Dcudss_DIR=/usr/lib/x86_64-linux-gnu/libcudss/12/cmake/cudss/
ninja
sudo ninja install

Install pycolmap with CUDA support

The pre-built wheel doesn't have CUDA support, you need to build pycolmap from source to have CUDA support as described in https://github.com/colmap/colmap/tree/main/python

When building pycolmap, I got that error, it seems MKL version on the system is too old:

Building stubs with /usr/bin/python3 to /tmp/tmpdp21adh7/build
INTEL MKL ERROR: /lib/x86_64-linux-gnu/libmkl_def.so: undefined symbol: mkl_sparse_optimize_bsr_trsm_i8.
Intel MKL FATAL ERROR: Cannot load libmkl_def.so.
ninja: build stopped: subcommand failed.

so using a newer version of MKL in a conda environment fixed that but I got another error after that because libstdc++.so.6 in conda environment is too old:

ImportError: ~/miniconda3/lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /tmp/tmppuev2s0r/build/_core.cpython-313-x86_64-linux-gnu.so)
ninja: build stopped: subcommand failed.

If you check symbols with strings ~/miniconda3/lib/libstdc++.so.6 | grep GLIBCXX it's indeed missing GLIBCXX_3.4.30 and strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX have it.

so the fix was to link to system libstdc++.so.6 that is a more recent version.

So I ended up with the following that worked.

Install miniconda from instructions in https://www.anaconda.com/docs/getting-started/miniconda/install#linux-2 that is:

mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh

source ~/miniconda3/bin/activate

conda init --all

Install MKL in this conda environment:

conda install mkl mkl-include
# Successfully installed intel-cmplr-lib-ur-2025.2.1 intel-openmp-2025.2.1 mkl-2025.2.0 mkl-include-2025.2.0 tbb-2022.2.0 tcmlib-1.4.0 umf-0.11.0
cd ~/miniconda3/lib
mv libstdc++.so.6 libstdc++.so.6.old
ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 libstdc++.so.6

and build pycolmap:

cd ~/colmap
conda activate  # in case you're rebuilding colmap and pycolmap and already have miniconda installed, be sure to activate it
CMAKE_ARGS="-DBLA_VENDOR=Intel10_64lp -DCMAKE_CUDA_ARCHITECTURES:STRING=all-major -DCMAKE_CUDA_COMPILER:FILEPATH=/usr/local/cuda-12/bin/nvcc -Dcudss_DIR=/usr/lib/x86_64-linux-gnu/libcudss/12/cmake/cudss/" pip install .

Install other dependencies for the panorama_sfm.py script:

pip install opencv-python Pillow scipy tqdm py360convert
# Successfully installed Pillow-11.3.0 numpy-2.2.6 opencv-python-4.12.0.88 py360convert-1.0.4 scipy-1.16.1

Example of using the panorama_sfm.py script

The panorama_sfm.py script is using the rig feature in colmap that was released in colmap 3.12.0 (announcement on X), creating virtual cameras from an equirectangular image.

Example extraction of 360 images from video, from start time 00:03:04 to 00:04:04 (60 seconds), it extracts 4 equirectangular images for each second.

ffmpeg -ss 00:03:04 -to 00:04:04 -i input.mp4 -qscale:v 1 -qmin 1 -vf fps=4 ~/gaussiansplats/insta360_one_r/images360/%06d.jpg

instead of using the ffmpeg command you can also use a Python script that do that ffmpeg command and select the sharpest frames, get the script:

cd ~
git clone [email protected]:SharkWipf/nerf_dataset_preprocessing_helper.git

If you need to add additional filters to the ffmpeg command you can edit the 01_filter_raw_data.py file, for example:

diff --git a/01_filter_raw_data.py b/01_filter_raw_data.py
index 43a88b9..6ad8850 100644
--- a/01_filter_raw_data.py
+++ b/01_filter_raw_data.py
@@ -14,11 +14,13 @@ def extract_frames(input_vid, output_path):
         os.makedirs(output_path)
     cmd = [
         'ffmpeg',
+        '-ss', '00:03:04', '-to', '00:04:04',
         '-i', input_vid,
         '-vsync', 'vfr',
         '-v', 'quiet',
         '-stats',
         '-q:v', '1',  # Use highest quality for JPEG
+        '-vf', 'scale=2*ih:ih',
         os.path.join(output_path, 'frame%05d.jpg')
     ]
     subprocess.run(cmd)

There is a web (very slow to extract frames) and Windows app of a similar tool where you can specify the video start and end: https://sharp-frames.reflct.app/

and run it:

python ~/nerf_dataset_preprocessing_helper/01_filter_raw_data.py --input_path input.mp4 --output_path ~/gaussiansplats/insta360_one_r/images360 --target_count 240

target_count 240 here is the same as fps=4 for a 1 minute video.

Here is a patch panorama_sfm.py.patch against colmap main branch I generated with git diff > panorama_sfm.py.patch

See the patch

diff --git a/python/examples/panorama_sfm.py b/python/examples/panorama_sfm.py
index 58282c2c..33ff403b 100644
--- a/python/examples/panorama_sfm.py
+++ b/python/examples/panorama_sfm.py
@@ -14,6 +14,7 @@ import cv2
 import numpy as np
 import PIL.ExifTags
 import PIL.Image
+import py360convert
 from scipy.spatial.transform import Rotation
 from tqdm import tqdm

@@ -87,7 +88,7 @@ def spherical_img_from_cam(image_size, rays_in_cam: np.ndarray) -> np.ndarray:
     return np.stack([u, v], -1) * image_size


-def get_virtual_rotations(
+def old_get_virtual_rotations(
     num_steps_yaw: int, pitches_deg: Sequence[float]
 ) -> Sequence[np.ndarray]:
     """Get the relative rotations of the virtual cameras w.r.t. the panorama."""
@@ -104,6 +105,34 @@ def get_virtual_rotations(
     return cams_from_pano_r


+pitch_yaw_pairs = [
+    (0, 90), #Reference Pose
+    (0, 0),
+    (42, 0),
+    (-30, -10), # (-30, 0) originally
+    (0, 42),
+    (0, -42),
+    (5, 180), # (0, 180) originally
+    (42, 180),
+#     (-30, 180), # we have the person on this camera
+    (0, 222),
+    (10, 138), # (0, 138) originally
+]
+
+
+def get_virtual_rotations(
+    num_steps_yaw: int, pitches_deg: Sequence[float]
+) -> Sequence[np.ndarray]:
+    """Custom virtual camera rotations defined by exact pitch/yaw angles."""
+    cams_from_pano_r = []
+    for pitch_deg, yaw_deg in pitch_yaw_pairs:
+        cam_from_pano_r = Rotation.from_euler(
+            "XY", [-pitch_deg, -yaw_deg], degrees=True
+        ).as_matrix()
+        cams_from_pano_r.append(cam_from_pano_r)
+    return cams_from_pano_r
+
+
 def create_pano_rig_config(
     cams_from_pano_rotation: Sequence[np.ndarray], ref_idx: int = 0
 ) -> pycolmap.RigConfig:
@@ -201,17 +230,31 @@ class PanoProcessor:

         for cam_idx, cam_from_pano_r in enumerate(self.cams_from_pano_rotation):
             rays_in_pano = self._rays_in_cam @ cam_from_pano_r
-            xy_in_pano = spherical_img_from_cam(self._pano_size, rays_in_pano)
-            xy_in_pano = xy_in_pano.reshape(
-                self._camera.width, self._camera.height, 2
-            ).astype(np.float32)
-            xy_in_pano -= 0.5  # COLMAP to OpenCV pixel origin.
-            image = cv2.remap(
+            # xy_in_pano = spherical_img_from_cam(self._pano_size, rays_in_pano)
+            # xy_in_pano = xy_in_pano.reshape(
+            #     self._camera.width, self._camera.height, 2
+            # ).astype(np.float32)
+            # xy_in_pano -= 0.5  # COLMAP to OpenCV pixel origin.
+            # image = cv2.remap(
+            #     pano_image,
+            #     *np.moveaxis(xy_in_pano, [0, 1, 2], [2, 1, 0]),
+            #     cv2.INTER_LINEAR,
+            #     borderMode=cv2.BORDER_WRAP,
+            # )
+
+            # Get pitch/yaw from original list
+            r = pitch_yaw_pairs[cam_idx]
+            pitch, yaw = r[0], r[1]
+
+            # Run projection
+            image = py360convert.e2p(
                 pano_image,
-                *np.moveaxis(xy_in_pano, [0, 1, 2], [2, 1, 0]),
-                cv2.INTER_LINEAR,
-                borderMode=cv2.BORDER_WRAP,
+                fov_deg=(self.render_options.hfov_deg, self.render_options.vfov_deg),
+                u_deg=yaw,
+                v_deg=pitch,
+                out_hw=(self._camera.height, self._camera.width),
             )
+
             # We define a mask such that each pixel of the panorama has its
             # features extracted only in a single virtual camera.
             closest_camera = np.argmax(
@@ -302,7 +345,7 @@ def run(args):
     pycolmap.extract_features(
         database_path,
         image_dir,
-        reader_options={"mask_path": mask_dir},
+#        reader_options={"mask_path": mask_dir},
         camera_mode=pycolmap.CameraMode.PER_FOLDER,
     )

@@ -315,7 +358,7 @@ def run(args):
     # verification using rig constraints.
     matching_options.rig_verification = True
     # The images within a frame do not have overlap due to the provided masks.
-    matching_options.skip_image_pairs_in_same_frame = True
+    # matching_options.skip_image_pairs_in_same_frame = True
     if args.matcher == "sequential":
         pycolmap.match_sequential(
             database_path,
@@ -338,6 +381,7 @@ def run(args):
         logging.fatal(f"Unknown matcher: {args.matcher}")

     opts = pycolmap.IncrementalPipelineOptions(
+        ba_use_gpu=1,
         ba_refine_sensor_from_rig=False,
         ba_refine_focal_length=False,
         ba_refine_principal_point=False,

I also shared the modified file here

Be sure to change the pitch_yaw_pairs list for your own dataset, the camera number is the index in this list, starting with 0. Here I modified camera 3, 6, 10 and removed camera 8 by commenting it to not have the cameraman in the images. By commenting it that means camera 9 is the pano_camera8 created directory and camera 9 is the pano_camera8 directory if you look at the drawing below.

If you're interested in speeding things up. You can constrain the SIFT max_num_features option with 768 for example. I found that tip here. The point cloud will be less dense and maybe the camera poses less accurate, so that may impact the GS training results later. You could also try another value like 1024, 2048 or 4096 for example. From the colmap source code the maximum is set to 8192.

With:

    pycolmap.extract_features(
        database_path,
        image_dir,
        extraction_options={
            "sift": {"max_num_features": 768}
        },
        camera_mode=pycolmap.CameraMode.PER_FOLDER,
    )

Instead of :

I0831 19:55:33.409615 1004552 incremental_pipeline.cc:434] Registering image #1921 (num_reg_frames=239)
I0831 19:55:33.409633 1004552 incremental_pipeline.cc:437] => Image sees 1918 / 4231 points
I0831 20:18:09.502020 1004552 timer.cc:90] Elapsed time: 145.866 [minutes]
I0831 20:18:10.992136 1004552 panorama_sfm.py:run:397] #0 Reconstruction:
        num_rigs = 1
        num_cameras = 10
        num_frames = 240
        num_reg_frames = 240
        num_images = 2400
        num_points3D = 666538
        num_observations = 7852289
        mean_track_length = 11.7807
        mean_observations_per_image = 3271.79
        mean_reprojection_error = 0.642201

real    175m4,626s
666538 points in 175 minutes

you will have something like this:

I0831 12:06:32.387889 356469 incremental_pipeline.cc:434] Registering image #1201 (num_reg_frames=239)
I0831 12:06:32.387907 356469 incremental_pipeline.cc:437] => Image sees 448 / 704 points
I0831 12:08:13.885629 356469 timer.cc:90] Elapsed time: 16.576 [minutes]
I0831 12:08:14.083770 356469 panorama_sfm.py:run:397] #0 Reconstruction:
        num_rigs = 1
        num_cameras = 10
        num_frames = 240
        num_reg_frames = 240
        num_images = 2400
        num_points3D = 105632
        num_observations = 1193920
        mean_track_length = 11.3026
        mean_observations_per_image = 497.467
        mean_reprojection_error = 1.00961

real    22m40,152s
105632 points in 22 minutes

so it can be 8 times faster!

Note

ChatGPT about the mean_reprojection_error metric: 0.5–1 px is typical for 1080 images, and for 2–4k images good reconstructions are often down around 0.2–0.6 px. It’s the average distance (in pixels) between where COLMAP projects a 3D point back into an image vs. where the feature was actually detected.

Imagine the same physical error on the scene (say, the point is 1 mm off in 3D).
On a 1080p image (1920×1080), 1 px might cover ~0.05° of field of view.
On a 4K image (3840×2160), 1 px covers ~0.025° (half as wide).
That means the same reprojection error in pixels corresponds to a smaller angular/physical error at higher resolution.

Thanks Don'tTellHarry for the initial pitch_yaw_pairs values he shared here on Discord. There was also another change to translate all cameras, you can find it here on Discord with the fixed rotation but it didn't give a good result when I tested it, the splats were flickering in the brush trained gaussian splats.

You can save the patch in the colmap repo and apply it with patch -p1 < panorama_sfm.py.patch

The changes enable GPU for colmap mapper BA, remove usage of the masks as we use overlapping cameras on both pitch/yaw, each camera is overlapping by half another camera, and we need all extracted features we can get. It also change the image creation from cv2 implementation (that gaves some vertical lines artifacts on some images) to py360convert e2p. Default 12 cameras config was 3 rows (pitch at -35, 0, -35) of 4 non overlapping on yaw (horizontally) cameras every 90°. The new set of cameras here is 11 cameras that are a reference camera (camera 0), and 5 cameras for each lens avoiding the stitching areas. The images/pano_camera0 directory that is on the stitching area can be removed before GS training.

The camera projections looks like this (link to the Python matplotlib script on discord)

To run the panorama_sfm.py script from this dataset:

cd ~/colmap/python/examples
time python panorama_sfm.py --input_image_path ~/gaussiansplats/insta360_one_r/images360/ --output_path ~/gaussiansplats/insta360_one_r/dataset/

You can visualize the camera poses with colmap gui

colmap gui --image_path ~/gaussiansplats/insta360_one_r/dataset2/images --database_path ~/gaussiansplats/insta360_one_r/dataset2/database.db --import_path ~/gaussiansplats/insta360_one_r/dataset2/sparse/0/

Note

If you don't see all images, you may have several components (set of camera poses), meaning in the sparse directory you have 0, 1, 2, 3. I had that issue using the default panorama_sfm.py script that uses masks and skip_image_pairs_in_same_frame = True on a specific dataset, my guess is there weren't enough feature points.

See colmap gui documentation how to move around the gui. To reduce the size of the red planes, you can use Alt+scroll down, that kind of just zoom out (is it the same on Windows?), but then you zoom in again with scroll up without Alt and the red planes will be smaller. Do it again for it to be even smaller.

For brush later use OUTPUTDIR=~/gaussiansplats/insta360_one_r/dataset

Masking persons

If you need to create masks to hide the persons, look at SAM2/YOLO. gradeeterna shared a video showing this and he's working on a gradio ui (comment on Discord), mainly detecting persons on fisheyes images.

Also mentionned by SharkWipf: Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2

https://github.com/ngoclinhle/frame_extractor has a YOLO integration. It uses the yolov8x model to detect persons and draw a black box on them you can use as a mask for GS training. It also creates a mask for vegetation to ignore the feature points on it when estimating poses. It kind of works on equirectangular images. It can mask the cameraman if he's visible fully on one side, but it's not working if the cameraman is split into two with a part on the left and a part on the right. And this doesn't mask the shadow of people, that's not possible with that model alone.

If you want to give it a try:

git clone [email protected]:ngoclinhle/frame_extractor.git
cd frame_extractor
conda create -n frame_extractor
conda activate frame_extractor
pip install -r requirements.txt

From the author jimmykun8520 on Discord: Originally I wrote it to extract frame from drone footage. There was a lot of trees and Metashape struggled a lot. Simply masking all the green pixels improved the pose estimation a lot.

Alternatives for splitting the 360 images

Other tools mentionned on Discord

py360convert mentioned by jonstephens85 that fixes the vertical artifacts we see on some images with the cv2 implementation in panoram_sfm.py
ffmpeg with the v360 filter, see lileaLab's script example that was taken from that other article Note from Vincent: that's exactly the same quality as with py360convert, with py360convert you can use the 360 images already extracted and selected from the video.
360-to-planer-images (Python)
Equirect2Cube (Python)

Muhammad Ichsan: nerfstudio supports equirectangular images as input for ns-process-data images. It uses torch. opencv, and colmap under the hood.

ns-process-data images \
    --data input_equirectangular \
    --output-dir NsProcess_Equirect8 \
    --camera-type equirectangular \
    --images-per-equirect 8 --verbose

split4splat by CinematicToBe (Windows)
360 Video - Stills Prep Tool
360 Video Still Cropper (Windows), example on X by Mojon
Olli Huttunen's Blender add-on, must watch:
- Check out This 3D Scanner Build made from three Insta360 cameras
- How to Extract images from 360 videos with this Blender add-on

Renaming image files to train with Postshot

Tavius shared on Discord and X how he used the panorama_sfm.py and then do the training with Postshot. He shared two Python scripts to be be run after the panorama_sfm.py script to rename the images for Postshot that need unique filenames. It is not needed for brush. The first update_image_names.py script renames in place the images outputed by the panorama_sfm.py script with the 8 hard coded cameras (that was with 3.12.5, now on main branch the default is 12 so be sure to modify the script, if the last directory in the images directory is pano_camera11, then use range(12) in the script) transforming

images
  pano_camera0
    image0001.png
  pano_camera1
    image0001.png
  ...
  pano_camera7
    image0001.png

images
  pano_camera0
    image0001_0.png
  pano_camera1
    image0001_1.png
  ...
  pano_camera7
    image0001_7.png

and the second script update_txt_names.py updates the file path in colmap images.txt

You can use those commands to convert between colmap TXT and BIN formats:

mkdir sparse_text
colmap model_converter --input_path sparse/0 --output_path sparse_text --output_type TXT
# do your changes in sparse_text/
colmap model_converter --input_path sparse_text --output_path sparse/0 --output_type BIN

Install glomap

git clone [email protected]:colmap/glomap.git
cd glomap
git fetch origin pull/201/head:pr-201
git checkout pr-201
mkdir build
cd build

# sudo apt-get install libflann-dev libboost-filesystem-dev # because it's currently using old version of colmap, shouldn't be needed with https://github.com/colmap/glomap/pull/201

conda deactivate
cmake .. -GNinja -DBLA_VENDOR=Intel10_64lp -DCMAKE_CUDA_ARCHITECTURES:STRING=all-major -DCMAKE_CUDA_COMPILER:FILEPATH=/usr/local/cuda-12/bin/nvcc -Dcudss_DIR=/usr/lib/x86_64-linux-gnu/libcudss/12/cmake/cudss/
ninja
sudo ninja install

Docker image (TODO)

Base the docker image from https://github.com/colmap/colmap/tree/main/docker and adapt to build latest ceres-solver with the instructions above.

Thanks

Special thanks to Muhammad Ichsan (ichsan2895 on GitHub) for sharing his notes about compiling colmap with CUDA support. I adapted his instructions with other sources and using latest versions.

Relevant comments that helped writting this documentation:

Creating your own dataset

phone

With Pixel 9 Pro, you can take 4032x3024 12MP photos with the UltraWide camera (zoom 0.5). I used Manual Camera DSLR Pro to be able to fix the focus, white balance, shutter speed, iso and auto take photos every 1s.

360 cameras

At the time (Sept 2025), the best consumer cameras are

Qoocam 3 Ultra
Insta360 X5
DJI Osmo 360

gradeeterna did this scan (youtube) from a 8k video shot with Qoocam 3 Ultra in K-Log. Each circular fisheye is 4k, he broke those up in multiple perspectives and trained on those close to original res. (comment shared on Discord)

What's my current process to 360 footage?

That would be recording with an Insta360 X5 in 8k with Manual settings to lock the white balance (5000K), shutter speed (1/200 or higher 1/250 ... to avoid motion blur, iso 100 or higher). I switch to Auto to see what the image should look like and go to Manual to lock shutter speed and iso. In Insta360 Studio, cut the beginning and end of the video you don't want (you can also do that later with ffmpeg), export the video with Direction Lock checked so the cameraman is always at the same place (you have a similar option for DJI Studio for the OSMO 360).

Find the correct pitch/yaw values from Insta360 Studio of some of the virtual cameras (6 8 9 10 in the drawing if the cameraman is at the bottom right) that will be needed for the panorama_sfm.py script. You need to click on the circle plus icon Add Keyframe to have that, set FOV control to 90, Distortion Control to 0, Pan Angle (yaw) value and find the correct Tilt Angle (pitch) where you don't the person's head anymore.

Execute ffmpeg to extract the frames 2fps or 4fps depending of the walk speed of the cameraman.

Full frame DSLR camera and lens

About camera, lens, shutter speed, iso, aperture, some insights in this Discord comment and discussion above it.

Mixing several sources

I asked ChatGPT to produce those three paragraphs based on COLMAP documentation and that good comment

When using COLMAP/Glomap, the choice between --ImageReader.single_camera, --ImageReader.single_camera_per_folder, or nothing determines how intrinsics are shared across images. By default (no flag), each image can have its own calibration, which is the most flexible but also the least stable since the solver has to optimize many more parameters. This is useful if zoom or focal length changes, since each image can then adapt its own intrinsics.

--ImageReader.single_camera 1 enforces a single calibration for all images, which is efficient and stable if they truly come from the same camera settings (fixed lens, no zoom). But if the dataset mixes different focal lengths, devices, or zoom levels, forcing a single camera model will lead to distortions: COLMAP will try to compensate for focal changes by incorrectly moving cameras closer or farther from the scene.

--ImageReader.single_camera_per_folder 1 strikes a balance: all images within a folder share intrinsics, but different folders can represent different cameras or zoom levels. This is recommended if you know which subsets of images share the same calibration (e.g., DSLR vs. drone, or wide vs. zoom shots). Organizing images this way reduces overfitting while still respecting intrinsic differences where they exist, leading to more accurate and stable reconstructions.

Training with brush on planar images

Brush currently uses a MCMC/ADC hybrid algorithm to train the splats on undistorted images only.

Don't use the web version because sadly training in browser is still a bit slower (can't do some GPU optimizations, can't do threaded data loading), and is limited to 3GB datasets (32 bits wasm). For some stuff it works fine tho! (from Arthur, brush maintainer)

Use the native version straight from the main branch, limited installation instructions here. For those not familiar with rust development tools, here is a more detailed installation procedure.

Install rust with

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

If you previously installed rust, you can update it with

rustup self update
rustup update

Get brush:

git clone [email protected]:ArthurBrussee/brush.git
cd brush

Run the gui with

cargo run --release

Later if you want to update, just use git pull and run it again. If you did that several times, you may remove the target folder to start fresh to free some disk space because the builds accumulate over time. You may want to note what commit you previously used, you can find it with git log.

In the gui, select your colmap directory that contains images, masks (optional), sparse (with the 0 directory in it that includes the .bin files), adjust the options if needed.

You can also run it on the command line with or without --with-viewer, example:

OUTPUTDIR=~/gaussiansplats/insta360_one_r/dataset
cargo run --release -- --with-viewer --export-path ${OUTPUTDIR}_results --sh-degree 0 --total-steps 40000 --max-splats 7000000 --growth-select-fraction 0.2 $OUTPUTDIR

I had better results with --growth-select-fraction 0.2 (default is 0.1) to grow the splats more quickly.

Other useful options I used:

--max-resolution 1008 or --max-resolution 2016 (the default is 1920) to train on a lower resolution of my 4032x3024 photos taken with my Pixel 9 Pro. The splats trained with 1008 is okay, and with 2016 it's only a bit better in quality. Training on the full resolution is really slow to train on my nvidia 3070 8GB so I didn't try it.

Follow the brush development on the MrNeRF & Brush discord, link in https://x.com/janusch_patas bio.

Exporting from Reality Scan (RS) to train with brush

See comments on Discord

Training directly on fisheyes images

gradeeterna: Trained directly on Insta360 X5 fisheyes with gsplat's 3DGUT https://www.youtube.com/watch?v=7AOcPGJCOns

You can watch How to Use 3DGUT with gsplat on Windows (Full Tutorial) also how to use colmap gui at 20:26 (I usually only use it on the command line, I don't have a clue how to do the steps on the gui)

Or with LichtFeld Studio and --gut option.

If you record with an Insta360 camera, you can rename the .insv file to .mp4 to view the fisheyes recording. On X3 you have two files, one for each lens. On X5 you have a single .insv file but two video tracks in it. See Olli Huttunen: 360 File Hacks - Handling Insta360 File Types

Training directly on equirectangular images

https://junboli-cn.github.io/spags/

Useful commands and tools

Check GPU usage on Ubuntu

Thanks SharkWipf for this tip on Discord: Instead of watch nvidia-smi (that calls nvidia-smi command every 2 seconds), use nvidia-smi -l, which remains connected to the GPU (allowing it to properly adjust/sleep), instead of reconnecting and being forcibly woken up every time via watch. Another nice tool to vizualize the stats is nvtop, you can install it with sudo apt install nvtop

Resize all images by half with imagemagick

mogrify -path ./images_2 -resize 50% ./images/*.jpg

Remove floaters, rotate splats

Supersplat editor

https://superspl.at/editor

Here is my personal workflow to clean up the ply file, fix the rotation and move splats the origin, eventually remove some floaters:

use mouse and double click to find the zone of interest
select sphere in tool, double click where you want to place it, increase radius, move it
click Set, move again or increase radius if needed and reclick Set, in menu click Selection and Invert and press del key
hide splats with spacebar
set position to 0, 0, 0
modify rotation x, move y position, and try to match the grid, and verify it with rotation left right with the mouse
maybe modify rotation z as well
tune position y, and then x and z to place the blue/red line intersection where you want to spawn

Resources

https://zenn.dev/lilealab/books/how-to-photogrammety

vincentfretin/training_install_colmap.md

Build colmap with CUDA support

Install nvidia drivers

Install cuda and cudss

Install cmake

Configure environment variables to use gcc 10.5.0 (Ubuntu 22.04 only)

Install ceres-solver

Install colmap

Install pycolmap with CUDA support

Example of using the panorama_sfm.py script

Masking persons

Alternatives for splitting the 360 images

Renaming image files to train with Postshot

Install glomap

Docker image (TODO)

Thanks

Creating your own dataset

phone

360 cameras

What's my current process to 360 footage?

Full frame DSLR camera and lens

Mixing several sources

Training with brush on planar images

Exporting from Reality Scan (RS) to train with brush

Training directly on fisheyes images

Training directly on equirectangular images

Useful commands and tools

Check GPU usage on Ubuntu

Resize all images by half with imagemagick

Remove floaters, rotate splats

Supersplat editor

Kiri Engine blender addon

Splats formats conversion

Web viewers

Resources

vincentfretin commented Aug 25, 2025

Uh oh!