Skip to content

Instantly share code, notes, and snippets.

@mre
mre / bitonic_sort.cu
Last active March 3, 2025 11:51
Bitonic Sort on CUDA. On a quick benchmark it was 10x faster than the CPU version.
/*
* Parallel bitonic sort using CUDA.
* Compile with
* nvcc -arch=sm_11 bitonic_sort.cu
* Based on http://www.tools-of-computing.com/tc/CS/Sorts/bitonic_sort.htm
* License: BSD 3
*/
#include <stdlib.h>
#include <stdio.h>
@donny-dont
donny-dont / aligned_allocator.cpp
Created December 13, 2011 09:11
An aligned allocator for placing SIMD types in std::vector
#ifdef _WIN32
#include <malloc.h>
#endif
#include <cstdint>
#include <vector>
#include <iostream>
/**
* Allocator for aligned data.
@goldsborough
goldsborough / conv.cu
Last active February 2, 2025 09:14
Convolution with cuDNN
#include <cudnn.h>
#include <cassert>
#include <cstdlib>
#include <iostream>
#include <opencv2/opencv.hpp>
#define checkCUDNN(expression) \
{ \
cudnnStatus_t status = (expression); \
if (status != CUDNN_STATUS_SUCCESS) { \