Skip to content

Instantly share code, notes, and snippets.

@sonots
Last active September 2, 2019 02:39
Show Gist options
  • Select an option

  • Save sonots/41daaa6432b1c8b27ef782cd14064269 to your computer and use it in GitHub Desktop.

Select an option

Save sonots/41daaa6432b1c8b27ef782cd14064269 to your computer and use it in GitHub Desktop.
invetigation on cudaMalloc alignment => aligned to at least **512** bytes
#include <sys/time.h>
#include <cuda_runtime.h>
#include <stdio.h>
void test(int size)
{
float *d1, *d2;
cudaMalloc(&d1, size);
cudaMalloc(&d2, size);
printf("Alignment: %ld\n", (d2 - d1) * sizeof(float));
cudaFree(d1);
cudaFree(d2);
}
int main(int argc, char **argv)
{
// set up device
int dev = 0;
cudaDeviceProp deviceProp;
cudaGetDeviceProperties(&deviceProp, dev);
printf("Using Device %d: %s\n", dev, deviceProp.name);
cudaSetDevice(dev);
test(1);
return(0);
}
@leofang
Copy link
Copy Markdown

leofang commented Sep 2, 2019

I was also looking into this property, and my theory for the 512 bytes alignment is that the device pointer returned by cudaMalloc could be used to back the texture memory, which requires the data to be aligned to textureAlignment(=512 bytes on V100). See, e.g., the cudaResourceDesc session in CUDA Runtime API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment