This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// -*- compile-command: "nvcc -m 32 -arch sm_50 -Xptxas=-v,-abi=no -cubin int_mul.cu" ; -*- | |
#include <stdint.h> | |
// | |
// | |
// | |
#define KERNEL_QUALIFIERS __global__ | |
#define KERNEL_QUALIFIERS_EXTERN extern KERNEL_QUALIFIERS |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
//goes well until ptxas which says something about invalid arguments so at least f16x2 modified atom instruction is recognized? | |
u32 | |
atomf16x2(u32 a, u32 b) | |
{ | |
u32 d; | |
asm("atom.global.add.f16x2 %0, [%1], %2;" : "=r"(d) : "r"(a), "r"(b)); | |
//atom.global.add.u32 %r5, [%rd2], 10; | |
//asm("mul.wide.s16 %0, %1, %2;" : "=r"(d) : "h"(a), "h"(b)); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Approximate recipe on compiling and running OpenSWR on AWS Instance (Ubuntu 14.04) | |
# Questions regarding this recipe: @__rej__ | |
# http://openswr.org | |
# prerequisites | |
sudo apt-get update | |
sudo apt-get install git | |
sudo apt-get install build-essential |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Pre requisites | |
- Enable DRI3 as described in https://vulkan.lunarg.com/app/docs/v1.0.3.1/getting_started_linux | |
- Install a library for SHA, e.g. sudo apt-get install libgcrypt11-dev (if not already present) | |
- Otherwise the driver may throw an error with "_mesa_sha1_compute" when loading SPIR-V shaders | |
# Building | |
- Clone Mesa Master : git clone git://anongit.freedesktop.org/mesa/mesa -b Master | |
- cd mesa | |
- autoreconf -vfi | |
- ./configure --with-dri-drivers=i965 --with-gallium-drivers= --with-sha1= --with-vulkan-drivers=intel |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <wrl/client.h> | |
#include <winml.h> | |
#include <stdio.h> | |
using Microsoft::WRL::ComPtr; | |
#include"cnpy.h" | |
#define PRINTDBG | |
int main(int argc,char**argv) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
directml.dll metacommands found: | |
copytensor | |
reduce | |
gemm | |
pooling | |
roipooling | |
convolution | |
normalization | |
mvn | |
rnn |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
name of display: :0 | |
display: :0 screen: 0 | |
direct rendering: Yes | |
server glx vendor string: SGI | |
server glx version string: 1.4 | |
server glx extensions: | |
GLX_ARB_create_context, GLX_ARB_create_context_no_error, | |
GLX_ARB_create_context_profile, GLX_ARB_create_context_robustness, | |
GLX_ARB_fbconfig_float, GLX_ARB_framebuffer_sRGB, GLX_ARB_multisample, | |
GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
name of display: :0 | |
display: :0 screen: 0 | |
direct rendering: Yes | |
server glx vendor string: SGI | |
server glx version string: 1.4 | |
server glx extensions: | |
GLX_ARB_create_context, GLX_ARB_create_context_no_error, | |
GLX_ARB_create_context_profile, GLX_ARB_create_context_robustness, | |
GLX_ARB_fbconfig_float, GLX_ARB_framebuffer_sRGB, GLX_ARB_multisample, | |
GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
name of display: :0 | |
display: :0 screen: 0 | |
direct rendering: Yes | |
server glx vendor string: SGI | |
server glx version string: 1.4 | |
server glx extensions: | |
GLX_ARB_create_context, GLX_ARB_create_context_no_error, | |
GLX_ARB_create_context_profile, GLX_ARB_create_context_robustness, | |
GLX_ARB_fbconfig_float, GLX_ARB_framebuffer_sRGB, GLX_ARB_multisample, | |
GLX_EXT_create_context_es2_profile, GLX_EXT_create_context_es_profile, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
========== | |
VULKANINFO | |
========== | |
Vulkan Instance Version: 1.1.82 | |
WARNING: [Loader Message] Code 0 : loader_scanned_icd_add: Using deprecated ICD interface of 'vkGetInstanceProcAddr' instead of 'vk_icdGetInstanceProcAddr' for ICD talvos-vulkan.dll | |
Instance Extensions: |
OlderNewer