Skip to content

Instantly share code, notes, and snippets.

View Jokeren's full-sized avatar
💻
Work

Keren Zhou Jokeren

💻
Work
  • George Mason University
  • Fairfax
  • 13:37 (UTC -12:00)
View GitHub Profile
@Jokeren
Jokeren / main.ptx
Last active April 4, 2023 23:20
PTX undefined behavior
//
// Generated by LLVM NVPTX Back-End
//
.version 8.0
.target sm_80
.address_size 64
// .globl triton__0d1d2d3d4d56d7d89d1011d1213d1415d1617d1819d2021d2223d2425d2627d2829d3031d3233d3435d3637d3839d4041d42d
.extern .shared .align 1 .b8 global_smem[];
@Jokeren
Jokeren / Instruction.md
Created March 1, 2023 06:13
fp16 mov reproducer

Install

git clone https://github.com/openai/triton.git;
cd triton/python;
pip install cmake; # build time dependency
pip install -e .
pip uninstall pytorch-triton -y

Expected result (-0.1250)

@Jokeren
Jokeren / ptx
Created February 28, 2023 03:29
bug.ptx
//
// Generated by LLVM NVPTX Back-End
//
.version 8.0
.target sm_80
.address_size 64
// .globl triton__0d1d2d3d
.visible .entry triton__0d1d2d3d(
.param .u64 triton__0d1d2d3d_param_0,
.param .u64 triton__0d1d2d3d_param_1,
@Jokeren
Jokeren / main.cc
Last active August 16, 2021 04:58
ld_preload + api
#include <dlfcn.h>
#include "tool.h"
int main() {
//void *handle = dlopen("./tool.so", RTLD_NOW);
print_t func = (print_t)dlsym(RTLD_NEXT, "print");
func();
return 0;
}
We couldn’t find that file to show.
@Jokeren
Jokeren / waka
Created October 16, 2020 22:38
waka
waka
@Jokeren
Jokeren / discussion.md
Last active March 24, 2024 10:47
Discussion

Multiplexer

  1. How many operations are still pending?

  2. How does a thread ensure its activities are transferred before it ends? Because metrics are stored in thread local maps, it needs to get all activities and attribute them before it dies.

     opencl-api.c
    
     device_finalizer_register vs thread finalizer
    

pending_operations

@Jokeren
Jokeren / week
Created October 12, 2020 01:47
week
week
#include <iostream>
#include <chrono>
void __attribute__ ((noinline)) init(int *arr, size_t length) {
for (auto i = 0; i < length; ++i) {
arr[i] = 1;
}
}
@Jokeren
Jokeren / segment tree 2d.cpp
Created October 27, 2019 23:29
segment tree 2d
class NumMatrix {
private:
std::vector<std::vector<int> > tree;
void columnInit(std::vector<int> &tree_column, std::vector<int> &matrix_column) {
size_t columns = matrix_column.size();
for (size_t i = columns; i < tree_column.size(); ++i) {
tree_column[i] = matrix_column[i - columns];
}
for (size_t i = columns - 1; i > 0; --i) {