This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://devtalk.nvidia.com/default/topic/933827/cuda-programming-and-performance/fast-256-bin-histogram/ | |
http://www.cse.uconn.edu/~zshi/course/cse5302/ref/chhugani08sorting.pdf | |
http://link.springer.com/chapter/10.1007/978-3-642-23397-5_16 | |
http://arxiv.org/abs/1008.2849 Faster Radix Sort via Virtual Memory and Write-Combining Jan Wassenberg, Peter Sanders | |
https://devtalk.nvidia.com/default/topic/378826/cuda-programming-and-performance/my-speedy-sgemm/post/2703033/#2703033 | |
https://devtalk.nvidia.com/default/topic/390366/cuda-programming-and-performance/instruction-latency/post/2768197/#2768197 | |
https://devtalk.nvidia.com/default/topic/913832/cuda-programming-and-performance/sum-reduction-working-in-fermi-kepler-and-maxwell/ | |
https://devtalk.nvidia.com/default/topic/776043/cuda-programming-and-performance/whats-new-in-maxwell-sm_52-gtx-9xx-/1 | |
https://devtalk.nvidia.com/default/topic/690631/cuda-programming-and-performance/so-whats-new-about-maxwell-/post/4305310/#4305310 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include "cuda_runtime.h" | |
#include "device_launch_parameters.h" | |
#include <stdio.h> | |
#define DATA_SIZE (1 << 29) | |
#define DATA_ACCESSES (1 << 8) | |
#define BLOCK_SIZE 128 | |
#define BLOCKS_COUNT 1024 | |
template<int COUNT, int PAGE_SIZE, typename T> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include<stdio.h> | |
#include<stdlib.h> | |
#include <cuda.h> | |
#include<omp.h> | |
#include <helper_functions.h> | |
#include <helper_cuda.h> | |
#include <cuda_runtime.h> | |
void fill_matrix(int *A, int fac, int m, int n) | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include<stdio.h> | |
#include<stdlib.h> | |
#include <cuda.h> | |
#include<omp.h> | |
#include <helper_functions.h> | |
#include <helper_cuda.h> | |
#include <cuda_runtime.h> | |
void fill_matrix(int *A, int fac, int m, int n) | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* srcoder.c for the OpenBWT project | |
* Copyright (c) 2008-2010 Yuta Mori. All Rights Reserved. | |
* | |
* Permission is hereby granted, free of charge, to any person | |
* obtaining a copy of this software and associated documentation | |
* files (the "Software"), to deal in the Software without | |
* restriction, including without limitation the rights to use, | |
* copy, modify, merge, publish, distribute, sublicense, and/or sell | |
* copies of the Software, and to permit persons to whom the |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
template< int Mode=0 > | |
struct MTF { | |
enum{ CNUM=256 }; | |
typedef short ranktypeF; | |
typedef byte ranktypeB; | |
ranktypeF RankF[CNUM]; | |
ranktypeB RankB[CNUM]; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
C:\!FreeArc\public\FARSH\SMHasher>a xxh64 | |
------------------------------------------------------------------------------- | |
--- Testing XXH64 (xxHash, 64-bit result) --- Testing XXH64 (xxHash, 64-bit result) | |
[[[ Speed Tests ]]] [[[ Speed Tests ]]] | |
Bulk speed test - 262144-byte keys Bulk speed test - 262144-byte keys | |
Alignment 0 - 4.441 bytes/cycle - 12706.75 MiB/sec @ 3 ghz Alignment 0 - 3.516 bytes/cycle - 10058.15 MiB/sec @ 3 ghz | |
Alignment 1 - 4.408 bytes/cycle - 12610.07 MiB/sec @ 3 ghz Alignment 1 - 4.390 bytes/cycle - 12559.72 MiB/sec @ 3 ghz |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <stdio.h> | |
#include <math.h> | |
#include <algorithm> | |
int main (int argc, char **argv) | |
{ | |
if(argc!=2) {printf("Usage: prime N\n Prints prime number higher or equal to N\n"); return 0;} | |
unsigned long long N, N0; | |
sscanf (argv[1], "%llu", &N); N0=N; | |
unsigned m = std::min((unsigned long long)(sqrt(N))+10, N); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* GHC_PACKAGES base rts | |
*/ | |
#include "Stg.h" | |
#include "HsBase.h" | |
START_MOD_INIT(__stginit_Main,__stginit_Main_) | |
EF_(__stginit_SystemziEnvironment_); | |
EF_(__stginit_SystemziIOziUnsafe_); | |
EF_(__stginit_DataziChar_); | |
EF_(__stginit_Prelude_); | |
REGISTER_IMPORT(__stginit_SystemziEnvironment_); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--------------------------------------------------------------------------------------------------- | |
---- "Взаимодействующие последовательные процессы", как описано в книге Хоара. ---- | |
--------------------------------------------------------------------------------------------------- | |
-- | | |
-- Module : Process | |
-- Copyright : (c) Bulat Ziganshin <[email protected]> | |
-- License : Public domain | |
-- | |
-- Maintainer : [email protected] | |
-- Stability : experimental |