Skip to content

Instantly share code, notes, and snippets.

View SteveBronder's full-sized avatar
⚔️
Chillin

Steve Bronder SteveBronder

⚔️
Chillin
View GitHub Profile
benchmark time cpu iterations
add_constref_bench/2 64.5 ns 64.4 ns 10927660
add_constref_bench/2 65.8 ns 65.8 ns 10927660
add_constref_bench/2 64.6 ns 64.6 ns 10927660
add_constref_bench/2 63.8 ns 63.8 ns 10927660
add_constref_bench/2 66.2 ns 66.2 ns 10927660
add_constref_bench/2 66.2 ns 66.2 ns 10927660
add_constref_bench/2 65.0 ns 65.0 ns 10927660
add_constref_bench/2 65.5 ns 65.5 ns 10927660
add_constref_bench/2 65.0 ns 65.0 ns 10927660
// Code generated by %%NAME%% %%VERSION%%
#include <stan/model/model_header.hpp>
namespace sparse_model_namespace {
template <typename T, typename S>
std::vector<T> resize_to_match__(std::vector<T> &dst,
const std::vector<S> &src) {
dst.resize(src.size());
return dst;
library(cmdstanr)
library(data.table)
library(ggplot2)
library(ggridges)
library(rstan)
library(posterior)
set_cmdstan_path("~/your_cmdstan_path")
# statcomp/benchmarks is working directory
gp_lst = rstan::read_rdump("./gp_pois_regr/gp_pois_regr.data.R")
---
high_dim_gauss
---
[turing]
Benchmark results
Compilation time: 3.0011221113333337 (approximately)
Running time: 2.4342513616666666 +/- 0.14597526149472717 (3 runs)
Forward time: 0.000411603
Gradient time (forwarddiff): 0.526029922
Gradient time (reversediff): 0.00010086
library(data.table)
library(ggplot2)
mat_mul_dt = fread("./mat_mul_bench.csv")
mat_mul_dt[1:25, bench := "old"]
mat_mul_dt[26:50, bench := "new"]
mm_strs = t(mat_mul_dt[, strsplit(Benchmark, "/")])
mat_mul_dt[, benchmark := mm_strs[, 1]]
mat_mul_dt[, rows := as.numeric(mm_strs[, 2])]
mat_mul_dt[, cols := as.numeric(mm_strs[, 3])]
struct Arg {};
// Base class for RVO checking
struct S {
S() { puts("\t\tDefault Constructor");}
S(Arg) { puts("\t\tValue Constructor");}
explicit S(int) {puts("\t\tExplicit Value Constructor (1)");}
explicit S(int, int) { puts("\t\tExplicit Value Constructor (2)");}
~S() { puts("\t\tDestruct");}
S(const S&) { puts("\t\tCopy construct");}
library(gh)
library(data.table)
# Cycles through the PR pages grabbing useful info
all_prs = list()[30 * 12]
stan_math_prs = list()[30 * 12]
for (i in 1:12) {
stan_math_prs[[i]] = gh("GET https://api.github.com/repos/stan-dev/math/pulls?state=closed", type = "public", page = i)
}
for (i in 1:12) {
benchmark size stat time cpu iters
multi_init_var_vec 128 mean 644 ns 644 ns 30
multi_init_var_vec 128 median 622 ns 622 ns 30
multi_init_var_vec 128 stddev 64.0 ns 64.0 ns 30
multi_init_var_vec 256 mean 1468 ns 1468 ns 30
multi_init_var_vec 256 median 1493 ns 1493 ns 30
multi_init_var_vec 256 stddev 49.4 ns 49.4 ns 30
multi_init_var_vec 512 mean 2705 ns 2705 ns 30
multi_init_var_vec 512 median 2711 ns 2710 ns 30
multi_init_var_vec 512 stddev 18.9 ns 18.9 ns 30
/**
* \ingroup opencl
* \defgroup opencl_kernel_generator OpenCL Kernel Generator
*
* The OpenCL kernel generator is used to combine multiple matrix operations into a
* single OpenCL kernel. This is much simpler than writing multi-operation kernels by
* hand.
*
* Because global GPU memory loads and stores are relativly slow compared to
* calculations in a kernel, using one kernel for multiple operations is faster than using one kernel
USER_OPTIM_FLAGS= -pipe -fPIC -O3 -mtune=native -march=native
# I couldn't tell whether the opencl headers we use existed in `StanHeaders` on cran
# So I do a
# git clone --recursive https://github.com/stan-dev/rstan
# and then include the OpenCL headers
USER_OPENCL_FLAGS= -I"/path_to_rstan/rstan/StanHeaders/inst/include/mathlib/lib/opencl_2.1.0"
# You can get these with clinfo -l
USER_OPENCL_FLAGS+= -DSTAN_OPENCL=1 -DOPENCL_DEVICE_ID=1 -DOPENCL_PLATFORM_ID=2 -lOpenCL
# Some extra bits we need
USER_OPENCL_FLAGS+= -DCL_HPP_TARGET_OPENCL_VERSION=120 -DCL_HPP_MINIMUM_OPENCL_VERSION=120 -DCL_HPP_ENABLE_EXCEPTIONS -Wno-ignored-attributes