This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Draws a figure illustrating change detection in the distribution of synthetic data. | |
### Each dot represents a single time period with 1000 samples. Before the change, | |
### the data is sampled from a unit normal distribution. After the change, 20 samples | |
### in each time period are taken from N(3,1). Comparing counts with a chi^2 test that | |
### is robust to small expected counts robustly detects this shift. | |
### log-likelihood ratio test for multinomial data | |
llr = function(k) { | |
2 * sum(k) * (H(k) - H(rowSums(k)) - H(colSums(k))) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### This is a demonstration of a Monte Carlo Expectation Maximization | |
### algorithm that can recover the mean and standard deviation of | |
### truncated normally distributed data. We get 10,000 samples from | |
### a unit normal distribution, but every sample below 0.5 is truncated | |
### to that value. Every sample above 2.5 is truncated to that value. | |
### These choices were made to get quick and visually appealling convergence | |
### but the algorithm still converges for any choice. The converges | |
### could be very, very slow if there is little information in the samples | |
### and the final answer could have substantial uncertainty. For instance, | |
### if we truncated at 4 and 6, almost all samples would be piled up at |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### This code builds a simple physical model of the range of an 85kWh Tesla Model S and | |
### compares it to real data. The data here is digitized from | |
### https://www.tesla.com/blog/model-s-efficiency-and-range | |
### The model here accounts for aerodynamic drag, viscous drag, constant | |
### friction and constant power drain | |
### First the digitized data | |
x = read.csv(text="v,range | |
10.22976354700292, 393.9005561997566 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# you can run this script with the following R command: | |
# source('https://gist.githubusercontent.com/tdunning/badb88043d41d916a3148c669f2fb0cd/raw/8d3289fdbf2a7999bd5d9687002488b904e1d82f/viewpoints.r') | |
set.seed(1) | |
noise = matrix(nrow=2000, ncol=8, data=rnorm(4*8*500)) | |
offsets = matrix( | |
c(rep(-1,1000), rep(1,1000), | |
rep(-1, 500), rep(1, 500), rep(-1, 500), rep(1, 500)), | |
ncol=2) | |
xy = rbind(matrix(nrow=2000, ncol=2, data=rnorm(2*2000))) + offsets * 8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.tdunning.tdigest.quality; | |
import com.google.common.collect.ImmutableList; | |
import com.google.common.io.Resources; | |
import com.tdunning.math.stats.MergingDigest; | |
import com.tdunning.math.stats.TDigest; | |
import org.junit.Test; | |
import java.io.File; | |
import java.io.IOException; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class MomentSketchOffsetTest { | |
@Test | |
public void testOffsetUniform() throws Exception { | |
MomentSketch ms = new MomentSketch(1e-10); | |
ms.setSizeParam(7); | |
ms.initialize(); | |
double[] data = TestDataSource.getUniform(20e1, 20e1 + 1, 1_000_000); | |
ms.add(data); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class HighDynamicRangeQuantile { | |
private final long[] counts; | |
private double minimum = Double.POSITIVE_INFINITY; | |
private double maximum = Double.NEGATIVE_INFINITY; | |
private long underFlowCount = 0; | |
private long overFlowCount = 0; | |
private final double factor; | |
private final double offset; | |
private final double minExpectedQuantileValue; | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### x is either a vector of numbers or a data frame with sums and weights. Digest is a data frame. | |
merge = function(x, digest, compression=100) { | |
## Force the digest to be a data.frame, possibly empty | |
if (!is.data.frame(digest) && is.na(digest)) { | |
digest = data.frame(sum=c(), weight=c()) | |
} | |
## and coerce the incoming data likewise ... a vector of points have default weighting of 1 | |
if (!is.data.frame(x)) { | |
x = data.frame(sum=x, weight=1) | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#define _GNU_SOURCE | |
#include <time.h> | |
#include <stdio.h> | |
#include <stdlib.h> | |
#include <fcntl.h> | |
#include <sys/uio.h> | |
#include <sys/file.h> | |
#include <unistd.h> | |
#include <errno.h> |