Skip to content

Instantly share code, notes, and snippets.

View dmbates's full-sized avatar

Douglas Bates dmbates

View GitHub Profile
@dmbates
dmbates / SimpleGibbs.md
Last active May 27, 2021 01:37
Simple Gibbs sampler in Julia

The Gibbs sampler discussed on Darren Wilkinson's blog and also on Dirk Eddelbuettel's blog has been implemented in several languages, the first of which was R.

The task is to create a Gibbs sampler for the unscaled density

 f(x,y) = x x^2 \exp(-xy^2 - y^2 + 2y - 4x)

using the conditional distributions

 x|y \sim Gamma(3, y^2 +4)

y|x \sim Normal(\frac{1}{1+x}, \frac{1}{2(1+x)})

@dmbates
dmbates / dls.jl
Created May 14, 2012 21:47
Least squares on distributed arrays using the Cholesky decomposition
for (potrs, elty) in (("dpotrs_", :Float64), ("spotrs_", :Float32),
("zpotrs_", :Complex128), ("cpotrs_", :Complex64))
@eval begin
function _jl_lapack_potrs(uplo, n, nrhs, A::StridedMatrix{$elty}, lda, B, ldb)
info = Array(Int32, 1)
ccall(dlsym(_jl_liblapack, $potrs), Void,
(Ptr{Uint8}, Ptr{Int32}, Ptr{Int32}, Ptr{Float64}, Ptr{Int32},
Ptr{Float64}, Ptr{Int32}, Ptr{Int32}),
uplo, &n, &nrhs, A, &lda, B, &ldb, info)
return info[1]
@dmbates
dmbates / JuliaGitPullRequest.md
Created May 16, 2012 16:49
Setting up git for creating pull requests to JuliaLang

These notes are based on Patrick O'Leary's video that he kindly created when I was wingeing about problems with git. It is best to watch the video first then go back through these notes to recall the steps.

Create a clone of the JuliaLang repository

git clone git://github.com/JuliaLang/julia.git

or

git clone git://github.com/JuliaLang/julia.git myjulia
@dmbates
dmbates / gist:2844387
Created May 31, 2012 16:00
Cross-tabulation in Julia

I would like to be able to cross-tabulate vectors or, more generally, factors in Julia. It may be that the capability already exists but, if so, I haven't been able to discover the function name.

Assuming that the capability needs to be created, I started by defining the types

abstract table
type crosstab <: table
    levels::Tuple
    table::Array{Uint32}
end
@dmbates
dmbates / rrssf.md
Created October 23, 2012 15:26
Evaluating sums of squared residuals on a grid

Background

In revising our 1988 Wiley book Nonlinear Regression Analysis and Its Applications I have taken advantage of advances in computing to evaluate projections of residual sum-of-squares contours for 3-parameter models. By evaluating the residual sum-of-squares on a 3-dimensional grid I can create the projections onto the 2-dimensional faces and plot contours. I also hope to use 3-D graphics to plot the 2-dimensional boundary. Any pointers on how to go about this would be appreciated. I presume I would need to generate a mesh for the surface from the grid values and then use something like OpenGL for the

@dmbates
dmbates / Intro.md
Created December 18, 2012 20:16
Use of the "regular sparse column-oriented" representation for mixed-effects models

Regular sparse matrices

In previous versions of the lme4 package for R, the random-effects model matrix, Z, or its transpose, Zt, have been stored as compressed sparse column-oriented matrices, which is a general sparse matrix representation. As implemented in the Matrix package for R, an m by n dgCMatrix matrix with nnz non-zeros has three vector slots: p, which is the length n + 1 vector of column pointers, i, which is the length nnz vector of row indices, and x, which is the length

@dmbates
dmbates / gist:5303570
Last active December 15, 2015 18:28
Formula, ModelFrame and ModelMatrix types

Formula, ModelFrame and ModelMatrix types

In R, the functions for fitting statistical models based on a linear predictor usually allow for a formula/data specification of the model. The data part is a data.frame object in R. In Julia the corresponding type is a DataFrame as implemented in the DataFrames package.

A formula object in R is an expression whose top-level operator is a unary or binary tilde, ~. The interpretation of many operators, including +, *, ^, \ and : are overridden in the formula language. See the result of

?formula

in R for the details.

@dmbates
dmbates / casestudy.md
Last active December 17, 2015 22:19
Handling Genome-Wide Association study data in Julia

Objectives

The dimensions of data on DNA variation such as single nucleotide polymorphisms or SNPs can be very large, involving thousands or millions of SNPs, measured on potentially thousands of individuals. Typical genotyping platforms may examine from 50K(K=thousand) to 2.5M (M= millions) SNPs. Some platforms could be even denser. There are 2 nucleotides (A, C, G or T) at each position (one on each chromosome). If the genotyping read is not sufficiently good, a missing value could be recorded in one or both chromosomes for that position/SNP. A frequently used re-codification of the nocleotide data is to replace the characters (i.e. alleles) by the count of the allele with the lower frequency in the sample, or according to a pre-specified allele as determined in the genotyping platform and software. Thus, instead of storing a pair of nucleotides (e.g., AA, AG, GG), researchers store the individual’s genotype as either 0,1,2, or NA. In thi

@dmbates
dmbates / Problem104.ipynb
Created October 3, 2013 19:54
Project Euler problem 104 in Julia.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@dmbates
dmbates / intro.md
Created March 3, 2014 23:30
Environment-based lme4

Background

The current version of the (lme4)[https://github.com/lme4/lme4] package for (R)[http://www.r-project.org) is based on a rather questionable practice of storing external pointers from C++ objects in an R reference class object. The idea was that when the reference class object was created the corresponding C++ class instance would be created. All changes to this object would be made through the external pointer in the C++ code.

This is risky because any changes to the R objects in the reference class could cause changes in the location of its values, after which all bets are off because the R object and the C++ object are referring to different memory locations.

It turns out that this does happen in the development version of R, which will become R-3.1.0, if the LAZY_DUPLICATE_OK flag is set. The symptom is that deepcopy methods are not copying and I suspect this is because the lazy duplication doesn't duplicate before the external pointer is constructed. We were living outsid