The dimensions of data on DNA variation such as single nucleotide polymorphisms or SNPs can be very large, involving thousands or millions of SNPs, measured on potentially thousands of individuals. Typical genotyping platforms may examine from 50K(K=thousand) to 2.5M (M= millions) SNPs. Some platforms could be even denser. There are 2 nucleotides (A, C, G or T) at each position (one on each chromosome). If the genotyping read is not sufficiently good, a missing value could be recorded in one or both chromosomes for that position/SNP. A frequently used re-codification of the nocleotide data is to replace the characters (i.e. alleles) by the count of the allele with the lower frequency in the sample, or according to a pre-specified allele as determined in the genotyping platform and software. Thus, instead of storing a pair of nucleotides (e.g., AA, AG, GG), researchers store the individual’s genotype as either 0,1,2, or NA. In thi