This is complete taken from Ryan D on biostar: http://www.biostars.org/p/2909/#16419
It takes a region in a format like "chr2:1234-3456". If an rs number is specified after the region, it will output the R^2 for every SNP in that region with the requested SNP; otherwise, it is all-vs-all.
It requires plink, vcftools, and tabix
Also requires toolshed for python which can be installed with:
sudo pip install toolshed
or
sudo easy_install toolshed
Call like:
python linkage.py chr11:1240203-1247497 > link.txt
and output looks like:
CHR_A BP_A SNP_A CHR_B BP_B SNP_B R2
11 1240338 rs2672792 11 1240338 rs2672792 1
11 1240338 rs2672792 11 1240381 rs112249530 0.00675058
11 1240338 rs2672792 11 1240435 rs115815572 0.0066929
11 1240338 rs2672792 11 1240439 rs146776493 0.00099445
11 1240338 rs2672792 11 1240485 rs72636989 0.0270542
11 1240338 rs2672792 11 1240533 rs189679987 3.71248e-05
11 1240338 rs2672792 11 1240626 rs192732186 0.00099445
11 1240338 rs2672792 11 1240628 rs185161871 5.51292e-05
11 1240338 rs2672792 11 1240713 rs150416385 0.000905582
where the final column is the R^2 value.
with the output like that, we can calculate "blocks" in a simple way by finding an R2 value to seed a block and then a distance to search for more values. To do this, the command is:
python linkage-blocks.py muc5b-linkage.txt > blocks.bed
The parameters including distance to search, seed and number of SNPs required to be a valid region are all hard-coded in the script.
If you use plink2 (aka plint 1.9), you can potentially simplify this code to accept the VCF directly :) Nevertheless, this is very nice and simple! Thanks for the code