Skip to content

Instantly share code, notes, and snippets.

@icaoberg
Forked from arq5x/make-master-hmm.sh
Last active February 3, 2020 23:21
Show Gist options
  • Save icaoberg/2253e0ef34b3fd7dd3f4703d8037e83f to your computer and use it in GitHub Desktop.
Save icaoberg/2253e0ef34b3fd7dd3f4703d8037e83f to your computer and use it in GitHub Desktop.
[bedtools] For Gemini: Create a master ChromHMM track from the 9 distinct cell types.
*.txt
*.bedg
# icaoberg - this example is fork that uses a bedtools in a Singularity container
CONTAINER=../../singularity-bedtools.simg
echo "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmGm12878HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmH1hescHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHepg2HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHmecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHsmmHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmHuvecHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmK562HMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhekHMM.bed.gz
http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeBroadHmm/wgEncodeBroadHmmNhlfHMM.bed.gz" \
> chromhmm-files.txt
# download
for remote in `cat chromhmm-files.txt`
do
wget $remote
done
# uncompress
for zip in `ls *.gz`
do
gunzip -f $zip && rm -f $zip
done
# bed+ -> ~bedgraph
for bed in `ls *.bed`
do
cut -f 1-4 $bed > $bed.bedg
done
# union of all intervals across all 9 cell types
if [ -f $CONTAINER ]; then
singularity run --app bedtools $CONTAINER unionbedg -i *.bedg > master.chromhmm.bedg
fi
rm -f wg*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment