Skip to content

Instantly share code, notes, and snippets.

@srynobio
Last active October 26, 2020 17:14
Show Gist options
  • Save srynobio/db1eb9fc813998c563ea893d76f62bc1 to your computer and use it in GitHub Desktop.
Save srynobio/db1eb9fc813998c563ea893d76f62bc1 to your computer and use it in GitHub Desktop.

UCGD Reprocessed Phase3 WGS Overview.

This is an overview of and where to find data for the Phase3 WGS 1000Genome data (3,202 individuals). This data was downloaded from S3, and realigned (GRCh38) using the standard UCGD Pipeline.

Data can be accessed here:

  • redwood servers.
  • Mosaic (To be populated)

CRAM Files

Individual CRAM & index files for all 3,202 individuals.

/scratch/ucgd/lustre/common/data/1000G_Phase3_SV_Crams/UCGD/GRCh38/Data/PolishedCrams/

GVCF Files

Individual GVCF & index file for all 3,202 individuals.

/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/GVCFs

Final VCF

Single VCF & index file with all 3,202 individuals joint-called.

/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Complete

Smoove File

Single merged VCF & index file for all 3,202 individuals.

/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Smoove

Manta Files

Individual Manta calls for all 3,202 individuals (not joint-called).

/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Manta

Data Timeline:

We plan to keep all CRAM files until Jan 1, 2021!

The following will be kept indefinitely as apart of the 19-05-07_VAR-UCGD-1KG_Phase3_WGS project:

  • 3,202 GVCF's
  • 1 Joint-called Final VCF
  • 1 Smoove joint-called file.
  • 1 Manta merged file (to be generated).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment