This is an overview of and where to find data for the Phase3 WGS 1000Genome data (3,202 individuals). This data was downloaded from S3, and realigned (GRCh38) using the standard UCGD Pipeline.
Data can be accessed here:
redwood
servers.- Mosaic (To be populated)
Individual CRAM & index files for all 3,202 individuals.
/scratch/ucgd/lustre/common/data/1000G_Phase3_SV_Crams/UCGD/GRCh38/Data/PolishedCrams/
Individual GVCF & index file for all 3,202 individuals.
/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/GVCFs
Single VCF & index file with all 3,202 individuals joint-called.
/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Complete
Single merged VCF & index file for all 3,202 individuals.
/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Smoove
Individual Manta calls for all 3,202 individuals (not joint-called).
/scratch/ucgd/lustre/UCGD_Datahub/Repository/AnalysisData/2019/A624/19-05-07_VAR-UCGD-1KG_Phase3_WGS/UCGD/GRCh38/VCF/Manta
We plan to keep all CRAM files until Jan 1, 2021
!
The following will be kept indefinitely as apart of the 19-05-07_VAR-UCGD-1KG_Phase3_WGS
project:
- 3,202 GVCF's
- 1 Joint-called Final VCF
- 1 Smoove joint-called file.
- 1 Manta merged file (to be generated).