Created
January 7, 2016 20:34
-
-
Save konrad/a00b96b1d84c2f9b5e97 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Problem: You have a NCBI GEO accession and would like to get the URL of the SRA file that contains the sequencing data. | |
# The sed command that removes the last characer of the string is essential as there is a invisible character that messes up the | |
# downstream steps otherwise. | |
GEO_ACCESSION="GSM1655353" # set you GEO accession here | |
SRA_FTP_URL=$(curl "http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=${GEO_ACCESSION}&targ=self&form=text&view=brief" 2>/dev/null | grep ftp-trace.ncbi.nlm.nih.gov | cut -c 32-| sed 's/.$//') | |
FTP_SUB_FOLDER=$(ncftpls ${SRA_FTP_URL}/) | |
SRA_FILE=$(ncftpls ${SRA_FTP_URL}/${FTP_SUB_FOLDER}/) | |
echo $GEO_ACCESSION ${SRA_FTP_URL}/${FTP_SUB_FOLDER}/${SRA_FILE} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment