Skip to content

Instantly share code, notes, and snippets.

@tomkinsc
Last active February 13, 2025 16:30
Show Gist options
  • Save tomkinsc/ef62de93fbe847c555bfe0e47e55fe79 to your computer and use it in GitHub Desktop.
Save tomkinsc/ef62de93fbe847c555bfe0e47e55fe79 to your computer and use it in GitHub Desktop.
#!/bin/bash
set -ex
if [ $# -eq 0 ]; then
echo "This script can be used to transfer all files from a DNAnexus project to a GS bucket (for Terra, etc.)"
echo "Usage: $0 DNAnexus_project-id:/path/to/recurse gs://bucket/path [grep pattern_to_match; ex. ".tar.gz"]"
echo " Before running, be sure to log in to both Google Cloud and DNAnexus via:"
echo " dx login"
echo " gcloud auth init"
echo " The CLI toolkits for google cloud and dx can be found here and must be installed first:"
echo " https://documentation.dnanexus.com/downloads"
echo " https://cloud.google.com/sdk/docs/install"
echo ""
exit 1
fi
dxprojectpath="$1"
bucketpath="$2"
match_pattern="$3"
dx_project_id="$(echo ${dxprojectpath} | cut -d':' -f1)"
for dx_file_path in $(dx find data --path "${dxprojectpath}" --delimiter=',' | cut -f4 -d',' | grep "${pattern_to_match}"); do
echo "Transferring ${dx_file_path}"
dx download --output - "${dx_project_id}:${dx_file_path}" | gcloud storage cp - "${bucketpath}${dx_file_path}"
echo ""
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment