Skip to content

Instantly share code, notes, and snippets.

@pcolazurdo
Last active March 16, 2020 19:16
Show Gist options
  • Save pcolazurdo/599748227880a7e5ee17ee6def1b55ac to your computer and use it in GitHub Desktop.
Save pcolazurdo/599748227880a7e5ee17ee6def1b55ac to your computer and use it in GitHub Desktop.
AWS Transcribe sample
YT_FILE=poX6BEzVmdE
BUCKET_NAME="Please define the Bucket where the output will be sent - you can also use paths if needed"
FILE_OUTPUT=`youtube-dl -o '%(id)s.%(ext)s' --restrict-filenames https://www.youtube.com/watch\?v\=${YT_FILE} | grep "[download] Destination:" | cut -d: -f1`
aws s3 cp ${FILE_OUTPUT} s3://${BUCKET_NAME}/${YT_FILE}
aws transcribe start-transcription-job --transcription-job-name ${YT_FILE} --language-code en-GB --media MediaFileUri=s3://${BUCKET_NAME}/${YT_FILE} --output-bucket-name ${BUCKET_NAME}
sleep 120 #or use list-transcript-jobs until it is completed
aws s3 cp s3://${BUCKET_NAME}/${YT_FILE}.json .
cat ${YT_FILE}.json | jq -r '.results.transcripts[].transcript'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment