Skip to content

Instantly share code, notes, and snippets.

@robinsloan
Last active August 29, 2015 14:22
Show Gist options
  • Save robinsloan/60c3349fb01eaba9c3e1 to your computer and use it in GitHub Desktop.
Save robinsloan/60c3349fb01eaba9c3e1 to your computer and use it in GitHub Desktop.
Voice memo transcriber engine. This is a very simple, highly inflexible script that probably won't be useful to many other people, but hey, you never know!
=begin README
Here's what this script does:
1. checks a Gmail inbox
2. finds attachments (which it expects to be WAV files)
3. pipes them through Google's voice transcription service
4. emails you the results
I use it with the Instacorder iPhone app for a super-fast, push-to-talk
voice memo transcription service, with no steps/confirmations along the way.
A few important things to know:
1. Google's transcription API maxes out at around 60 seconds. Longer recordings
~fail silently~.
2. I have this script running as a cron job on a desktop computer. You could
just as easily run it manually, put it on a server, etc.
3. I use a dedicated email account for voice memos, so the script naively plows
through EVERYTHING. It wouldn't be hard to modify it to discriminate between
emails/attachments -- maybe most easily using some key in the subject line?
3a. But, absent such modification: don't run this script on your everyday Gmail
account!!
4. To get your GOOGLE_KEY for the Speech API, you'll need to follow the
directions here, including the part where you sign up for the chromium-dev
group: http://www.chromium.org/developers/how-tos/api-keys
That's it!
=end
require "rubygems"
require "mail"
require "gmail"
require "json"
# also requires: curl on the command line
# CONST #
GMAIL_POWERED_ADDRESS = "[email protected]" # also works with google apps for your domain
PASSWORD = "password"
GOOGLE_KEY = "key_with_speech_api_enabled" # see: http://www.chromium.org/developers/how-tos/api-keys
TRANSCRIBED_EMAIL_TO = "[email protected]"
TRANSCRIBED_EMAIL_SUBJECT = "Transcribed voice memo"
THIS_DIR = __dir__
AUDIO_DIR = "#{THIS_DIR}/audio" # you must create this directory
LOGFILE = "log.txt" # you must create this file
SPEW_LOG_TO_CONSOLE = false # false = save to logfile instead
# GLOBAL #
$gmail = Gmail.connect!(GMAIL_POWERED_ADDRESS, PASSWORD)
# METHODS #
def transcribe_audio(full_path_to_audio_file)
log "transcribing #{full_path_to_audio_file}..."
cmd = "curl -X POST --data-binary @'#{full_path_to_audio_file}' \
--header 'Content-Type: audio/l16; rate=16000;' \
'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=#{GOOGLE_KEY}'"
raw_response = `#{cmd}`
best_transcript = raw_response.scan(/\"transcript\"\:\"(.+?)\"/)[0][0]
# need to use [0][0] b/c of the way the capture group (.+?) returns its results
log "transcribed as: #{best_transcript}..."
return best_transcript
end
def log(msg)
if SPEW_LOG_TO_CONSOLE then
puts msg
else
open(File.join(THIS_DIR,"log.txt"), 'a') do |log_file|
log_file.puts msg
end
end
end
def log_end
log("---\n\n")
# if logfile is larger than 10 megs...
if ((File.size(File.join(THIS_DIR,"log.txt"))/1024000.0) > 10) then
$gmail.deliver do
to TRANSCRIBED_EMAIL_TO
from GMAIL_POWERED_ADDRESS
subject "ALERT: your transcriber.rb logfile is getting large"
body "that's all :)"
end
end
end
#####################
# #
# BEGIN ZE SCRIPT #
# #
#####################
if $gmail.inbox.count(:unread) <= 0 then
exit
end
log "found unread message(s) at #{Time.now}"
$gmail.inbox.find(:unread).each do |email|
email.message.attachments.each do |attachment|
full_path = File.join(AUDIO_DIR, attachment.filename)
File.write(full_path, attachment.body.decoded)
transcription_email = $gmail.compose do
to TRANSCRIBED_EMAIL_TO
from GMAIL_POWERED_ADDRESS
subject TRANSCRIBED_EMAIL_SUBJECT
body transcribe_audio(full_path)
end
begin
transcription_email.deliver!
email.read!
log "transcription sent!"
rescue
log "well, something went wrong, so I didn't mark the email as read..."
end
end
end
log_end
$gmail.logout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment