Skip to content

Instantly share code, notes, and snippets.

@mherwig
Forked from anonymous/gs_sp_to_txt.sh
Last active December 11, 2015 23:18
Show Gist options
  • Save mherwig/4675532 to your computer and use it in GitHub Desktop.
Save mherwig/4675532 to your computer and use it in GitHub Desktop.
Simple voice control demonstration using google's speech-api. For demonstration purpose I added the unix commands 'ls' and 'whoami' that map to the spoken phrases "List directory" and "Who am I?"
#!/bin/bash
#
# Author: Mike Herwig
# Description:
# Simple voice control demonstration using google's speech-api
LANG="en"
API="http://www.google.com/speech-api/v1/recognize?lang=$LANG"
CMD_LIST_DIRECTORY="list directory"
CMD_WHOAMI="who am i"
JSON=`arecord -f cd -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o out.flac;\
wget -O - -o /dev/null --post-file out.flac --header="Content-Type: audio/x-flac; rate=16000" "$API"`
UTTERANCE=`echo $JSON\
|sed -e 's/[{}]/''/g'\
|awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]; exit }'\
|awk -F: 'NR==3 { print $3; exit }'\
|sed -e 's/["]/''/g'`
echo "utterance: $UTTERANCE"
echo ""
if [ `echo "$UTTERANCE" | grep -ic "^$CMD_LIST_DIRECTORY$"` -gt 0 ]; then
ls .
elif [ `echo "$UTTERANCE" | grep -ic "^$CMD_WHOAMI$"` -gt 0 ]; then
whoami
fi
#!/bin/bash
#
# Author: Mike Herwig
# Description:
# Updated voice control demonstration using google's speech-api
# The main difference to the previous script I uploaded to my Gist is that it's using sox for recording now
# and only records your voice to the disc when sox detects sound
# Dependencies: sox, wget
LANG="en"
API="http://www.google.com/speech-api/v1/recognize?lang=$LANG"
CMD_LIST_DIRECTORY="list directory"
CMD_WHOAMI="who am i"
function waitForCommand {
rec /tmp/cmdrecording.flac rate 32k silence 1 0.1 3% 1 3.0 3%
}
function speechToJSON {
JSON=`wget -O - -o /dev/null --post-file /tmp/cmdrecording.flac --header="Content-Type: audio/x-flac; rate=32000" "$API"`
}
function getUtterance {
UTTERANCE=`echo $JSON\
|sed -e 's/[{}]/''/g'\
|awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]; exit }'\
|awk -F: 'NR==3 { print $3; exit }'\
|sed -e 's/["]/''/g'`
echo "utterance: $UTTERANCE"
echo ""
}
while true; do
waitForCommand && speechToJSON && getUtterance
if [ `echo "$UTTERANCE" | grep -ic "^$CMD_LIST_DIRECTORY$"` -gt 0 ]; then
ls .
elif [ `echo "$UTTERANCE" | grep -ic "^$CMD_WHOAMI$"` -gt 0 ]; then
whoami
fi
#sleep 1
done
@mherwig
Copy link
Author

mherwig commented Feb 13, 2014

You might like to tweak silence [−l] above-periods [duration threshold[d|%]
[below-periods duration threshold[d|%]]

for more information on SoX visit:
http://sox.sourceforge.net/sox.html

Further ideas:
You might not want to send your fapping sounds to google so you could install a hardware button to the pi that 'mutes' the script. You could also add a line to switch on a (red) LED so that you know that it's listening.
I will add that later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment