-
-
Save mherwig/4675532 to your computer and use it in GitHub Desktop.
Simple voice control demonstration using google's speech-api. For demonstration purpose I added the unix commands 'ls' and 'whoami' that map to the spoken phrases "List directory" and "Who am I?"
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Author: Mike Herwig | |
# Description: | |
# Simple voice control demonstration using google's speech-api | |
LANG="en" | |
API="http://www.google.com/speech-api/v1/recognize?lang=$LANG" | |
CMD_LIST_DIRECTORY="list directory" | |
CMD_WHOAMI="who am i" | |
JSON=`arecord -f cd -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o out.flac;\ | |
wget -O - -o /dev/null --post-file out.flac --header="Content-Type: audio/x-flac; rate=16000" "$API"` | |
UTTERANCE=`echo $JSON\ | |
|sed -e 's/[{}]/''/g'\ | |
|awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]; exit }'\ | |
|awk -F: 'NR==3 { print $3; exit }'\ | |
|sed -e 's/["]/''/g'` | |
echo "utterance: $UTTERANCE" | |
echo "" | |
if [ `echo "$UTTERANCE" | grep -ic "^$CMD_LIST_DIRECTORY$"` -gt 0 ]; then | |
ls . | |
elif [ `echo "$UTTERANCE" | grep -ic "^$CMD_WHOAMI$"` -gt 0 ]; then | |
whoami | |
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Author: Mike Herwig | |
# Description: | |
# Updated voice control demonstration using google's speech-api | |
# The main difference to the previous script I uploaded to my Gist is that it's using sox for recording now | |
# and only records your voice to the disc when sox detects sound | |
# Dependencies: sox, wget | |
LANG="en" | |
API="http://www.google.com/speech-api/v1/recognize?lang=$LANG" | |
CMD_LIST_DIRECTORY="list directory" | |
CMD_WHOAMI="who am i" | |
function waitForCommand { | |
rec /tmp/cmdrecording.flac rate 32k silence 1 0.1 3% 1 3.0 3% | |
} | |
function speechToJSON { | |
JSON=`wget -O - -o /dev/null --post-file /tmp/cmdrecording.flac --header="Content-Type: audio/x-flac; rate=32000" "$API"` | |
} | |
function getUtterance { | |
UTTERANCE=`echo $JSON\ | |
|sed -e 's/[{}]/''/g'\ | |
|awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]; exit }'\ | |
|awk -F: 'NR==3 { print $3; exit }'\ | |
|sed -e 's/["]/''/g'` | |
echo "utterance: $UTTERANCE" | |
echo "" | |
} | |
while true; do | |
waitForCommand && speechToJSON && getUtterance | |
if [ `echo "$UTTERANCE" | grep -ic "^$CMD_LIST_DIRECTORY$"` -gt 0 ]; then | |
ls . | |
elif [ `echo "$UTTERANCE" | grep -ic "^$CMD_WHOAMI$"` -gt 0 ]; then | |
whoami | |
fi | |
#sleep 1 | |
done |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You might like to tweak silence [−l] above-periods [duration threshold[d|%]
[below-periods duration threshold[d|%]]
for more information on SoX visit:
http://sox.sourceforge.net/sox.html
Further ideas:
You might not want to send your fapping sounds to google so you could install a hardware button to the pi that 'mutes' the script. You could also add a line to switch on a (red) LED so that you know that it's listening.
I will add that later.