Created
December 20, 2016 05:17
-
-
Save stevemclaugh/efd9a4829d551f2f01605dd468daa86d to your computer and use it in GitHub Desktop.
Simple, fast, reasonably accurate speech-to-text processing for audio recordings of speech.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Lightweight Speech-to-Text (STT) with PocketSphinx\n", | |
"\n", | |
"More info:\n", | |
"https://github.com/cmusphinx/pocketsphinx\n", | |
"\n", | |
"Adapted from:\n", | |
"https://github.com/Uberi/speech_recognition/blob/master/examples/audio_transcribe.py\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"## Dependencies:\n", | |
"\n", | |
"## pip install pocketsphinx\n", | |
"## pip install speechrecognition" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"import os\n", | |
"import speech_recognition as sr\n", | |
"\n", | |
"# Downloading WAV file to home directory\n", | |
"\n", | |
"os.chdir(os.path.expanduser('~/Desktop/'))\n", | |
"\n", | |
"#!wget http://www.stephenmclaughlin.net/hipstas/wgbh_temp/wav/cpb-aacip-15-p26pz51t8d__CBSNS730426X_.h264.wav\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"stores over the floodwaters was over guns constitutes a leopold because it is really a coleman the same way that go goals famous work good souls was titled by google in russian flamboyantly on the title page that's also a pall of russia in that sense i may regard myself as a poet i do not regard myself the poet is a bullet in the sense that shall we regarded himself was a poet when he wrote the stove and home to the skylark i do not regard myself as being close to john keats as opposed i don't even regardless of who's been close that alexander popov support than either famously this miserable throb revolt and robert frost was my idea of a kind of poetry or was interested him however i did regard sock produces decided what that i'm a goner is as a possibility that socrates and the tinge line on most of the evidence would can gianni was a surprise to provence bought this organism that confronted the rug on\n" | |
] | |
} | |
], | |
"source": [ | |
"# use the audio file above as STT audio source\n", | |
"\n", | |
"wav_pathname=\"PUA_Kaldi_tests_161106/Antin-David_Studio111-Q_A_UPenn_3-16-04_1min.wav\"\n", | |
"\n", | |
"r = sr.Recognizer()\n", | |
"with sr.AudioFile(wav_pathname) as source:\n", | |
" audio = r.record(source) # read the entire audio file\n", | |
"\n", | |
"# recognize speech using pocketsphinx\n", | |
"\n", | |
"try:\n", | |
" print(r.recognize_sphinx(audio))\n", | |
"except sr.UnknownValueError:\n", | |
" print(\"Sphinx could not understand audio\")\n", | |
"except sr.RequestError as e:\n", | |
" print(\"Sphinx error; {0}\".format(e))\n" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 2", | |
"language": "python", | |
"name": "python2" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 2 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython2", | |
"version": "2.7.12" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment