Last active
November 22, 2021 17:21
-
-
Save germanattanasio/ae26dc0144f229ad913a to your computer and use it in GitHub Desktop.
curl commands to use the Speech to Text service
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# This script clears the terminal, call the IBM Watson Speech to Text service. | |
USERNAME="<SERVICE_USERNAME>" | |
PASSWORD="<SERVICE_PASSWORD>" | |
SESSION_ID="<SESSION_ID>" # you will get this after running (1) | |
# 1. Create a session: | |
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD -d "{}" "https://stream.watsonplatform.net/speech-to-text/api/v1/sessions" | |
# This returns you a session URL. Note that the client needs to support cookies for sessions to work. | |
# 2. GET as follows to fetch the interim transcription results: | |
curl -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \ | |
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/observer_result?interim_results=true" | |
# This request will wait until the audio is submitted, and then it will return interim results in a timely manner. | |
# 3. POST the audio to the session recgonize URL, similar to the above examples. | |
# Here the audio can be sent in realtime, as it is being recorded from the system microphone. | |
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \ | |
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/recognize?continuous=true" --header "Content-Type: audio/flac" --header "Transfer-Encoding: chunked" --data-binary @pcm0003.flac | |
# At this point you can continue submitting requests and observing interim results. | |
# 4. When you are done, close the session: | |
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \ | |
-X DELETE "https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
line 19 comment says that audio from a live mic can be sent but the code seems to be calling for a .flac file. Is there an arecord option that writes out to a file that this monitors or can you show me how to have live audio from the mic sent in the example above?
i change line 21 and 22 to be something like...
arecord -D sysdefault:CARD=Device | curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/recognize?continuous=true" --header "Content-Type: audio/flac" --header "Transfer-Encoding: chunked" --data-binary @-
and it seems to be waiting for stdin... but it just keeps waiting nothing is returned... i suppose i am not ending the transmission so i do not ever get a response back. Is this on the right track? Can i use cURL to send live mic and get a constant stream of text back or do we have to send something, get something back and then send some more to get more back?
thank you in advance.
--jd