Search our Blogs
Showing results for 
Search instead for 
Do you mean 
 

Speech-to-text transcription from audio and video using Python

Transcribing an audio or video files and URLs is simple using Haven OnDemand’s Speech Recognition API. All you need to do is install the official Python API wrapper, POST a publicly facing URL or local file to Haven OnDemand’s Speech Recognition API, and obtain the result.

 

Code

Completd code

 

First, install the official Haven OnDemand Python API wrapper:

 

pip install havenondemand

 

Next, open up the file you will write your code in and import Haven OnDemand:

 

 

from havenondemand.hodclient import *
client = HODClient('APIKEY', 'v1')

 

 

Replace “APIKEY” with your API key, which can be found here after signing up.

 

If you’re operating behind a firewall, you’ll have to include a proxy when importing Haven OnDemand:

 

 

from havenondemand.hodclient import *
proxyDict = {
  "http"  : "http://user:pass@proxy.server.com:3128",
  "https" : "http://user:pass@proxy.server.com:3128",
  # "ftp"   : ftp_proxy
}
client = HODClient('API_KEY', 'v1', **proxyDict)

 

 

Next, you’ll call the Speech Recognition API by submitting either a local file or a publicly facing URL. Be sure to review the supported file inputs. You will also define a function below this that checks the transcription status - because this is asynchronous request - and ultimately prints the transcribed speech (For more information on asynchronous and synchronous request, see here):

 

 

# params = {'file': 'path/to/file.mp3'} # uncomment if using a local file
params = {'url': 'https://www.havenondemand.com/sample-content/videos/hpnext.mp4'} # uncomment if using a publicly facing URL
response_async = client.post_request(params, HODApps.RECOGNIZE_SPEECH, async=True)
jobID = response_async['jobID']

def getJobAsyncJobStatus(jobID):
   print 'Checking job status...'
   response = client.get_job_status(jobID)
   if response == None: # still transcribing...
       getJobAsyncJobStatus(jobID)
   else: # done trasncribing
       transcription = response['document'][0]['content']
       print transcription

getJobAsyncJobStatus(jobID)

 



When you run this file, it will output when it begins processing the file and then the transcribed speech, when it’s completed, to the terminal. It will look like the following:

 

 

Processing...
we want to hear from you let's get the conversation started about what's next for Hewlett Packard this is HP next this matters

 

Learn more about speech processing here.





Comments
Rio
| ‎04-22-2017 11:21

I got this following error when I tried to execute the sample code given:

 

Traceback (most recent call last):
File "D:\FYP\python code\HODtest.py", line 7, in <module>
jobID = response_async['jobID']
KeyError: 'jobID'

 

I don't understand what's wrong, I already replaced the "APIKEY" with my current API key.

Please help

darkwarrrias
Monday

Traceback (most recent call last):
File "C:\Python27\p14.py", line 3, in <module>
response_async = client.post_request(params, HODApps.RECOGNIZE_SPEECH, async=True)
NameError: name 'client' is not defined

Social Media
About the Author
Topics
† The opinions expressed above are the personal opinions of the authors, not of HPE. By using this site, you accept the Terms of Use and Rules of Participation