Speech Recognition examples with Python

Speech recognition is the process of converting spoken words to text. Python supports many speech recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API,
Microsoft Bing Voice Recognition and IBM Speech to Text.

In this tutorial we will use Google Speech Recognition Engine with Python.

Related course: Complete Machine Learning Course with Python

Installation

A library that helps is named “SpeechRecognition”. You should install it with pyenv, pipenv or virtualenv. You can also install it system wide:

pip install SpeechRecognition

The SpeechRecognition module depends on pyaudio, you can install them from your package manager.
On Manjaro Linux these packages are called “python-pyaudio” and “python2-pyaudio”, they may have another name in your system.

Speech Recognition demo
You can test the speech recognition module, with the command:

python -m speech_recognition

Results show in terminal.

Speech Recognition with Google
The example below uses Google Speech Recognition engine, which I’ve tested for the English language.

For testing purposes, it uses the default API key.
To use another API key, use

 
`r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`

Copy the code below and save the file as speechtest.py.
Run it with Python 3.

#!/usr/bin/env python3                                                                                

import speech_recognition as sr  

# get audio from the microphone                                                                       
r = sr.Recognizer()                                                                                   
with sr.Microphone() as source:                                                                       
    print("Speak:")                                                                                   
    audio = r.listen(source)   

try:
    print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Could not understand audio")
except sr.RequestError as e:
    print("Could not request results; {0}".format(e))

If you are new to Machine Learning, I highly recommend this book

Download examples

Leave a Reply:

Elvira • Sat, 29 Jul 2017

It works quite well for me

Esteban Montesinos • Sat, 11 Apr 2020

It doesn't run for me it just prints on the console "speak:" and doesn't do anything else

dev • Sat, 11 Apr 2020

Check which microphone inputs are available and set the right input. Also try with different speech engines https://github.com/Uberi/sp...

กำธร ตันติวิทยาทันต์ • Tue, 28 Apr 2020

Excellent!

griffinadelmann • Fri, 15 Jan 2021

How can I change the microphone settings? it doesn't seem to pick it up for me, mic is attached via USB

dev • Fri, 15 Jan 2021

Make sure you have PyAudio installed. Then you can change the id of the microphone like so: Microphone(device_index=3). Change the device_index to the microphone id you need. You can list the microphones this way:

import speech_recognition as sr
for index, name in enumerate(sr.Microphone.list_microphone_names()):
    print('Microphone with name "{1}" found for `Microphone(device_index={0})`'.format(index, name))