Speech Recognition examples with Python

Speech recognition technologies have experienced immense advancements, allowing users to convert spoken language into textual data effortlessly. Python, a versatile programming language, boasts an array of libraries specifically tailored for speech recognition. Notable among these are the Google Speech Engine, Google Cloud Speech API, Microsoft Bing Voice Recognition, and IBM Speech to Text.

In this tutorial, we’ll explore how to seamlessly integrate Python with Google’s Speech Recognition Engine.

Setting Up the Development Environment

To embark on your speech recognition journey with Python, it’s imperative to equip your development environment with the right tools. The SpeechRecognition library streamlines this preparation. Whether you’re accustomed to using pyenv, pipenv, or virtualenv, this library is seamlessly compatible. For a global installation, input the command below:

pip install SpeechRecognition

It’s vital to acknowledge that the SpeechRecognition library hinges on pyaudio. The installation methodology for pyaudio might differ based on the operating system. For example, Manjaro Linux users can find packages labeled “python-pyaudio” and “python2-pyaudio”. Always ensure that you’ve selected the appropriate package tailored for your OS.

Introduction to Speech Recognition

Are you eager to test the prowess of the speech recognition module firsthand? Simply run the command given below and watch the magic unfold within your terminal:

python -m speech_recognition

A Closer Look at Google’s Speech Recognition Engine

Now, let’s delve deeper into the capabilities of Google’s renowned Speech Recognition engine. While this tutorial predominantly showcases its usage in English, it’s noteworthy that the engine is proficient in handling a plethora of languages.

A vital point to consider: this walkthrough employs the standard Google API key. If you’re inclined to integrate an alternate API key, modify the code accordingly:

r.recognize_google(audio, key="YOUR_GOOGLE_SPEECH_RECOGNITION_API_KEY")

Eager to witness this engine in action? Incorporate the code detailed below, save the file as “speechtest.py”, and initiate the script using Python 3:

#!/usr/bin/env python3

import speech_recognition as sr  

# Capture audio input from the microphone                                                                             
r = sr.Recognizer()                                                                                   
with sr.Microphone() as source:                                                                       
    print("Kindly voice out your command!")                                                                                   
    audio = r.listen(source)   

try:
    print("Interpreted as: " + r.recognize_google(audio))
except sr.UnknownValueError:
    print("Apologies, the audio wasn't clear enough.")
except sr.RequestError as e:
    print("There was an issue retrieving results. Error: {0}".format(e))

Dive Deeper with Downloadable Speech Recognition Samples

Leave a Reply:

Elvira • Sat, 29 Jul 2017

It works quite well for me

Esteban Montesinos • Sat, 11 Apr 2020

It doesn't run for me it just prints on the console "speak:" and doesn't do anything else

dev • Sat, 11 Apr 2020

Check which microphone inputs are available and set the right input. Also try with different speech engines https://github.com/Uberi/sp...

กำธร ตันติวิทยาทันต์ • Tue, 28 Apr 2020

Excellent!

griffinadelmann • Fri, 15 Jan 2021

How can I change the microphone settings? it doesn't seem to pick it up for me, mic is attached via USB

dev • Fri, 15 Jan 2021

Make sure you have PyAudio installed. Then you can change the id of the microphone like so: Microphone(device_index=3). Change the device_index to the microphone id you need. You can list the microphones this way:

import speech_recognition as sr
for index, name in enumerate(sr.Microphone.list_microphone_names()):
    print('Microphone with name "{1}" found for `Microphone(device_index={0})`'.format(index, name))