Speech recognition is the process of converting spoken words to text. Python supports many speech recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API,
Microsoft Bing Voice Recognition and IBM Speech to Text.
In this tutorial we will use Google Speech Recognition Engine with Python.
A library that helps is named “SpeechRecognition”. You should install it with pyenv, pipenv or virtualenv. You can also install it system wide:
pip install SpeechRecognition
The SpeechRecognition module depends on pyaudio, you can install them from your package manager.
On Manjaro Linux these packages are called “python-pyaudio” and “python2-pyaudio”, they may have another name in your system.
Speech Recognition demo
You can test the speech recognition module, with the command:
Results show in terminal.
Speech Recognition with Google
The example below uses Google Speech Recognition engine, which I’ve tested for the English language.
For testing purposes, it uses the default API key.
To use another API key, use
Copy the code below and save the file as speechtest.py.
Run it with Python 3.
You could try the examples below:
Download Speech Recognition examples