You can transcribe an audio file automatically with Python.
If you have an audio file with spoken words, the program will output a transcription of that audio file completely automatically.
This example uses English as input language for the audio file, but technically any language can be used as long as the speech recognition engine supports it.
Related course: Complete Python Bootcamp: Go from zero to hero in Python 3
Start of by creating an audio file with some speech. This can be any audio file with English words. Save the file as transcript.mp3
If you are unsure where to get an spoken words audio file, you can use Bluemix to generate one.
To run the app you need several things installed:
- Python 3
- the module pydub
- the program ffmpeg
- the module SpeechRecognition
You can install the Python modules with pip. ffmpeg can be installed with your package manager (apt-get, emerge, yum, pacman)
Audio transcription works by a few steps:
- mp3 to wav conversion,
- loading the audio file,
- feeding the audio file to a speceh recongition system.
Copy the program below and save it as transcribe.py
import speech_recognition as sr
Run the program with:
It will output the transcription of the original audio file.