shell+python scripts that convert the audio track in a video to a subtitle track, using Google's speech recognition
LICENSE.md | ||
ml2srt.py | ||
README.md | ||
voice2subs.sh |
voice2subs
a quick and dirty shell+python combo which uses the google cloud speech to text service to convert audio in a video file to subtitles.
usage
provide one or more video files to process to voice2subs.sh
on the cli:
$ ./voice2subs.sh test.mp4
Processing 'test.mp4'...
------------========----
extracting audio...
converting audio to text...
Waiting for operation [operations/8540494017153580661] to complete...done.
converting google yaml data to subtitle data...
Finished, result is in: 'test_with_subs.mkv'
pre-reqs
this code requires the following to run:
- ffmpeg
- gcloud cli tool, configured with a gcs project
- python3-yaml
- python3-srt
design
ml2srt.py
a small python script that expects the output of a google ml command like
gcloud -q --format yaml ml speech recognize-long-running --include-word-time-offsets
,
and converts it into an SRT format
subtitles file.
voice2subs.sh
a small shell script that does the following:
- rips audio track from a video file
- processes the audio track with
gcloud ml speech
, per above - calls
ml2srt.py
to convert the google output to a subtitle file - remuxes the original video and the subtitle file into a new file, dropping audio