# voice2subs a quick and dirty shell+python combo which uses the google cloud speech to text service to convert audio in a video file to subtitles. ## usage provide one or more video files to process to `voice2subs.sh` on the cli: ```sh $ ./voice2subs.sh test.mp4 Processing 'test.mp4'... ------------========---- extracting audio... converting audio to text... Waiting for operation [operations/8540494017153580661] to complete...done. converting google yaml data to subtitle data... Finished, result is in: 'test_with_subs.mkv' ``` ## pre-reqs this code requires the following to run: * ffmpeg * gcloud cli tool, configured with a gcs project * python3-yaml * python3-srt ## design ### ml2srt.py a small python script that expects the output of a google ml command like `gcloud -q --format yaml ml speech recognize-long-running --include-word-time-offsets`, and converts it into an [SRT format](https://en.wikipedia.org/wiki/SubRip) subtitles file. ### voice2subs.sh a small shell script that does the following: * rips audio track from a video file * processes the audio track with `gcloud ml speech`, per above * calls `ml2srt.py` to convert the google output to a subtitle file * remuxes the original video and the subtitle file into a new file, dropping audio