voice2subs/README.md

# voice2subs

a quick and dirty shell+python combo which uses the google cloud
speech to text service to convert audio in a video file to subtitles.

## usage
provide one or more video files to process to `voice2subs.sh` on the cli:
```sh
$ ./voice2subs.sh test.mp4
Processing 'test.mp4'...
------------========----
extracting audio...
converting audio to text...
Waiting for operation [operations/8540494017153580661] to  complete...done.                                                                             
converting google yaml data to subtitle data...
Finished, result is in: 'test_with_subs.mkv'
```

## pre-reqs
this code requires the following to run:
* ffmpeg
* gcloud cli tool, configured with a gcs project
* python3-yaml
* python3-srt

## design
### ml2srt.py
a small python script that expects the output of a google ml command like
`gcloud -q --format yaml ml speech recognize-long-running --include-word-time-offsets`,
and converts it into an [SRT format](https://en.wikipedia.org/wiki/SubRip)
subtitles file.

### voice2subs.sh
a small shell script that does the following:
* rips audio track from a video file
* processes the audio track with `gcloud ml speech`, per above
* calls `ml2srt.py` to convert the google output to a subtitle file
* remuxes the original video and the subtitle file into a new file, dropping audio
first pass, seems to work 2021-11-05 06:03:00 +00:00			`# voice2subs`

			`a quick and dirty shell+python combo which uses the google cloud`
			`speech to text service to convert audio in a video file to subtitles.`

			`## usage`
			provide one or more video files to process to `voice2subs.sh` on the cli:
			```sh
			`$ ./voice2subs.sh test.mp4`
			`Processing 'test.mp4'...`
			`------------========----`
			`extracting audio...`
			`converting audio to text...`
			`Waiting for operation [operations/8540494017153580661] to complete...done.`
			`converting google yaml data to subtitle data...`
			`Finished, result is in: 'test_with_subs.mkv'`
			```

			`## pre-reqs`
			`this code requires the following to run:`
			`* ffmpeg`
			`* gcloud cli tool, configured with a gcs project`
			`* python3-yaml`
			`* python3-srt`

			`## design`
			`### ml2srt.py`
			`a small python script that expects the output of a google ml command like`
			`gcloud -q --format yaml ml speech recognize-long-running --include-word-time-offsets`,
			`and converts it into an [SRT format](https://en.wikipedia.org/wiki/SubRip)`
			`subtitles file.`

			`### voice2subs.sh`
			`a small shell script that does the following:`
			`* rips audio track from a video file`
			* processes the audio track with `gcloud ml speech`, per above
			* calls `ml2srt.py` to convert the google output to a subtitle file
			`* remuxes the original video and the subtitle file into a new file, dropping audio`