YouTube's auto-translated captions go more and more missing as you translate away from English. This sucks if we want captions to understand videos that aren’t in our language. Or if YouTube is missing captions all together. We also had some fun (it was more "self-induced pain") by deciding not to use an API call to OpenAI’s Whisper, instead hosting it ourselves on Google App Engine (repo here).
Extension repo: https://github.com/mkandan/dubdubs
2 edge functions: https://github.com/mkandan/dubdubs/tree/c9d61a3e4dee8167f0c6d0b6f9c51aba10fa3544/supabase
For example, try clicking the captions button on the below video
https://www.youtube.com/watch?v=u7j--YMXZtA&list=RDu7j--YMXZtA
Just use the freaking paid API/model 😂
We tried wrangling our way out of using OpenAI’s hosted models. We wanted to avoid users have to use their API tokens and be “charged” for it. The spirit and hard work was there. But we couldn’t create a scalable solution that is anywhere as quick.
There is a price for convenience, but we wanted to challenge ourselves and see how far we could go, but it came at a cost of an incomplete project.