This repo has all the code for the project, which is based on Whisper and MoviePy. It has a version that can run in an IPython Notebook, and I have also released an application version compatible with Gradient Deployments. This was only my second time working on front end with HTML, so it's all still a little rudimentary for now.
The way it works is by first generating the translated speech to text values at labeled timestamps with Whisper. Then, MoviePy scales the captions to the size of the inputted video and overlays them at the (mostly) correct timeslots.
In this tutorial, i walk through how Whisper can be used with MoviePy to automatically generate and overlay translated subtitles from any video sample. Be sure to check out the Github repo as well, linked at the bottom.
This project was based on Whisper, MoviePy, and Flask.
The intersection of architecture and 3d design with ML is an intersect I haven't seen much work in, but it would be fascinating to see what comes from this
In our newest article, we discuss autoencoders and convolutional autoencoders in the context of image data. We then show how to write custom autoencoders of our own with PyTorch, train them, and view our results in a Gradient Notebook.
Check out the full guide here: blog.paperspace.com/dreambooth-stable-diffusion-tutorial-1/