Update: I turned my open-source Wav2Lip tool into a native Desktop App (PyQt6). No more OOM crashes on 8GB cards + High-Res Face Patching.Le jardin

Hi everyone,

I posted here a while ago about Reflow, a tool I’m building to chain TTS, RVC (Voice Cloning), and Wav2Lip locally.

Back then, it was a bit of a messy web-UI script that crashed a lot. I’ve spent the last few weeks completely rewriting it into a Native Desktop Application.

v0.5.5 is out, and here is what changed:

No More Browser UI: I ditched Gradio. It’s now a proper dark-mode desktop app (built with PyQt6) that handles window management and file drag-and-drop natively. 8GB VRAM Optimization: I implemented dynamic batch sizing. It now runs comfortably on RTX 3060/4060 cards without hitting CUDA Out Of Memory errors during the GAN pass. Smart Resolution Patching: The old version blurred faces on HD video. The new engine surgically crops the face, processes it at 96x96, and pastes it back onto the 1080p/4K master frame to preserve original quality. Integrity Doctor: It auto-detects and downloads missing dependencies (like torchcrepe or corrupted .pth models) so you don’t have to hunt for files.

It’s still 100% free and open-source. I’d love for you to stress-test the new GUI and let me know if it feels snappier.

🔗 GitHub: [https://github.com/ananta-sj/ReFlow-Studio]

💬 Discussion r/StableDiffusion (5 points, 2 commentaires) 🔗 Source

Bazaroid

Explorateur

Update: I turned my open-source Wav2Lip tool into a native Desktop App (PyQt6). No more OOM crashes on 8GB cards + High-Res Face Patching.

Vue Graphique