Made with 🐸CoquiTTS #2602
Replies: 7 comments 7 replies
-
👀 EPUB reader using 🐸TTS by @knochenhans -> #2580 |
Beta Was this translation helpful? Give feedback.
-
I've been making text to speech audiobooks for a long time, using the zira voice from Microsoft. I took it as a personal challenge to see if I could make something you could actually listen to for long periods. I made the experience better by fixing spelling mistakes, making new words pronounced properly in the engine, adding pauses where appropriate, emphasizing italics text with slower speed, and most importantly making dialogue have a higher pitch than narration so you can always know when someone is talking. Now, I've been experimenting with coqui-tts for a little while now, and over the past few weeks have been working on a program that will output any size text Into chapter separated mp3s with a great voice. On top of that, I managed to incorporate all the extra features: pauses, pitch, and rate changes! I can't begin to describe how much tweaking and experimentation went in to the project to get something I was happy with. So many little gotchas where the model I was using would error out randomly, or just plain completely skip some text in the prompt. And then tweaking every little thing for quality of use. Anyway, if anyone is interested, I've posted a showcase of the output on YouTube that I'll be using as a basis for making the next audiobook, feel free to check it out: |
Beta Was this translation helpful? Give feedback.
-
This uses COQUI-TTS to speak, she is a social chatbot in the game "vrchat" more info can be found here and there is a 'howsheworks' page that describes the design ideas about her (no source code though). She uses a finetuned LJSpeech, with currently 13,100 samples usually 2-10 seconds long, I plan to increase this by double soon to capture some words she doesn't express well. |
Beta Was this translation helpful? Give feedback.
-
Introducing Loqui: A Shiny app for Creating Automated CoursesLoqui is an open source Shiny application that enables the creation of automated courses using ari, an R package for generating videos from text and images. Loqui takes as input a Google Slides URL, extracts the speaker notes from the slides, and converts them into an audio file using Coqui TTS. Then, it converts the Google Slides to images and ultimately, generates an mp4 video file where each image is presented with its corresponding audio. Any feedback is much appreciated! |
Beta Was this translation helpful? Give feedback.
-
We have supported exporting vits models from Coqui to ONNX and run it with sherpa-onnx sherpa-onnx supports both text-to-speech and speech-to-text and it runs on
and provides various APIs for different languages, e.g.,
We are working on WebAssembly support. The following colab notebook shows how to convert vits models from Coqui to sherpa-onnx You can also try the exported models by visiting the following huggingface space We also have pre-built Android APKs for the VITS English models from Coqui. |
Beta Was this translation helpful? Give feedback.
-
It might just be a TTS Plugin, but i like to use it because of its speed and quality. It would be the default TTS if its dependency conflicts would not crash everything. 😬 I made Whispering Tiger It can Translate / Transcribe Audio, Text and Text in Images using a variety of different AI models, and output using different TTS Models. It all runs completely locally, so unless you use a Plugin with API requirement, it works completely offline (if you have let it download the AI models beforehand) There are also more Plugins available like RVC Voice-Conversion which can even be used together with the Coqui TTS Plugin to get state of the art Voice Conversion. |
Beta Was this translation helpful? Give feedback.
-
Seamless Speech to Speech Translation with Voice Replication (S3TVR)Hey everyone! 🌟 I'm thrilled to share Seamless Speech to Speech Translation with Voice Replication (S3TVR), an AI model I developed for live translation and voice cloning. It uses Couqi's XTTS_V2 as the Text-to-Speech (TTS) model. Try the DemoCurious to see it in action? Check out the demo here: S3TVR Demo 😊 Run Locally for Better PerformanceFor a more optimized experience, you can run the whole pipeline locally. Everything you need is in this repo: S3TVR GitHub Repo. Let's Connect!I'm always up for collaboration or just chatting more about this project. Your support and feedback mean a lot to me! Feel free to reach out if you're interested in learning more or working together. 🤗 Thanks for stopping by! |
Beta Was this translation helpful? Give feedback.
-
Let us know what you do with 🐸CoquiTTS, and feel free to post it here.
I see many cool projects, so gathering them up would be good.
Beta Was this translation helpful? Give feedback.
All reactions