pnnbao97/VieNeu-TTS

Jan 11, 2026

Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality

Duration: 00:01:57

eduardolat/kokoro-web

Jan 10, 2026

Kokoro Web: Free AI text-to-speech, online or self-hosted, OpenAI compatible!

Duration: 00:01:32

ekwek1/soprano

Jan 09, 2026

Soprano: Instant, Ultra-Realistic Text-to-Speech

Duration: 00:01:51

diodiogod/TTS-Audio-Suite

Jan 08, 2026

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio 2 and Microsoft VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools

Duration: 00:01:43

ddPn08/rvc-webui

Jan 07, 2026

liujing04/Retrieval-based-Voice-Conversion-WebUI reconstruction project

Duration: 00:01:50

huakunyang/SummerTTS

Jan 06, 2026

SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目，可以本地运行不需要网络，而且没有额外的依赖，一键编译完成即可用于中文和英文的语音合成。SummerTTS is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out

Duration: 00:01:51

shibing624/parrots

Jan 05, 2026

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成，支持多语言，准确率高

Duration: 00:01:42

modelscope/KAN-TTS

Jan 04, 2026

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Duration: 00:01:51

mbailey/voicemode

Jan 03, 2026

VoiceMode MCP brings natural conversations to Claude Code

Duration: 00:01:57

gotev/android-speech

Jan 02, 2026

Android speech recognition and text to speech made easy

Duration: 00:01:52

see2023/Bert-VITS2-ext

Jan 01, 2026

基于Bert-VITS2做的表情、动画测试. Animation testing based on Bert-VITS2.

Duration: 00:01:44

google/tacotron

Dec 31, 2025

Audio samples accompanying publications related to Tacotron, an end-to-end speech synthesis model.

Duration: 00:01:45

StarmoonAI/Starmoon

Dec 30, 2025

A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS

Duration: 00:01:37

p0p4k/vits2_pytorch

Dec 29, 2025

unofficial vits2-TTS implementation in pytorch

Duration: 00:01:55

wildminder/ComfyUI-VibeVoice

Dec 28, 2025

ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio

Duration: 00:01:52

daswer123/xtts-api-server

Dec 27, 2025

A simple FastAPI Server to run XTTSv2

Duration: 00:01:48

domesticatedviking/TextyMcSpeechy

Dec 26, 2025

Easily create Piper text-to-speech models in any voice. Make a text-to-speech model with your own voice recordings, or use thousands of RVC voices. Works offline on a Raspberry pi. Rapidly record custom datasets for any metadata.csv file and listen to your model as it is training.

Duration: 00:01:50

zuoban/tts

Dec 25, 2025

tts 服务

Duration: 00:01:53

alan-ai/alan-sdk-reactnative

Dec 24, 2025

The Self-Coding System for Your App — Alan AI SDK for React Native

Duration: 00:01:51

vilassn/whisper_android

Dec 23, 2025

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

Duration: 00:01:56

oscie57/tiktok-voice

Dec 22, 2025

Simple Python script to interact with the TikTok TTS API

Duration: 00:01:46

gexgd0419/NaturalVoiceSAPIAdapter

Dec 21, 2025

Make Azure natural TTS voices accessible to any SAPI 5-compatible application.

Duration: 00:02:06

geekwenjie/SmartJavaAI

Dec 20, 2025

Java免费离线AI算法工具箱，支持人脸识别，活体检测，表情识别、目标检测、实例分割、行人检测、OCR文字识别、车牌识别、表格识别、ASR+TTS、机器翻译等功能，Maven引用即可使用。支持PyTorch、Tensorflow，已集成 Mtcnn、InsightFace、SeetaFace6、YOLOv8~v12、PaddleOCR(PPOCRv5)、Whisper等主流模型

Duration: 00:02:13

AIDC-AI/Pixelle-Video

Dec 19, 2025

AI 全自动短视频引擎 | AI Fully Automated Short Video Engine

Duration: 00:01:54

lucasnewman/f5-tts-mlx

Dec 17, 2025

Implementation of F5-TTS in MLX

Duration: 00:01:46

madroidmaq/mlx-omni-server

Dec 16, 2025

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.

Duration: 00:01:45

zai-org/GLM-TTS

Dec 15, 2025

GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning

Duration: 00:01:38

daniilrobnikov/vits2

Dec 14, 2025

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Duration: 00:01:37

superstarryeyes/lue

Dec 13, 2025

Terminal eBook Reader with Audiobook-Quality Text-to-Speech — Supports EPUB, PDF, DOCX, HTML, RTF, TXT, and MD.

Duration: 00:01:47

devnen/Chatterbox-TTS-Server

Dec 12, 2025

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.

Duration: 00:01:39

seungwonpark/melgan

Dec 11, 2025

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Duration: 00:01:50

HA6Bots/Automatic-Youtube-Reddit-Text-To-Speech-Video-Generator-and-Uploader

Dec 10, 2025

A series of 3 programs that will automatically receive scripts from Reddit, allow the user to edit them, then be sent off to a video generator where they will be uploaded to YouTube automatically.

Duration: 00:02:02

lucasjinreal/Kokoros

Dec 09, 2025

Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.

Duration: 00:01:51

thorstenMueller/Thorsten-Voice

Dec 08, 2025

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Duration: 00:01:54

google/voice-builder

Dec 07, 2025

An opensource text-to-speech (TTS) voice building tool

Duration: 00:02:00

soobinseo/Transformer-TTS

Dec 06, 2025

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

Duration: 00:01:58

lobehub/lobe-tts

Dec 05, 2025

Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser

Duration: 00:01:44

jaywalnut310/glow-tts

Dec 04, 2025

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Duration: 00:01:54

dlutton/flutter_tts

Dec 03, 2025

Flutter Text to Speech package

Duration: 00:01:51

C-Loftus/QuickPiperAudiobook

Dec 02, 2025

With one command, create a natural-sounding audiobook from a variety of input formats (epub, mobi, txt, PDF, HTML and more!)

Duration: 00:01:51

markovka17/dla

Dec 01, 2025

Deep learning for audio processing

Duration: 00:01:54

stepfun-ai/Step-Audio-EditX

Nov 30, 2025

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Duration: 00:00:27

coqui-ai/TTS-papers

Nov 29, 2025

collection of TTS papers

Duration: 00:01:47

cboard-org/cboard

Nov 28, 2025

Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser

Duration: 00:01:52

wangzongming/esp-ai

Nov 27, 2025

The simplest and lowest-cost AI integration solution. If you like this project, please give it a Star~ | 最简单、最低成本的AI接入方案。喜欢本项目的话点个 Star 吧~

Duration: 00:02:02

VRCWizard/TTS-Voice-Wizard

Nov 26, 2025

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

Duration: 00:01:57

NVIDIA-AI-Blueprints/pdf-to-podcast

Nov 25, 2025

Transform PDFs into AI podcasts for engaging on-the-go audio content.

Duration: 00:01:48

supertone-inc/supertonic

Nov 24, 2025

Lightning-fast, on-device TTS — running natively via ONNX.

Duration: 00:01:45

BandarLabs/gitpodcast

Nov 23, 2025

Convert any git repository into an engaging podcast

Duration: 00:01:46

hujingshuang/MTrans

Nov 22, 2025

Multi-source Translation

Duration: 00:01:49

Tomiinek/Multilingual_Text_to_Speech

Nov 21, 2025

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Duration: 00:01:43

lobehub/lobe-vidol

Nov 20, 2025

Lobe Vidol - Making Virtual Idols Accessible for EveryOne

Duration: 00:01:50

OpenBMB/VoxCPM

Nov 19, 2025

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Duration: 00:02:03

PABannier/bark.cpp

Nov 17, 2025

Suno AI's Bark model in C/C++ for fast text-to-speech generation

Duration: 00:01:45

daswer123/xtts-webui

Nov 16, 2025

Webui for using XTTS and for finetuning it

Duration: 00:01:54

GetStream/Vision-Agents

Nov 15, 2025

Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.

Duration: 00:01:36

Spr-Aachen/Easy-Voice-Toolkit

Nov 14, 2025

一个简易的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.

Duration: 00:01:58

aedocw/epub2tts

Nov 13, 2025

Turn an epub or text file into an audiobook

Duration: 00:01:56

lmnt-com/diffwave

Nov 12, 2025

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Duration: 00:01:47

FireRedTeam/FireRedTTS

Nov 11, 2025

An Open-Sourced LLM-empowered Foundation TTS System

Duration: 00:01:52

gabrielmittag/NISQA

Nov 10, 2025

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Duration: 00:01:50

High-Logic/Genie-TTS

Nov 09, 2025

GPT-SoVITS ONNX Inference Engine & Model Converter

Duration: 00:02:01

mosecorg/mosec

Nov 08, 2025

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Duration: 00:01:35

nazdridoy/kokoro-tts

Nov 07, 2025

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Duration: 00:01:38

joey-zhou/xiaozhi-esp32-server-java

Nov 06, 2025

小智ESP32的Java企业级管理平台，提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案

Duration: 00:01:42

athena-team/athena

Nov 05, 2025

an open-source implementation of sequence-to-sequence based speech processing engine

Duration: 00:01:48

wxxxcxx/ms-ra-forwarder

Nov 04, 2025

免费的在线文本转语音API

Duration: 00:02:01

ardha27/AI-Waifu-Vtuber

Nov 03, 2025

AI Vtuber for Streaming on Youtube/Twitch

Duration: 00:02:03

Azure-Samples/Cognitive-Speech-TTS

Nov 02, 2025

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Duration: 00:01:41

NATSpeech/NATSpeech

Nov 01, 2025

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

Duration: 00:01:46

ttop32/MouseTooltipTranslator

Oct 31, 2025

Mouseover Translate Any Language At Once - Chrome Extension: PDF Translator, EBOOK, EPUB, OCR, TTS, NETFLIX, YOUTUBE DUAL SUBTITLES, GOOGLE DOCS, AI, VIEWER, GMAIL, WRITING, IMAGE, DUAL SUBS, MANGA, HOVER, DICTIONARY, WEBTOON, EDGE, JAPANESE, ENGLISH

Duration: 00:01:55

Artrajz/vits-simple-api

Oct 30, 2025

A simple VITS HTTP API, developed by extending Moegoe with additional features.

Duration: 00:01:47

Edresson/YourTTS

Oct 29, 2025

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Duration: 00:01:46

janvarev/Irene-Voice-Assistant

Oct 28, 2025

Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.

Duration: 00:01:49

Enemyx-net/VibeVoice-ComfyUI

Oct 27, 2025

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.

Duration: 00:01:59

PriesiaMioShirakana/DragonianVoice

Oct 26, 2025

多个SVC/TTS的C++推理库

Duration: 00:01:46

semperai/amica

Oct 25, 2025

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

Duration: 00:02:11

Stypox/dicio-android

Oct 24, 2025

Dicio assistant app for Android

Duration: 00:01:50

Henry-23/VideoChat

Oct 23, 2025

实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.

Duration: 00:01:48

spring-media/TransformerTTS

Oct 21, 2025

Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Duration: 00:01:38

Kyubyong/dc_tts

Oct 20, 2025

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

Duration: 00:01:54

ictnlp/StreamSpeech

Oct 19, 2025

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Duration: 00:02:01

bawangxx/XZVoice

Oct 18, 2025

Free and open source text-to-speech software

Duration: 00:01:53

hgneng/ekho

Oct 17, 2025

Chinese text-to-speech engine

Duration: 00:01:52

gitmylo/audio-webui

Oct 15, 2025

A webui for different audio related Neural Networks

Duration: 00:01:49

PlayVoice/vits_chinese

Oct 14, 2025

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!

Duration: 00:01:37

haoheliu/voicefixer

Oct 13, 2025

General Speech Restoration

Duration: 00:02:07

sauravpanda/BrowserAI

Oct 12, 2025

Run local LLMs like llama, deepseek-distill, kokoro and more inside your browser

Duration: 00:02:04

R3gm/SoniTranslate

Oct 11, 2025

Synchronized Translation for Videos. Video dubbing

Duration: 00:01:50

travisvn/openai-edge-tts

Oct 10, 2025

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

Duration: 00:02:07

lenML/Speech-AI-Forge

Oct 09, 2025

Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

Duration: 00:01:54

coqui-ai/open-speech-corpora

Oct 08, 2025

A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Duration: 00:01:42

LuckyHookin/edge-TTS-record

Oct 07, 2025

一个可以录制 Microsoft Edge 浏览器的语音合成（TTS）语音并输出为 .wav 音频的（windows平台）工具。

Duration: 00:01:56

edwko/OuteTTS

Oct 06, 2025

Interface for OuteTTS models.

Duration: 00:01:43

voice-cloning-app/Voice-Cloning-App

Oct 05, 2025

A Python/Pytorch app for easily synthesising human voices

Duration: 00:01:52

wwbin2017/bailing

Oct 04, 2025

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，集成DeepSeek R1等优秀大模型，时延低至800ms，Mac等低配置也可运行，支持打断

Duration: 00:01:47

neural-maze/ava-whatsapp-agent-course

Oct 03, 2025

Meet Ava, the WhatsApp Agent

Duration: 00:01:46

cosin2077/easyVoice

Oct 02, 2025

开源文本转语音工具，支持超长文本，多角色配音

Duration: 00:01:52

kan-bayashi/ParallelWaveGAN

Oct 01, 2025

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Duration: 00:01:48

AlexxIT/YandexStation

Sep 30, 2025

Управление Яндекс.Станцией и другими устройствами умного дома с Алисой из Home Assistant

Duration: 00:01:44

Kana & Mari’s SoundRepos

Episodes

pnnbao97/VieNeu-TTS

eduardolat/kokoro-web

ekwek1/soprano

diodiogod/TTS-Audio-Suite

ddPn08/rvc-webui

huakunyang/SummerTTS

shibing624/parrots

modelscope/KAN-TTS

mbailey/voicemode

gotev/android-speech

see2023/Bert-VITS2-ext

google/tacotron

StarmoonAI/Starmoon

p0p4k/vits2_pytorch

wildminder/ComfyUI-VibeVoice

daswer123/xtts-api-server

domesticatedviking/TextyMcSpeechy

zuoban/tts

alan-ai/alan-sdk-reactnative

vilassn/whisper_android

oscie57/tiktok-voice

gexgd0419/NaturalVoiceSAPIAdapter

geekwenjie/SmartJavaAI

AIDC-AI/Pixelle-Video

lucasnewman/f5-tts-mlx

madroidmaq/mlx-omni-server

zai-org/GLM-TTS

daniilrobnikov/vits2

superstarryeyes/lue

devnen/Chatterbox-TTS-Server

seungwonpark/melgan

HA6Bots/Automatic-Youtube-Reddit-Text-To-Speech-Video-Generator-and-Uploader

lucasjinreal/Kokoros

thorstenMueller/Thorsten-Voice

google/voice-builder

soobinseo/Transformer-TTS

lobehub/lobe-tts

jaywalnut310/glow-tts

dlutton/flutter_tts

C-Loftus/QuickPiperAudiobook

markovka17/dla

stepfun-ai/Step-Audio-EditX

coqui-ai/TTS-papers

cboard-org/cboard

wangzongming/esp-ai

VRCWizard/TTS-Voice-Wizard

NVIDIA-AI-Blueprints/pdf-to-podcast

supertone-inc/supertonic

BandarLabs/gitpodcast

hujingshuang/MTrans

Tomiinek/Multilingual_Text_to_Speech

lobehub/lobe-vidol

OpenBMB/VoxCPM

PABannier/bark.cpp

daswer123/xtts-webui

GetStream/Vision-Agents

Spr-Aachen/Easy-Voice-Toolkit

aedocw/epub2tts

lmnt-com/diffwave

FireRedTeam/FireRedTTS

gabrielmittag/NISQA

High-Logic/Genie-TTS

mosecorg/mosec

nazdridoy/kokoro-tts

joey-zhou/xiaozhi-esp32-server-java

athena-team/athena

wxxxcxx/ms-ra-forwarder

ardha27/AI-Waifu-Vtuber

Azure-Samples/Cognitive-Speech-TTS

NATSpeech/NATSpeech

ttop32/MouseTooltipTranslator

Artrajz/vits-simple-api

Edresson/YourTTS

janvarev/Irene-Voice-Assistant

Enemyx-net/VibeVoice-ComfyUI

PriesiaMioShirakana/DragonianVoice

semperai/amica

Stypox/dicio-android