Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export kokoro to sherpa-onnx #1713

Merged
merged 4 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions .github/workflows/export-kokoro.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
name: export-kokoro-to-onnx

on:
push:
branches:
- export-kokoro

workflow_dispatch:

concurrency:
group: export-kokoro-to-onnx-${{ github.ref }}
cancel-in-progress: true

jobs:
export-kokoro-to-onnx:
if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'
name: export kokoro
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: ["3.10"]

steps:
- uses: actions/checkout@v4

- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Install Python dependencies
shell: bash
run: |
pip install -q "numpy<=1.26.4" onnx==1.16.0 onnxruntime==1.17.1 librosa soundfile piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html

- name: Run
shell: bash
run: |
curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2
tar xf espeak-ng-data.tar.bz2
rm espeak-ng-data.tar.bz2
cd scripts/kokoro
./run.sh

- name: Collect results
shell: bash
run: |
src=scripts/kokoro

d=kokoro-en-v0_19
mkdir $d
cp -a LICENSE $d/LICENSE
cp -a espeak-ng-data $d/
cp -v $src/kokoro-v0_19_hf.onnx $d/model.onnx
cp -v $src/voices.bin $d/
cp -v $src/tokens.txt $d/
cp -v $src/README-new.md $d/README.md
ls -lh $d/
tar cjfv $d.tar.bz2 $d
rm -rf $d

ls -h $.tar.bz2

- name: Publish to huggingface
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
uses: nick-fields/retry@v3
with:
max_attempts: 20
timeout_seconds: 200
shell: bash
command: |
git config --global user.email "[email protected]"
git config --global user.name "Fangjun Kuang"

rm -rf huggingface
export GIT_LFS_SKIP_SMUDGE=1
export GIT_CLONE_PROTECTION_ACTIVE=false

git clone https://csukuangfj:[email protected]/csukuangfj/kokoro-en-v0_19 huggingface
cd huggingface
rm -rf ./*
git fetch
git pull

git lfs track "cmn_dict"
git lfs track "ru_dict"
git lfs track "*.wav"

cp -a ../espeak-ng-data ./
mkdir -p test_wavs

cp -v ../scripts/kokoro/kokoro-v0_19_hf.onnx ./model.onnx

cp -v ../scripts/kokoro/kokoro-v0_19_hf-*.wav ./test_wavs/

cp -v ../scripts/kokoro/tokens.txt .
cp -v ../scripts/kokoro/voices.bin .
cp -v ../scripts/kokoro/README-new.md ./README.md
cp -v ../LICENSE ./

git lfs track "*.onnx"
git add .

ls -lh

git status

git commit -m "add models"
git push https://csukuangfj:[email protected]/csukuangfj/kokoro-en-v0_19 main || true

- name: Release
uses: svenstaro/upload-release-action@v2
with:
file_glob: true
file: ./*.tar.bz2
overwrite: true
repo_name: k2-fsa/sherpa-onnx
repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}
tag: tts-models
3 changes: 3 additions & 0 deletions scripts/kokoro/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
voices.json
voices.bin
README-new.md
10 changes: 10 additions & 0 deletions scripts/kokoro/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Introduction

This folder contains scripts for adding meta data to models
from https://github.com/thewh1teagle/kokoro-onnx/releases/tag/model-files

See also
https://huggingface.co/hexgrad/Kokoro-82M/tree/main
and
https://huggingface.co/spaces/hexgrad/Kokoro-TTS

107 changes: 107 additions & 0 deletions scripts/kokoro/add-meta-data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
#!/usr/bin/env python3
# Copyright 2025 Xiaomi Corp. (authors: Fangjun Kuang)


import argparse
import json
from pathlib import Path

import numpy as np
import onnx


def get_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model", type=str, required=True, help="input and output onnx model"
)

parser.add_argument("--voices", type=str, required=True, help="Path to voices.json")
return parser.parse_args()


def load_voices(filename):
with open(filename) as f:
voices = json.load(f)
for key in voices:
voices[key] = np.array(voices[key], dtype=np.float32)
return voices


def get_vocab():
_pad = "$"
_punctuation = ';:,.!?¡¿—…"«»“” '
_letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
_letters_ipa = "ɑɐɒæɓʙβɔɕçɗɖðʤəɘɚɛɜɝɞɟʄɡɠɢʛɦɧħɥʜɨɪʝɭɬɫɮʟɱɯɰŋɳɲɴøɵɸθœɶʘɹɺɾɻʀʁɽʂʃʈʧʉʊʋⱱʌɣɤʍχʎʏʑʐʒʔʡʕʢǀǁǂǃˈˌːˑʼʴʰʱʲʷˠˤ˞↓↑→↗↘'̩'ᵻ"
symbols = [_pad] + list(_punctuation) + list(_letters) + list(_letters_ipa)
dicts = {}
for i in range(len((symbols))):
dicts[symbols[i]] = i
return dicts


def generate_tokens():
token2id = get_vocab()
with open("tokens.txt", "w", encoding="utf-8") as f:
for s, i in token2id.items():
f.write(f"{s} {i}\n")


def main():
args = get_args()
print(args.model, args.voices)

model = onnx.load(args.model)
voices = load_voices(args.voices)

if Path("./tokens.txt").is_file():
print("./tokens.txt exist, skip generating it")
else:
generate_tokens()

keys = list(voices.keys())
print(",".join(keys))

if Path("./voices.bin").is_file():
print("./voices.bin exists, skip generating it")
else:
with open("voices.bin", "wb") as f:
for k in keys:
f.write(voices[k].tobytes())

meta_data = {
"model_type": "kokoro",
"language": "English",
"has_espeak": 1,
"sample_rate": 24000,
"version": 1,
"voice": "en-us",
"style_dim": ",".join(map(str, voices[keys[0]].shape)),
"n_speakers": len(keys),
"speaker_names": ",".join(keys),
"model_url": "https://github.com/thewh1teagle/kokoro-onnx/releases/tag/model-files",
"see_also": "https://huggingface.co/spaces/hexgrad/Kokoro-TTS",
"see_also_2": "https://huggingface.co/hexgrad/Kokoro-82M",
"maintainer": "k2-fsa",
}

print(model.metadata_props)

while len(model.metadata_props):
model.metadata_props.pop()

for key, value in meta_data.items():
meta = model.metadata_props.add()
meta.key = key
meta.value = str(value)
print("--------------------")

print(model.metadata_props)

onnx.save(model, args.model)

print(f"Please see {args.model}, ./voices.bin, and ./tokens.txt")


if __name__ == "__main__":
main()
50 changes: 50 additions & 0 deletions scripts/kokoro/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
#!/usr/bin/env bash
# Copyright 2025 Xiaomi Corp. (authors: Fangjun Kuang)

set -ex

cat > README-new.md <<EOF
# Introduction

Files in this folder are from
https://github.com/thewh1teagle/kokoro-onnx/releases/tag/model-files

Please see also
https://huggingface.co/hexgrad/Kokoro-82M
and
https://huggingface.co/hexgrad/Kokoro-82M/discussions/14
EOF

files=(
kokoro-v0_19_hf.onnx
# kokoro-v0_19.onnx
# kokoro-quant.onnx
# kokoro-quant-convinteger.onnx
voices.json
)

for f in ${files[@]}; do
if [ ! -f ./$f ]; then
curl -SL -O https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/$f
fi
done

models=(
# kokoro-v0_19
# kokoro-quant
# kokoro-quant-convinteger
kokoro-v0_19_hf
)

for m in ${models[@]}; do
./add-meta-data.py --model $m.onnx --voices ./voices.json
done

ls -l
echo "----------"
ls -lh

for m in ${models[@]}; do
./test.py --model $m.onnx --voices-bin ./voices.bin --tokens ./tokens.txt
done
ls -lh
Loading