Wavify is a collection of small speech models and a runtime that is blazingly fast and runs anywhere.
- Speech-to-text
- Wake-word detection
- Speech-to-intent
- Python
- Kotlin
- Swift
- Rust
- Flutter
- C++
Additional foreign language bindings can be developed externally and we welcome contributions to list them here.
Function signature are available in lib/wavify_core.h
.
aarch64-apple-ios
aarch64-linux-android
aarch64-unknown-linux-gnu
aarch64-apple-darwin
x86_64-pc-windows-gnu
x86_64-unknown-linux-gnu
Running speech-to-text on assets/samples_jfk.wav
on a Raspberry Pi 5.
Engine | Size | Time | Real-time factor |
---|---|---|---|
Whisper.cpp |
75MB (Whisper tiny) |
4.91s | 0.45 |
Wavify | 45MB | 2.21s | 0.20 |
demo.MP4
Speech-to-text models for supported languages are available here. The filename specifies the language in which the model operates, indicated by the ISO 639-1 code.
We provide the example wake-word model model-wakeword-alexa.bin. For custom models please contact us at [email protected].
You'll also need an API key which is available for free. You can get it from your dashboard once signed in.
pip install wavify
import os
from wavify.stt import SttEngine
engine = SttEngine("path/to/your/model", os.getenv("WAVIFY_API_KEY"))
result = engine.stt_from_file("/path/to/your/file")
import os
from wavify.wakeword import WakeWordEngine
engine = WakeWordEngine("path/to/your/model", os.getenv("WAVIFY_API_KEY"))
audio = ... # audio needs to be 2 seconds sampled at 16kHz, 16 bit linearly encoded and single channel
result = engine.detect(audio)
cargo add wavify
use std::env;
use anyhow::Result;
use wavify::SttEngine;
fn main() -> Result<()> {
let engine = SttEngine::new("/path/to/your/model", &env::var("WAVIFY_API_KEY")?)?;
let result = engine.stt_from_file("/path/to/your/file")?;
Ok(())
}
Kotlin bindings and an example app showcasing the integration of Wavify is available in android/
.
import dev.wavify.SttEngine
val modelPath = File(applicationContext.filesDir, "/your/model").absolutePath
val apiKey = BuildConfig.WAVIFY_AP_KEY
val appName = applicationContext.packageName
val engine = SttEngine.create(modelPath, apiKey, appName)
val audioFloats = floatArrayOf(0.0f, 0.1f, 0.2f) // Replace with actual audio data
val result = engine.stt(audioFloats)
Swift bindings and an example app showcasing the integration of Wavify is available in ios/
.
guard let modelPath = Bundle.main.path(forResource: "/your/model", ofType: "bin") else {
fatalError("Failed to find model file.")
}
guard let apiKey = Bundle.main.object(forInfoDictionaryKey: "WAVIFY_API_KEY") as? String else {
fatalError("No api key found.")
}
engine = SttEngine(modelPath: modelPath, apiKey: apiKey)!
let audioFloats: [Float] = [3.14, 2.71, 1.61]
engine.recognizeSpeech(from: convertDataToFloatArray(data: floatValues.withUnsafeBufferPointer { Data(buffer: $0) })
Contributions to wavify
are welcome.
- Please report bugs as GitHub issues.
- Questions via GitHub issues are welcome!
- To build from source, check the contributing page.
For specialized solutions, including the development of custom models optimized for your specific use case, or to discuss how Wavify can be adapted to meet your requirements, you can contact our team directly at [email protected].