Releases: mobiusml/faster-whisper
mobius-faster-whisper 1.1.0
This release is based on the original faster-whisper repo (release v1.0.3) and contains below additional features:
Turbo model support (#33)
- Based on openai/whisper#2363
Batching Support, Speed Boosts, and Quality Enhancements (Based on SYSTRAN#856)
- Batching support
- Faster feature extraction with torch STFT
- Quality Improvements:
- Multi-segment language detection
- Code-switching support
- Consistency across runs
- Reducing hallucinations
Bug Fixes
Latest release before merging to FW
Changes include removing the default FE method and replacing it with the torch-based method, restructuring components for ease of reliability, and reducing redundancy.
faster whisper v1.0.1 with mobiusml additions
Primer before faster-whisper PR:
- Comply with CONTRIBUTING guidelines.
- Added tests for batched transcription and multisegment language detection.
- Added vad model by default for batched transcription.
- minor fixes in the code/requirements.
faster whisper v0.10.0 with mobius additional capabilities
This version builds explicitly on faster_whisper 0.10.0 and has the following additional capabilities:
- All Mobius features that were present in the previous release.
- Support for batched inference (assuming Vad segments are fed as inputs) in streaming and batched output modes.
- Support for multisegment language detection that is more accurate.
faster whisper v0.9.0 with additional features for mobius ASR v2.1
This release is based on latest faster-whisper project (v0.9.0), further changes include:
- Adding multilingual support (Major)
- Fixing seed for consistent results
- Reduce hallucination by skipping ambiguous transcription segments
- Adding numpy requirements
faster whisper 0.6.0 with multilingual capability, seed and fixes
Faster Whisper v0.6.0 with additional capabilities:
Multilingual support: Optional flag to support multilingual videos. The default output language is English. There is an option to set code-switched language as the output language.
Setting seed for ctranslate2 model: Useful for consistency reasons.
Skipping the segment if the avg_log_prob is too low: The current option also checks for no_speech_prob and ignores music/noise pieces.