Some scripts for convenient voice transcription. At present, we use OpenAI-Whisper, since it's such high quality.
-
Transcribes your voice, and copies it to the clipboard.
-
Can run
whisper
each time you record yourself -
Can run our little whisper server (in
server-flask/
dir) for faster use (model stays loaded) -
UI buttons: In
ui/whisper-buttons
may be run for on-screen buttons -
Hotkeys: You can assign hotkeys to
whisper-auto
andwhisper-kill-rec
-
Desktop Shortcuts: Drag the scripts to your desktop (preferably as .desktop entries -- I'm not sure what they'll do as links because whisper-auto tries to find its own location)
Click to play video (remember, the button UI is optional. Hotkeys can be assigned to the scripts, or they can be used as icons on the desktop):
- Donate to patreon.com/jaggz
- Or paypal me at jaggz.h {at} gmail.com
This was initially designed for the desktop icons method of execution. It updates/renames icons on the desktop to provide status changes (like, it'll rename them to WH-Ready, WH-Rec, ...). This doesn't work on all X11 desktops (the renaming doesn't update the desktop entries). I worked on this for some time but gave up and focused on the little GTK ui (ui/whisper-buttons
).
We have two methods of using whisper for our purposes:
- No-Server Method: Runs whisper each time you want a transcription (model must be loaded each run)
- Server Method: Run a separate little server (in
server-flask/
) which keeps the model loaded so transcriptions are faster
(A venv is a python virtual environment, to store all the whisper-related stuff in this case.)
- Make venv:
- Make venv base dir:
mkdir -p ~/venv
(or wherever you want it. Adjust the following commands to match this base directory.) - Create whisper venv:
python -m venv ~/venv/whisper
- Make venv base dir:
- Activate venv:
. ~/venv/whisper/bin/activate
- Install whisper stuff in venv:
-
pip install openai-whisper
-
OR, you can use the versions I'm currently using, which I placed in
requirements-whisper.txt
(to just use the scripts or UI), andserver-flask/requirements-server.txt
(to run the server, which needs openai-whisper AND flask). -
pip install flask
<--- used only if you want to run our server
-
- Edit
whisper-auto
- Change "use_server=1
to
=0` - If you used a venv other than
~/venv/whisper
:- You'll need to change the
envact=
line to point to it - and the
envpattern=
line which is used to detect if we're in the venv already.
- You'll need to change the
- Run
./whisper-auto
(it will begin recording)- Hit ctrl-c to stop recording, and whisper should begin processing.
- Alternatively, use the
whisper-kill-rec
script to end the recording task.
In this mode, we run a separate little python app that keeps the model loaded.
https://blog.deepgram.com/how-to-build-an-openai-whisper-api/ Implemented an ultra-easy flask server (modified in this here project) for the task. Thanks Adam! Check out his blog post.
- In a separate term, go to the
server-flask
folderflask run
should run it fine- It uses port 5000 by default. Edit the
app.py
andwhisper-auto
to set theserver\_port=
if you change it.
- Run
./whisper-auto
(it will begin recording)- Hit ctrl-c to stop recording, and whisper should begin processing.
- Alternatively, use the
whisper-kill-rec
script to end the recording task.
1. Go to the ui/ folder and type `make`
1. I've not built out a list of dependencies -- you'll have to figure it out yourself (it uses gtk). Submit a pull-request with dependencies added here if you want :)
1. Assign two hotkeys, one to -auto and one to -kill-rec
1. Drag both to your desktop, or otherwise add them.
Note: They should probably be links or .desktop entries, not copies, so updates will be handled when you git pull
updates.