OpenAI Whisper (fully offline and MIT licensed)
This software was previously mentioned at: https://askubuntu.com/a/1378514/52975 but I wanted to provide a minimal runnable example.
Tested on Ubuntu 24.04:
sudo apt install ffmpeg
pipx install openai-whisper==20231117
Sample usage with this video: https://commons.wikimedia.org/wiki/File:Goldstone_Apple_Valley_Radio_Telescope_(GAVRT)_Solar_Patrol_(SVS14530).webm
wget -O gavrt.webm https://upload.wikimedia.org/wikipedia/commons/4/45/Goldstone_Apple_Valley_Radio_Telescope_%28GAVRT%29_Solar_Patrol_%28SVS14530%29.webm--2024-08-09
time whisper gavrt.webm
The video is 1:16 long and features a lady speaking in perfect American English about a technical subject. There is light background music throughout.
The terminal now contains:
/home/ciro/.local/pipx/venvs/openai-whisper/lib/python3.12/site-packages/whisper/transcribe.py:115: UserWarning: FP16 is not supported on CPU; using FP32 instead
warnings.warn("FP16 is not supported on CPU; using FP32 instead")
Detecting language using up to the first 30 seconds. Use `--language` to specify the language
Detected language: English
[... transcript ...]
real 0m32.569s
user 3m40.130s
sys 0m8.885s
and cwd now has among others, a file gavrt.srt:
1
00:00:00,000 --> 00:00:05,080
The Gavart Solar Patrol program is a heliophysics program aimed at citizen
2
00:00:05,080 --> 00:00:09,120
scientists and K through 12 students both locally, nationally and throughout the
3
00:00:09,120 --> 00:00:14,480
world. The goal of Gavart Solar Patrol is to monitor active regions on the Sun in
4
00:00:14,480 --> 00:00:18,120
order to understand how they're connected to explosive events that we
5
00:00:18,120 --> 00:00:23,100
categorize under space weather. Participants can remote in and actually
6
00:00:23,100 --> 00:00:26,480
control the telescope themselves. So a common observing mode with the Gavart
7
00:00:26,480 --> 00:00:30,400
Solar Patrol is that we'll have classrooms actually operate the
8
00:00:30,400 --> 00:00:34,760
telescope themselves, collect some data and then generate maps of what the Sun
9
00:00:34,760 --> 00:00:39,560
looks like at radio frequencies. They're gaining a really unique experience that I
10
00:00:39,560 --> 00:00:43,960
think is really special to the Gavart program and that's the ability to walk
11
00:00:43,960 --> 00:00:48,280
through the scientific process from the very beginning, from the steps of
12
00:00:48,280 --> 00:00:52,120
collecting the data themselves, all the way to reducing that data and
13
00:00:52,120 --> 00:00:56,880
interpreting scientific results from their studies. I get excited anytime I
14
00:00:56,880 --> 00:01:00,040
get to operate a radio telescope and so I really enjoy it when other people get
15
00:01:00,040 --> 00:01:04,800
to have that same opportunity and that same learning process.
Amazing! The transcription was perfect or almost perfect! And the installation/usage seamless!
Benchmarked on a .
vosk-transcriber
This is a convenient CLI for Vosk, tested on Ubuntu 24.04, you can install with the English model as:
pipx install vosk==0.3.45
mkdir -p ~/var/lib/vosk
cd ~/var/lib/vosk
wget https://alphacephei.com/vosk/models/vosk-model-en-us-0.22.zip
unzip vosk-model-en-us-0.22.zip
cd -
and then use as:
time vosk-transcriber -m ~/var/lib/vosk/vosk-model-en-us-0.22 -i gavrt.webm -o gavrt.srt -t srt
it took:
real 0m26.538s
user 0m22.677s
sys 0m5.617s
and gavrt.srt contains:
1
00:00:00,690 --> 00:00:03,030
the gabbert solar patrol program is a
2
00:00:03,030 --> 00:00:05,760
helium physics program aimed at citizen scientists
3
00:00:05,760 --> 00:00:07,620
and keep you told students both locally
4
00:00:07,620 --> 00:00:10,260
nationally and throughout the world the goal
5
00:00:10,260 --> 00:00:12,960
of gabbert solar control is to monitor
6
00:00:12,990 --> 00:00:14,850
active regions on the sun in order
7
00:00:14,850 --> 00:00:17,610
to understand how they're connected to explosive
8
00:00:17,610 --> 00:00:20,160
events that we categorize under space weather
9
00:00:20,610 --> 00:00:23,400
participants can remote in and actually control
10
00:00:23,400 --> 00:00:25,680
the telescope themselves so a common observing
11
00:00:25,680 --> 00:00:27,870
mode with the gabbert solo patrol is
12
00:00:27,870 --> 00:00:30,420
that we'll have classrooms actually operate the
13
00:00:30,420 --> 00:00:33,330
telescope themselves collect some data and then
14
00:00:33,360 --> 00:00:35,010
generate maps of what the sun looks
15
00:00:35,010 --> 00:00:37,920
like at radio frequencies they're gaining a
16
00:00:37,920 --> 00:00:39,900
really unique experience that i think is
17
00:00:39,990 --> 00:00:40,320
is real
18
00:00:40,320 --> 00:00:42,390
early special to the gabbert program and
19
00:00:42,390 --> 00:00:44,370
that's the ability to walk through the
20
00:00:44,370 --> 00:00:47,430
scientific process from the very beginning from
21
00:00:47,430 --> 00:00:49,830
the steps of collecting the data themselves
22
00:00:50,160 --> 00:00:51,990
all the way to producing that data
23
00:00:52,080 --> 00:00:55,380
and interpreting scientific results from their studies
24
00:00:55,770 --> 00:00:57,120
i get excited anytime i get to
25
00:00:57,120 --> 00:00:58,890
operate a radio telescope and i really
26
00:00:58,890 --> 00:01:00,240
enjoy it when other people get to
27
00:01:00,240 --> 00:01:00,450
have
28
00:01:00,570 --> 00:01:03,060
seeing opportunity and that same learning process
29
00:01:04,080 --> 00:01:08,010
and
So it is clearly worse than Whisper.
Benchmarked on a Lenovo ThinkPad P14s amd laptop.