Home Assistant - Enabling CUDA GPU support for Wyoming Whisper Docker container
Update: ab-tools has created a Docker image that has CUDA support baked in. You should be able to skip the below instructions and use this instead. Thanks ab-tools! https://hub.docker.com/r/abtools/wyoming-whisper-cuda
When trying to use the built in CUDA support on the Wyoming Whisper Docker container I received the error: "RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version." Here is how I resolved it.
Prerequisites:
- A working Docker/NVIDIA environment. See: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
- Docker Compose
- Wyoming Whisper/Home Assistant configured. See https://www.home-assistant.io/integrations/whisper/ and https://community.home-assistant.io/t/run-whisper-on-external-server/567449
- I used Ubuntu 22.04. Other distros will work but this is what I used. If you are also using Ubuntu, run the following to imstall the required packages: apt install libcublas11 libcudnn8
Here is an example of the full error received after attempting to use the CUDA support in Wyoming Whisper:
docker-compose logs -f
whisper | Traceback (most recent call last):
whisper | File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
whisper | return _run_code(code, main_globals, None,
whisper | File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
whisper | exec(code, run_globals)
whisper | File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/__main__.py", line 135, in <module>
whisper | asyncio.run(main())
whisper | File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
whisper | return loop.run_until_complete(main)
whisper | File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
whisper | return future.result()
whisper | File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/__main__.py", line 112, in main
whisper | whisper_model = WhisperModel(
whisper | File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 58, in __init__
whisper | self.model = ctranslate2.models.Whisper(
whisper | RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version
I was using a configuration that I normally use for any of my other containers that leverage my GPU:
1 deploy:
2 resources:
3 reservations:
4 devices:
5 - capabilities: [gpu]
The following docker-compose.yml is what eventually got things in working order.
1services:
2 whisper:
3 container_name: whisper
4 image: rhasspy/wyoming-whisper:latest
5 command: --model small-int8 --language en --beam-size 5 --device cuda
6 volumes:
7 - /home/whisper/whisper-data:/data
8 - /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:ro
9 - /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:ro
10 - /usr/lib/x86_64-linux-gnu/libcublasLt.so.11:/usr/lib/x86_64-linux-gnu/libcublasLt.so.12:ro
11 - /usr/lib/x86_64-linux-gnu/libcublas.so.11:/usr/lib/x86_64-linux-gnu/libcublas.so.12:ro
12 restart: always
13 ports:
14 - 10300:10300
15 runtime: nvidia
16 deploy:
17 resources:
18 reservations:
19 devices:
20 - driver: nvidia
21 count: 1
22 capabilities: [gpu]
Notice a few libraries are manually imported into the container. Without these I would get errors such as below:
docker-compose logs -f
whisper | INFO:__main__:Ready
whisper | Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
whisper | Please make sure libcudnn_ops_infer.so.8 is in your library path!
whisper | Aborted (core dumped) python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --data-dir /data --download-dir /data "$@"
In the end, things are working as expected. Responses went from ~10 seconds using CPU down to ~1 second using GPU (GTX1060).
docker-compose logs -f
whisper | INFO:__main__:Ready
whisper | INFO:wyoming_faster_whisper.handler: Turn on living room lights.