Skip to content

Instantly share code, notes, and snippets.

@refabr1k
Last active August 16, 2024 02:06
Show Gist options
  • Save refabr1k/09fbfcdcc97e1e58e5946f61a971c11d to your computer and use it in GitHub Desktop.
Save refabr1k/09fbfcdcc97e1e58e5946f61a971c11d to your computer and use it in GitHub Desktop.
openui-whisper installation (With Torch CUDA and dependies issue due to missing fbgemm.dll)

Setting up Torch with Cuda (if you have a nvidia gfx card)

  1. Install CUDA Toolkit -> you should install the CUDA version that PyTorch supports for example at this time, Version CUDA 12.4 is the latest. image

navigate to https://developer.nvidia.com/cuda-12-4-0-download-archive download and install the CUDA Toolkit version

  1. Check that CUDA Toolkit is installed > issue nvcc --version and observe from output that the installed cuda version is detected

Installing Torch

  1. Navigate to https://pytorch.org/get-started/locally/ and select the appropriate options for your system (remember to chose the same CUDA version you installed for "Compute Platform"
  2. Run the command
  3. Test that Torch is installed
>>> import torch

>>> torch.cuda.is_available()
True

>>> torch.cuda.device_count()
1

>>> torch.cuda.current_device()
0

>>> torch.cuda.device(0)
<torch.cuda.device at 0x7efce0b03be0>

>>> torch.cuda.get_device_name(0)
'GeForce GTX 950M'

  1. If there is an issue see below (reference: pytorch/pytorch#131662)

Torch dependency issue: (Missing fbgemm.dll) OSError: [WinError 126] The specified module could not be found.

Solution here: pytorch/pytorch#131662 (comment)

  1. Install Visual Studio Community 2022
  2. Tools > Get Tools and Features
  3. Individual Components tab > VS 2022 C++ ... (latest)

352517926-81707c5d-d56e-49ce-8677-460a6c423d62


Install whisper

https://github.com/openai/whisper

whisper ".\New Recording 50.m4a" --model large-v3 --language=en --threads=4 
# [00:00.320 --> 00:05.520]  what is the you know the highest cost or all the all the cost that is causing all the
# [00:05.520 --> 00:11.760]  expensiveness like and see whether from the industrial solution we choose uh you know shed


whisper ".\New Recording 40.m4a" --language Chinese --model large-v3 --threads 4 --output_format txt
# [00:00.000 --> 00:02.160] 嗨,今天天气很好

# to steer ai to translate to simplified chinese (use initial_prompt with simplified chinese)
whisper ".\New Recording 40.m4a" --language Chinese --model large-v3 --threads 4 --output_format txt --initial_prompt '以下是普通话的句子'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment