Last active
April 11, 2023 18:48
-
-
Save FurkanGozukara/63e1ef499b5f0c16882a5a202c37f33f to your computer and use it in GitHub Desktop.
How good RTX 3090 is for Machine Learning, AI and Video Rendering tasks full review and comparison with RTX 3060
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
video link : https://www.youtube.com/watch?v=lgP1LNnaUaQ | |
ffmpeg with cuda support : https://github.com/GyanD/codexffmpeg/releases | |
check ffmpeg cuda support : ffmpeg -hwaccels | |
ffmpeg -version | |
used whisper commit : b5851c6c40e753606765ac45b85b298e3ae9e00d | |
used automatic1111 web ui commit : 22bcc7be428c94e9408f589966c2040187245d81 | |
used dreambooth extension commit : 926ae204ef5de17efca2059c334b6098492a0641 | |
whisper "C:\rtx 3090 review\whisper_test.mp3" --model large-v1 --language en --initial_prompt "Welcome to the Software Engineering Courses channel." --best_of 10 --beam_size 10 --output_dir "C:\rtx 3090 review\result" --device cuda:1 | |
cpu-z : https://www.cpuid.com/softwares/cpu-z.html | |
gpu-z : https://www.techpowerup.com/gpuz/ | |
core temp : https://www.alcpu.com/CoreTemp/ | |
nvidia driver download : https://www.nvidia.com/Download/index.aspx?lang=en-us | |
uninstall older torch for any case before install: pip uninstall torch | |
pytorch 2 : pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 | |
pytorch 1.13 : pip3 install torch==1.13 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 | |
xformers compile wheels : https://github.com/facebookresearch/xformers/actions/workflows/wheels.yml | |
xformers for pytorch 1.13 and python 3.10 : pip install https://huggingface.co/MonsterMMORPG/SECourses/resolve/main/xformers-0.0.18.dev494-cp310-cp310-win_amd64.whl | |
latest cuda download (if requires login make a new account it is free) : https://developer.nvidia.com/cuda-downloads | |
latest cudnn download (if requires login make a new account it is free) : https://developer.nvidia.com/rdp/cudnn-download | |
set CUDA_VISIBLE_DEVICES=1 | |
for full scripts check below files |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import torch | |
from torch.utils.cpp_extension import CUDA_HOME | |
print(f"CUDA Version: {torch.version.cuda}") | |
print(f"Torch Version: {torch.__version__}") | |
cudnn_version = torch.backends.cudnn.version() | |
print(f"The version of cuDNN DLL file used by PyTorch is {cudnn_version}.") | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# pip install py3nvml | |
import py3nvml.py3nvml as nvml | |
def get_gpu_info(): | |
nvml.nvmlInit() | |
device_count = nvml.nvmlDeviceGetCount() | |
for i in range(device_count): | |
device_handle = nvml.nvmlDeviceGetHandleByIndex(i) | |
pci_info = nvml.nvmlDeviceGetPciInfo(device_handle) | |
# Get GPU name | |
gpu_name = nvml.nvmlDeviceGetName(device_handle) | |
# Get GPU memory amount | |
memory_info = nvml.nvmlDeviceGetMemoryInfo(device_handle) | |
memory_amount = memory_info.total / (1024**2) # Convert bytes to MiB | |
# Get currently active PCIe Generation | |
try: | |
pcie_gen = nvml.nvmlDeviceGetCurrPcieLinkGeneration(device_handle) | |
except nvml.NVMLError as error: | |
print(f"Error getting currently active PCIe generation for device {i}: {error}") | |
pcie_gen = "Unknown" | |
# Get currently active PCIe Link Width | |
try: | |
pcie_width = nvml.nvmlDeviceGetCurrPcieLinkWidth(device_handle) | |
except nvml.NVMLError as error: | |
print(f"Error getting currently active PCIe link width for device {i}: {error}") | |
pcie_width = "Unknown" | |
print(f"GPU {i}:") | |
print(f" Name: {gpu_name}") | |
print(f" Memory Amount: {memory_amount:.2f} MiB") | |
print(f" PCI Bus ID: {pci_info.busId.decode('utf-8')}") | |
print(f" Currently Active PCIe Generation: {pcie_gen}") | |
print(f" Currently Active PCIe Link Width: {pcie_width} lanes") | |
nvml.nvmlShutdown() | |
if __name__ == "__main__": | |
get_gpu_info() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
set CUDA_VISIBLE_DEVICES=1 // we are using this to disable first gpu in that particular cmd window so that ffmpeg uses second card | |
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i "C:\\rtx 3090 review\\encode_test.mp4" -c:v hevc_nvenc -preset slow -rc vbr -b:v 100m -rc-lookahead 60 -maxrate 200m -bufsize 300m -vf "scale_cuda=w=7680:h=4320" -c:a copy "C:\\rtx 3090 review\\encode_test_upscaled.mp4" | |
This is a command for FFmpeg, a popular open-source software suite for handling multimedia files. The command is used to transcode a video file, and it utilizes the NVIDIA hardware acceleration features for improved performance. Here's a breakdown of the parameters: | |
-hwaccel cuda: This parameter enables hardware acceleration using NVIDIA's CUDA technology. This offloads video processing tasks from the CPU to the GPU, resulting in faster processing. | |
-hwaccel_output_format cuda: This specifies the output format for the hardware-accelerated decoding to be CUDA. | |
-i "C:\\rtx 3090 review\\encode_test.mp4": This indicates the input file, which is a video file located in the "C:\rtx 3090 review" folder and named "encode_test.mp4". | |
-c:v hevc_nvenc: This sets the video codec to use NVIDIA's hardware-accelerated HEVC (H.265) encoder, hevc_nvenc. | |
-preset slow: This sets the encoding preset for the HEVC encoder to "slow". This provides a better compression ratio at the expense of encoding speed. Other available presets include: ultrafast, superfast, veryfast, faster, fast, medium, and veryslow. | |
-rc vbr: This sets the rate control mode to variable bitrate (VBR), which allows the bitrate to vary depending on the complexity of the video frames. This can result in better overall video quality compared to constant bitrate (CBR). | |
-b:v 100m: This sets the target video bitrate to 100 Mbps (100m). | |
-rc-lookahead 64: This sets the number of frames to be analyzed before encoding, allowing the encoder to make better decisions on bitrate allocation. A higher value can improve quality but may increase encoding time. | |
-maxrate 200m: This sets the maximum video bitrate to 200 Mbps (200m). | |
-bufsize 300m: This sets the buffer size for rate control to 300 Mbps (300m). | |
-vf "scale_cuda=w=7680:h=4320": This applies a video filter (-vf) to scale the video resolution to 7680x4320 using the CUDA-based scaler. | |
-c:a copy: This copies the audio stream from the input file to the output file without re-encoding it. | |
"C:\\rtx 3090 review\\encode_test_upscaled.mp4": This specifies the output file, which will be saved in the "C:\rtx 3090 review" folder and named "encode_test_upscaled.mp4". | |
Other options you may consider: | |
-threads: This parameter sets the number of threads used for encoding. For example, -threads 8 would use 8 threads. | |
-crf: This parameter sets the Constant Rate Factor (CRF), which controls the trade-off between quality and file size. Lower values yield better quality but larger files. | |
-c:v libx264 or -c:v libx265: These options use the software-based H.264 and H.265 encoders, respectively. These may be slower than hardware-accelerated encoders but can provide better quality in some cases. | |
-c:a aac: This sets the audio codec to AAC for encoding the audio stream. You can also set the audio bitrate with -b:a, for example, -b:a 128k for 128 kbps. | |
-ss and -t: These options can be used to trim the video. -ss sets the start time, and -t sets the duration. For example, `-ss 00:01:00 -t 00:02:00` would trim the video to start at 1 minute and last for 2 minutes. | |
-vf "fps=30": This applies a video filter to change the output frame rate. In this example, the output video will have a frame rate of 30 frames per second. | |
-af "volume=2": This applies an audio filter to adjust the volume. In this example, the output audio will be twice as loud as the input. | |
-vf "transpose=1": This applies a video filter to rotate the video by 90 degrees clockwise. | |
-vf "crop=w=1920:h=1080:x=0:y=0": This applies a video filter to crop the video frame to the specified width, height, and position (x and y). | |
-map 0:v:0 -map 0:a:0: These options can be used to select specific video and audio streams from the input file. In this example, the first video stream (0:v:0) and first audio stream (0:a:0) are selected. | |
These are just a few examples of the many options available in FFmpeg. For a comprehensive list of options and their explanations, refer to the official FFmpeg documentation: https://ffmpeg.org/ffmpeg.html. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment