-
Star
(217)
You must be signed in to star a gist -
Fork
(50)
You must be signed in to fork a gist
-
-
Save whizzzkid/37c0d365f1c7aa555885d102ec61c048 to your computer and use it in GitHub Desktop.
# Instructions for 4.14 and cuda 9.1 | |
# If upgrading from 4.13 and cuda 9.0 | |
$ sudo apt-get purge --auto-remove libcud* | |
$ sudo apt-get purge --auto-remove cuda* | |
$ sudo apt-get purge --auto-remove nvidia* | |
# also remove the container directory direcotory at /usr/local/cuda-9.0/ | |
# Important libs required with 4.14.x with Cuda 9.X | |
$ sudo apt install libelf1 libelf-dev | |
# Install Intel Graphics Patch Firmwares (This should reboot your system): | |
bash -c "$(curl -fsSL http://bit.ly/IGFWL-install)" | |
# Update to 4.14 kernel. nvidia-384 compiles fine with this. | |
cd /tmp | |
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-headers-4.14.13-041413_4.14.13-041413.201801101001_all.deb | |
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-headers-4.14.13-041413-generic_4.14.13-041413.201801101001_amd64.deb | |
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-image-4.14.13-041413-generic_4.14.13-041413.201801101001_amd64.deb | |
sudo dpkg -i *.deb | |
# Add Nvidia repository | |
sudo add-apt-repository ppa:graphics-drivers/ppa | |
sudo apt update | |
# Install via ubuntu drivers | |
sudo ubuntu-drivers autoinstall | |
# <optional for ML Folks> Install CUDA 8 (if you're interested in using gpu for ML) ~> requires nvidia stable drivers | |
# This acts as repo so install it somewhere safe and do not delete | |
# cuda is now available with membership only, download from https://developer.nvidia.com/cuda-release-candidate-download | |
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local-rc_9.0.103-1_amd64.deb | |
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub | |
sudo apt-get update | |
sudo apt-get install cuda | |
# Update Path to have /usr/local/cuda/bin incase using dotfiles. | |
# <optional for ML Folks> Install CuDNN for running things like tensorflow | |
# Download cudnn7 for cuda9 from nvidia https://developer.nvidia.com/cudnn | |
sudo dpkg -i libcudnn7_7.0.2.43-1+cuda9.0_amd64.deb | |
# Testing cuda. | |
$ sudo prime-select intel | |
$ nvidia-smi | |
# This should give an error, no drivers found | |
# Try this | |
$ sudo prime-select nvidia | |
# displays CUDA info. | |
# At this point the nvidia drivers work well enough. Search for "nvidia x server settings" | |
# in your applications menu and you can switch between intel and nvidia PRIME profiles. | |
# But the nvidia card is still ON (bbswitch will report it off, but the battery consumption is (20 +- 5) Watts | |
# You can leave it here if you're not worried about battery but if you are then continue with this. | |
# Install powertop | |
sudo apt install powertop | |
# TLP interferes with bluetooth, better not install it, or remove it completely. | |
sudo apt remove --purge tlp | |
# Run powertop: | |
sudo powertop | |
# You should see battery discharge around 20w +/- 5W, this eats up my battery 4 times faster. | |
# Add command line params: | |
sudo nano /etc/default/grub | |
# Make the following look like this, do not ask why. | |
GRUB_CMDLINE_LINUX_DEFAULT='pcie_port_pm=off acpi_backlight=none acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009"' | |
sudo update-grub2 | |
# Install bumblebee - now this is the danger zone, this software has not been updated in a while and I am not sure when will this available. | |
# Avoid updating your system if you're fine with this. | |
sudo add-apt-repository ppa:bumblebee/testing | |
sudo apt update | |
sudo apt install bumblebee bumblebee-nvidia | |
# at the time of writing this, the latest is nvidia-381 but cuda 8 requires the stable, which is nvidia-375 | |
# add them to bumblebee config file. | |
sudo nano /etc/bumblebee/bumblebee.conf | |
# Change 'Driver=' to 'Driver=nvidia' | |
# Change all occurences of 'nvidia-current' to 'nvidia-xxx' | |
# Change KernelDriver=nvidia-384 | |
# save and run | |
sudo service bumblebeed restart | |
# this should give you daemon already running | |
sudo bumblebeed | |
# Since the driver load will now be handled by bumblebee, we need to stop the OS from loading it. | |
sudo nano /etc/modprobe.d/bumblebee.conf | |
# Make the following section look like this (the drm line will be added): | |
#387 | |
blacklist nvidia-387 | |
blacklist nvidia-387-drm | |
blacklist nvidia-387-updates | |
blacklist nvidia-experimental-387 | |
# once that is done, you'll need bbswitch dkms module | |
sudo apt-get install bbswitch-dkms | |
# Load this with the kernel. | |
sudo nano /etc/modules-load.d/modules.conf | |
# add following | |
i915 | |
bbswitch | |
# now make sure nvidia-settings has nvidia prime profile selected. | |
# So what actually happened: | |
# The control for switching between graphics has been moved from nvidia's driver to bumblebee. This helps | |
# maximize battery life because now you can selectively switch between which graphics card to use. In case | |
# you want to provide access to nvidia gpu for the current application run it using optirun. | |
# e.g. if you want to run steam with nvidia gpu, run something like: $ optirun steam | |
# or if you're using gpu to run ml tasks, just run them with optirun and they would work just fine. | |
# Additional | |
# TLP is known to interfere with bumblebee, make it avoid using this https://wiki.archlinux.org/index.php/Talk:Bumblebee#Bumblebee_and_TLP_interferening | |
# Run powertop to see if battery consumption is in check: 10w +/- 5W | |
# Testing bumblebee | |
cat /proc/acpi/bbswitch # Ouput:0000:01:00.0 OFF | |
optirun glxgears -info # Runs the Gears demo | |
optirun nvidia-smi # Should give an error | |
sudo prime-select nvidia # Should select nvidia hardware for cuda | |
optirun nvidia-smi # Outputs: | |
+-----------------------------------------------------------------------------+ | |
| NVIDIA-SMI 387.34 Driver Version: 387.34 | | |
|-------------------------------+----------------------+----------------------+ | |
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | |
|===============================+======================+======================| | |
| 0 GeForce GTX 1050 Off | 00000000:01:00.0 Off | N/A | | |
| N/A 29C P0 N/A / N/A | 5MiB / 4041MiB | 2% Default | | |
+-------------------------------+----------------------+----------------------+ | |
+-----------------------------------------------------------------------------+ | |
| Processes: GPU Memory | | |
| GPU PID Type Process name Usage | | |
|=============================================================================| | |
| 0 23052 G /usr/lib/xorg/Xorg 5MiB | | |
+-----------------------------------------------------------------------------+ | |
cat /proc/acpi/bbswitch # Still Outputs:0000:01:00.0 OFF which means, we're using nvida hardware only when we run applications using optirun | |
# Getting mouse freezes, random misses? | |
dmesg -w | grep psmouse #check if your trackpad is out of sync frequently | |
# Add this boot flag: | |
"psmouse.resetafter=0" | |
# The system loads a blank screen with just a cursor on reboot? | |
# I found that selecting nvidia and restarting the system would need nvidia to run again, | |
# but unfortunately we disable nvidia on startup. A simple fix could be adding an alias in your ~/.(bash|zsh)rc | |
# I always restart my system from the terminal so this makes sure I move back to intel always before restart. | |
alias reboot="sudo prime-select intel; sudo reboot now" | |
alias shutdown="sudo prime-select intel; sudo shutdown -h now" | |
# Other Helpful links: | |
http://en.community.dell.com/techcenter/os-applications/f/4613/t/19629103 | |
https://karlgrz.com/dell-xps-15-ubuntu-tweaks/ | |
https://hemenkapadia.github.io/blog/2016/05/07/Ubuntu-with-Nvidia-Bumblebee.html | |
https://askubuntu.com/questions/879856/nvidia-prime-cant-switch-to-intel/885487 | |
http://www.webupd8.org/2016/08/how-to-install-and-configure-bumblebee.html | |
#Benchmarks | |
================================== | |
GpuTest 0.7.0 | |
http://www.geeks3d.com | |
Module: TessMark X64 | |
Score: 5243 points (FPS: 87) | |
Settings: | |
- 1920x1080 windowed | |
- antialiasing: Off | |
- duration: 60000 ms | |
Renderer: | |
- GeForce GTX 1050/PCIe/SSE2 | |
- OpenGL: 4.5.0 NVIDIA 384.98 | |
================================== | |
================================== | |
GpuTest 0.7.0 | |
http://www.geeks3d.com | |
Module: Plot3D | |
Score: 11340 points (FPS: 189) | |
Settings: | |
- 1920x1080 windowed | |
- antialiasing: Off | |
- duration: 60000 ms | |
Renderer: | |
- GeForce GTX 1050/PCIe/SSE2 | |
- OpenGL: 4.5.0 NVIDIA 384.98 | |
================================== | |
================================= | |
GpuTest 0.7.0 | |
http://www.geeks3d.com | |
Module: FurMark | |
Score: 2670 points (FPS: 44) | |
Settings: | |
- 1920x1080 windowed | |
- antialiasing: Off | |
- duration: 60000 ms | |
Renderer: | |
- GeForce GTX 1050/PCIe/SSE2 | |
- OpenGL: 4.5.0 NVIDIA 384.98 | |
================================== |
@whizzzkid Thank You. After Trying dozens of tutorials all over the internet, your's is the only one that worked reliably. I am so thankful to you for finally getting my GPU properly running selectively. I had very nearly given up on Linux and moved back to windows before your Gist solved everything.
I found that after I did this with 4.11 it worked fine but once I rebooted I got AHCI errors and cant use anything but my onboard graphics. Any suggestions?
I am temped to give 4.13 a go.
Further Details:
Ubuntu 16.04
Kernel: 4.11.2-041102-generic
Error on boot (doesn't stop booting)
ACPI Error: Namespace lookup failure, AE_NOT_FOUND.
Then when I attempt to shut down I get:
NMI watchdog: BUG: soft lockup - CPU#1 stuck....
I am keen to know what BIOS version everyone has.. I am running on v1.2.4 and im not sure if I need to upgrade or if that will cause me more headaches.
@mark-bucknell I am getting the same errors, pls let me know if you find a solution. Also I have tried Manjaro, solus, Ubuntu 17:10, Elementary as well getting same errors on everyone.
@bajubullet no such luck but I am keeping an eye on this thread
Sorry guys, I was away. Also I noticed, there was an update for nvidia-384 yesterday, I am not sure what it's supposed to do but it did not mess up anything.
Let me respond to you one by one:
@hlavki
Here is the output:
$ optirun glxgears -fullscreen
1035 frames in 5.0 seconds = 206.857 FPS
953 frames in 5.0 seconds = 190.546 FPS
1017 frames in 5.0 seconds = 203.210 FPS
1010 frames in 5.0 seconds = 201.939 FPS
1012 frames in 5.0 seconds = 202.285 FPS
@rjcrystal
During my initial setup I started by installing mint, and it had too much stuff missing, ubuntu sucked too. KDE Neon looked like the best contender as it had most of the stuff working out of the box. I am not sure what the error here is but do post if you find a solution, btw primux should have been installed when installing bumblebee.
@Moulick
thanks :)
@mark-bucknell
As far as I remember I was having problems with 4.11, I moved to 4.13 already. I am on 4.13.0-041300-generic also I am on bios 1.4 planning to go to 1.5 soon
So I just did the dell bios upgrade from v1.2.4 to 1.4.0 and I followed everything down to line 23. All looks good. I am now able to support 3 monitors (just) and the error message on shut down has gone. I still have an error message showing on startup but I know that is something to do with the linux kernel and ACPI so I'll wait for updates that will fix that.
@whizzzkid thankyou for maintaining this gist
Hi,
Just to let you know, I have followed this tutorial a month ago when I have received my laptop.
I am using it daily with Kubuntu 16.04 + backports to get Plasma 5.08, Bios 1.5, Linux 4.13.8, Nvidia 387, Bumblebee...
I have about 7W in powertop on IDLE.
Options in GRUB:
acpi_rev_override=1 pcie_port_pm=off
Everything is fine, so thank you so much!
@mark-bucknell: good for you!
@romainreignier: cheers :) I am still on Driver Version: 384.98
because CUDA9.
I seek advice from you wise folks. I am having a hard time making this installation work with nvidia 384 (with cuda 8.0 ga_2 and libcudnn 6.0 on KDE Neon LTS 16.04) It seems kde neon does not like it atm because of this bug[1]
I tried rolling back to nvidia 375, but apt-get won't let me because 375 is a transitional version and automatically force-enable nvidia 384.
I tried installing nvidia 381, instead of 384, but cuda 8.0 deb installation forces -> nvidia 384.
Do you have any advice for a workaround I may try?
Also a dumb question--Is updating the kernel to 4.13 required for getting nvidia.384 to work?
[1] https://blog.martin-graesslin.com/blog/2017/08/warning-nvidia-driver-384-69-seems-to-be-broken-with-qtquick/
@dhfromkorea with this setup you will not be loading nvidia drivers when normally using the system. That bug is a problem when you're explicitly loading nvidia drivers at boot. Since this setup does not do that and instead loads them only when running a program with optirun
it shouldn't be a problem.
This is probably a dumb question, but what does bumblebee do? It sounds like it allows optionally use the nvidia card--ie take the output of the nvidia card and route it to the intel graphics unit. Why would you want to do this? Just for power savings or it there something else?
Thanks
great gist @whizzzkid ! Many thanks.
After a lot of effort, black screens with no proper display adapter and reinstalls, I finally managed to get it properly working. For me, what was missing from the whole gist and caused trouble was sudo apt-get install primus
. It gave me an errors something like "proper bridge missing".
Also, very useful for ML folks: Since you can run only what you need in your GPU, tensorflow and other ML and DL libraries are now faster than ever 👍
Just type optirun jupyter notebook
and you are ready to go!
Thanks for this excellent tutorial and for keeping it up to date! I had to do a complete reinstall of my laptop today but I had problems with the first step:
# Install Intel Graphics Patch Firmwares (This should reboot your system):
bash -c "$(curl -fsSL http://bit.ly/IGFWL-install)"
I solved it by making a local copy of that script and add cd /tmp
in the beginning of the script.
For nvidia-387 and kernel 3.14 I had to install libelf-dev
before dkms would build a kernel module.
@jasonbeach bumblebee keeps nvidia off for most of the time. If you need to run some application which would require access to cuda, you can simply run it using $ optirun [any app]
@littlewine You're welcome :)
@saroele did you have curl
installed?
@geertw updated the gist with kernel 4.14.13 and Cuda 9.1
Hi @whizzzkid,
I managed to do a working install a few months ago following your gist, but I've reinstalled my system and can't get anything working now, it seems I've fell multiple times in kernel # and drivers # that don't work together. Do you have a definite list of such couples that actually work? I've observed bad performances with X with kernels >= 4.13 and I'm therefore running with 4.10 (which doesn't work with nvidia-384).
Thanks for your work, it's been a real help.
great stuff! Especially the boot config (GRUB_CMDLINE_LINUX_DEFAULT)
@paines For those stuck with a failed Xorg-server. I managed to recover by uninstalling all packages mentioned above and do
ubuntu-drivers autoinstall
from the recovery mode.
Its a nice guide. I got as far as having cuda running. Then gave up after three attempts to fix the rest. Could be that it doesn't work with gnome?!?
Thanks for this!
I did this because my laptop had heating issues because of nvidia always being on. Now it's staying really cool!
Also, in my case, using primusrun
instead of optirun gave me better performance for steam.
@whizzzkid, my setup has stopped working suddenly. I had an abrupt shutdown while steam with primusrun
was live.
Then, when I try to start steam again, it just runs the updater and quits.
$ primusrun steam
/usr/bin/primusrun: line 41: warning: command substitution: ignored null byte in input
Running Steam on ubuntu 17.10 64-bit
STEAM_RUNTIME is enabled automatically
Pins up-to-date!
[2018-04-07 21:44:12] Startup - updater built Apr 2 2018 15:23:43
Looks like steam didn't shutdown cleanly, scheduling immediate update check
[2018-04-07 21:44:13] Checking for update on startup
[2018-04-07 21:44:13] Checking for available updates...
[2018-04-07 21:44:14] Download skipped: /client/steam_client_ubuntu12 version 1522709999, installed version 1522709999
[2018-04-07 21:44:14] Nothing to do
[2018-04-07 21:44:14] Verifying installation...
[2018-04-07 21:44:14] Performing checksum verification of executable files
[2018-04-07 21:44:15] Verification complete
$
This doesn't happen when running steam
in intel or with optirun
, but the performance is just not there.
Also I am seeing some weird dmesg
output too.
[ 4265.046654] ldconfig.real[17704]: segfault at 338 ip 000000000049c5a7 sp 00007ffd43442710 error 4 in ldconfig.real[400000+e2000]
[ 4265.193360] bbswitch: enabling discrete graphics
[ 4265.407318] nvidia-nvlink: Nvlink Core is being initialized, major device number 239
[ 4265.407797] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.48 Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts)
[ 4266.463078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.48 Wed Mar 21 23:48:34 PDT 2018
[ 4266.479341] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 238
[ 4266.509646] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.509799] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.568571] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.568720] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4268.457375] steam[17728]: segfault at a8 ip 00000000f75c8f83 sp 00000000ffd95e10 error 4 in libGL.so.390.48[f7547000+c5000]
[ 4268.582475] nvidia-modeset: Unloading
[ 4268.611549] nvidia-uvm: Unloaded the UVM driver in 8 mode
[ 4268.639389] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239
[ 4268.697497] bbswitch: disabling discrete graphics
[ 4268.714983] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 4273.194150] bbswitch: enabling discrete graphics
[ 4273.417666] nvidia-nvlink: Nvlink Core is being initialized, major device number 239
[ 4273.418106] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.48 Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts)
[ 4274.474290] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 390.48 Wed Mar 21 23:48:34 PDT 2018
[ 4274.491723] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 238
[ 4274.523783] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4274.523930] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4274.575560] nvidia-modeset: Unloading
[ 4274.640395] nvidia-uvm: Unloaded the UVM driver in 8 mode
[ 4274.659580] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239
[ 4274.693555] bbswitch: disabling discrete graphics
[ 4274.711057] pci 0000:01:00.0: Refused to change power state, currently in D0
Welcome everyone
I updated to 390.48 a couple of days ago and 4.15.18 kernel, the watts are oddly satisfying on idle.
@fayazkhan try reinstalling, I am clueless.
Cheers :)
@whizzzkid it actually wasn't a setup issue, but an issue from steam which had a solve here: ValveSoftware/steam-for-linux#5428 (comment)
This command solves it
primusrun steam -steamos
Hello! Thank you for taking the time to provide clear instructions.
I am running Ubuntu 16.04 on my Dell XPS 15 9560 (UHD screen, 16Gb RAM, 512 SSD etc.) After following the guide, I ended up with a black screen (no cursor) upon reboot. I have a feeling the error has something to do with one of these factors:
- When running sudo bumblebeed , the output was something to the effect of "nvidia" not being found.
- When I ran cat /proc/acpi/bbswitch, the output was 0000:01:00.0 ON
- I installed the V4.14.13 kernel, but sudo ubuntu-drivers autoinstall installed the version 396 driver.
Do you perhaps have any advice how I could get this working properly? At the moment, having removed all nvidia related components, powerstat indicates an average power draw of 25W. I would very much like to get this down to more reasonable figures.
My CPU temperature sits around 50 degrees - which is hotter than the 40 degrees I have read online others are able to achieve. Furthermore, the GPU temperature seems to be at around 60 degrees.
I finally installed Manjaro Arch Linux and suddenly everything works perfectly out of the box, zero configuration required !
When i run optirun it says
[ 200.660549] [ERROR]No bridge found. Try installing primus or virtualgl.
I am using linux mint 18 with nvidia 940MX kernel 4.13.