CUDA Installation

Introduction

At long last, I have managed to get a CUDA installation working reliably alongside our ROS environments without breaking things (so far). Here are the steps you need to take:

Prep

Of course you need to make sure that you have the essentials

sudo apt-get install build-essential

Before we begin purge all drivers from your computer. This is important so as to not have any residual libraries

sudo apt-get purge nvidia-*

Download new drivers by going to the CUDA download area and download the CUDA 8.0 '.run' drivers for Debian/Ubuntu. It should be of the style:

cuda_8.0.61_375.26_linux.run

Note that from this point forward we will be operating on the assumption that the file is called cuda_8.0.61_375.26_linux.run with the understanding that you will fill in your own filename

Separately extract new CUDA and nVidia drivers:

mkdir ~/Downloads/nvidia_installers;
cd ~/Downloads
./cuda_8.0.61_375.26_linux.run -extract=~/Downloads/nvidia_installers;

"Blacklisting" the nouveau file will make it so that nothing goes unstable from its removal, but will never be used while you have an nVidia driver. Create the file that will do this by creating a file called /etc/modprobe.d/blacklist-nouveau.conf and filling it with the two lines: blacklist nouveau and option nouveau modset=0. You can do this with the following commands:

sudo -i
> /etc/modprobe.d/blacklist-nouveau.conf
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist-nouveau.conf
echo "options nouveau modset=0" >> /etc/modprobe.d/blacklist-nouveau.conf
exit

Add this to your machine's driver list:

sudo update-initramfs -u

For all commands from this point forward we will be operating in a headless mode. Please print these instructions or pull them up on another machine

Installing the drivers and CUDA

Restart your computer. nothing should have changed, but the image of your login screen may look distorted with nouveau blacklisted and no current nVidia driver. Type Alt+Ctrl+F1 to change to headless mode. You should be presented with a tty terminal. Login with your usual username and password, and then navigate to the directory we created.

Depending on the state of your drivers beforehand, booting to Ubuntu normally may result in a black screen. Do not fret, this is likely because at some other point you cleared out all of your backup graphics drivers. Restart your computer in Safe Mode, then type mount -o rw,remount /, which will mount your file system, then switch to your user with su <username>, and then change directory to home. It is likely that if you require this step that you will not have any lightdm to turn off.

cd ~/Downloads/nvidia_installers

Turn off display manager:

sudo service lightdm stop

This next step is incredibly important. Building with --no-opengl-files is necessary for this to work as far as I have found

Run the driver script (where the asterisks should be the numbers of the driver in this folder; tab autocaomplete should find it):

sudo ./NVIDIA-Linux-x86_64-3**.**.run --no-opengl-files

Next install the toolkit:

sudo ./cuda-linux64-rel-*.*.**-********.run
sudo ./cuda-sample-linux-*.*.**-********.run

Add your new environmental variables to your PATH and export them to your .bashrc:

export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
echo "export PATH=/usr/local/cuda-8.0/bin:$PATH" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH" >> ~/.bashrc

Verify that your driver was installed correctly:

cat /proc/driver/nvidia/version

Verify CUDA version

nvcc --version

Now at this point you can restart your Display Manager

sudo service lightdm start.

Now build and run a sample for testing:

cd /usr/local/cuda/sample/1_Utilities/deviceQuery
sudo make
./deviceQuery

If this works and shows a CUDA Capable device you are done!

FAQ:

"Ahhhh!! I broke everything! What happened?"

As you may or may not know, both NVIDIA's graphics drivers and CUDA are super guilty of causing this all of the time. If you are suffering from

Login Loop (where you try to log in and it either restarts you or redirects you back to the login screen)
Black screen
Errors when you try to use apt and other core functionality I would suggest checking out the links in the References section to see if they can be helpful to you. They have been to many of us. If you have your own specific error with a solution that specifically worked for you, please add it here! Don't forget to detail any differences in your setup from the standard HLP-R setup, as in if you are running 16.04 instead of 14.04

References:

Excellent up-to-date overview of a bunch of errors: https://www.linkedin.com/pulse/installing-nvidia-cuda-80-ubuntu-1604-linux-gpu-new-victor
Guide procedure largely sourced from: http://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
Fixed one particularly nasty error (specifically using the quirks-handler in the first solution: https://askubuntu.com/questions/526571/apt-get-ends-up-with-errors-after-nvidia-331-installation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!