-
Notifications
You must be signed in to change notification settings - Fork 3
CUDA Installation
At long last, I have managed to get a CUDA installation working reliably alongside our ROS environments without breaking things (so far). Here are the steps you need to take:
Of course you need to make sure that you have the essentials
sudo apt-get install build-essential
Before we begin purge all drivers from your computer. This is important so as to not have any residual libraries
sudo apt-get purge nvidia-*
Download new drivers by going to the CUDA download area and download the CUDA 8.0 '.run' drivers for Debian/Ubuntu. It should be of the style:
cuda_8.0.61_375.26_linux.run
Note that from this point forward we will be operating on the assumption that the file is called cuda_8.0.61_375.26_linux.run with the understanding that you will fill in your own filename
Separately extract new CUDA and nVidia drivers:
mkdir ~/Downloads/nvidia_installers;
cd ~/Downloads
./cuda_8.0.61_375.26_linux.run -extract=~/Downloads/nvidia_installers;
"Blacklisting" the nouveau file will make it so that nothing goes unstable from its removal, but will never be used while you have an nVidia driver. Create the file that will do this by creating a file called /etc/modprobe.d/blacklist-nouveau.conf and filling it with the two lines: blacklist nouveau and option nouveau modset=0. You can do this with the following commands:
sudo -i
> /etc/modprobe.d/blacklist-nouveau.conf
echo "blacklist nouveau" >> /etc/modprobe.d/blacklist-nouveau.conf
echo "options nouveau modset=0" >> /etc/modprobe.d/blacklist-nouveau.conf
exit
Add this to your machine's driver list:
sudo update-initramfs -u
For all commands from this point forward we will be operating in a headless mode. Please print these instructions or pull them up on another machine
Restart your computer. nothing should have changed, but the image of your login screen may look distorted with nouveau blacklisted and no current nVidia driver. Type Alt+Ctrl+F1 to change to headless mode. You should be presented with a tty terminal. Login with your usual username and password, and then navigate to the directory we created.
- Depending on the state of your drivers beforehand, booting to Ubuntu normally may result in a black screen. Do not fret, this is likely because at some other point you cleared out all of your backup graphics drivers. Restart your computer in Safe Mode, then type
mount -o rw,remount /, which will mount your file system, then switch to your user withsu <username>, and then change directory to home. It is likely that if you require this step that you will not have anylightdmto turn off.
cd ~/Downloads/nvidia_installers
Turn off display manager:
sudo service lightdm stop
This next step is incredibly important. Building with --no-opengl-files is necessary for this to work as far as I have found
Run the driver script (where the asterisks should be the numbers of the driver in this folder; tab autocaomplete should find it):
sudo ./NVIDIA-Linux-x86_64-3**.**.run --no-opengl-files
Next install the toolkit:
sudo ./cuda-linux64-rel-*.*.**-********.run
sudo ./cuda-sample-linux-*.*.**-********.run
Add your new environmental variables to your PATH and export them to your .bashrc:
export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
echo "export PATH=/usr/local/cuda-8.0/bin:$PATH" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH" >> ~/.bashrc
Verify that your driver was installed correctly:
cat /proc/driver/nvidia/version
Verify CUDA version
nvcc --version
Now at this point you can restart your Display Manager
sudo service lightdm start.
Now build and run a sample for testing:
cd /usr/local/cuda/sample/1_Utilities/deviceQuery
sudo make
./deviceQuery
If this works and shows a CUDA Capable device you are done!
"Ahhhh!! I broke everything! What happened?"
As you may or may not know, both NVIDIA's graphics drivers and CUDA are super guilty of causing this all of the time. If you are suffering from
- Login Loop (where you try to log in and it either restarts you or redirects you back to the login screen)
- Black screen
- Errors when you try to use
aptand other core functionality I would suggest checking out the links in the References section to see if they can be helpful to you. They have been to many of us. If you have your own specific error with a solution that specifically worked for you, please add it here! Don't forget to detail any differences in your setup from the standard HLP-R setup, as in if you are running 16.04 instead of 14.04
- Excellent up-to-date overview of a bunch of errors: https://www.linkedin.com/pulse/installing-nvidia-cuda-80-ubuntu-1604-linux-gpu-new-victor
- Guide procedure largely sourced from: http://kislayabhi.github.io/Installing_CUDA_with_Ubuntu/
- Fixed one particularly nasty error (specifically using the
quirks-handlerin the first solution: https://askubuntu.com/questions/526571/apt-get-ends-up-with-errors-after-nvidia-331-installation