22.04 (Jammy) w/- latest kernel 6.1: NVidia RTX 2060 - fail

With the latest kernel, none of the 'additional drivers' I tried (510, 515, 525, etc) from the official repository worked.

~$ nvidia-smi
# NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Seems like the device is picked up by the kernel; but the nvidia dkms isn't doing its job maybe?

~$ lspci -k | grep -EA3 'VGA|3D|Display'
# 01:00.0 VGA compatible controller: NVIDIA Corporation TU106M [GeForce RTX 2060 Max-Q] (rev a1)
#	Subsystem: ASUSTeK Computer Inc. TU106M [GeForce RTX 2060 Max-Q]
#	Kernel modules: nvidiafb, nouveau
#01:00.1 Audio device: NVIDIA Corporation TU106 High Definition Audio Controller (rev a1)
#--
#05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c5)
#	Subsystem: ASUSTeK Computer Inc. Renoir
#	Kernel driver in use: amdgpu
#	Kernel modules: amdgpu

The nvidia dkms maybe isn't doing its job? This was working with the 5.15.0-58 kernel package.

2 Likes

Answer

I never figured out what the problem was. I am now able to install the 525 driver and get the correct output from nvidia-smi.

:rabbit: I went down several rabbit holes... including the NVIDIA driver installer. However, there were several hurdles, and I was ultimately unsuccessful. It did alert me to some potential problems.

I then stumbled upon something that did work. After I undid most of the changes from trying the NVIDIA installer, I ran apt install nvidia-driver-525 anyway - with the only difference in my system being a switch to gcc-12, e.g.

sudo apt install gcc-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
# I also had manually installed the 6.1 headers, but is this necessary...?
sudo apt install linux-headers-$(uname -r)
sudo apt install nvidia-driver-525

Then I rebooted, and finally nvidia-smi came up with the goods.

I then switched back to gcc-11 as a default:

sudo update-alternatives --config gcc
There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path             Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-12   12        auto mode
  1            /usr/bin/gcc-11   11        manual mode
  2            /usr/bin/gcc-12   12        manual mode

Press <enter> to keep the current choice[*], or type selection number: 1

But nvidia-driver-525 still successfully installed ... i.e. I uninstalled it, confirmed the switch to 11 with gcc --version, then installed, rebooted - and nvidia-smi still came up with the goods (unlike before).

If you want to try the NVIDIA installer

For the record the closest I got with the NVIDIA installer was via the following. Start in any ol' terminal and disable the nouveau driver...

# disable nouveau nvidia driver
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo update-initramfs -u
sudo reboot

When the login screen appears hit Ctrl+Alt+F2 and login to the terminal then...

# need gcc that matches that used for kernel build
sudo apt install gcc-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
# get 6.1 headers
sudo apt install linux-headers-$(uname -r)
# stop the X server
sudo service lightdm stop
# run nvidia installer - I clicked 'yes' to everything, but i think
# nvidia-xconfig messed things up at the end
chmod +x NVIDIA-Linux-x86_64-525.85.05.run
sudo ./NVIDIA-Linux-x86_64-525.85.05.run

Removing NVIDIA installer attempt

If you also find that the NVIDIA installer messed up the X settings then Ctrl+Alt+F2 and

nvidia-xconfig --restore-original-backup
sudo ./NVIDIA-Linux-x86_64-525.85.05.run --uninstall

Then remove /etc/modprobe.d/blacklist-nvidia-nouveau.conf and run update-initramfs -u and reboot.

6 Likes

I've actually encountered this issue again; I think it was again when the kernel was updated.

I was able to install the new drivers by downloading the kernel headers, switching to gcc-12, and then installing the drivers anew. I'm not sure why I have to do this now.

2 Likes

Thank you very much... if not you nothing will work perfectly.. Thanks again!!

3 Likes

Welcome @Name_LastName to the community!

1 Like

I'm running driver 510.108.03 on 22.04 on an RTX 3070 and so far have had no issuses like this. I did have a similar disaster on 20.04 laptop with GTX 1060.

I'm filing this away "just in case" an update goes bad.

1 Like