1

A few days ago I ran a sudo apt update + upgrade on my Ubuntu 16.04. I had not done so for about 2 months. In the meantime, I had changed my graphics card from a GTX 1060 to a GTX 1070.

When today I tried to login, I discovered I had entered an annoying infinite login loop. This is the content of my xsessions-errors log:

X Error of failed request:  BadWindow (invalid Window parameter)
  Major opcode of failed request:  155 (NV-GLX)
  Minor opcode of failed request:  4 ()
  Resource id in failed request:  0x3d0
  Serial number of failed request:  46
  Current serial number in output stream:  46
openConnection: connect: No such file or directory
cannot connect to brltty at :0
[...]

Looking at the promising answers to this question, I tried the following:

  1. Check the ownership of .Xauthority belongs to me, and not root (it does belong to me)
  2. Reconfigure lightdm
  3. Reinstall lightdm
  4. Check if my /home/ is full (it's at 44% usage)

all unsuccessfully. Then I started believing my issue lies in a NVIDIA drivers update, as I read multiple sources on various websites explaining that was a common issue. It's worth adding I did not make any recent modification to ./profile or similar, and never run the command startx in my life.

I found these potential solutions to my issue that revolve around NVIDIA drivers:

  • installing nvidia-current drivers (older than latest supported by NVIDIA) as proposed here;
  • reinstalling NVIDIA drivers by running nvidia-installer.sh as proposed here;

My problem is that on my computer I spent tens of hours configuring CUDA, in a delicate balance with NVIDIA drivers, and various packages. For installing CUDA, I also had to install a specific Ubuntu kernel version (4.4).

Is there a chance that my CUDA environment will break if I touch the drivers? Is there actually anything else I could try to fix the issue?

raggot
  • 231

2 Answers2

0

The solution is try to reinstall the driver and reconfigure lightdm also. In the worst case you will need to install CUDA again but try reinstalling the driver first by downloading the installer from nvidia(the shell script).

Is there a chance that my CUDA environment will break if I touch the drivers? Is there actually anything else I could try to fix the issue?

Not necessarily. As long as your drivers are in place. nvcc should function properly.

abu_bua
  • 11,313
0

To other people facing my same issue, I suggest trying the following:

mv .Xauthority .Xauthority-backup

which renames .Xauthority and forces the creation of a new one with the next login attempt. In case the problem laid there, then the login would work.

In my case, however, what actually worked was doing what I was afraid doing, which is reinstalling the NVIDIA drivers. For users of CUDA, the driver version that needs to be installed is defined by the documentation. In my case, as I had CUDA 9.1, it's at least driver version 390.46 (as per the moment I write this answer, of course).

I followed the instructions found on this website to (re)install the drivers I needed. I actually also found this post from the CUDA forum, written by a moderator, explaining how on his experience using this source for the drivers may not always work, as they are not officially released by NVIDIA. In my case, it still worked and I therefore share it.

First, remove the installed NVIDIA drivers:

sudo apt-get purge nvidia*

Add the repository for the graphics driver:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update

Then install the correct version of the drivers (in my case, xxx = 390)

sudo apt install nvidia-xxx

And finally

reboot

The login problem should now be solved. In my case, the CUDA environment did not get affected and all my project still ran normally.

raggot
  • 231