1

my Ubuntu server recently crashed and since then I am struggling to get it back.

The server became unresponsive, ping returned sporadic returns and none of the services (SSH or Webmin) would connect. Shutdown wasn't possible either so I eventually had to switch it off.

The hard reset seems to have destroyed the root file system as the boot folder and many others were empty which meant I ended up in grub rescue mode after the reboot.

Well, decided to reinstall the OS which is where my journey begins.

First, what's working:

  • New installation works without a problem
  • All drives are found, including the raid
  • when opening a shell in USB drive rescue mode I can mount all drives without a problem, (raid and backup drive)

Setup is

  • SSD for the OS, home and swap (3 separate partitions)
  • 3 4TB drives for a software raid 10 (one spare)
  • a separate 2 TB swappable drive for offline backups

And here's where I am stuck:

  • The server boots, shows the grub window and loads the kernel (lots of the usual status messages...)

  • The last successful messages seem to be

    Begin: Loading essential drivers ... done

    Begin: Running scripts/init-premount ... done

    [19.000] random: fast init done

    Begin waiting for root file system

From there on there are lots of the below

Begin: Running scripts/local-block ... mdadm: no devices listed in config file were found
done

Until it finaly gives up with

Gave up waiting for root device. Common problems...
...
ALERT! UUID=.... does not exist. Dropping to shell 

After which the system freezes.

The UUID listed is correct and represents the boot partition of my SSD.

Thís somehow looks like none of the drives are accesible all of a sudden, neither the boot drive (UUID error) nor the raid array (mdadm error message)

I tried grup-updates and reinstalls which all give me strange errors. But whenever I a boot from my USB stick, select the rescue option and open a shell with the ssd-boot partition I can happily see and mount all partitions.

Some of the grup messages I am getting:

grub-update

Found linux image....

Found initrd image....

WARNING: Failed to connect lvmdat. Falling back to device scanning
grup-probe: error: cannot find a GRUB drive for dev/sdb1 check your device map

I checked /etc/fstab, and all entries look good to me. UUIDs macth what I would expect, / SWAP and are available

Anyone's got an idea of where to look next? My next step would be to completely repartition the SSD which I would like to avoid...

Thanks Thomas

TZ04
  • 131

0 Answers0