0

So I had a problem with my server running Ubuntu 14.04 and 3x1T configured with a software RAID. I forced mdadm to mount the RAID with two disks only, I added the missing disk back to the RAID array and the system rebuilt the RAID and everything looked fine..

And now here is the problem. Every time when the server starts I see this messages

[    2.440341] md0: detected capacity change from 0 to 482848079872
[    2.460418]  md0: unknown partition table

It waits for a couple of seconds and after that it mounts the partitions as it should and everything is fine.

Here is some more info:

mdadm -D /dev/md0 /dev/md0:

        Version : 0.90   Creation Time : Sat Feb 26 10:39:28 2011  
     Raid Level : raid5  
     Array Size : 1921873792 (1832.84 GiB 1968.00 GB)   Used Dev Size : 960936896   (916.42 GiB 984.00 GB)    Raid Devices : 3   Total Devices  
: 3 Preferred Minor : 0  
    Persistence : Superblock is persistent  

    Update Time : Fri Jan 30 19:40:00 2015  
          State : clean   Active Devices : 3 Working Devices : 3  Failed Devices : 0     Spare Devices : 0

         Layout : left-symmetric  
     Chunk Size : 64K  

           UUID : 91c9bf9f:53a9ecfd:80cbc40e:2f20054f  
         Events : 0.602824  

    Number   Major   Minor   RaidDevice State  
       0       8        1        0      active sync   /dev/sda1  
       1       8       17        1      active sync   /dev/sdb1  
       2       8       33        2      active sync   /dev/sdc1    

fdisk -l

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes 255 heads, 63
sectors/track, 121601 cylinders, total 1953525168 sectors Units =
sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512
bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00072f13

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048  1921875967   960936960   fd  Linux raid
autodetect /dev/sda2      1921875968  1953523711    15823872   82 
Linux swap / Solaris

Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes 255 heads, 63
sectors/track, 121601 cylinders, total 1953525168 sectors Units =
sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512
bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000d8a37

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        2048  1921875967   960936960   fd  Linux raid
autodetect /dev/sdb2      1921875968  1953523711    15823872   82 
Linux swap / Solaris

Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes 255 heads, 63
sectors/track, 121601 cylinders, total 1953525168 sectors Units =
sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512
bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000e4fef

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1   *        2048  1921875967   960936960   fd  Linux raid
autodetect /dev/sdc2      1921875968  1953523711    15823872   82 
Linux swap / Solaris

Disk /dev/md0: 1968.0 GB, 1967998763008 bytes 2 heads, 4
sectors/track, 480468448 cylinders, total 3843747584 sectors Units =
sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512
bytes / 512 bytes I/O size (minimum/optimal): 65536 bytes / 131072
bytes Disk identifier: 0x00000000

**Disk /dev/md0 doesn't contain a valid partition table**

Why is it doing this and what can I do to fix it?

Fabby
  • 35,017
MiniMe
  • 193

1 Answers1

1

Why? The individual disks are (maybe) fine, but the RAID super-block is very probably damaged due to a software/hardware combination fault.

What to do?

  1. Back up everything!
  2. Install smartmontools and to a full diagnostic of all drives

    sudo apt-get install smartmontools
    sudo smartctl --test=long /dev/sda
    sudo smartctl --test=short /dev/sdb
    sudo smartctl --test=short /dev/sdc
    

    wait until test is finished, then:

    sudo smartctl --all /dev/sda
    sudo smartctl --all /dev/sdb
    sudo smartctl --all /dev/sdc
    
  3. interpret results and see whether any drive(s) need(s) replacing (leave a comment if not clear, and did I mention to back up?)

  4. look for bad blocks:

    badblocks -nsv -o /dev/USB-Stick/BadBlocks.sda /dev/sda
    badblocks -nsv -o /dev/USB-Stick/BadBlocks.sdb /dev/sdc
    badblocks -nsv -o /dev/USB-Stick/BadBlocks.sdc /dev/sdc
    
  5. If you find bad blocks, these must be combined into 1 file (badblocks.all. Didn't I forget to mention to back up?) and passed to all drives:

    mkfs.ext4 -l /dev/USB-Stick/BadBlocks.all /dev/sda
    mkfs.ext4 -l /dev/USB-Stick/BadBlocks.all /dev/sdb
    mkfs.ext4 -l /dev/USB-Stick/BadBlocks.all /dev/sdc
    
  6. and recreate your device:

    mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda /dev/sdb /dev/sdc
    
  7. Restore back-up

Notes:

  • I would definitely not do a mdadm --detail --scan beforehand as you will copy the error.
  • If this is really time-critical, you can do away with 4&5 if the results from 3 are fantastic, but I would not!
  • If this is time-critical, you can do away with 5 if the results from 3&4 are fantastic, but I would not!
  • If you have the budget, get rid of software RAID5 and get a hardware RAID5 (300-500$)
  • If you have the budget, add 2 more disks and go to RAID6
  • If you have the entire week-end to do this, do a -wsv instead of an -nsv

Oh, and I wasn't joking about the backup!

Fabby
  • 35,017