1

When I try to add a new disk to mdadm, I am getting back an error:
sudo mdadm --add /dev/md0 /dev/sdd --verbose

mdadm: Failed to write metadata to /dev/sdd

Is this a problem with my setup or something else.
I purchased 4 replacement disks that were reportedly new, however I suspect they were either wiped and marked as new or factory refurbished due to the difficulty I have had with working them and the listed accumulated time. I have only tried 2 of the 4 disks so far. My setup is a HB-1235 Disk Enclosure connected via LSI2308. typically using Multipath, however I have since disabled Multipath, and disconnected the second cable to trying to identify why I'm not able to setup the disks. Trying to run fdisk mkfs.ext4 or other disk utilities have not been able to write to it. I ran a badblocks scan which returned no bad blocks. Checking hdparm, the read only flag is not set. One other odd thing I have found is that even though I disabled multipathd and have rebooted, I still see two drives listed with their multipath aliases. Is there a second application that could be running multipath? The disks arrived sealed but I am aware that anyone could reseal a disk into a static bag. Do refurbished disk show factory time?

System Details:

  • Ubuntu 22.04.1 LTS
  • mdadm - v4.2 - 2021-12-30
  • Dmsetup
    • Library version: 1.02.175 (2021-01-08)
    • Driver version: 4.45.0

sudo mdadm --query --detail /dev/md0

/dev/md0:
           Version : 1.2
     Creation Time : Sat Feb  2 22:55:00 2019
        Raid Level : raid6
        Array Size : 19533824000 (18.19 TiB 20.00 TB)
     Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB)
      Raid Devices : 12
     Total Devices : 11
       Persistence : Superblock is persistent
 Intent Bitmap : Internal

   Update Time : Sat Dec 31 04:39:07 2022
         State : clean, degraded
Active Devices : 11

Working Devices : 11 Failed Devices : 0 Spare Devices : 0

        Layout : left-symmetric
    Chunk Size : 512K

Consistency Policy : bitmap

          Name : media:0  (local to host media)
          UUID : 1599e3ae:2bb24f48:a9524f60:02b6cb8c
        Events : 5824802

Number   Major   Minor   RaidDevice State
   0       8      144        0      active sync   /dev/sdj
   1       8      176        1      active sync   /dev/sdl
   2       8      128        2      active sync   /dev/sdi
   3       8      112        3      active sync   /dev/sdh
   4       8       96        4      active sync   /dev/sdg
   5       8      192        5      active sync   /dev/sdm
   -       0        0        6      removed
   7       8       80        7      active sync   /dev/sdf
   8       8      160        8      active sync   /dev/sdk
   9     253        1        9      active sync   /dev/dm-1
  10     253        0       10      active sync   /dev/dm-0
  11       8       32       11      active sync   /dev/sdc


cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sdk[8] sdj[0] sdh[3] sdi[2] sdg[4] sdc[11] sdl[1] sdm[5] sdf[7] dm-1[9] dm-0[10] 19533824000 blocks super 1.2 level 6, 512k chunk, algorithm 2 [12/11] [UUUUUU_UUUUU] bitmap: 15/15 pages [60KB], 65536KB chunk

sudo smartctl -a /dev/sdd

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-56-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Vendor: SEAGATE Product: DKS2P-H2R0SS Revision: 4F06 Compliance: SPC-3 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Logical block size: 512 bytes Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c50041070d0b Serial number: Z1P1AFWD0000S138114Q Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Sat Dec 31 12:31:04 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled

=== START OF READ SMART DATA SECTION === SMART Health Status: OK

Current Drive Temperature: 23 C Drive Trip Temperature: 68 C

Accumulated power on time, hours:minutes 54263:40 Manufactured in week 06 of year 2012 Specified cycle count over device lifetime: 10000 Accumulated start-stop cycles: 151 Specified load-unload count over device lifetime: 300000 Accumulated load-unload cycles: 151 Elements in grown defect list: 0

Vendor (Seagate Cache) information Blocks sent to initiator = 89937483 Blocks received from initiator = 195186019 Blocks read from cache and sent to initiator = 750296 Number of read and write commands whose size <= segment size = 3072072 Number of read and write commands whose size > segment size = 0

Vendor (Seagate/Hitachi) factory information number of hours powered up = 54263.67 number of minutes until next internal SMART test = 39

Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 2324326148 0 0 2324326148 0 1149.377 0 write: 0 0 0 0 0 101.683 0 verify: 459775 0 0 459775 0 0.000 0

Non-medium error count: 0

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on'] SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours)

1 Background short Completed - 54215 - [- - -]

Long (extended) Self-test duration: 5 seconds [0.1 minutes]

Here is historical reference from when I last created the array: Multipath Raid 6 - MDADM can't find superblocks - 18.04

EDIT: After reviewing some information online, I suspect my disks were at one point formatted as 520 byte sectors disks. However current review do not show this:

REF: https://www.youtube.com/watch?v=DAaTfv96V9w&ab_channel=ArtofServer https://www.reddit.com/r/homelab/comments/9bu8tf/is_this_drive_actually_bad_or_did_i_screw/

I ran sdparm -all /dev/sdd >> sdd_parm.txt against a few different disks. The existing disks only different by the disk id. However the new disk has multiple differences.

colordiff sdc_parm.txt sdd_parm.txt -B --ignore-matching-lines=RE -W 200

1c1
<     /dev/sdc: SEAGATE   ST2000NM0001      XRBA
---
>     /dev/sdd: SEAGATE   DKS2P-H2R0SS      4F06
10c10
<   EER           0  [cha: y, def:  0, sav:  0]  Enable early recovery (obsolete)
---
>   EER           1  [cha: y, def:  0, sav:  1]  Enable early recovery (obsolete)
18c18
<   RRC           20  [cha: y, def: 20, sav: 20]  Read retry count
---
>   RRC           10  [cha: y, def: 20, sav: 10]  Read retry count
28c28
<   RTL           8000  [cha: y, def: -1, sav:8000]  Recovery time limit (ms)
---
>   RTL           2000  [cha: y, def: -1, sav:2000]  Recovery time limit (ms)
41c41
<   MBS           314  [cha: y, def:314, sav:314]  Maximum burst size (512 bytes)
---
>   MBS           1040  [cha: y, def:314, sav:1040]  Maximum burst size (512 bytes)
54c54
<   DBPPS         512  [cha: n, def:512, sav:512]  Data bytes per physical sector
---
>   DBPPS         512  [cha: n, def:520, sav:512]  Data bytes per physical sector
75c75
<   V_DTE         0  [cha: y, def:  0, sav:  0]  Data terminate on error
---
>   V_DTE         1  [cha: y, def:  0, sav:  1]  Data terminate on error
77c77
<   V_RC          20  [cha: y, def: 20, sav: 20]  Verify retry count
---
>   V_RC          5  [cha: y, def: 20, sav:  5]  Verify retry count
79c79
<   V_RTL         8000  [cha: y, def: -1, sav:8000]  Verify recovery time limit (ms)
---
>   V_RTL         1000  [cha: y, def: -1, sav:1000]  Verify recovery time limit (ms)
92c92
<   WCE           0  [cha: y, def:  0, sav:  0]  Write cache enable
---
>   WCE           0  [cha: y, def:  1, sav:  0]  Write cache enable
118c118
<   NCS           32  [cha: n, def: 32, sav: 32]  Number of cache segments
---
>   NCS           3  [cha: y, def: 32, sav:  3]  Number of cache segments
126,127c126,127
<   D_SENSE       1  [cha: y, def:  0, sav:  1]  Descriptor format sense data
<   GLTSD         0  [cha: y, def:  1, sav:  0]  Global logging target save disable
---
>   D_SENSE       0  [cha: y, def:  0, sav:  0]  Descriptor format sense data
>   GLTSD         1  [cha: y, def:  1, sav:  1]  Global logging target save disable
155c155
<   ESTCT         18500  [cha: n, def:18500, sav:18500]  Extended self test completion time (sec)
---
>   ESTCT         5  [cha: y, def: 14, sav:  5]  Extended self test completion time (sec)
172,174c172,174
<   IDLE_C        0  [cha: n, def:  0, sav:  0]  Idle_c timer enable
<   IDLE_B        1  [cha: y, def:  0, sav:  1]  Idle_b timer enable
<   IDLE          1  [cha: y, def:  0, sav:  1]  Idle_a timer enable
---
>   IDLE_C        0  [cha: y, def:  0, sav:  0]  Idle_c timer enable
>   IDLE_B        0  [cha: y, def:  0, sav:  0]  Idle_b timer enable
>   IDLE          0  [cha: y, def:  0, sav:  0]  Idle_a timer enable
183c183
<   ICCT          0  [cha: n, def:  0, sav:  0]  Idle_c condition timer (100 ms)
---
>   ICCT          18000  [cha: y, def:18000, sav:18000]  Idle_c condition timer (100 ms)
195c195
<   PERF          0  [cha: y, def:  0, sav:  0]  Performance (impact of ie operations)
---
>   PERF          1  [cha: y, def:  0, sav:  1]  Performance (impact of ie operations)
202,203c202,203
<   LOGERR        1  [cha: y, def:  0, sav:  1]  Log informational exception errors
<   MRIE          4  [cha: y, def:  0, sav:  4]  Method of reporting informational exceptions
---
>   LOGERR        0  [cha: y, def:  0, sav:  0]  Log informational exception errors
>   MRIE          0  [cha: y, def:  0, sav:  0]  Method of reporting informational exceptions
207,217c207,208
<   INTT          600  [cha: y, def:  0, sav:600]  Interval timer (100 ms)
<   REPC          0  [cha: y, def:  1, sav:  0]  Report count (or Test flag number [SSC-3])
< Background control (SBC) [bc] mode page [PS=1]:
<   S_L_FULL      0  [cha: n, def:  0, sav:  0]  Suspend on log full
<   LOWIR         0  [cha: n, def:  0, sav:  0]  Log only when intervention required
<   EN_BMS        1  [cha: n, def:  1, sav:  1]  Enable background medium scan
<   EN_PS         0  [cha: y, def:  0, sav:  0]  Enable pre-scan
<   BMS_I         72  [cha: y, def: 72, sav: 72]  Background medium scan interval time (hour)
<   BPS_TL        24  [cha: y, def: 24, sav: 24]  Background pre-scan time limit (hour)
<   MIN_IDLE      250  [cha: y, def:500, sav:250]  Minumum idle time before background scan (ms)
<   MAX_SUSP      0  [cha: y, def:  0, sav:  0]  Maximum time to suspend background scan (ms)
---
>   INTT          0  [cha: y, def:  0, sav:  0]  Interval timer (100 ms)
>   REPC          1  [cha: y, def:  1, sav:  1]  Report count (or Test flag number [SSC-3])

Drive Diff Page 1/5 Drive Diff Page 2/5 Drive Diff Page 3/5 Drive Diff Page 4/5 Drive Diff Page 5/5

Patch
  • 11

1 Answers1

0

Found a solution here: https://www.reddit.com/r/homelab/comments/9bu8tf/is_this_drive_actually_bad_or_did_i_screw/

  1. Final Solution was to download firmware from HP for the drive model
  2. Flash it to the drive using OpenSeaChest ( Note Sectors and Bytes will show 0, this is ok. )
  3. Reboot - I'm not sure if its necessary but I did it anyways
  4. Format the disk using the 512 bytes parameter. I used the OpenSeaChest_Format but I'm sure sg_format would work just the same.

NOTES: I suspect the seller or the people who vendor who provides the drives to the seller, previously wiped the drive and reformatted it to 512, but never flashed the firmware, so the disks were looking for a 520 sector partition but formatted for 512.

Patch
  • 11