The Question
Why is my initial resync so slow? Initially cat /proc/mdstat reported ~ 30000K/sec.
- I then increased
/sys/block/md0/md/stripe_cache_sizefrom256to16384. That increased the speed to~50000K/sec. - I changed
/proc/sys/dev/raid/speed_limit_minfrom1000to100000and/proc/sys/dev/raid/speed_limit_maxfrom200000to400000, but that didn't help.
What else can I do to speed things up?
- This topic suggests replacing Western Digital Green drives with anything different. That is not an option for me.
- Running
smartctl -t short /dev/sd[bcd]and latersmartctl -l selftest /dev/sd[bcd]did not reveal any errors. - A number of resources that I found on the topic (1, 2) are not very specific about how (which commands) to change certain settings, not do they explain very well what they do & why they help.
Some Background
How I set up the raid array
I just added three 2TB Western Digital SATA (their "green" series) drives to my Ubuntu 14.04 server. They are /dev/sd[bcd].
I decided to use them in a raid5 array and set everything up like this:
1) Create one partition on each disk:
fdisk /dev/sdb (same for sdc and sdd)
Based on this blog post I chose 2048 and 3907029128 as the respective first and last sector on each disk. These numbers are divisible by 4, since these drives are 4K sector drives.
The fs type was set to da (Non-FS data), as per that blog post, which reads
This stops distribution auto-mount startup scripts for searching for superblocks on drives marked as “fd” and trying to mount them up in a funny order.
Since the raid array is non-essential for booting up the system, that makes sense to me.
2) create raid array using mdadm:
mdadm --create --verbose /dev/md0 --level=5 --chunk=2048 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1 --spare-devices=0 --force
The --chunk=2048 option was also inspired by the same blog post, "bear[ing] in mind the 'should be divisible by 4' logic."
The --spare-devices=0 --force options are my own creation, since without them the mdadm command would start the inital resync process, but quickly slow down to < 200K/sec, and then bail out with a message in /var/mail/root that /dev/sdd "might have failed". After that happening, the output of mdadm --detail /dev/md0 would show that /dev/sdd1 had been moved to be a spare drive.
Since adding these options, the resync is continued running. It first slowed down to < 100K/sec but then speed increased to about 30000K/sec and stayed there.
About the running resync process
atop reveals that /dev/sdd is the slowest
DSK | sdd | busy 103% | read 930 | write 304 | KiB/r 512 | | KiB/w 512 | MBr/s 46.50 | MBw/s 15.20 | avq 112.12 | avio 8.10 ms |
DSK | sdc | busy 85% | read 942 | write 384 | KiB/r 508 | | KiB/w 410 | MBr/s 46.75 | MBw/s 15.40 | avq 20.59 | avio 6.21 ms |
DSK | sdb | busy 73% | read 942 | write 387 | KiB/r 508 | | KiB/w 411 | MBr/s 46.75 | MBw/s 15.55 | avq 17.31 | avio 5.31 ms |
About the WD Green drives
All drives were bought at a different time (so I suspect they're from different production runs, months apart).
/dev/sdb and /dev/sdc are the youngest and have 64MB cache. /dev/sdd has 32MB cache.