"ext4lazyinit" running since 6 days on a new RAID5 array

Question

I know there are a lot of "ext4lazyinit" topics. But they are all about 4-6TB HDD, and poster stating that it has in the end been completed after a few hours.

On my side i have a newly created RAID5 area, with 5*14TB disk (hence 51TB total size), and "ext4lazyinit" is running since ... 6 days (= since last reboot, but it has probably been running a couple days before that). And, of course, it is constantly generating I/O on the array. No errors anywhere, so outside this, everything seems fine.

But, why is it taking so long ? Ok, the disk array is big, but ... 6 days ?!

At first i wasn't aware of this behavior, so i did at some point (a couple of days after having created the raid array) a system reboot - the "ext4lazyinit" seems to have been restarted automatically after that, but is it possible that the reboot has corrupted something ?

ps -ef|grep lazy
root       583     2  0 Dec02 ?        00:04:37 [ext4lazyinit]

And is there any way to monitor the progress of this process (something like a cat /proc/mdstat that is available for some mdadm operations) ? (i haven't been able to find anything in dmesg, journalctl, or any others logs)

To be noted (and maybe this is explaining why it's so slow ?), the number of I/O seems constant overtime but rather low (so maybe the process is not running at full HDD speed ?). Is there any way to increase that speed ?

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.42    1.17    0.00   98.17

Device             tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
mmcblk0           0.00         0.00         0.00          0          0
sda               3.00         0.00         8.50          0         17
sdb               5.00       256.00       264.50        512        529
sdc               4.00       192.00       200.50        384        401
sdd               4.00        64.00        72.50        128        145
sde               3.00         0.00         8.50          0         17
md0               0.50         0.00       256.00          0        512

Not entirely related, but I strongly suggest you read this: [ZFS: You should use mirror vdevs, not RAIDZ.](https://jrs-s.net/2015/02/06/zfs-you-should-use-mirror-vdevs-not-raidz/) — Daniel B, Dec 08 '20 at 07:58
Some useful info here: [*Forcing `ext4lazyinit` to finish its thing*](https://superuser.com/q/784606/432690). — Kamil Maciorowski, Dec 08 '20 at 08:08
RAID5 rebuilds on slow HDDs like that are expected, some estimates say 1 day/TB, so you could be looking at close to 2 weeks. You should not use RAID5 with anything over 300-400GB unless you're using SSDs, or have really good backups. You're likely to see a drive fail by the time the array is rebuilt. — essjae, Dec 08 '20 at 08:12
@essjae It's not an array rebuild - in fact it has nothing to do with MDADM for that step, as it's just about EXT4 lazy initialization (would be the same witout raid). The array rebuilt (actually, done during initial build of the mdadm array) has been finished in 36 hours. — Sergio, Dec 08 '20 at 08:27
Sergeo, @essjae is absolutely correct about his main point - RAID5 on modern HDD is a bad idea. — davidgo, Dec 08 '20 at 08:34

score 3 · Answer 1 · answered Apr 16 '22 at 16:24

I'm having the same problem. 24GB RAID5 array and I started a mkfs.ext4 yesterday. Leaving this here for anyone else who comes across this thread with the info I've found.

The easiest way to do this is to just mkfs.ext4 with the lazy options off and then wait a long time for it to initialize everything. If you want to get using your array, it's not going to be great on spinny disks anyway since there will be a lot of scattered I/O happening until lazy init is done and it absolutely kills read/write speed.

mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 /dev/md0

Speeding it up: Mount with this option: init_itable=0 (that's the multiplier for how long it waits after zeroing a chunk (default is 10, which means to wait 10x as long as it took to zero the last one before moving on. 0 = do it right away, but it hogs a lot more of your I/O bandwidth).

The link in the comments above ( Forcing ext4lazyinit to finish its thing? ) is very useful for monitoring progress. Current writes vs. total sectors from fdisk. I'm a day in and am now at 54%, so I guess I'm getting there... lazy init is running at something like 10-12MB/s write.

Make sure you're not doing anything else on the disk and:

echo 1 > /proc/sys/vm/block_dump  # Turn on logging in /var/log/syslog
fdisk -l /dev/md0                 # Note total sectors.
echo 0 > /proc/sys/vm/block_dump  # Turn of logging.  Don't fill the log :)

Divide the sectors being written to from syslog with the total from fdisk.

Hope that helps the next person who comes across this. Now I just get to wait another day till it's done, then I can actually start using the array at decent speeds. (Till then, I can still pull 30MB/s out of it, so it's not hopeless)

"ext4lazyinit" running since 6 days on a new RAID5 array

1 Answers1