How crazy is this - pool expansion

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
My pool (details in my sig) is getting full, and the 6x 4TB disks that comprise one of my vdevs are about 9 years old at this point, so both point to time to replace them with something larger. So I've ordered in new disks and am burning them in as I type--I expect badblocks will finish some time next week. Then, of course, comes replacing the old disks, one by one--or does it?

I have 36 bays in my chassis. Right now, all the disks are online. Is there any real reason not to run all six replace operations in parallel? There's no loss in redundancy--all the old disks stay online until the operation completes. I have ample power and data bandwidth to handle all the disks (and as currently configured, the old and new disks are on different backplanes, and therefore different ports on the HBA).

So, am I crazy to think I should just run these all at the same time?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Is there any real reason not to run all six replace operations in parallel?
Greater energy consumption is the only that comes to mind.
 
Joined
Oct 22, 2019
Messages
3,641
So, am I crazy to think I should just run these all at the same time?
I believe that's the idea. The offline-swap-in-and-out-one-at-a-time "temporary degraded dance" is for those of us who don't have enough available ports or power.

If you do it all in parallel, in one go, there's a chance an existing drive might fail during the resilver; but this risk is in a strange way higher if you do the resilver process one-by-one, since it means you'll have to read the pool's vdev's data six times over.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Greater energy consumption is the only that comes to mind.
But that would surely be offset by keeping an extra five disks online (and idle) while replacing each one, no? Maybe not so much if I pulled out the extras, but let's be frank, I'm not going to do that.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
I ran through this exact scenario when I replaced my 6 drives a couple years ago. In the end I chose to replace them one at a time just in case there were any issues during the replacement. There's no real reason for this as I had 6 SATA ports available, it was just less for me to keep track of while the replacements were taking place and I had the free time to do it that way. I had them all hooked up and installed, just replaced them one at a time.

It's kinda like the pool layout discussions that take place all the time on the forum. It all boils down to what your comfort level is as to how you choose to proceed, there really isn't a right or wrong way to go about it.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
In my humble opinion, since you have RAID-Z2, I would only do 2 disks at once. Then allow it to stabilize, and maybe pause for a few days. Then another 2 disks at once, with the same stabilize & pause. Repeat, one more time.

My "take" on this, is that doing 2 at once would be less prone to errors, (mostly human). For example, if a disk finished and you yanked what you thought was the old disk but really was the new, you would have to re-silver it again. (Though probably partially, since ZFS would keep track of where it was...) By doing only 2 at once, IMO, you reduce the chances of problems.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Well, it took a while for the burn-in to complete, and then I ran into the bug that pool expansion doesn't actually work, so I had to do the disk replacement at the shell. But six disks are resilvering right now.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Weren't there issues with the WebUI when replacing drives from the shell?
At least while the replacement is pending, it looks like it's displaying correctly:
1706144983446.png
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
So it seems disk replacement can run concurrently, but only to an extent. I'd started all six disks replacing, but I started one of them several hours after the rest. This morning, the first five finished, and the last one appears to have just started based on the output of zpool status. Everything's still looking fine in the GUI, even having done the partitioning/replacement at the shell:
1706277014125.png
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Yes if I understand correctly there are tunables you can change in order to increase it.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
That's weird, I remember stories of replacing entire 12-wide vdevs in one go.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I think it probably would have in this case too had I started them all at the same time within an appropriate margin--+/- several minutes would be within that margin; +/- several hours not so much. But that's just a guess at this point.
 
Top