3x2 Mirror or 1x6 Z2

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Hey guys,
I would like to get your oppionon on a proper pool layout.

Hardware:
Intel E-2224
Ram 32GB ECC
Supermicro X11SSL-CF
6x Seagate 10TB
280GB Intel Optane 900P
10Gbit Intel NIC

Planned usage:
Pure fileserver for small company (CG-Graphics) 6-10 concurrent users. Also connected via 10GB. Big photoshop files 900-2500mb, loading different textures assets during rendering, etc.

Backup strategy:
Daily mirror to QNAP storage (RAID 5) + Weekly mirror to external QNAP storage.

I initially planned to do a simple:
6x 10TB as Raid-Z2 and using the Optane as an SLOG device.

But I read a little bit further in the net and got confused (reading these Link 1 / Link 2 ) if it wouldnt be better to mirror instead of Z2.

What I like using the mirror way would be:
- Mixing harddrives capacity wise
- Buying just 2 disks for adding more space instead of 6
- Easier/faster disk upgrade when just replacing because of shorter resilver time
- Faster reads/writes (but dont know if this would have an impact for our usecase)

Maybe to mention: I got 3x 1TB ssds and some more ssds with lower capacity lying around, which I could use as read cache.

What would you guys prefer for that usecase?
 

koolmon10

Cadet
Joined
May 21, 2018
Messages
4
I think the most obvious and biggest drawback to mirroring is capacity loss. In your case, it's the difference of only a single drive, so probably not a huge issue. (30TB avail vs 40TB avail)
 

lightwave

Explorer
Joined
Jun 14, 2018
Messages
68
My main concern would be about redundancy. The mirror option would cause a full pool (and data) loss if two disks from the same mirror pair fail simultaneously. With 10tb disks, this is not as unlikely as it sounds as the rebuild after a failure puts significant stress on the one remaining disk. The Z2 option will allow any two disks to fail simultaneously with no data loss. Sufficient backups might be a way to mitigate the risk for the mirror setup. Z2 option also gives more usable terabytes for the same cost. The only drawback I see with Z2 is performance for some very specific and demanding loads. You could also do triplets rather than pairs in the mirrors, but this comes at a significantly higher cost per usable terabyte. In the end, it all depends on your use case, how deep your pockets are and your risk preferences.
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Normally I would agree with the redundancy, but reading the links I posted above. Rebuilding with such big disks would put stress on all drives during resilvering, and this could take several days and the performance should drop down pretty significant during this time. This is what made me rethink about the pool layout strategy.

I cant really answer if the performance for a Z2 is enough for our usecase. As I stated in the starting post, we mainly work with big files, but when rendering there can be up to 8 PC's loading the same data from the NAS which can be up to 4gb total with mixed data (jpg, tiff, png, max, vrmesh), all connected via 10gbit. Sometimes we also comp videos with multiple psd files which can take to 300mb per frame.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Oof, well, in terms of rebuild time and risk of secondary failure during rebuild, the mirror will be much faster, even if it can't tolerate a 2-disk failure on the same side of the mirror.

And, if you have 8 simultaneous users, the IOPS on spinning rust drives is so low I think you'll find RAID Z2 incredibly slow for files this large. You can't max out your 10G ethernet connection on reads alone regardless of which setup you use if you've got high contention. Using Western Digital Gold's 250MB/s sustained sequential read per drive as the max, you can get up to 1.5GB/s or 12Gbps for sequential reads, but that's only for the outtermost sectors of the drives. Once you add in parity checking and the associated low IOPS+high latency, I think you'll be lucky to get 8Gbps effective, and it'll be much worse as you get past 60% drive load.
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Okay. Thx for the insights.
After reading a little bit more here in the forum regarding the topics, mirror vs z2, I think z2 is the better option. Better redundancy and more usable storage.

For the writing I already planned a 280GB Intel Optane 900P as SLOG. I think this should overcome any performace penalty caused by using z2, right?
For reading I could use 3x1TB SSDs as cache. Or is it possible to split the Optane drive into two partions and use it as slog and cache device?
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Okay. Thx for the insights.
After reading a little bit more here in the forum regarding the topics, mirror vs z2, I think z2 is the better option. Better redundancy and more usable storage.

For the writing I already planned a 280GB Intel Optane 900P as SLOG. I think this should overcome any performace penalty caused by using z2, right?
For reading I could use 3x1TB SSDs as cache. Or is it possible to split the Optane drive into two partions and use it as slog and cache device?
Well, if you have a video editting workflow, you're write-heavy, so unless you're working on the same video file simultaneously (not quite sure how that works, same for working on the same file for a render), having a strong read cache won't be very helpful. If you have an assembly line type of workflow, then a read cache can help, especially if you can tune it to cache the most recently written/read files.

Optane might be overkill for SLOG, because you're not in the realm of 1000+ read/write requests per second, where the QD 1-8 performance is really going to shine. A Samsung 970 Pro, or a Crucial NVMe drive with their partial power loss protection will probably be enough performance for you, and they might have better power loss protection vs. the Optane drive. You'd have to dig a bit if that matters to you.

If you CAN split an SSD into both SLOG and read cache, then that may also be a factor on going for a more conventional 512GB NVME SSD vs. the premium cost Optane, depending on how many big files you're working on at once.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
If you need performance, 3 mirrors will give you 3 times as many IOPS as a single 6-disk RAIDZ2 array. If you need storage space, RAIDZ2 will give you the capacity of 4 disks vs 3 disks in the mirror setup.
 

CraigD

Patron
Joined
Mar 8, 2016
Messages
343
If you need the IOPS three way mirrors is an option, TWO disks can fail without data loss in each three way mirror

You will need nine drive instead of six, and will get ~2.4 drives worth of usable space

Have Fun
PS I an a home user
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
How will you be accessing the files? SMB share?
If you are using asynchronous writes SLOG wont even be used.
If your inflight data isn't a concern just turn sync=off and you will have the best write performance and not use a SLOG.

For performance you should really have an SDD pool for work in progress and disks for a storage pool, scratch vs. archive.
If your WIP files aren't that big it wouldn't cost much. You would want to be careful of getting low DWPD SSD for scratch disks.
You wouldn't need a L2ARC at all with SSD.
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Oh man, I'm reading more and more about all the topics in detail and only getting even more confused o_O

95% of the users are pc with windows 10 so it will be an SMB share. About the sync vs async I'm not quite sure. When I'm not mistaken sync is much slower slower then async. But async comes with the cost of data being corrupted when something is going wrong? What need to happen for async to "damage/lose" data? Is it harddrive failure, power failure? And when something goes wrong is just the written data "gone" or is there a bigger damage to the filesystem?

The data we are writing is "important" but not as much as for a bank or insurance company.

The Optane drive is already ordered with alle the other components, but I got the possibility to send it back and maybe trade it for other components.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
How will you be accessing the files? SMB share?
If you are using asynchronous writes SLOG won't even be used.
If your inflight data isn't a concern just turn sync=off and you will have the best write performance and not use a SLOG.

For performance you should really have an SDD pool for work in progress and disks for a storage pool, scratch vs. archive.
If your WIP files aren't that big it wouldn't cost much. You would want to be careful of getting low DWPD SSD for scratch disks.
You wouldn't need a L2ARC at all with SSD.
Given the current industry trend on falling flash prices and increasing endurance in all tiers, I don't think we need to worry about low drive write anymore. By the time you'd have to replace a $100 1TB Crucial BX 500, Samsung 860 QVO, or Western Digital Green you buy today (which can sustain a full Petabyte of writes before becoming unusable), there'll be a $50 higher endurance replacement ready.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Oh man, I'm reading more and more about all the topics in detail and only getting even more confused o_O
Welcome to IT. No one here knows everything.

95% of the users are pc with windows 10 so it will be an SMB share. About the sync vs async I'm not quite sure. When I'm not mistaken sync is much slower slower then async. But async comes with the cost of data being corrupted when something is going wrong? What need to happen for async to "damage/lose" data? Is it harddrive failure, power failure? And when something goes wrong is just the written data "gone" or is there a bigger damage to the filesystem?

The data we are writing is "important" but not as much as for a bank or insurance company.

The Optane drive is already ordered with alle the other components, but I got the possibility to send it back and maybe trade it for other components.

I would say just step back and think about how your data is being used. Are your team members actually working on the same data concurrently, or is it round robin/assembly line where one person finishes their work, hands it off, etc.? If you actually are doing concurrent work, then using part of the Optane drive as SLOG is a great idea. Probably a bit pricey for the disk size for your use case, but a good idea nonetheless. I stand by my thoughts on using a larger, cheaper NVMe drive though if you can.
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
I really hope that in the near future you can go on ssd only storage, but at the moment its just to pricey. We tried 2 years ago the approach of splitting project folders to use a ssd only pool on our qnap, but the performance wasnt that great (seems to be an qnap issue, that is one of the reasons why we try to move to the freenas route). But the bigger problem was that this splitting of active and inactive projects was quite tedious. Thats the reason why we abandoned that approach and want to move to one big pool.

Regarding the data usage. The team members are not working on the same data at the same time. But when rendering multiple workstations are loading the same data simultaneously (which can be up to 8 workstations), with mixed file types and file sizes, differ between 100 kilobytes up to 1gb per file in general. Writing is happening also on different files. The filesize in general for the bigger files differ between 300mb - 3gb. Which can also happen on the same time by 8 users, but wouldnt happend that often.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
I really hope that in the near future you can go on ssd only storage, but at the moment its just to pricey. We tried 2 years ago the approach of splitting project folders to use a ssd only pool on our qnap, but the performance wasnt that great (seems to be an qnap issue, that is one of the reasons why we try to move to the freenas route). But the bigger problem was that this splitting of active and inactive projects was quite tedious. Thats the reason why we abandoned that approach and want to move to one big pool.

Regarding the data usage. The team members are not working on the same data at the same time. But when rendering multiple workstations are loading the same data simultaneously (which can be up to 8 workstations), with mixed file types and file sizes, differ between 100 kilobytes up to 1gb per file in general. Writing is happening also on different files. The filesize in general for the bigger files differ between 300mb - 3gb. Which can also happen on the same time by 8 users, but wouldnt happend that often.
Interesting. Well you'll have to come back and tell us how it performs for you. I've never been impressed with the corporate all-HDD setups I've seen. Even an 800 WD Gold Z2 array sitting adjacent to the 2 42U towers hosting the user VMs on a 100Gbit backplane just could not sustain 13,000 users.
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
I've read a little bit deeper regarding the sync and slog. Slog devices are mainly used when you need to keep writes save and use sync=always for the dataset/pool. This seems to be important for vm usage but not so much for normal file storage use.

So what I would get out of this, would be to skip the optane drive as slog device and put the money into more ram from 32 to 64gb (max. of the board). With more ram the whole system would perform better because I got bigger (to simplify it) "read and write cache".
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Edit (Can't edit my previous post, dont know why):
And set sync=disabled to get the best performance.
 
Joined
Jan 27, 2020
Messages
577
Reading through it, you decided to go with RAIDZ2, am I getting that right? Despite the fact that you're facing heavy IOPS?

Scenario 3: High-resolution video production work via file share

We have a group of video editors that need to work on high-resolution footage stored on our system. They will be editing the footage directly from the pool, as opposed to copying it to local storage first. Streaming speeds will be very important as high-resolution video files can have gigantic bitrates. The more editors we have, the more performance we’ll need. If we only have a small handful of editors, we can probably get away with several RAIDZ2 vdevs, but as you add more editors, IOPS will become increasingly important to support all their simultaneous IO work. At a certain point, Z2 will no longer be worth its added capacity and a set of mirrored vdevs will make more sense. That exact cutoff point will vary, but will likely be between 5 and 10 total editors working simultaneously.

Six Metrics for Measuring ZFS Pool Performance Part 1
Six Metrics for Measuring ZFS Pool Performance Part 2
 

Mugga

Dabbler
Joined
Feb 19, 2020
Messages
25
Correct, at the moment I would go the RAIDZ2 route. Our workload is not the exact same as a video production agency. We also do video production, but it will likely not happen, that more than 2 persons are editing videos at the same time.
 

patrickjp93

Dabbler
Joined
Jan 3, 2020
Messages
48
Correct, at the moment I would go the RAIDZ2 route. Our workload is not the exact same as a video production agency. We also do video production, but it will likely not happen, that more than 2 persons are editing videos at the same time.
Good. I think you'll be okay on IOPS then. Big sequential reads, which HDDs are at least acceptable at, where the latency of the seeks and checksums shouldn't impact you too badly.

Hopefully Seagate, Hitachi, and WD roll out the multi-actuator tech in the next couple of years. HDDs are going to be nearly unusable at the 20TB+ sizes if they don't.

Definitely let us know how it performs for you.
 
Top