Metadata vDev vs Metadata L2ARC?

jnussbaum

Cadet
Joined
Dec 11, 2023
Messages
4
I'm planning a new system, as discussed elsewhere, and am confused about the different types of metadata devices. I'll say right away that this is for a home setup, and I'm under no illusions that I need a bunch of fast gadgets for the normal working of my system. (I've seen the "if you have to ask if you need a SLOG, you don't" proverb.) I don't even really know what a metadata cache would do, to be honest. One thing that I've seen with my Synology NAS is that it takes a lot of time to scan large directories for new files, which is something that various apps do on a regular basis. Would a metadata cache help with this?

If so, I don't understand the different kinds. I know that I _don't_ want a dedicated vDev for this, that I can't lose without losing the pool--that's just too complicated for my basic home needs. And that's what the docs discuss. But I've seen other discussion about a metadata L2ARC that would seem to serve a similar purpose but is not integral to the pool? Would that help in this situation? If so, what's the difference between these two approaches--why would anyone want a potentially risky extra vDev?

Basically, if I can easily add a single SSD to my system and improve the speed in a noticeable way, I'd like to do that. If I'm getting in over my head and trying to do something unsupported or unintended that I don't really need to do, tell me to shut up and I'll go away ;-)
 
Joined
Oct 22, 2019
Messages
3,641
One thing that I've seen with my Synology NAS is that it takes a lot of time to scan large directories for new files, which is something that various apps do on a regular basis. Would a metadata cache help with this?
There's two parts to this:
  • Increase RAM to improve your "cache", as it is the primary "read cache" of your system (i.e, the ZFS "ARC")
  • Starting with Open ZFS 2.2.x, there's a new tunable that you can adjust to prioritize metadata in the ARC, so that it will not be evicted as aggressively.

As far as a "metadata cache", you might be thinking of either the L2ARC (which is a non-crucial vdev, nor is it exclusive to metadata); or the special vdev (which cannot be lost, and is the place where "small" blocks and metadata are stored permanently.)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
"metadata cache", you might be thinking of either the L2ARC (which is a non-crucial vdev, nor is it exclusive to metadata);
L2ARC can be set to only cache metadata... secondarycache=metadata (as opposed to the default =all or the alternative =none) https://openzfs.github.io/openzfs-d...rkload Tuning.html#adaptive-replacement-cache

That can be set for the pool and/or datasets within it.

Starting with Open ZFS 2.2.x, there's a new tunable that you can adjust to prioritize metadata in the ARC, so that it will not be evicted as aggressively.
That's probably something that will make more difference than any cache that could be added... making the cache that you must have actually do its job properly.
 

Jorsher

Explorer
Joined
Jul 8, 2018
Messages
88
Hello all. I'll continue this thread instead of starting a new one.

My old server reached its limits (mATX, limited to 64gb ram, quad core Xeon) and I'm setting up a new one. My pools are 48 x 16tb, 24 x 16tb, 12 x 14tb, and 12 x 8tb. One is primarily large files, and the others are mixed.

My current build is using 8 x 1.92TB Toshiba THNSN81Q92CSE/HK4R (SATA SSDs) in a striped/mirrored VDEV to store "apps" and databases. For my new build, I have 6 x 118GB Intel P1600x (M.2 Optane) and plan to use those for my databases/apps in a similar striped/mirrored VDEV (or striped and just replicate to another pool, being non-mission-critical?) since it's plenty of space for the purpose.

Now that I have 8 unused SATA SSDs, I looked into using them as special vdevs for metadata. However, because losing them means losing the pool, I would need mirrors at the minimum -- leaving me with a mirror per pool. For the smaller pools, a 1.75tb mirror will leave almost all capacity unused. For the biggest pool, 'zdb -Lbb' only shows 49GB of metadata -- but the pool is only 1/3 full.

As an alternative, I considered using them for L2ARC. I assumed I could make an 8-wide striped vdev and it would be shared across all pools like ARC is, but I've learned that's not the case. However, I won't need to panic if a drive dies and I have more flexibility with the allocation of the drives to the pools.

With the tunable to prioritize metadata in ARC and 512gb of ram, I'm not even sure L2ARC will benefit me much. The only benefit I see is persistence, which may not be applicable if ARC never 'evicts' the data? and even then, it appears L2ARC has to be 'rebuilt' every boot?

Curious what others would do.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
Curious what others would do.
Look at and understand the output from arc_summary... only if you have some evidence that your ARC hit rate is below 99% or you see a high hit rate for an existing L2ARC would you do anything to augment it.
 

Jorsher

Explorer
Joined
Jul 8, 2018
Messages
88
Look at and understand the output from arc_summary... only if you have some evidence that your ARC hit rate is below 99% or you see a high hit rate for an existing L2ARC would you do anything to augment it.
Thanks, you're right. I've read this before but was hoping to have the pool structure decided up-front. Jumping from 64GB ram to 512GB ram, chances are high L2ARC will not benefit me.

What's the accuracy of zdb -bb for estimating metadata vdev sizing? I assume it's possible to accurately identify the metadata size of an existing pool, and I'm hoping the output of this can be relied on for determining capacity needed.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
It's hard to imagine scenarios where metadata exceeds 20% of the pool size, but you also need to consider the "small files" part of what will be caught by the special VDEV in addition to metadata, so can vary wildly based on your use case (if it's all miniature files, then the whole pool contents could land on the VDEV).
 
Top