Performance suggestions?

DeWebDude

Explorer
Joined
Nov 2, 2015
Messages
52
Hello All,

I'm currently running my NAS with the following:
TrueNAS-12.0-U8
ProLiant DL180 G6
Dual L5520 Xeon 2.27GHZ processors
48GB RAM
Built initially under FreeNAS.
lz4

I think I used a 9211-8i controller ( not sure why we only used 6 drives, may be related to cable & backplane or something have to re-visit )
I have 6 SEAGATE ST3000NM0023 - 7200rpm SAS, 6GB drives
I believe we have it setup in Raid 10 ( not sure how to see that under pool config been a long time since we configured this)

Given my above config, I know we have overkill on RAM & CPU, but disk performance is probably key.
Enterprise NVME even used is still too high, so trying to figure out how we can best impact performance.
One thing I know is possibly adding (RAID-1) SSD Cache.

Any ideas or suggestions appreciated!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
I believe we have it setup in Raid 10 ( not sure how to see that under pool config been a long time since we configured this)
Cogwheel at the right of the pool, then Status (or at the Shell: zpool status poolname)

One thing I know is possibly adding (RAID-1) SSD Cache.
You wouldn't use any kind of RAID for cache (L2ARC), stripe if anything, but you may not find L2ARC useful either depending on your workload, which you haven't shared details of.

Performance can be measured in throughput or IOPS and setups for maximising those are not the same. Capacity and cost are other things which will drive what you are prepared to do.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
trying to figure out how we can best impact performance.
Putting on my corporate hat for a moment here, I'm going to pretend you're a client who just walked in and said something similar.

"While throwing various types of hardware at a problem can be very fun, and is usually very profitable for vendors, it often isn't the right way to address the root cause. Sometimes the bottleneck exists outside the array, sometimes a configuration change can solve things, and sometimes the answer is still hardware, but it's a matter of 'what hardware do we throw at it?' So let's figure out what the array is being used for, and what the underlying performance problem that we need to find a solution for is."

So, who's using the array, what kind of workloads are you putting on it, and where do you feel the pain points?
 

DeWebDude

Explorer
Joined
Nov 2, 2015
Messages
52
Thank you both for your replies.
I realize that yes, I didn't spell out the environment use which will help in knowing how to design the array.

I thought we had done raid 10, but based on the attached picture, it looks like we did Raid1 3 times, and made each of those arrays part of 1 volume. ( I am not against re-configuring the array to improve on performance/reliability or any other logical purpose)

Speed was the goal in the original design concept.

In respect to use, the TrueNAS box is used as storage for a basic VMWare cluster (2 Servers) via 10GB network.
The servers provide email and hosting. There are probably 10-12 VM's overall with 2-3 of them being the workhorses and the rest pretty light workload. No other services such as snapshots etc are being done on this server, only to provide disk to the environment.

In my post, while it was more theoretical as to how can I make my race car go faster, it's root desire is that we look for the performance to improve when sites load. WordPress sites using MySQL/MariaDB, apache etc.

While at times several sites may load and cause high requests on disk, it's more about if a single site loads how quickly can we serve up the content.

Hopefully this provides a little more insight.

Side Question, to see the IOPS of the array from one of the Guest OS's (Centos) is DD the best way to measure?

Thanks
 

Attachments

  • 11.03.2022_11.14.25_REC.png
    11.03.2022_11.14.25_REC.png
    47 KB · Views: 216
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I thought we had done raid 10, but based on the attached picture, it looks like we did Raid1 3 times, and made each of those arrays part of 1 volume. ( I am not against re-configuring the array to improve on performance/reliability or any other logical purpose)
That's the "ZFS RAID10 equivalent" - all vdevs in a pool (your mirrors) receive a distribution of I/O, so you've got the right layout.

In respect to use, the TrueNAS box is used as storage for a basic VMWare cluster (2 Servers) via 10GB network.
The servers provide email and hosting. There are probably 10-12 VM's overall with 2-3 of them being the workhorses and the rest pretty light workload. No other services such as snapshots etc are being done on this server, only to provide disk to the environment.

Well, bit of bad news; you're actually running in a risky configuration right now. See the Resource regarding Sync Writes below:


Basically, your writes to the array are super-quick because they're landing in unprotected RAM only. Suffer a PSU/HBA/backplane failure, and you can lose a chunk of data and possibly render your VMFS datastore unmountable.

Setting aside the sync-write question (you'll need to solve this one with an SLOG, think Optane) the question of "how to get faster performance for your site loads" is almost certainly answered by "add more RAM" which thankfully will be cheap to add to a DL180 G6. (A couple processors with higher clock speeds might not go amiss either. X5650's are cheap, get two.)

Pop an SSH session into your TN machine, run arc_summary.py and post up the results here inside of CODE tags or attached as a .txt - pay special attention to the top section of the ARC Total Accesses where it shows your hit/miss ratio.
 

DeWebDude

Explorer
Joined
Nov 2, 2015
Messages
52
Thanks @HoneyBadger ... Took some time to correctly collect some more data and put together some questions.


First processor wise, roughly a 14% increase between my processor and lets say X5675 ( top accepted processor by the DL180 G6).
Six-Core Processor - Intel Xeon Processor X5675 (3.06 GHz, 12MB L3 Cache, 95 W, DDR3-1333, HT, Turbo 2/2/2/2/3/3) FIO
Quad-Core Processor - Intel Xeon Processor L5520 (2.26 GHz, 8MB L3 Cache, 60 W, DDR3-1066, HT, Turbo 1/1/2/2)

I have the L5520, so in this case, it would increase in cores as well as GHz, but based on the 14% and the graph attached showing CPU usage ( I can click back for days, doesn't really go above what's there ) So based on that, not sure if it will be worth it (time wise, not the minimal cost wise)

Memory, very easy fix, I can go 128GB vs existing 64GB, is it worth going 256GB or is that overkill?
Additionally,. not sure how to interpret the memory graph, which is attached in case that helps this question.

The last point you made with the SLOG... I would certainly buy an Intel Optane.
What is the recommended formula for SLOG size vs disk or memory?

The VMWare hosts are accessing the TrueNAS via dual iScsci 10GB backbone.

In respect to SAS 6GB vs SAS 12GB, will that serve in this function as a significant performance boost?
I realize I would need new drives, maybe a new controller in order to accommodate the higher bandwidth.

Thank you for your assistance!
 

Attachments

  • cpu.png
    cpu.png
    138.5 KB · Views: 217
  • Memory.png
    Memory.png
    136.9 KB · Views: 196
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
A SLOG needs 5 seconds of maximum throughput available to it
Assuming you have a 10Gb NIC - thats 5 seconds of 10Gb = 50Gb = 6.25GB of SLOG
A 280GB 900p Optane is nicely (excessively) oversized and is probably the best cheapest option.

I actually partition manually a mirrored pair of optanes to run as SLOg for two pools with 20GB each and the rest of the space unused.

As an alternative an RMS-8 or RMS-16 - if you can find one is probaby better than an Optane - but rocking horse shit comes to mind. Or an Optane 4800x is potentially better than the 900p technically
 

DeWebDude

Explorer
Joined
Nov 2, 2015
Messages
52
Sorry for the delayed response, we all know how crazy IT gets.

Based on my graphs shown above, will adding memory truly add to the performance of the NAS?
The same graphs show CPU and from what I can tell underutilized, so improving that doesn't seem of value.

In respect to the bad news @HoneyBadger , I am using iscsi to connect the NAS to the VMWare hosts, so does this nullify that comment?

Speaking strictly from the box perspective, assuming (bad thing) the CPU & RAM are good, will utilizing SAS 12GB vs 6GB create any realistic improvement?

Thanks
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
To be fair, I think you were waiting on my response, so that's kind of my apologies as well.

Memory will definitely add performance for your reads, because ZFS works by caching your Most Recently Used (MRU) and Most Frequently Used (MFU) data into RAM. The more RAM you have, the more reads come from RAM and don't have to wait for disk. This also means your disks could have more time for writes, which improves that as well.

In respect to the bad news @HoneyBadger , I am using iscsi to connect the NAS to the VMWare hosts, so does this nullify that comment?
Unfortunately not. VMware iSCSI assumes that the target device is handling write-safety entirely on its own, which unfortunately the default TrueNAS CORE config doesn't do out-of-the-box because of the immense performance penalty of doing this without a proper SLOG device. Enabling sync=always is the only way to guarantee the write-safety on iSCSI ZVOLs.

NFS, on the other hand, defaults to "don't trust the disk" and therefore runs equivalently to sync=always in the default state.

Speaking strictly from the box perspective, assuming (bad thing) the CPU & RAM are good, will utilizing SAS 12GB vs 6GB create any realistic improvement?
Doubtful. Your current HBA is running eight lanes of 6Gbps SAS, for a total of 48Gbps or ~6GB/s of bandwidth. Even SAS1 at 3Gbps per drive is faster than any spinning disk can generally spit out - I imagine you're hitting a bottleneck elsewhere in the system.

Memory, very easy fix, I can go 128GB vs existing 64GB, is it worth going 256GB or is that overkill?
256GB isn't overkill at all. It's basically impossible to give ZFS "too much memory" - the more you have, the more fits into RAM.
 
Top