Choosing between two CPU's: Higher base clock frequency vs more cores.

nickw_

Cadet
Joined
May 9, 2023
Messages
9
Hi All,

Planning out a server build, and I could use some real world experience in helping me pick a cpu. Here are the two I am currently considering (both server pulls):

Xeon Gold 6146
- 12 core, 24 thread.​
- 3.2 GHz base clock, 4.2 ghz turbo​
- CPU Benchmark:​
- 23,296 Multicore​
- 2309 Single threaded​

Xeon Gold 6148
- 20 core, 40 thread​
- 2.4 GHz base clock, 3.7 ghz turbo​
- CPU Benchmark:​
- 28,993 Multicore​
- 2183 Single threaded​


To keep this simple, I am wondering about how much clock speed / single threaded performance is needed to saturate a 10g and 25g connection over smb? Assuming the underlying media can keep up. I would think it's the actual single threaded performance which is more important here, rather than just clock speed?

Is there any sort of rule of thumb for working out how much is needed? I looked but couldn't find anything. I am trying to deepen my understanding here.

Thanks,
Nick
 
Last edited:

nickw_

Cadet
Joined
May 9, 2023
Messages
9
I know some people like lots of details about the build, if you're one of those people, here they are:

Motherboard will likely be a Supermicro X11SPL-F with 256gb of ram to start. Pool layout isn't finalized yet now, but this should be close. 3 pools:
  1. NVME pool for VM machines
    • via Supermicro PCIe adapter
    • Vdev layout: 2x 2way mirrors (4 nvme total)
    • PCIe Optane SLOG
  2. SSD pool for Video editing
    • Vdev layout: 2x 5wide z1
  3. HDD warm pool
    • Vdev layout: 2x 8 wide z2
    • Special vdev: 3x mirror (SSD) for meta data

I will likely run Proxmox as a HV with TrueNAS core as VM, with the HBA's in hardware passthrough. So I can use this server as a secondary (and back up) HV.

In terms of load and use case:
This is a part homelab, part work, part home file server.
  • VM Pool: iscsi target for my main HV. Plus will act as a secondary HV, with 4-6 VMs running typical homelab and home automation type tasks. So mostly light duty.
  • SSD Pool: Target for video editing via smb. Mostly 1 user at a time. Rarely 2.
  • Warm Pool: General file server, home media server, target for time machine backups (currently x6), and cold store archive.

Network:
10g currently. Original plan was SFP+ (lagg) to my core/distribution switch. However I am now considering upgrading to SFP28. Editing over 10G is fast enough, so 25G would be for large data transfers (10g is fine here too, but why not). All other devices will be a mix of 1G/10G connections.


Questions & circling back to CPU's.
This gives a little more insight into what I am wanting. It's not just smb performance, but also running as a HV.

I suspect both will be fine (and very under utilized) for my current workload. I am leaning towards the 6148 for the higher core count and more flexibility to expand in the future. Since it's only a 6% difference in single threaded performance on the cpu benchmark, I am guessing they will be very similar with smb performance. Although this is operating under the assumption that single threaded performance is the more important metric (rather than raw clock speed). I'd still like to better understand how much single threaded performance is needed to saturate different speed connections.

That said, the 20 core isn't that much faster in multicore performance (25%) compared to the 12 core, but it all helps. I would love to hear others input. I am open to other processors recommendations too. I am trying to keep price down on the build as everything adds up fast, thus I was looking at 8-10th gen Xeon server pulls.

Cheers,
Nick
 
Last edited:

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Probably not that much of a difference. If you want to keep the price down, look into 'C' custom Xeon Scalable on eBay: These go for much less than the regular parts.
 
Joined
Dec 29, 2014
Messages
1,135
Like most things, it depends on your workload. If you will be using the hypervisor functions of TrueNAS, you probably want more cores. If you are using it just for storage (like I am), you probably want higher clock speeds and hyperthreading off. I think Samba is still single threaded, so that is someplace where the higher clock speed will make a difference. As far as saturating a 10G or 25G link, that will most likely depend more on pool construction than CPU. I find my bare metal storage only units are never CPU bound. I switched my pools from RAID-Z2 to mirrors, and that definitely helped. My ESXi data stores are NFS, so adding an SLOG helped quite a bit too. As a general rule, RAID-Z1 is not recommended for larger drives since resilvering would take a long time and you would lose the entire pool if there was another drive failure during that process. If all the drives in a pool are NVME, the SLOG might not help as much. An SLOG only helps if synch is on.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I will likely run Proxmox as a HV with TrueNAS core as VM, with the HBA's in hardware passthrough. So I can use this server as a secondary (and back up) HV.
If you haven't already, I suggest you read this resource.

That said, the 20 core isn't that much faster in multicore performance (25%) compared to the 12 core, but it all helps.
Where does that value come from? Most of the time I see performance values based on gaming content, not so much business applications, especially when you involve using a Hypervisor.

As far as saturating a 10G or 25G link, that will most likely depend more on pool construction than CPU.
I believe this is a very true statement. So which CPU would more depend on how many VM's you plan to use and how many cores each one of those gets if you wanted to sort of dedicate a CPU for a VM. I don't think you could go wrong with either CPU here.
 

nickw_

Cadet
Joined
May 9, 2023
Messages
9
Thanks for the reply. I tried to address most of this in my second post, although I realize its a longer read.

Like most things, it depends on your workload. If you will be using the hypervisor functions of TrueNAS, you probably want more cores.
Planning on using Proxmox as a HV, with TrueNAS core as a VM, then with hardware passthrough for the HBA's.

If you are using it just for storage (like I am), you probably want higher clock speeds and hyperthreading off. I think Samba is still single threaded, so that is someplace where the higher clock speed will make a difference.
SMB is single threaded. I know people say higher clock speeds for higher speed, but no numbers or guidelines are ever given (that I have found anyways). I was hoping someone could break that down and define it more. How much is needed?

I am curious about the relationship between clock speed, single threaded performance and SMB. While clock speeds have mostly stagnated, single threaded performance continues to improve at the essentially the same clock speeds. I wonder how much does this come into play.

As far as saturating a 10G or 25G link, that will most likely depend more on pool construction than CPU. I find my bare metal storage only units are never CPU bound. I switched my pools from RAID-Z2 to mirrors, and that definitely helped.
That's helpful. Would this be on your primary NAS in your signature, with the Dual E5-2637 v4 @ 3.50GHz? I suppose there is no easy way to see how much cpu utilization there is per core during transfers (just the aggregate for all cores).

In my case, I am only really concerned about smb speed with the SSD pool. Under the current build sheet it will be comprised of 10 Crucial MX500 3.84TB drives, in 2x 5wide z1. So theoretical sequential read/write speeds of about 4400MB/s. It should be able to saturate 25g.

Although as shared, 25g really isn't needed. The only benefit is for large file transfers. This is more of an exercise in seeing whats possible and having fun.

As a general rule, RAID-Z1 is not recommended for larger drives since resilvering would take a long time and you would lose the entire pool if there was another drive failure during that process.
The Z1 is only for the SSD pool. My understanding is this is still considered a reasonable practice with SSD's, no? The non recoverable error rate is specified as 10^17. Although I do realize single parity can be a problem when a NRE does occur during rebuild.

In my case, both single parity flash storage pools (nvme mirrors and ssd z1) will be backed up to the HDD pool (2x parity), plus off site. Given the low MRE rate, reliability with flash, in conjunction with multiple backups, I feel safe (enough) with single parity for flash only pools.

If all the drives in a pool are NVME, the SLOG might not help as much. An SLOG only helps if synch is on.
Sync would be on for the VM pool.

In terms of the benefits of a SLOG where the underlying pool is already flash based media. I remember reading a thread here, where this questions was specifically asked (I wish I could still find it).

In the thread an experienced mod (may have been jgreco) mentioned that a SLOG can still be beneficial in situations like this. The reason given had to do with the write paths. The ZIL's write path still had to go through the pools write stack, which takes additional time since it's not optimized for pure speed. Where the SLOG write path is optimized for speed. So even when nvme is being used, a dedicated SLOG can still be faster, providing it isn't a bottleneck in and of itself.

EDIT: I found the thread:
 
Last edited:

nickw_

Cadet
Joined
May 9, 2023
Messages
9
If you haven't already, I suggest you read this resource.
Thanks, I know the community isn't a big fan of this approach.

My understand is things have come along way in the last few years, and that its not the same level of risk that it used to be. I am also willing to accept this risk, given the situation.

Where does that value come from? Most of the time I see performance values based on gaming content, not so much business applications, especially when you involve using a Hypervisor.
The 25% was based on CPU benchmarks. I know this is a gross oversimplification, as not all workflows are the same.

I believe this is a very true statement. So which CPU would more depend on how many VM's you plan to use and how many cores each one of those gets if you wanted to sort of dedicate a CPU for a VM. I don't think you could go wrong with either CPU here.
The underlying pool would be compromised of 10 Crucial MX500 (3.84TB) SSD's in a 2x 5wide z1 layout. Theoretical sequential read/writes of 4400MB/s. I really don't need 25g speeds here, its more an exercise of seeing whats possible and having fun. 10g brings all the practical benefit I need.

In terms of CPU, thank you for the confirmation/validation. I am leaning towards the 6148/20 core model, just to have more options. I was thinking of starting with 4-6 dedicated cores and going from there.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Thanks, I know the community isn't a big fan of this approach.

My understand is things have come along way in the last few years, and that its not the same level of risk that it used to be. I am also willing to accept this risk, given the situation.
Several of us have been running TrueNAS on ESXi for many years, many years without any problems. I like ESXi because of it's robust history, and it's free for home/personal use. Sure there are some bells and whistles you don't get unless you get the paid version but honestly, the free ESXi has been great for me. Proxmox in the past has been problematic however I do hear success stories out there.

Good luck with your project.
 

AdrianB1

Dabbler
Joined
Feb 28, 2017
Messages
29
Planning on using Proxmox as a HV, with TrueNAS core as a VM, then with hardware passthrough for the HBA's.
Then the CPU will not matter for TrueNAS, but it will matter how much CPU other VMs will consume and what will be left available for TrueNAS. TrueNAS by itself as NAS-only is not CPU intensive, 2 CPUs are usually enough and it will not even eat up 100% of it. I have a TrueNAS VM with 10 Gbps NIC and CPU consumption is in single digit area on a 6 core physical CPU with 3 assigned to the TrueNAS VM. I don't know what will take for 25 Gbps, but I guess the CPU will not be the limitation.
SMB is single threaded. I know people say higher clock speeds for higher speed, but no numbers or guidelines are ever given (that I have found anyways). I was hoping someone could break that down and define it more. How much is needed?

I am curious about the relationship between clock speed, single threaded performance and SMB. While clock speeds have mostly stagnated, single threaded performance continues to improve at the essentially the same clock speeds. I wonder how much does this come into play.
The performance of SMB also depends a lot on the implementation and type of files you transfer. Look for Linus' (LTT) video on YouTube with the SMB benchmarks on 100 Gbps, what performance he got and where the limits were, it is informative. 10 Gbps looks impressive from afar, but it is practically around 1100 MB/sec at maximum, nothing to write home about it. 25 Gbps is still slower than a single nVME PCIe version 3, to put it in perspective. At the same time the nVME SSDs you want to use will have local traffic only (for VMs) while the 10 SSDs will be the source of the network traffic, with a pool of 5 SSDs you can saturate 10 Gbps but not the 25 Gbps, that means 25 Gbps will be faster on certain operations (few, large files, not many small files) so if the cost is right, go for it. That is assuming you also have clients at 25 Gbps.

Edit: the network performance for SMB will matter if you do transfers, I don't think video editing over the network will be visibly impacted by network speed, the limitation will be the encoding/transcoding.
 
Last edited:
Joined
Dec 29, 2014
Messages
1,135
That's helpful. Would this be on your primary NAS in your signature, with the Dual E5-2637 v4 @ 3.50GHz? I suppose there is no easy way to see how much cpu utilization there is per core during transfers (just the aggregate for all cores).
I look at the dashboard while doing transfers and that gives me a pretty good idea. I will occasionally see an individual CPU spike up to 80% or more, but rarely. I also look at the CPU reports from the GUI. That makes me feel pretty confident that I am not CPU bound.
Planning on using Proxmox as a HV, with TrueNAS core as a VM, then with hardware passthrough for the HBA's.
Check out this resource on virtualizing TrueNAS. Mine are bare metal, so I have no suggestions for your use case.
https://www.truenas.com/community/r...guide-to-not-completely-losing-your-data.212/
Would this be on your primary NAS in your signature, with the Dual E5-2637 v4 @ 3.50GHz?
Yes, that is it. Both my units have the same CPU and 256G of RAM. The secondary one has an external SAS enclosure and the internal pool is RAID-Z2. I got all my rsynch jobs set up before deciding to redo the main units as mirrors, and didn't feel like redoing all that stuff on the secondary for a pool that I only use for backups. I do have 10G-BaseT NIC's on the client facing network now, but all my SMB clients max out at 1G connections. Both of these units easily manage to fill the client 1G connections. The storage network is 40G which presents as NFS to my ESXi servers. I get bursts of 16G when reading from TrueNAS with sustained reads at ~=13G. The writes top out at 8G with sustained writes at ~=5G. Minimal CPU impact even when doing storage Vmotion.
In terms of the benefits of a SLOG where the underlying pool is already flash based media. I remember reading a thread here, where this questions was specifically asked (I wish I could still find it).
You would need to try it out. All my pools are spinning rust, so the SLOG makes a HUGE difference there.

There are also system tunables that will likely need to be changed to get the most out of 10G or higher connections.
https://www.truenas.com/community/r...ng-to-maximize-your-10g-25g-40g-networks.207/
The Z1 is only for the SSD pool. My understanding is this is still considered a reasonable practice with SSD's, no?
I would think the resilvering would be faster with SSD's, but the relevant question is how bad off would you be if you lost that pool?
The underlying pool would be compromised of 10 Crucial MX500 (3.84TB) SSD's in a 2x 5wide z1 layout.
You'll have to try this out as well. My understanding is that more vdevs = more IO throughput. That usually means mirrors for pools which IOPS is the most important thing. I resisted that for a while because of the fact that you get less usable space from the pool, but I definitely noticed that increased throughput on my main pool after I did that.
 

nickw_

Cadet
Joined
May 9, 2023
Messages
9
Then the CPU will not matter for TrueNAS, but it will matter how much CPU other VMs will consume and what will be left available for TrueNAS. TrueNAS by itself as NAS-only is not CPU intensive, 2 CPUs are usually enough and it will not even eat up 100% of it. I have a TrueNAS VM with 10 Gbps NIC and CPU consumption is in single digit area on a 6 core physical CPU with 3 assigned to the TrueNAS VM. I don't know what will take for 25 Gbps, but I guess the CPU will not be the limitation.
I am planning on dedicating some cores to TrueNAS (starting with 4-6 and going from there).

My only concern for the cpu being a limiting factor is in regards to the required single threaded performance for large file transfer over smb with 25g. Only because I don't understand the relationship well enough yet. The primary goal of this thread was to trying and understand that relationship at a deeper more meaningful level.


The performance of SMB also depends a lot on the implementation and type of files you transfer. Look for Linus' (LTT) video on YouTube with the SMB benchmarks on 100 Gbps, what performance he got and where the limits were, it is informative. 10 Gbps looks impressive from afar, but it is practically around 1100 MB/sec at maximum, nothing to write home about it. 25 Gbps is still slower than a single nVME PCIe version 3, to put it in perspective. At the same time the nVME SSDs you want to use will have local traffic only (for VMs) while the 10 SSDs will be the source of the network traffic, with a pool of 5 SSDs you can saturate 10 Gbps but not the 25 Gbps, that means 25 Gbps will be faster on certain operations (few, large files, not many small files) so if the cost is right, go for it. That is assuming you also have clients at 25 Gbps.

Edit: the network performance for SMB will matter if you do transfers, I don't think video editing over the network will be visibly impacted by network speed, the limitation will be the encoding/transcoding.
Thanks. I'll check out the LTT video.

I'm in the same boat, I really don't consider 10g fast anymore. 25g is better, but hardly high performance (well, maybe for a home).

With it nvme pool, it will be for both a local HV and a secondary HV connected via iscsi.

I will run a OM4 drop to my primary editing machine (we have some large home reno's coming up, so I'll probably run conduit for OM4 to a few rooms). Originally I was planning on sticking with 10g, but when I realized the ssd pool could saturate a 25g connection, I thought it might be fun to do. But then my next through was, how much single threaded CPU performance is required to saturate 25g for large file transfers over smb? Thus this thread.
 
Last edited:

nickw_

Cadet
Joined
May 9, 2023
Messages
9
Thank for the reply. Some of the answers to your questions or comments have already been covered above, so I'll just respond to the ones that haven't.
There are also system tunables that will likely need to be changed to get the most out of 10G or higher connections.
https://www.truenas.com/community/r...ng-to-maximize-your-10g-25g-40g-networks.207/
Thanks, I'll check it out. I haven't gotten too deep into specific smb tuning.

You'll have to try this out as well. My understanding is that more vdevs = more IO throughput. That usually means mirrors for pools which IOPS is the most important thing. I resisted that for a while because of the fact that you get less usable space from the pool, but I definitely noticed that increased throughput on my main pool after I did that.
With raidz more vdev's doesn't increase IO throughput, but increases IOPS. Throughput is based on the number of non parity drives.

With mirrors (both 2way and 3way) each drive in the pool contributes to read speed. It's linear, 1 more drive equals 1x more read. But for writes, it's based on the number of vdevs.

This explains things really well:
 
Last edited:

AdrianB1

Dabbler
Joined
Feb 28, 2017
Messages
29
how much single threaded CPU performance is required to saturate 25g for large file transfers over smb?
Honestly, there is no way to tell; it will depend on NIC capabilities, CPU frequency and usage patterns, but I am not concerned that you will not get most of 25 Gbps performance. Will you get the max throughput? No. Will you get close enough? Yes, the difference will not be coming from the CPU alone and not from those 500 MHz. If you don't want to take chances, go for the small count high-frequency CPU and you will know you did the best in that area. Unless you need a lot of CPU power for your VMs, you will be good.

Just for perspective, this is something I do on a daily basis for my job. I am responsible to design a couple of thousand virtualized VMs (that are replaced regularly) with conflicting requirements, mainly high frequency and high CPU count and low cost, in the end it is about the best compromise. There is no perfect solution, but there is the "good enough" one. But pay attention to the NIC, at 25 Gbps they start to matter.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
I am planning on dedicating some cores to TrueNAS (starting with 4-6 and going from there).
So you have the CPU stuff figured out, or at least a starting place. The RAM you allocate is going to be a major player as well. At 256GB available I would hope you are going to give it at least 16GB and see what your throughput is. That should be fine but then you can play around with the RAM allocated, the same way you can play with the number of CPU threads you will allocate. If you find a limiting factor, post the results. I think your pool design will be the largest factor here, not the CPU or RAM allocation.
 

firesyde424

Contributor
Joined
Mar 5, 2019
Messages
155
Several of us have been running TrueNAS on ESXi for many years, many years without any problems. I like ESXi because of it's robust history, and it's free for home/personal use. Sure there are some bells and whistles you don't get unless you get the paid version but honestly, the free ESXi has been great for me. Proxmox in the past has been problematic however I do hear success stories out there.

Good luck with your project.
This is why I use ESXi at home as well. You don't get support with the free version of course, and it is missing some cool stuff, but you do get a hypervisor that's extremely lightweight, stable, and well tested.
 
Top