help!! NVME pool optimization

spikkedd

Cadet
Joined
Sep 20, 2020
Messages
4
I have a SuperMicro server running:
AMD EPYC 7451
128GB RAM
FreeNAS11.3-U4.1
4x 4TB Intel P4510 NVME SSDs
4x 10G NICs

and an iMac with a 10G NIC

When I try to create a striped pool, the transfer speeds will reach ~900MB/s but then stop completely for a few seconds before continuing again.
A few things I've tried:
setting Compression and Dedup to OFF
playing around with the record size
disabling SYNC in the pool settings

but nothing seems to make a difference. What's interesting is that transfers never make the dashboard indicate CPU usage.

Any thoughts or suggestions?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Make sure your NVME drives are updated to the latest firmware release:

HTH,
Patrick
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I suspect you have an untuned system....

If you have a large amount of RAM and SYNC = disabled, then all writes just go to RAM. That is the 900MB/s burst speed.

After the RAM Cache fills up (or after 5 seconds)... ZFS then starts to write to the drives as a transaction group. It will stop acknowledging writes (write throttle) until RAM is free and hence the write rate will now drop and will be more variable. RAM is not freed immediately.

So, current config may be OK, but the performance may be "stop-start".... what average write throughput do you get?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Just watched the beginning of this video. This is precisely what we experienced with our brand new all NVME systems (see signature) until I did a firmware upgrade for all the Intel NVME drives.

Hence my recommendation above.
 

spikkedd

Cadet
Joined
Sep 20, 2020
Messages
4
Just watched the beginning of this video. This is precisely what we experienced with our brand new all NVME systems (see signature) until I did a firmware upgrade for all the Intel NVME drives.

Hence my recommendation above.

I've just updated the firmware on all the SSDs to the latest version and it appears the problem persists.
Mind if I ask, what your configurations on your pools are?
 

spikkedd

Cadet
Joined
Sep 20, 2020
Messages
4
Rabbit hole might be a bit deeper.


This is my fear as well, although, at about 9:12 in the video where he is in his server room, you can see his other SuperMicro which is similar to what I have. I'm leaning towards that being his functioning NAS.
Here's the video that corresponds to that server (I think):
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I've just updated the firmware on all the SSDs to the latest version and it appears the problem persists.
Mind if I ask, what your configurations on your pools are?
See my signature. The "Production Hypervisor System".
 

Herr_Merlin

Patron
Joined
Oct 25, 2019
Messages
200
I have the same issues with my NVMe pool.. looks like FreeNAS is not able (yet) to work great with NVMe
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Thanks Patrick, what settings to do you have for your storage pool?
What do you have for compression, Dedup, and record size?
Also, if you don't mind me asking, what transfer speeds are you getting?
What do you mean by settings? Default compression, default record size, of course no dedup.

And since these systems don't serve any data but run VMs exclusively, I cannot give you transfer speeds, either.
But with the new systems just set up I experienced continuous "lost interrupt" followed by "controller reset" messages during which the system was frozen - then resumed (for a while).
Upgrading (Warner Losh fixed some important parts in the interrupt handling of FreeBSD) to 11.3 from 11.2 made the problem less frequent and upgrading the drive firmware fixed it for good.

Sorry I did not give you complete information right away.

Kind regards,
Patrick
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I have the same issues with my NVMe pool.. looks like FreeNAS is not able (yet) to work great with NVMe

NVMe can work well with TrueNAS CORE, but at this stage it requires custom ZFS tuning and some wizardry. The 1st step is defining the precise performance goals of the system.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
but at this stage it requires custom ZFS tuning and some wizardry
Could you point me to some docs or a thread where this has been discussed? Why would it need specific tuning? I mean, it's just SSDs, only faster, isn't it? Currently using NVME devices in my virtualisation systems @work (NVME exclusively) as well as in my NAS @home (mirror pool for VMs and jails).

Thanks,
Patrick
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Could you point me to some docs or a thread where this has been discussed? Why would it need specific tuning? I mean, it's just SSDs, only faster, isn't it? Currently using NVME devices in my virtualisation systems @work (NVME exclusively) as well as in my NAS @home (mirror pool for VMs and jails).

Thanks,
Patrick

NVME drives work faster by default..... . but it doesn't mean its optimized for a specific workload need.
Current docs on the subject are the zfs source code... but only necessary for people with specific goals in mind.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Which specific parts of the ZFS source, and are we talking about "requires recompile" or "significant gains can be made by adjusting the very conservative default tunables"? because I know that the default tunables on TrueNAS are still very conservative for any kind of NAND media, let alone fast NVMe.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Who mentioned "requires recompile"?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Who mentioned "requires recompile"?
Me - I'm trying to determine the extent of the customization that's required in order to eke out better performance, eg: would I need to write new code logic itself (recompile) vs adjust some sysctls afterwards (a reboot, at worst)
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
First pass is just configuring zfs parameters.... understanding what the specific performance goals are and the hardware is capable of.
 
Top