Reviving two Supermicro boxes

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
I inherited two servers, Supermicro iron, one is 24-bay, another is 36 (the older one with low profile PCI). I wonder what kind of performance I can expect from these boxes? They were neglected, running FreeNAS 9, and the hard drives are failing. What I need is eight client accessing them over 10 GbE with at least three of them getting guaranteed 1 Gb/sec sustained. That's the absolute minimum. This is a postproduction finishing shop so we deal with both huge media files and long image sequences.

I will probably try to get the bigger box operational first. It has 128 GB of RAM and both channels on the Intel NIC are working. My switch is Netgear ProSafe m4300-8x8f so I can probably aggregate the links. So far I upgraded to TrueNAS-13.0-U6 and it felt ok but then I lost WebGUI and SSH access. Also the network share kept dropping randomly.

This is not a very urgent project and I’m not the owner so all purchase decisions must go through the approval process but if I can achieve my throughput goals we are willing to invest some time and money.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
Do you have any other option? If that's the hardware you have then you have to figure out how to make it work as best as possible.

It's really a complex question though. You have to take a lot of things into account for video editing.

I've cobbled a few TrueNAS builds together with junkyard parts when I've had to. There really isn't a one size fits all answer. Traditionally, you're hiring an integrator who is taking everything into account from codec choice in the workflow to specific Atto drivers across the network.

How many editors you can handle is mostly going to come down to how many drives are inside those servers and how much horse power is under the hood.

We could start with getting more details. Besides 24bays and 36 bays. Model number? Motherboard? CPU? HBA card? Current storage?

I have no experience with ProSafe / Netgear products in a shared editing environment. Just never crossed paths. Generally any switch that can do 10gbe and jumbo frames is going to work fine, but heat management often comes down to how much RJ45 you cram in a closet and how loud are the fans allowed to get. I tend to split my 10gbe network off on its, away from anything that touches the public switch.
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
I do have an option to take them to the junkyard and keep lugging DAS boxes around. The brass doesn't mind but the hands are grumpy about that. Another option is to shop around for a brand new NAS.

Both servers are down so the only way for me to get the specs is to pull them from the racks and take apart. That'll have to wait until the work is much slower. I'm an editor, not an IT.

Some notes I took when initially trying to mess with them:

The 24-bay
Supermicro X10SRL-F
Intel Xeon E5-2630 v4 @ 2.20GHz
64 (2x32) GB RAM
Some Chelsio Dual 10Gb NIC with only one channel alive

The 36-bay is more RAM, Intel NIC with both channels working, faster CPU but fewer cores.

In both boxes all HBAs are LSI 2008, some 8, some 16 ports. Firmware is IT but it is old and mismatched. Current storage is irrelevant since the hard drives are old and failing. If we go for it, we'll start with at least 24 brand new spinning, likely SATA, not SAS.

Heat and noise is under control.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
You know a lot for an editor, so that makes you IT. Pretty sure that's how it works.

What is the total storage amount you all would want to have usable?

Do you have any ideas on how you want to handle the vdev configuration? That's really going to be the biggest factor, unless you want to put A LOT more ram and tinker with caching layers. For video editing and expansion, mirrors would be my first choice. After that I would probably look at 6x raidz2 vdevs. I just find 6 disks to be a nice medium. Anything over 8TB drives I'd limit to 6 disks in z2 though.

You said 8 clients with some bandwidth reserved for 3 of them, how many editors at once?

And how complex is are the timelines/work? Are we talking feature length or multicam stuff?

Sometimes if aggregation is a pain in the stack, I will just artificially load balance by assigning a few edit bays to each network port I have available.
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
What I was trying to say is that I cannot look after the system full time since I do other work in the shop so after the initial setup the hope is it will be as transparent as possible. Likely a single huge share, single user account, no permissions nonsense.

60-80 TB is what I'm looking at. I will do mirrored pairs and stripe them. I think I mentioned we are an online shop, mostly docs, so the media is the mix of OCM (ArriRAW, r3d, Sony X-OCN) and whatever crap passes for archival footage these days. The deliverables are flavors of ProRes, IMF, DCP and and 16-bit DPX. All UHD or DCI 4k.

So unlike the offline editorial when a day may be spent hitting the same two dozens of takes for a scene my main concern is smooth playback of the whole show, so I don't think caching would help. I see a lot of cases where all clients are hitting the server at ones with two colorists, two conform artists and rendering and copying deliverables to and from it. The colorists are the priority because there is usually the client in the room. I'd prefer to do any load balancing by scheduling jobs around each other, rather than putting restrictions on artist with network layout and permissions. We don't use Avid so nothing is put on the storage unless a human puts it there and we all are more or less adults and can clean after ourselves.

All the above is from the purely selfish perspective of a user, not an admin. There may be plenty of reasons it can't work this way.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
Nothing too crazy in your workload. Colorists often bump everything up to full too. If there are spikes in our charts, it's cause the colorist came to the office.

If you populated the 24 bays with 12TB rust, that gives you 110ish usable TB's in striped mirrors. A good amount of head room on a 60-80TB goal. Which is what? $8k in new platters. You could leave 2 hot spares. Or investigate far more capacity in like a 4 way RaidZ2 if you really don't want to do maintenance for a few weeks at a time.

256Gb of memory is your floor though.
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
Well, sounds reassuring. Thanks.

Hopefully, next week I'll have time to do clean install and start rebuilding.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
It really just comes down to your total budget.

In working condition, this hardware should mostly be on autopilot if you go with TrueNAS Core.

It's just very little maintenance if it's all set up properly. It will take a lot of hours and some trial and error. But with new drives, you're generally in good hands. Used servers, your budget should have some spare parts on hand. I typically just run a closed 10gbe network for the edit bays. Health reports are pushed to our company discord server.

Can the hardware get you through that workload, with some possible consessions, sure 100%. I did 2 seasons of CBS shows on a junkyard Dual E5-2690 v4 with 64Gb of memory and a two way 6x raidz2 with 18TB rust.

If you have 24 or possibly 36 platters, your bottlenecks will be CPU, Memory and Network.

Bump it to 256-512Gb, you're in very safe territory.

I would probably look at getting an Arista 7050X or the equal if you can. I see them used for pretty cheap. They're bomb proof. I had 5 in a production studio that ran around the clock for 7 years. That rules out network bottlenecks being even possible. You gain 40Gb QSFP to the servers and a switch and can go RJ45 /Cat6a to wherever you need to.

That really leaves CPU as the most possible bottleneck. Which is only solved with upgrading the server to something newer.

Between Intel and Apple M chips, h265 acceleration has really changed the way we approach our hardware workloads. We bake 10-bit h265's and we basically have no network load until it's time for color and render.

Behind the TrueNAS I'd probably want an LTO deck to avoid unecessary expansion and full on-site and off-site backup capabilities, just clone the tapes.
 
Last edited:

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
The budget is a tricky issue. On one hand we're pretty busy and can afford to invest in hardware but on another we are handling the load without any shared storage so what's the motive? Kinda chicken/egg situation. To get the funds I need to show the performance but to show the performance I need to spend on rust which requires funds. Not explicitly but still so, especially since these boxes never performed well.

Yes isolating 10Gb LAN from the rest of network is a priority. I don't want file traffic interfere with the Resolve database access. That would be relatively easy. I have plenty of Gigabit port on another switch.

We do LTO backups though I have some reservations with what exactly gets backed up and what does not. Things evolved a lot since the policies were laid out.
 

wdp

Explorer
Joined
Apr 16, 2021
Messages
52
I don't really put ZFS down in the performant filesystem category. If anything, it is extremely resource intensive, power hungry and has a fairly high level of entry and an even steeper learning curve. Can it be? Absolutely. Is it the fastest way to do everything in 8k? Hell no.

The sales pitch has to start long before that.

TrueNAS provides you with one of, if not the best block storage solutions you can provide for your footage and the integrity of your client data. ZFS is how you SHOULD be storing all your data. Plus it has an amazing snapshot system and paves a long term path for scaling productions.

The more coal you feed it, the faster this train goes.
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
Thanks. It does make sense.

Actually with the limited time I could spend and just basic tweaks I could take it to the performance level where the network bandwidth became the bottleneck. So we are going to give it a shot in earnest.
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
Small update and a headscatcher.

So I started on the box with more RAM (256 GB).

Supermicro X9DRsomething
Intel Xeon CPU E5-2620 2.00GHz

I replaced the boot SSD, flashed the HBA with latest firmware, rebuild the pool with 12 mirrored pairs striped (24 SAS HDDs) and aggregated two channels on the 10GbE NIC.

I'm getting great speed copying terabytes of files to and from this box. I can play 6 streams of 3480x2160 ProResHQ with 8 channels of audio at 23.98 fps (~75 MB/sec) but what I cannot do is to play my Davinci timeline regardless of the settings. The playback stutters on each cut for a couple of seconds and then goes back to 23.98. If I roll back and play the same section again it's smooth.

Is there any way to improve the playback?
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
solnet-array-test reported this so far:

Code:
Performing initial serial array read (baseline speeds)
Fri Feb  2 23:41:49 PST 2024
Sat Feb  3 00:40:21 PST 2024
Completed: initial serial array read (baseline speeds)


Array's average speed is 79.0346 MB/sec per disk


Disk    Disk Size  MB/sec %ofAvg
------- ---------- ------ ------
da0      3815447MB     45     56 --SLOW--
da1      3815447MB     44     56 --SLOW--
da2      3815447MB     49     62 --SLOW--
da3      3815447MB     44     56 --SLOW--
da4      3815447MB     43     54 --SLOW--
da5      3815447MB     44     55 --SLOW--
da6      3815447MB     43     55 --SLOW--
da7      3815447MB     45     57 --SLOW--
da8      3815447MB     44     56 --SLOW--
da9      3815447MB     43     55 --SLOW--
da10     3815447MB     44     55 --SLOW--
da11     3815447MB     45     57 --SLOW--
da12     3815447MB     44     56 --SLOW--
da13     3815447MB     43     55 --SLOW--
da14     3815447MB     44     55 --SLOW--
da15     3815447MB     42     53 --SLOW--
da16     3815447MB     44     56 --SLOW--
da17     3815447MB     43     54 --SLOW--
da18     3815447MB     44     56 --SLOW--
da19     3815447MB     43     55 --SLOW--
da20     3815447MB     42     53 --SLOW--
da21     3815447MB     43     55 --SLOW--
da22     3815447MB     44     55 --SLOW--
da23     3815447MB     42     54 --SLOW--
ada0      238475MB    501    634 ++FAST++
ada1      238475MB    502    635 ++FAST++


Performing initial parallel array read
Sat Feb  3 00:40:21 PST 2024
The disk da0 appears to be 3815447 MB.
Disk is reading at about 84 MB/sec
This suggests that this pass may take around 762 minutes


                   Serial Parall % of
Disk    Disk Size  MB/sec MB/sec Serial
------- ---------- ------ ------ ------
da0      3815447MB     45     84    188 ++FAST++
da1      3815447MB     44     82    186 ++FAST++
da2      3815447MB     49     93    190 ++FAST++
da3      3815447MB     44     79    178 ++FAST++
da4      3815447MB     43     83    194 ++FAST++
da5      3815447MB     44     83    190 ++FAST++
da6      3815447MB     43     83    191 ++FAST++
da7      3815447MB     45     84    187 ++FAST++
da8      3815447MB     44     86    193 ++FAST++
da9      3815447MB     43     83    191 ++FAST++
da10     3815447MB     44     83    191 ++FAST++
da11     3815447MB     45     82    181 ++FAST++
da12     3815447MB     44     83    189 ++FAST++
da13     3815447MB     43     80    186 ++FAST++
da14     3815447MB     44     82    189 ++FAST++
da15     3815447MB     42     81    193 ++FAST++
da16     3815447MB     44     81    185 ++FAST++
da17     3815447MB     43     84    196 ++FAST++
da18     3815447MB     44     82    186 ++FAST++
da19     3815447MB     43     83    192 ++FAST++
da20     3815447MB     42     82    193 ++FAST++
da17     3815447MB     43     84    196 ++FAST++
da18     3815447MB     44     82    186 ++FAST++
da19     3815447MB     43     83    192 ++FAST++
da20     3815447MB     42     82    193 ++FAST++
da21     3815447MB     43     82    189 ++FAST++
da22     3815447MB     44     82    188 ++FAST++
da23     3815447MB     42     81    190 ++FAST++
ada0      238475MB    501      0          0  --SLOW--
ada1      238475MB    502      0          0  --SLOW--


It is still going. I'd expected old heavily used HDDs to not shine but the uniformity of the results looks weird to me. I am definitely replacing the drives but will continuing the test give me any more insights? Expander or HBA not working properly?
 

edgecode

Dabbler
Joined
Nov 18, 2023
Messages
20
So I have a matching JBOD disk shelf. If I physically move all hard drives there, should I expect the pool still be visible to the system? Are there any extra steps to do? The HBA for it is installed and flashed with the same firmware.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Top