Advice for ASRock C2550D4I / C2750D4I users?

Status
Not open for further replies.

Miles

Cadet
Joined
Oct 27, 2015
Messages
8
Hello All,

So I've been reading up on those horror stories of FreeNAS builds dying due to the issue with those two ASRock boards. I currently have a working system that I built last October. Right now I'm thinking to just turn it off, RMA the board, and install the replacement board.

Any recommendations for precautions to preserve my pool? Technically I should be able to just pull the board, replace it, reset my bios settings, and turn it back on...or am I being naive?

Any suggestions / advice would be most appreciated.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,477
First off I doubt ASRock evil allow you to just RMA the board if it is working fine.

Second, how do you know that the replacement board will not suffer from the same issue? I've never seen any official statement from ASRock that they have identified the issue and boards being produced now do not suffer from the same issue.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Disable the watchdog ASAP. That should prevent the issue from developing. It can be enabled after the bug is patched.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Any recommendations for precautions to preserve my pool?
The host system is not directly affected by this issue. The worst that can happen is a temperature rise due to lack of fan control.
 

Pitfrr

Wizard
Joined
Feb 10, 2014
Messages
1,531
@Ericloewe: by disabling the whatchdog I suppose you mean the option in the BIOS? (I wouldn't see an other way, it's just to be sure ;-) )
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
@Ericloewe: by disabling the whatchdog I suppose you mean the option in the BIOS? (I wouldn't see an other way, it's just to be sure ;-) )
Yeah, disabling the OS driver might have the same effect, but it also means that the BMC would reboot the host every few minutes/seconds.
 

Pitfrr

Wizard
Joined
Feb 10, 2014
Messages
1,531
Well, I wouldn't know how to do it on the OS side so no worries there... ;-))

But just out of curiosity (and to know it) how would it be done on the OS side?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Probably a tunable to disable the watchdog driver.
 

Pitfrr

Wizard
Joined
Feb 10, 2014
Messages
1,531
Thanks for the info. I wouldn't have though about tunables for that... but so far I didn't experience much with it so I don't now what's possible.
But I have other topics to investigate before I look into tunables. ;-)
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,477
@Ericloewe thanks for sharing this info. Out of curiosity how did you pinpoint the issue to the watchdog? Personal investigation? Did ASRock confirm?

Thanks!
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215

Miles

Cadet
Joined
Oct 27, 2015
Messages
8
Disable the watchdog ASAP. That should prevent the issue from developing. It can be enabled after the bug is patched.

Ah, so since I never enabled watchdog via the bios (or ever actively turned on any watchdog function) ....I should be fine?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
@Ericloewe thanks for sharing this info. Out of curiosity how did you pinpoint the issue to the watchdog? Personal investigation? Did ASRock confirm?

Thanks!
I didn't, others did. A guy actually got raw console access to the BMC and spotted what was going on.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Ah, so since I never enabled watchdog via the bios (or ever actively turned on any watchdog function) ....I should be fine?
Hopefully. But given how incredibly silly this bug is, I wouldn't be surprised if it manifested itself in other scenarios.
What we've seen points to it being a watchdog problem, though.
 

Miles

Cadet
Joined
Oct 27, 2015
Messages
8
Hopefully. But given how incredibly silly this bug is, I wouldn't be surprised if it manifested itself in other scenarios.
What we've seen points to it being a watchdog problem, though.

Thanks for all of your help...I suppose I'll keep my fingers crossed and an eye on the forums to see if there's a fix from ASRock. Cheers
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
It's ironic actually,

A watchdog is supposed to minimize downtime by rebooting a server if it crashes, but in this case, it maximizes it by frying the mobo ;)
 

DaveY

Contributor
Joined
Dec 1, 2014
Messages
141
Ah, so since I never enabled watchdog via the bios (or ever actively turned on any watchdog function) ....I should be fine?
No sir, disabling watchdog from BIOS only prevents the motherboard from ACTUALLY rebooting your system if it does hang, but will not prevent watchdogd from frying your BMC. You need to stop and prevent the watchdogd process from running on reboots. The tunable goes into your rc.conf

rc.conf
watchdogd_enable = NO

Then reboot your system or kill the watchdogd process (which may actually reboot your board if you don't have watchdog turned off in the BIOS)

Watchdog problem contributes to the life of these boards, but I believe most of them die due to heat. My last board used to run at 60C idle and 90C under load. Board died after a year. Add a fan and replace the sorry excuse for a thermal pad that comes with it. My current board runs at 28C idle and never goes above 32C under load. Motherboard temp is around 44C according to IPMI
 

Pitfrr

Wizard
Joined
Feb 10, 2014
Messages
1,531
I think I'm missing something (sorry if I'm a bit off topic, I don't know if I should have started an other thread?).

The service watchdogd is not running on my system but it is enabled in rc.conf.
So, since I never used tunables, I thought it would be a good exercice to try it out.

I added the following tunable:
Code:
Variable: watchdogd_enable Value: NO Enabled: TRUE​

After reboot, I still had watchdogd enabled in rc.conf.

That's where I'm not sure that I get it all...

My understanding is that the tunables are used to change parameters that are normally not modifiable or to add parameters that are not set.
How do I know the tunable is effective? In my case, rc.conf was unchanged but maybe I have to look somewhere else?
Do I have to specify where the parameter has to be (i.e. in rc.conf)?

Thanks for you help
 

DaveY

Contributor
Joined
Dec 1, 2014
Messages
141
You're doing this through the GUI right?!?? System -> Tunables

If you're editing /etc/rc.conf, the file is overwritten on reboots
 

Pitfrr

Wizard
Joined
Feb 10, 2014
Messages
1,531
Yes, sorry I didn't mention it, I'm doing it through the GUI.
 
Status
Not open for further replies.
Top