Un-correctable DRAM ECC ERROR detected at CPU02/DIMM3A

Status
Not open for further replies.

g1ngerninja

Dabbler
Joined
May 8, 2016
Messages
21
Hi All,

I've got a Supermicro X8DT6/X8DTE I am running the latest BIOS 2.0c.
I've just installed 12 8GB ECC Reg DIMMS and get the above error.

They are all Hynix 8GB 2Rx4 PC3-10600R-9-10-E1 HMT31GR7BFR4C-H9 D7 AB.

I switched DIMM slots and the error moved so I have identified the culprit DIMM, well I think anyway.
What could be causing this, is this a problem with the DIMM or the Motherboard?
I've found a few FAQs on Supermicro's site https://www.supermicro.com/support/faqs/faq.cfm?faq=13758 that suggest a BIOS upgrade fixes the issue, but since I have the latest BIOS not sure what to do.

I can't even run Memtest86+ it starts and immediately reboots when this DIMM is in.
ramerror21072016-v3.png


Thanks
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
It sounds like you got a bad stick. If you remove the bad stick, do you still get any errors?
 

Robert Smith

Patron
Joined
May 4, 2014
Messages
270
Congratulations, now you know your ECC is working.
I suggest you keep the stick, so you can use it in the future to confirm that you have configured ECC correctly. LOL

But as Nick said, boot without the bad stick to confirm this is nothing else.

And, use Memtest86 without the "Plus."
 

g1ngerninja

Dabbler
Joined
May 8, 2016
Messages
21
It sounds like you got a bad stick. If you remove the bad stick, do you still get any errors?

Hi Nick, Yeah it doesn't error if I remove it and I also have a dead stick. By dead I mean when installed it takes out the other channel it's connected with. At first I thought I had 2 dead sticks until I found the culprit. I put back in the 4GB DIMMs I had replaced for these 2 faulty ones and ran Memtest86+ for the full duration with no issues and no errors reported for the remaining DIMMs.

Seems this board, could be all server boards (this is my first one) don't play nice if you have a dead stick in one channel and a good one in the other. At least now I know.

Is this erroring stick completely goosed then?
 

g1ngerninja

Dabbler
Joined
May 8, 2016
Messages
21
Congratulations, now you know your ECC is working.
I suggest you keep the stick, so you can use it in the future to confirm that you have configured ECC correctly. LOL

But as Nick said, boot without the bad stick to confirm this is nothing else.

And, use Memtest86 without the "Plus."

Hi Robert, why use memtest86 and not the plus version that is on almost all Linux live boot CD's?
 

Robert Smith

Patron
Joined
May 4, 2014
Messages
270
The 'Plus' version is a fork. There have been problems with it throughout the years, like bad multi-threading and stuff.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The 'Plus' version is a fork. There have been problems with it throughout the years, like bad multi-threading and stuff.
It's the open-source version whose development was continued from the old memtest86 4.x. Passmark bought the original and close-sourced new versions.

Unfortunately, development has been nonexistent for years. Fortunately, it still exercises memory well.
 
Status
Not open for further replies.
Top