TrueNAS will not boot after upgrade to 13

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Does the FreeBSD team know about this? Did anyone create an issue in bugs.freebsd.org or contact the freebsd-stable mailing list?
 

speedtriple

Explorer
Joined
May 8, 2020
Messages
75
Does the FreeBSD team know about this? Did anyone create an issue in bugs.freebsd.org or contact the freebsd-stable mailing list?
Yes, filed a bug with FreeBSD, they should have info about the issue.
 

NachoMan77

Dabbler
Joined
Sep 23, 2013
Messages
17
Hi there!

I'm having the exact same problem as the OP, same OS/HW config.

Any update from the FreeBSD camp?
 

speedtriple

Explorer
Joined
May 8, 2020
Messages
75
Hi there!

I'm having the exact same problem as the OP, same OS/HW config.

Any update from the FreeBSD camp?
No. Initial feedback from FreeBSD was that it was not much information to work on. So I would not expect much. And probably not to many in same situation to use resources on that case. I guess that goes for iX-systems also.

I moved on to clean install SCALE with config from CORE 12.8 - on same Esxi HW. Working perfectly.

Never tried clean-install of CORE 13 on the Esxi-VM with config from 12.8, it might work?
 
Last edited:

isaacc

Cadet
Joined
Aug 31, 2022
Messages
2
Just wanted to chime in, I was also having this problem with a Truenas VM crash and subsequent "Doorbell handshake failed" reboot error on 13.0-U1 and a LSI SAS2008. I kept it offline for a bit as it wasn't crucial and restarting ESXI was a chore because of other VMs. I did an ESXI reboot to bring the SAS2008 card back online and promptly did a 13.0-U2 update and so far I've had 20 hours of uptime, which is more than the previous best, of 3 hours. It would be great if anyone else can confirm that things are stable now for us ESXI/SAS2008 card users.
 

NachoMan77

Dabbler
Joined
Sep 23, 2013
Messages
17
Just wanted to chime in, I was also having this problem with a Truenas VM crash and subsequent "Doorbell handshake failed" reboot error on 13.0-U1 and a LSI SAS2008. I kept it offline for a bit as it wasn't crucial and restarting ESXI was a chore because of other VMs. I did an ESXI reboot to bring the SAS2008 card back online and promptly did a 13.0-U2 update and so far I've had 20 hours of uptime, which is more than the previous best, of 3 hours. It would be great if anyone else can confirm that things are stable now for us ESXI/SAS2008 card users.
This is a weird one. I had this problem with constant VM crashes under 13.0-U1 with ESXi 6.7, at least every 24hs. Rebooting the host provided a temp fix as previously noted above. Then I decided to patch ESXi to the latest build (I was missing 4 patches) and now it's still under U1, with a 15 day uptime. Don't know if its coincidence or patching the host really solved the issue. I'm about to update to U2, wish me luck!
 

badincite

Dabbler
Joined
Aug 10, 2022
Messages
20
This is a weird one. I had this problem with constant VM crashes under 13.0-U1 with ESXi 6.7, at least every 24hs. Rebooting the host provided a temp fix as previously noted above. Then I decided to patch ESXi to the latest build (I was missing 4 patches) and now it's still under U1, with a 15 day uptime. Don't know if its coincidence or patching the host really solved the issue. I'm about to update to U2, wish me luck!
How it running? in the same situation. Trying to decided before a put my plex server back online.
 

anaxagorasbc

Dabbler
Joined
Dec 3, 2021
Messages
15
I had the same issue, went to upgrade from 12U8 to 13U2, system locked up within 10 minutes of being on 13U2, rebooted and was getting the doorbell handshake error. Had to reboot the entire host, it was fine for less than a half hour, then it happened again. Downgraded back to 12U8, system has been up for 2 hours and so far so good.
 

isaacc

Cadet
Joined
Aug 31, 2022
Messages
2
this has reared its ugly head up again for me on the same system. I had to reboot my ESXI host a few times recently for patching and after having no problems for months now cannot keep Truenas up for more than half an hour.
 

socra

Dabbler
Joined
Nov 3, 2018
Messages
34
this has reared its ugly head up again for me on the same system. I had to reboot my ESXI host a few times recently for patching and after having no problems for months now cannot keep Truenas up for more than half an hour.
I have the exact same problem.
Have been running on TrueNAS 12.8 for years on an older version of ESXi: VMware ESXi, 6.7.0, 18828794 (running on a SuperMicro X10SRA-F as ESXi host)
IBM M1015 flashed to LSI in IT mode Card passthrough with a couple of mirror disks running stable for years.
Today I upgraded to 13.1 by reinstalling it to a new BE using the ISO. (I powered off the VM before installing and made a snapshot)
Reinstalled and imported my 12.8 config, reboot all good I thought!
After a few minutes I saw that my Truenas VM was powered off, powered it back on and promptly after a few minutes my VM was shutdown again.

Reverted back to my snapshot to go back to 12.8 (thank God for snapshots) However my TrueNAS 12.8 VM now also crashed after it came up after a few minutes.
So I'm currently shutting down my VMs and will upgrade ESXi to the Oct 2022 patch and reboot my host hopefully getting my TrueNAS 12.8 VM back up and running. o_O

@isaacc and others can you share what version you've been using?

I found the ID on the FreeBSD mailinglist: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=265383
 
Last edited:

socra

Dabbler
Joined
Nov 3, 2018
Messages
34
I upgraded my ESXi host to :
Product: VMware ESXi
Version: 6.7.0
Build: Releasebuild-20497097
Update: 3
Patch: 189
After rebooting the host and starting my TrueNAS 12.8 it seems to be running stable again (uptime almost 4 hours) After my upgrade to 13.1 it crashed < 20 minutes
Really no clue what is going on here. I have 2 other TrueNAS boxes with the same HBA card(M1015-> flashed to SAS9211-8i) (running bare metal) where I upgraded the machine in the same way from 12.2 to 13.1 and those machines don't run 24-7 like my VM but so far had no issues with 13.x

Would really suck if this setup would no longer work virtually.
 

Attachments

  • vmware-logs.zip
    140.5 KB · Views: 160

socra

Dabbler
Joined
Nov 3, 2018
Messages
34
I gave this another go by doing the following:
- Reinstalled my ESXi host with the latest 7.0.3
- Created a new VM (FreeBSD 13) with 7x compatibility that had the exact same characteristics as my 12.x VM.
- Installed TrueNAS 13.U4 on the new VM
- Created a TrueNAS config export of my 12.x VM
- Shut down the old VM and removed my IBM M1015 HBA PCI device from the VM
- Added the M1015 PCI device to my new TrueNAS 13.x VM
- Rebooted my ESXi host first to make sure ESXi understood what the new situation was going to be.
- Booted my new TrueNAS vm and imported the config backup.
The new VM rebooted and has been up now for 7 days where before it would crash within 20 minutes.
 
Joined
Aug 10, 2016
Messages
28
Same thing here... Tried fresh install too... No good news... I think I'll have to rethink my server(s) usage, might consider trunas scale maybe...
The following message keeps repeating forever:
Root mount waiting for: da

It is notmal to see Root mount waiting for: CAM. Never seen waiting for da before

Then, had to poweroff vm, then i get a PSOD
IMG_20230820_192223.jpg


BTW, it looks like all da disks are ok

System is a hp DL380g6 with 2 perc h310
 
Joined
Aug 10, 2016
Messages
28
You do realize that ESXi 6.7 is past end of support, yes?
Yes, but I chimed in because the OP also uses ESXi6.7 .

Anyway, both the Hardware and the HyperVisor are old, but that's all I have to work with ATM.

I'm thinking about moving to TrueNAS Scale. I'm just trying to find out if I can install Scale in baremetal and add another linux based hypervisor appliance such as ProxMox or ovirt in a LXC container... I don't want to passthrough the HBAs anymore... and I'm sot sure what to expect in terms of general administration, ansible automation, etc...

If that's not possible at all, I'll have go with other options... I'm trying to avoid booting up a second server right now because of electric bills....
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
if I can install Scale in baremetal and add another linux based hypervisor appliance such as ProxMox
Not possible for now in a realistic way. The settings aren't in the GUI so only a hack which would disappear on reboot would be an option for that.

ovirt in a LXC container

You may or may not be able to get ovirt working in one of those.
 
Joined
Aug 10, 2016
Messages
28
Not possible for now in a realistic way. The settings aren't in the GUI so only a hack which would disappear on reboot would be an option for that.
Well... I'm not such a newbie, I know how to get around the persistent environment in TrueNAS Core, pfSense, etc... Linux shouldn't be a problem for me :) ( shouldn't :grin: )

I'm just not that experienced in lxc, but I can find my way.

ATM I'm debugging why myt 12.0-U8.1 box depends on DNS for every single IP (not name as param) connection.... this caused havoc ( searching the forums to find some clue ).
 
Last edited:

socra

Dabbler
Joined
Nov 3, 2018
Messages
34
Yes, but I chimed in because the OP also uses ESXi6.7 .

Anyway, both the Hardware and the HyperVisor are old, but that's all I have to work with ATM.

I'm thinking about moving to TrueNAS Scale. I'm just trying to find out if I can install Scale in baremetal and add another linux based hypervisor appliance such as ProxMox or ovirt in a LXC container... I don't want to passthrough the HBAs anymore... and I'm sot sure what to expect in terms of general administration, ansible automation, etc...

If that's not possible at all, I'll have go with other options... I'm trying to avoid booting up a second server right now because of electric bills....
Yes but you did use an esxi version from 2020.
Have you tried with the latest 7.x?
 
Top