Kubernetes not coming up on a 21.06-BETA

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Anyone else facing this issue?

I need to run some more tests and try a few more installs, but so far kubernetes is just stuck at "not ready",

Setup is supermicro A2SDi-8c. 250 G Samsung Evo m.2/nvme as the boot drive. Only using the first SAS port, with a mirror of 10 G TB WD reds at the end of a forward breakout cable.

64G ECC RAM and I've opted for no swap during the installation. I'll post more info when I get a chance.

I should note that everything aside from the "Apps" module seems to be working perfectly.
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Thanks.

It would be good to know what steps you took to get Kubernetes started and what errors you are getting when trying to start a simple App?
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
It would be good to know what steps you took to get Kubernetes started and what errors you are getting when trying to start a simple App?

I didn't take any steps on my own to get Kubernetes started. I let truenas handle that. And the first time I try to create an app I get "Kubernetes service isn't running" error in a pop up alert in the ui. Oddly there's no kubernetes service listed in the services section of the UI as one would expect after such an announcement.

I tried a again and it let me create a Minio app, and deploy an apline:latest container, however they get stuck forever.

Anyway there's been a new development here. I "unset" the pool in the Apps module from within the UI, then set it again to the same pool and kubernetes system pods started coming up.

Once I saw that the openebs-zfs pods where ready I tried again, and was able to get minio up and running! The only change I made in between is I change the ix-applications dataset from inherited sync=always to sync=standard. IDK if that was related but I will try and cycle again through all the steps I've taken to see if reproduction is possible. I'll wipe the pool first and try it without the sync as well in another cycle. I'll post clear steps if I'm able to isolate anything.

Another thing to note though: I still can't get the alpine pod up. I've tried without filling in any unrequired options during launch, as well as different variations of volume/port/privileged/unprivileged etc.

It seems the system tries to create a container called "ix-chart", but then gets stuck in "back off restarting failed container".

backoff.png
 
Last edited:

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Originally I had set sync to always on the pool (I was going to run some tests with sync and random writes in a VM). I'm not sure if that was clear. Again I have no idea if that may have been what caused the issue.
 

guyp2k

Dabbler
Joined
Nov 16, 2020
Messages
26
I always have this issue after applying an update/nightlies, I have to reboot again after the update and K8s starts....Haven't really done any debugging given it's in beta....If a debug would help after an update when I have the issue I will be glad to open a jira.

Just applied an update and k8s will not start, tried to get a debug and the debug hung at 25%/Dump Kubernetes Information.
 
Last edited:

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Just applied an update and k8s will not start, tried to get a debug and the debug hung at 25%/Dump Kubernetes Information.

Have you tried unsetting/ then resetting the apps pool? Strangely this has worked for me 3 times in a row on the beta.
 

guyp2k

Dabbler
Joined
Nov 16, 2020
Messages
26
Have you tried unsetting/ then resetting the apps pool? Strangely this has worked for me 3 times in a row on the beta.

I assume that would wipe out/reset K8s resulting in deleting the apps I have installed....

An additional reboot after an update usually addresses the problem.
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
I assume that would wipe out/reset K8s resulting in deleting the apps I have installed....

I you keep the same pool everything is stored in the ix-applications dataset. Unsetting the pool won't wipe it out. Resetting will only trigger a redeploy of all your apps. I just tried it with three apps and everything remains intact.

However if your apps contain important data .. I can't guarantee no data loss.
 

guyp2k

Dabbler
Joined
Nov 16, 2020
Messages
26
I you keep the same pool everything is stored in the ix-applications dataset. Unsetting the pool won't wipe it out. Resetting will only trigger a redeploy of all your apps. I just tried it with three apps and everything remains intact.

However if your apps contain important data .. I can't guarantee no data loss.


Thanks for that and will give a try, it's a test box so NP.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I you keep the same pool everything is stored in the ix-applications dataset. Unsetting the pool won't wipe it out. Resetting will only trigger a redeploy of all your apps. I just tried it with three apps and everything remains intact.

However if your apps contain important data .. I can't guarantee no data loss.
Thanks... with this new knowledge and the nightly, it would be useful if you could identify the reproducable bug(s) that can get fixed.
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
it would be useful if you could identify the reproducable bug(s) that can get fixed.

I'm working on it. The POST is very long on the supermicro so I'll have to wait till I get more time to sit through it and try the install again from scratch.
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Thanks... with this new knowledge and the nightly, it would be useful if you could identify the reproducable bug(s) that can get fixed.

I actually haven't been able to reproduce this with any consistency. I suspect I must have overworked the drives and hung it (kubernetes, and the zfs controller somehow) by creating a bunch of VM's and having their installers all running squashfs at the same time (or actually I mean writing their file systems at the same time).

Two or three out of five or so tries Kubernetes will hang if I follow that same pattern (as rapidly as possible), which is

  • install
  • boot
  • wipe two drives
  • create mirror pool
  • set the pool to sync
  • create three ubuntu vms, from the same iso, uploading iso through the ui for the first one
  • launch vnc for each one simultaneously
  • quickly run through the interactive portion of the installations
  • browse to Apps
  • attempt to launch an app

if/when Kubernetes does hang, reboots won't fix it, the only fix I've found is to "unset" the pool, then "choose pool" again.

Honestly though the Kubernetes implementation in Apps isn't going to work for us. We are going to begin directing our hardware towards testing "rolling our own" kubernetes installations with worker vm's running on top of multiple instances of SCALE. Which is much like what we've done in the past, except in the past the Freenas itself was a VM and peer to the worker nodes, and handled the storage for the hypervisor.

We've done it on the ix minis plenty as well, passing the sata controllers through to Freenas/Truenas instance usind intel iommu (vfio), which works quite well, perhaps even better than the LSI HBA's.

Using SCALE saves us a single VM per node, and may even award a small performance boost due to less context switching. The only downside here is that we didn't always deploy everything to a vm or to a cluster, typically we would run a jail or plugin or two as well for a few basic apps, just for convenience. Some stateful apps that require lot of good fast storage and don't require high availability are just far simpler to throw in a jail. And the apps module in SCALE doesn't seem to be a suitable replacement to that. Not even close.

I suppose the other downside is that in our current stack we are accustomed to using libvirt, and all tools that work with it. On our larger deployments (mostly DELL Poweredges) we may have to keep thing the way they are for awhile. I honestly hate having to deal with the HBA controllers though. Sometimes they go bad, sometimes the slots go bad, sometimes there's a firmware bug, etc etc. None of that stuff happens often but it bites when it does.
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Thanks for the write-up. @inman.turbo
Report-a-bug and capture the debugs would make sense if you want to fix.
If you would like tweaks to enable your architecture supported better, try to identify the key issues. Right now, its not a clustered K8s model.
In 2022, we are planning to provide clustered Kubernetes.... and have Kubernetes manage the VMs. Apps can then be deployed with a unified API.
 

tianyaxun

Dabbler
Joined
Feb 12, 2021
Messages
15
I have this issue too.
Kubernetes not coming up on a 21.06-BETA1,
20210825072207.png
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Have you enabled Kubernetes 1st?
Then you install applications.
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Have you enabled Kubernetes 1st?
Then you install applications.
Can you explain what you mean by enable Kubernetes? There is no way to enable or disable Kubernetes service directly from the UI, as far as I can tell. It is not listed as a service. You must select a pool for Kubernetes to use in order to use Apps but, on a fresh install, if you haven't selected a pool yet, an alert will popup directing you to do so.

The only way to enable/disable Kubernetes is to set or unset the storage pool, which doesn't actually completely stop/start Kubernetes from running, as far as I can tell. It does seem to enable/disable some of the neworking needed to use the service. So essentially it just cripples it.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
So the ix-applications pool is set-up
the kubernetes settings and network are configured.
Under systems settings, services, I thought there was a Kubernetes status indicator (Apologies, i don't have access to a system right now).
Otherwise, it would require cli to check kubernetes status.
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Under systems settings, services, I thought there was a Kubernetes status indicator (Apologies, i don't have access to a system right now).
Yeah that was what I was thinking at first, at least that is what you would come to expect when accustomed to working within the UI. It is the typical pattern, and it would be nice.

Kubernetes is a pretty complex beast though so I guess it would be pretty difficult to control expected behavior with a toggle switch in the settings services table. The things is it would have to trigger a series of jobs when toggled, which could lead to the switch getting spammed and cause further complications, I suppose. It could maybe work if the switch was grayed out until the previous on/off job was completed.
 

guyp2k

Dabbler
Joined
Nov 16, 2020
Messages
26
Still same issue w/ me, just updated just now to latest nightly and will not start, on my 3rd reboot.....If you need a debug or more specifics let mw know.

3rd reboot was the ticket :)

TrueNAS-SCALE-21.08-MASTER-20210826-232919
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Still same issue w/ me, just updated just now to latest nightly and will not start, on my 3rd reboot.....If you need a debug or more specifics let mw know.

3rd reboot was the ticket :)

TrueNAS-SCALE-21.08-MASTER-20210826-232919
If it reliable going forward a debug would not be useful... but if it fails after another reboot, it might be useful to submit a bug report. Obviously, its very hard to work out what is different about your system that causes the issue.
 
Top