SOLVED Plex failure after major failure -- 21.08 Beta Fixed the issue

Status
Not open for further replies.

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
I just had a power outage that lasted some than my UPS lasted and there was some issue with NUTS and none of my devices cleanly shutdown. Horrible for all the VMs running on my xcp-ng cluster that has SCALE as the Storage Resource.

I don't blame TrueNAS or XCP-ng for any of my issues, NUTS is running on a pfsense box and all of my other devices point to it. It's worked fine in the past and with my pull the plug test.

On to PLEX, right now it's saying Deploying and it never stops.
When I check the log in the GUI and here is what I get:
2021-05-19 19:40:04
MountVolume.SetUp failed for volume "default-token-g47c5" : failed to sync secret cache: timed out waiting for the condition
2021-05-19 19:40:04
MountVolume.SetUp failed for volume "plex-probe-check" : failed to sync configmap cache: timed out waiting for the condition

At first I thought this was a Plexpass Token, so I got a new one and redeployed within the 4 minute window or so.

Edit:
2021-05-19 20:33:01
Created pod: truenas-scale-plex-57c98df45-f28lx
0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
 
Last edited:

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Without more information what version of SCALE you're even running, it's hard to help you out.

Did you try another reboot of the SCALE system already?
and/or service k3s restart

Your errors show your node isn't running. AKA k8s hasn't started correctly (yet)
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
SCALE 21.04

I did not try another reboot as once the share got mounted, several hypervisors start using the system.

I do not see k3s or k8s in the services menu to try and reboot.

Also forgive me, I'm a basic docker person and have given up on kuburnetes several times because it was vastly more complex for simple tasks like a plex container or pi-hole.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
I gave you a command to run, not something in the menu.

Also go reboot and report afterwards.
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
Sorry, didn't parse the command after the and/or.

No change after running k3s, a service k3s status show it up but in the log there are errors sync'ing a pod.

I rebooted and am getting the same issue.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Awesome thanks!
Sorry for the short reply, was on my phone at the time.

Try:
k3s kubectl get nodes

Get the node name
them run:
k3s kubectl describe node *nodename*

And print the results, obviously a node isn't ready (which it is complaining about).
Which shouldn't happen.

After these commands, make a debug (under settings->advanced afaik), create a bugreport on the iX Jira, add the debug and the readout of above two commands just to be sure.
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
One of my apps is working, it's the chia one that is was from the original instructions by Chris. It's working no issues, so not sure what the difference is with that one and this one.

truenas# k3s kubectl get nodes

NAME STATUS ROLES AGE VERSION
ix-truenas NotReady control-plane,master 26d v1.20.4-k3s1

truenas# k3s kubectl describe node ix-truenas
Name: ix-truenas
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ix-truenas
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=true
node-role.kubernetes.io/master=true
openebs.io/nodename=ix-truenas
Annotations: k3s.io/node-args:
["server","--flannel-backend","none","--disable","traefik,metrics-server,local-storage","--disable-kube-proxy","--disable-network-policy",...
k3s.io/node-config-hash: JNNECX4FHXDNNVEHPYT7ARODEQ64JDVJ45FIQXW2U2AKTV7NF3MA====
k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/ssd-pool/ix-applications/k3s/data/11347498feda7a0048cf376e3f4c1626523dbb94ae900b8256db941e2113a653"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Sat, 24 Apr 2021 09:04:18 +0700
Taints: node.kubernetes.io/not-ready:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: ix-truenas
AcquireTime: <unset>
RenewTime: Thu, 20 May 2021 18:09:56 +0700
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Thu, 20 May 2021 18:07:10 +0700 Sat, 24 Apr 2021 09:04:18 +0700 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Thu, 20 May 2021 18:07:10 +0700 Sat, 24 Apr 2021 09:04:18 +0700 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Thu, 20 May 2021 18:07:10 +0700 Sat, 24 Apr 2021 09:04:18 +0700 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Thu, 20 May 2021 18:07:10 +0700 Wed, 19 May 2021 19:40:25 +0700 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 10.100.10.4
Hostname: ix-truenas
Capacity:
cpu: 32
ephemeral-storage: 1865188480Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 131781732Ki
pods: 110
Allocatable:
cpu: 32
ephemeral-storage: 1814455351921
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 131781732Ki
pods: 110
System Info:
Machine ID: 37f8fb817c7b463ba06d90e2e41c4d9d
System UUID: 00000000-0000-0000-0000-d05099dd378f
Boot ID: b41f02f5-6d0f-4393-8af1-6f80ca0dc6bd
Kernel Version: 5.10.18+truenas
OS Image: Debian GNU/Linux bullseye/sid
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.5
Kubelet Version: v1.20.4-k3s1
Kube-Proxy Version: v1.20.4-k3s1
PodCIDR: 172.16.0.0/16
PodCIDRs: 172.16.0.0/16
Non-terminated Pods: (4 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system openebs-zfs-controller-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 26d
kube-system coredns-854c77959c-jq872 100m (0%) 0 (0%) 70Mi (0%) 170Mi (0%) 26d
ix-chia1 chia1-ix-chart-77b7487b77-r4hk9 0 (0%) 0 (0%) 0 (0%) 0 (0%) 11d
kube-system openebs-zfs-node-trzqc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 26d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (0%) 0 (0%)
memory 70Mi (0%) 170Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
#
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Your container is working, because it was installed while k8s was working.
If k8s fails it doesn't stop your already installed containers from running. It just runs because it already was running.

However k8s has clearly failed.

could you give the details of:
k3s kubectl get pods -A

Could you also try going to the Apps interface and recreate the network settings there (change them and back, for example change the IP and back to what it was)
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
I went to the settings in Apps and check the network settings in the menu but nothing has changed there. It's all the same as what I have recorded from initial setup.

truenas# k3s kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system openebs-zfs-controller-0 0/5 Error 0 26d
kube-system coredns-854c77959c-jq872 0/1 Error 0 26d
ix-chia1 chia1-ix-chart-77b7487b77-r4hk9 1/1 Running 1 12d
kube-system openebs-zfs-node-trzqc 1/2 CrashLoopBackOff 335 26d
ix-truenas-scale-plex truenas-scale-plex-69cc7c9fc7-nncn5 0/1 Pending 0 90s
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
I didn't ask to check it.
I asked to change something, save it and change it back. To force it being recreated/reset. But lets leave that for now!


Okey try this too:
k3s kubectl delete pods openebs-zfs-controller-0 coredns-854c77959c-jq872 openebs-zfs-node-trzqc -n kube-system

Wait 5 minutes and do:
k3s kubectl get pods -A

This would remove the crashing containers, which will force the system to try and recreate them.

Please do this BEFORE changing any network settings as requested earlier.First do this command, THEN get back to me, before messing with the network settings
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
I just did the delete, waited 10 minutes (got distracted by the kid), and it looks like things are pending.

truenas# k3s kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ix-chia1 chia1-ix-chart-77b7487b77-r4hk9 1/1 Running 1 12d
ix-truenas-scale-plex truenas-scale-plex-69cc7c9fc7-nncn5 0/1 Pending 0 7h52m
kube-system coredns-854c77959c-8cg6k 0/1 Pending 0 10m
kube-system openebs-zfs-controller-0 0/5 Pending 0 9m32s

Application Events
0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
 
Last edited:

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
make a bug report on jira and add all the previous details and steps + link to to thread.
 

cyrus104

Explorer
Joined
Feb 7, 2021
Messages
70
They want to do a teamview, since I'm in Bangkok, Thailand we are trying to work out a time that works for everyone.

Thanks for taking a stab at it!
 

mea-dev

Cadet
Joined
Sep 29, 2021
Messages
2
Hello, i'm havin a similar issue, the difference is the taint part, mine has :
Taints: ix-svc-start:NoExecute

Here are some more details:

truenas# k3s kubectl describe node ix-truenas
Name: ix-truenas
Roles: control-plane,master
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ix-truenas
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=true
node-role.kubernetes.io/master=true
Annotations: k3s.io/node-args:
["server","--flannel-backend","none","--disable","traefik,metrics-server,local-storage","--disable-kube-proxy","--disable-network-policy",...
k3s.io/node-config-hash: TQ5STE32O3KXYIEQXCJZZY67QOHW7KU6UGA3LH7NTXOYYXMQ75LQ====
k3s.io/node-env: {"K3S_DATA_DIR":"/mnt/hdd1to/ix-applications/k3s/data/b923ec15745249e9ba73deb8dc22f52cc5b7615239be700fbc14a9e6d80314aa"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 29 Sep 2021 13:17:58 +0100
Taints: ix-svc-start:NoExecute
Unschedulable: false
Lease:
HolderIdentity: ix-truenas
AcquireTime: <unset>
RenewTime: Wed, 29 Sep 2021 23:28:28 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 29 Sep 2021 23:23:49 +0100 Wed, 29 Sep 2021 13:17:58 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 29 Sep 2021 23:23:49 +0100 Wed, 29 Sep 2021 13:17:58 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 29 Sep 2021 23:23:49 +0100 Wed, 29 Sep 2021 13:17:58 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 29 Sep 2021 23:23:49 +0100 Wed, 29 Sep 2021 13:18:13 +0100 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 192.168.1.200
Hostname: ix-truenas
Capacity:
cpu: 16
ephemeral-storage: 223505024Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32906648Ki
pods: 110
Allocatable:
cpu: 16
ephemeral-storage: 217425687177
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 32906648Ki
pods: 110
System Info:
Machine ID: db81e06a94014d56883e11a4f64c0fe4
System UUID: 1683fb06-dacf-11df-bbda-de33d2541cc1
Boot ID: 19b5a0da-a11a-4160-bfb8-4a1ca4d9a3b9
Kernel Version: 5.10.42+truenas
OS Image: Debian GNU/Linux 11 (bullseye)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.8
Kubelet Version: v1.21.0-k3s1
Kube-Proxy Version: v1.21.0-k3s1
PodCIDR: 172.16.0.0/16
PodCIDRs: 172.16.0.0/16
Non-terminated Pods: (0 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Starting 15m kubelet Starting kubelet.
Normal NodeAllocatableEnforced 15m kubelet Updated Node Allocatable limit across pods
Normal NodeHasSufficientMemory 15m kubelet Node ix-truenas status is now: NodeHasSufficientMemory
Normal NodeHasNoDiskPressure 15m kubelet Node ix-truenas status is now: NodeHasNoDiskPressure
Normal NodeHasSufficientPID 15m kubelet Node ix-truenas status is now: NodeHasSufficientPID

truenas# k3s kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ix-qbittorrent qbittorrent-5c55897bc-h4lg7 0/1 Pending 0 9m51s
kube-system coredns-7448499f4d-nms6q 0/1 Pending 0 9m25s
kube-system openebs-zfs-controller-0 0/5 Pending 0 9m8s

i tried to delete these pods, but nothing changed.

this is my 1st experience with truenas scale, i'm moving from proxmox CT/VM world.
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
Sadly enough we've not seen "ix-svc-start:NoExecute" during out testing...

If there are any issues preventing the Apps system from starting, it's always best to directly file a Jira ticket with iX Systems and add a debug dump (under advanced-settings) to said ticket.

Generally speaking the iX Developers often do not get wind of forum posts and your issue might not be solved for the near future without making a ticket.
 

exodus454

Dabbler
Joined
Nov 24, 2019
Messages
14
I've also been stuck with the "ix-svc-start:NoExecute"

k3s kubectl taint node ix-truenas ix-svc-stop:NoExecute-

got me back running again
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
To add a bit to this issues:
ix-svc-stop is a special taint added by iX systems.

This taint not being removed, indicates something in the middleware is going haywire and after removing it you might still encounter other sideeffects of the middleware problems....

Hence it would need iX assistence to prevent this issue (and any potential side effects) for other users of the Apps ecosystem.
That Jira ticket would still be VERY helpfull to the community as a whole :)
 

exodus454

Dabbler
Joined
Nov 24, 2019
Messages
14
To add a bit to this issues:
ix-svc-stop is a special taint added by iX systems.

This taint not being removed, indicates something in the middleware is going haywire and after removing it you might still encounter other sideeffects of the middleware problems....

Hence it would need iX assistence to prevent this issue (and any potential side effects) for other users of the Apps ecosystem.
That Jira ticket would still be VERY helpfull to the community as a whole :)

I'm gunna wipe my apps and see if anything it changes, I've also been getting crazy errors about "network not ready" (I'm not sitting infront if it) but it's complaining about /etc/cni/net.d and being "not ready"
 

darkcloud784

Dabbler
Joined
Feb 28, 2019
Messages
25
I'm gunna wipe my apps and see if anything it changes, I've also been getting crazy errors about "network not ready" (I'm not sitting infront if it) but it's complaining about /etc/cni/net.d and being "not ready"
Where you able to get this working?
 
Status
Not open for further replies.
Top