Pod stuck in Terminating state
Problem Description
When a Pod is being deleted, it may remain in a Terminating
state for a long time. This can happen due to the following reasons:
- The Pod has a
finalizer
associated with it, and the task of thisfinalizer
is not complete. - The Pod is not responding to the termination signal.
When we execute the kubectl get pods
command, we will see the following information:
NAME READY STATUS RESTARTS AGE
nginx-7ef9efa7cd-qasd2 1/1 Terminating 0 1h
Troubleshooting and Resolution Steps
1. Gather some information
kubectl get pod -n [NAMESPACE] -p [POD_NAME] -o yaml
2. Check for finalizer
First, we need to check if the pod has any associated finalizer. If there is a finalizer associated, it is likely to be the cause of the problem.
Get and view the configuration information of a pod:
kubectl get pod -n [NAMESPACE] -p [POD_NAME] -o yaml > /tmp.txt
Then, check if the finalizer
field exists under the metadata
configuration block in the content of /tmp.txt
. If it exists, use Solution A
.
3. Check the status of the node
It could also be an issue with the node where the pod is located due to some reasons.
If you find that all pods on a certain node are in the Terminating
state from the content of /tmp.txt
, then it is an issue with that node.
4. Try deleting the pod
If the pod is not terminated, it may be due to the process not responding to signals. The specific reasons need to be considered in the context of the specific application. This may include:
- Existence of a “tight loop” in the user space code that does not respond to interrupt signals.
- A maintenance process during application runtime, such as garbage collection.
In this case, using Solution B may be feasible.
5. Restart kubelet
If other methods do not work, try restarting kubelet on the node where the pod is running. See Solution C.
Solution
Solution A: Remove finalizer
Remove all finalizers from the corresponding pod:
kubectl patch pod [POD_NAME] -p '{"metadata":{"finalizers":null}}'
Solution B: Force delete the pod
Please note that this is a workaround and not a recommended solution. Most articles online suggest this method as a first option. Be cautious when using this method to avoid any additional issues. For information on force deleting a pod from a StatefulSet, refer to https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/
.
kubectl delete pod --grace-period=0 --force --namespace [NAMESPACE] [POD_NAME]
Solution C: Restart kubelet
If you have SSH access to the node, you can restart the kubelet process on that node. If you don’t have the necessary permissions, contact someone who does.
Before restarting kubelet, it’s recommended to check the kubelet logs for any potential issues.
Verification
If the kubectl get pods
command no longer shows the terminating pod, it indicates that the issue has been resolved.
$ kubectl get pod -n mynamespace -p nginx-7ef9efa7cd-qasd2
NAME READY STATUS RESTARTS AGE
Further Actions and Investigation
1. Check if the finalizer’s tasks need to be completed
This depends on the tasks performed by the finalizer. Usually, tasks related to volumes may cause the finalizer to remain incomplete.
2. Identify the root cause
This also depends on the tasks performed by the finalizer and requires some specific contextual information. If you have access to the node, check the kubelet
logs as they may contain useful information for troubleshooting.
Reference
https://www.veitor.net/posts/pod-stuck-in-terminating-state/
https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/#finalizers
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
https://unofficial-kubernetes.readthedocs.io/en/latest/concepts/abstractions/pod-termination/
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#looking-at-logs