Skip to content

Pod stuck in Terminating state

homepage-banner

Problem Description

When a Pod is being deleted, it may remain in a Terminating state for a long time. This can happen due to the following reasons:

  • The Pod has a finalizer associated with it, and the task of this finalizer is not complete.
  • The Pod is not responding to the termination signal.

When we execute the kubectl get pods command, we will see the following information:

NAME                     READY     STATUS             RESTARTS   AGE
nginx-7ef9efa7cd-qasd2   1/1       Terminating        0          1h

Troubleshooting and Resolution Steps

1. Gather some information

kubectl get pod -n [NAMESPACE] -p [POD_NAME] -o yaml

2. Check for finalizer

First, we need to check if the pod has any associated finalizer. If there is a finalizer associated, it is likely to be the cause of the problem.

Get and view the configuration information of a pod:

kubectl get pod -n [NAMESPACE] -p [POD_NAME] -o yaml > /tmp.txt

Then, check if the finalizer field exists under the metadata configuration block in the content of /tmp.txt. If it exists, use Solution A.

3. Check the status of the node

It could also be an issue with the node where the pod is located due to some reasons.

If you find that all pods on a certain node are in the Terminating state from the content of /tmp.txt, then it is an issue with that node.

4. Try deleting the pod

If the pod is not terminated, it may be due to the process not responding to signals. The specific reasons need to be considered in the context of the specific application. This may include:

  • Existence of a “tight loop” in the user space code that does not respond to interrupt signals.
  • A maintenance process during application runtime, such as garbage collection.

In this case, using Solution B may be feasible.

5. Restart kubelet

If other methods do not work, try restarting kubelet on the node where the pod is running. See Solution C.

Solution

Solution A: Remove finalizer

Remove all finalizers from the corresponding pod:

kubectl patch pod [POD_NAME] -p '{"metadata":{"finalizers":null}}'

Solution B: Force delete the pod

Please note that this is a workaround and not a recommended solution. Most articles online suggest this method as a first option. Be cautious when using this method to avoid any additional issues. For information on force deleting a pod from a StatefulSet, refer to https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/.

kubectl delete pod --grace-period=0 --force --namespace [NAMESPACE] [POD_NAME]

Solution C: Restart kubelet

If you have SSH access to the node, you can restart the kubelet process on that node. If you don’t have the necessary permissions, contact someone who does.

Before restarting kubelet, it’s recommended to check the kubelet logs for any potential issues.

Verification

If the kubectl get pods command no longer shows the terminating pod, it indicates that the issue has been resolved.

$ kubectl get pod -n mynamespace -p nginx-7ef9efa7cd-qasd2
NAME                     READY     STATUS             RESTARTS   AGE

Further Actions and Investigation

1. Check if the finalizer’s tasks need to be completed

This depends on the tasks performed by the finalizer. Usually, tasks related to volumes may cause the finalizer to remain incomplete.

2. Identify the root cause

This also depends on the tasks performed by the finalizer and requires some specific contextual information. If you have access to the node, check the kubelet logs as they may contain useful information for troubleshooting.

Reference

  • https://www.veitor.net/posts/pod-stuck-in-terminating-state/
  • https://kubernetes.io/docs/tasks/access-kubernetes-api/custom-resources/custom-resource-definitions/#finalizers
  • https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
  • https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
  • https://unofficial-kubernetes.readthedocs.io/en/latest/concepts/abstractions/pod-termination/
  • https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#looking-at-logs
Leave a message