Delete Kubernetes POD stuck in terminating state?


The deletion of the Kubernetes POD depends on many factors, before you execute the kubectl delete pod command you should go through the following checkpoint -

  1. Is the container associated with POD still running?
  2. Are there any active Kubernetes deployment associated with POD?
  3. Check if the POD persistent volume and persistent volume claim still exist?
  4. Check if the host is running out of memory?
  5. Are there any Finalizer associated with the POD?

If you have done the preliminary check based on the above-mentioned points then you can aim for deletion of the POD otherwise you should not issue kubectl delete pod command because it will lead you to terminating state forever.

You can check the status of the POD by running the following command -

1kubectl get pods 

Here is the status of the POD -

1NAME                                    READY       STATUS        RESTARTS       AGE
2mysql-95d6b45b5-72c99-7ef9efa7cd-qasd2   1/1       Terminating        0          25sh

Here are the steps for troubleshooting POD deletion -

  1. Use kubectl describe pod to view the running information of POD
  2. Check and remove Finalizer(if finalizer is applied on POD)
  3. Check the status of the NODE for the stuck POD
  4. Check the running information of Deployment associated with POD stuck in terminating status
  5. Verify the PV(Persistent Volume) and PVC(Persistent Volume Claim)
  6. Force Delete POD stuck in Terminating Status
  7. Restart Kubelet

(*Note - Follow this post - If you are planning to delete all PODs at once)

1. Use kubectl describe pod to view the running information of POD

Before reaching any conclusion the first recommendation would be to check the running information of the POD using the command kubectl describe POD <YOUR_POD_NAME>.

1kubectl describe -n mynamespace pod mysql-95d6b45b5-6bdcw 

It should return you with the following information about the POD -

 1Name:         mysql-95d6b45b5-6bdcw
 2Namespace:    default
 3Priority:     0
 4Node:         node1/100.0.0.2
 5Start Time:   Wed, 08 Dec 2021 19:50:39 +0000
 6Labels:       app=mysql
 7              pod-template-hash=95d6b45b5
 8Annotations:  cni.projectcalico.org/containerID: c93d71d0c0b2655eabc23cea3be2c22019233ad3ca1af2193dc87617ff8c2adb
 9              cni.projectcalico.org/podIP: 10.233.90.48/32
10              cni.projectcalico.org/podIPs: 10.233.90.48/32
11Status:       Running
12IP:           10.233.90.48
13IPs:
14  IP:           10.233.90.48
15Controlled By:  ReplicaSet/mysql-95d6b45b5
16Containers:
17  mysql:
18    Container ID:   docker://0cf40c528099f14c3164d55143ba218ebd4778f174343d17818fd10fae64fad4
19    Image:          mysql:5.6
20    Image ID:       docker-pullable://mysql@sha256:cdb7b3a69c0f36ce61dda653cdbe1bf086b6a98c1bf6fa023f7a37bc8325dc98
21    Port:           3306/TCP
22    Host Port:      0/TCP
23    State:          Running
24      Started:      Wed, 08 Dec 2021 19:50:41 +0000
25    Ready:          True
26    Restart Count:  0
27    Environment:
28      MYSQL_ROOT_PASSWORD:  <set to the key 'password' in secret 'mysql-secret'>  Optional: false
29    Mounts:
30      /home/vagrant/storage from mysql-persistent-storage (rw)
31      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j7l8x (ro)
32Conditions:
33  Type              Status
34  Initialized       True 
35  Ready             True 
36  ContainersReady   True 
37  PodScheduled      True 
38Volumes:
39  mysql-persistent-storage:
40    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
41    ClaimName:  test-pvc
42    ReadOnly:   false
43  kube-api-access-j7l8x:
44    Type:                    Projected (a volume that contains injected data from multiple sources)
45    TokenExpirationSeconds:  3607
46    ConfigMapName:           kube-root-ca.crt
47    ConfigMapOptional:       <nil>
48    DownwardAPI:             true
49QoS Class:                   BestEffort
50Node-Selectors:              <none>
51Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
52                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
53Events:
54  Type    Reason     Age   From               Message
55  ----    ------     ----  ----               -------
56  Normal  Scheduled  12m   default-scheduler  Successfully assigned default/mysql-95d6b45b5-6bdcw to node1
57  Normal  Pulled     12m   kubelet            Container image "mysql:5.6" already present on machine
58  Normal  Created    12m   kubelet            Created container mysql
59  Normal  Started    12m   kubelet            Started container mysql

Or you can export the above information into the .txt file -

1kubectl get pod -n mynamespace pod mysql-95d6b45b5-6bdcw  -o yaml > /tmp/pod_details.txt

2. Check and remove Finalizer(if finalizer is applied on POD)

From the Step-1 you have all the necessary information available for troubleshooting the pod status of terminating.

  1. Now look for finalizers inside the information logs which you got from Step-1.
  2. If there are any finalizers then you need to delete the finalizer first because the finalizer will prevent you from removing the resource from the Kubernetes cluster.
  3. Use the following command to remove the finalizer -
1kubectl patch -n mynamespace pod mysql-95d6b45b5-6bdcw -p '{"metadata":{"finalizers":null}}'
  1. If there are no finalizers then proceed to Step-3.

3. Check the status of the NODE for the stuck POD

The third information we need to check is the NODE. If you look carefully in the running information log which we have collected in Step-1 you will find the Node details also -

Here is the actual logs -

 1Node-Selectors:              <none>
 2Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
 3                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
 4Events:
 5  Type    Reason     Age   From               Message
 6  ----    ------     ----  ----               -------
 7  Normal  Scheduled  12m   default-scheduler  Successfully assigned default/mysql-95d6b45b5-6bdcw to node1
 8  Normal  Pulled     12m   kubelet            Container image "mysql:5.6" already present on machine
 9  Normal  Created    12m   kubelet            Created container mysql
10  Normal  Started    12m   kubelet            Started container mysql

If you do not find any particular or any error then your Kubernetes NODE is healthy otherwise you should get some error message.


4. Check the running information of Deployment associated with POD stuck in terminating status

The next troubleshooting would be checking the deployment which is associated with your POD. There is a very thin change of deployment causing trouble in the deletion of POD but it’s worth checking.

Run the following command to check the deployment -

1kuebctl get deployment 

Here is the status of the deployment -

1NAME    READY   UP-TO-DATE   AVAILABLE   AGE
2mysql   1/1     1            1           53m 

Also use the describe command to check the status of the deployment -

1kubectl describe deployment mysql

Here is the running information of the deployment (*Note - If you do not notice any error in deployment then delete the deployment)

 1Name:               mysql
 2Namespace:          default
 3CreationTimestamp:  Wed, 08 Dec 2021 19:48:06 +0000
 4Labels:             <none>
 5Annotations:        deployment.kubernetes.io/revision: 1
 6Selector:           app=mysql
 7Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
 8StrategyType:       Recreate
 9MinReadySeconds:    0
10Pod Template:
11  Labels:  app=mysql
12  Containers:
13   mysql:
14    Image:      mysql:5.6
15    Port:       3306/TCP
16    Host Port:  0/TCP
17    Environment:
18      MYSQL_ROOT_PASSWORD:  <set to the key 'password' in secret 'mysql-secret'>  Optional: false
19    Mounts:
20      /home/vagrant/storage from mysql-persistent-storage (rw)
21  Volumes:
22   mysql-persistent-storage:
23    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
24    ClaimName:  test-pvc
25    ReadOnly:   false
26Conditions:
27  Type           Status  Reason
28  ----           ------  ------
29  Progressing    True    NewReplicaSetAvailable
30  Available      True    MinimumReplicasAvailable
31OldReplicaSets:  <none>
32NewReplicaSet:   mysql-95d6b45b5 (1/1 replicas created)
33Events:
34  Type    Reason             Age   From                   Message
35  ----    ------             ----  ----                   -------
36  Normal  ScalingReplicaSet  54m   deployment-controller  Scaled up replica set mysql-95d6b45b5 to 1

Delete the deployment associated with POD stuck in terminating state

1kubectl delete deployment mysql 

5. Verify the PV(Persistent Volume) and PVC(Persistent Volume Claim

The last resource we need to check before doing the force delete of POD is PV(Persistent Volume) and PVC(Persistent Volume Claim.

Run the following kubectl command to list PV and PVC

1kubectl get pv 

Here is the list of PV

1NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM              STORAGECLASS    REASON   AGE
2jhooq-pv   1Gi        RWO            Retain           Bound    default/jhooq-pv   local-storage            65m

Here is the list of PVC

1kubectl get pvc 
1NAME       STATUS   VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS    AGE
2jhooq-pvc   Bound    jhooq-pv   1Gi        RWO            local-storage   66m

Delete the PV, PVC associated with POD stuck in terminating state

1kuebctl delete pv jhooq-pv 

and also delete PVC

1kuebctl delete pv jhooq-pvc

(*Note- Click here to read more about deleting Persistent Volume and Persistent Volume Claim)


6. Force Delete POD stuck in Terminating Status

If you have followed all the previous steps for debugging and checking respective associated Kubernetes resources then you can safely force delete your POD which is stuck in terminating

Here is the command for force delete -

1kubectl delete pod --grace-period=0 --force --namespace [YOUR_NAME_SPACE [POD_NAME] 

7. Restart kubelet

At last, you can do is to restart the kubelet. But for that, you need to have administrative permission also if you are working in a production-like environment then make sure to communicate the downtime to avoid service outages.

Use the following bash command to restart the kubelet -

1/etc/init.d/kubelet restart 

Learn more On Kubernetes -

  1. Setup kubernetes on Ubuntu
  2. Setup Kubernetes on CentOs
  3. Setup HA Kubernetes Cluster with Kubespray
  4. Setup HA Kubernetes with Minikube
  5. Setup Kubernetes Dashboard for local kubernetes cluster
  6. Setup Kubernetes Dashboard On GCP(Google Cloud Platform)
  7. How to use Persistent Volume and Persistent Volume Claims in Kubernetes
  8. Deploy Spring Boot Microservice on local Kubernetes cluster
  9. Deploy Spring Boot Microservice on Cloud Platform(GCP)
  10. Setting up Ingress controller NGINX along with HAproxy inside Kubernetes cluster
  11. CI/CD Kubernetes | Setting up CI/CD Jenkins pipeline for kubernetes
  12. kubectl export YAML | Get YAML for deployed kubernetes resources(service, deployment, PV, PVC….)
  13. How to setup kubernetes jenkins pipeline on AWS?
  14. Implementing Kubernetes liveness, Readiness and Startup probes with Spring Boot Microservice Application?
  15. How to fix kubernetes pods getting recreated?
  16. How to delete all kubernetes PODS?
  17. How to use Kubernetes secrets?
  18. Share kubernetes secrets between namespaces?
  19. How to Delete PV(Persistent Volume) and PVC(Persistent Volume Claim) stuck in terminating state?
  20. Delete Kubernetes POD stuck in terminating state?