If you saw and ImagePullBackOff error in your practice or in production, there happen mainly three reason. It means kubernetes can’t fetch your image and retrying with backoff.

  1. Misspelled image name or tag (nginx vs nginy)
  2. Image was deleted from Docker Hub
  3. Image is in a private registry but credentials weren’t provided

To use private images in kubernetes, firstly, copy the deployment file from kubernetes documentation or write by yourself.

As you can see there 3 replicas running, With the help of kubectl get pods -w  we can find that deployment will updating and show in the image.

 Lets get the private docker image and apply deployment

Fix the private registry providing your private username and token. As token is applied only for 30 days in docker hub.

As in the Kubernetes documentation

deployment.yml file look like this

Now we need to apply and check the status

As you can see in the image by using command kubectl describe pod <pod name>. You can see the image ID that image has been pulled from the kubebibek docker hub.

Thats all.

If my organization is using AWS ECS

You can use the given command and apply to interlink with your docker hub.

CrashLoopBackOff: Continuous Pod Restarts

Causes

The CrashLoopBackOff error indicates that your pod is restarting repeatedly due to one or more issues:

  • Incorrect Startup Command: The command specified in your Dockerfile or Kubernetes manifest might be wrong (e.g., using CMD ["app1.py"] instead of app.py).
  • Failing Liveness Probe: If the liveness probe is misconfigured, it may incorrectly determine that the application is unhealthy.
  • Out of Memory (OOMKilled): The application may be consuming more memory than the limits set in your resource configuration.

Debugging Steps

To diagnose the issue, you can use the following commands:

  1. Check Logs:Copykubectl logs <pod-name> This will show the output from the application, helping you identify any errors.
  2. Describe Pod:Copykubectl describe pod <pod-name> This command provides detailed information about the pod’s status and events.

Key Indicators to Look For

  • Completed: If you see a message indicating that the app exited too early, it suggests an issue with the startup command.
  • OOMKilled: This indicates that the application exceeded its memory limits.
  • Liveness Probe Failed: This points to a misconfiguration in the health check settings.

Fixes

  • Correct the Command: Ensure the startup command in your Dockerfile or manifest is accurate.
  • Adjust Resource Limits: Modify the memory and CPU limits if your application requires more resources.
  • Fix the Probe: Review and adjust the liveness probe settings, or temporarily remove it during testing.

Pods Stuck in Pending: Understanding Scheduling

Causes

When pods remain in a pending state, it often results from scheduling issues:

  • Node Label Mismatch: The pod may be requesting a node with a specific label that does not exist.
  • Taints Without Tolerations: The node might have a taint that the pod does not tolerate.

Useful Commands

To investigate pending pods, use these commands:

  1. Describe Pod:Copykubectl describe pod <pod-name>
  2. Show Node Labels:Copykubectl get nodes --show-labels
  3. Label Node:Copykubectl label node <node-name> disktype=ssd

Common Error Messages

  • 0/3 Nodes Match Node Selector: Indicates that no nodes meet the specified criteria.
  • 0/3 Nodes Had Untolerated Taint: Means that the pod cannot be scheduled due to a taint on the nodes.

Fixes

  • Check and Correct Configuration: Review and adjust the nodeSelectornodeAffinity, or tolerations in your deployment YAML.
  • Remove or Adjust Taints: If certain taints are unnecessary, consider removing or modifying them.

StatefulSet Pods Not Starting: PVC Binding Issues

If your StatefulSet works on one platform (like EKS) but fails on another (like Minikube or AKS), the problem is often related to the storage class.

Key Concepts

  • Sequential Pod Creation: StatefulSets create pods in a specific order.
  • PVC Binding: Persistent Volume Claims (PVCs) must bind to an existing StorageClass.
  • EBS on Minikube: Amazon EBS volumes won’t work in a Minikube environment.

Fixes

Ensure your volumeClaimTemplates are correctly defined:

CopyvolumeClaimTemplates:
  - metadata:
      name: www
    spec:
      storageClassName: standard

Additional Notes

  • Deleting PVCs: Remember that PVCs associated with StatefulSets persist even after the StatefulSet is deleted. Use:Copykubectl delete pvc <name>
  • Using CSI Drivers: For advanced storage solutions, consider using Container Storage Interface (CSI) drivers (e.g., NetApp, Portworx) to integrate with external storage platforms.

Final Thoughts

Kubernetes is a powerful orchestration tool, but effective troubleshooting in a production environment relies on:

  • Reading Logs: Analyze application logs for errors.
  • Interpreting Status Messages: Understand the output from Kubernetes commands to identify issues.
  • Understanding Scheduler and Kubelet Behavior: Gain insight into how Kubernetes schedules pods and manages node resources.