I have set up my Django project to deploy on the container engine based on documentation https://cloud.google.com/python/django/container-engine.
After creating kubernetes resources with
kubectl create -f project.yaml
I try to get the status of the pods with
kubectl get pods
Each of the pod has status **CrashLoopBackOff**
Can you please suggest on debugging this error?
Related
I am trying to setup Multi-cluster Ingress in GCP, following the steps at the documentation page.
When I try to setup config cluster using following command, it's giving me error -
$ gcloud alpha container hub ingress enable \
--config-membership=projects/myproject/locations/global/memberships/gke-us
ERROR: (gcloud.alpha.container.hub.ingress.enable) INVALID_ARGUMENT: InvalidValueError for field config_membership:
Membership
"projects/myproject/locations/global/memberships/projects/myproject/locations/global/memberships/gke-us"
does not exist
When I run kubectl config get-contexts, I can see the gke-us exists.
> $ gcloud container hub memberships list.
NAME EXTERNAL_ID
> gke-us 2b7924f5-c55e-485f-959b-9048b0920713
> gke-eu 9ed69091-56b7-44c0-af49-2ccf817c5fcf
I have done this setup earlier few times and never faced this issue but suddenly it has started coming.
I am trying to create an application load balancer controller on my EKS cluster by following
this link
When I run these steps (after making the necessary changes to the downloaded yaml file)
curl -o v2_1_2_full.yaml https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.1.2/docs/install/v2_1_2_full.yaml
kubectl apply -f v2_1_2_full.yaml
I get this output
customresourcedefinition.apiextensions.k8s.io/targetgroupbindings.elbv2.k8s.aws configured
mutatingwebhookconfiguration.admissionregistration.k8s.io/aws-load-balancer-webhook configured
role.rbac.authorization.k8s.io/aws-load-balancer-controller-leader-election-role unchanged
clusterrole.rbac.authorization.k8s.io/aws-load-balancer-controller-role configured
rolebinding.rbac.authorization.k8s.io/aws-load-balancer-controller-leader-election-rolebinding unchanged
clusterrolebinding.rbac.authorization.k8s.io/aws-load-balancer-controller-rolebinding unchanged
service/aws-load-balancer-webhook-service unchanged
deployment.apps/aws-load-balancer-controller unchanged
validatingwebhookconfiguration.admissionregistration.k8s.io/aws-load-balancer-webhook configured
Error from server (InternalError): error when creating "v2_1_2_full.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: no endpoints available for service "cert-manager-webhook"
Error from server (InternalError): error when creating "v2_1_2_full.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: no endpoints available for service "cert-manager-webhook"
The load balancer controller doesnt appear to start up because of this and never gets to the ready state
Has anyone any suggestions on how to resolve this issue?
Turns out the taints on my nodegroup prevented the cert-manager pods from starting on any node.
These commands helped debug and led me to a fix for this issue:
kubectl get po -n cert-manager
kubectl describe po <pod id> -n cert-manager
My solution was to create another nodeGroup with no taints specified. This allowed the cert-manager to run.
While I'm trying to get the pods or node states, from Google Cloud Platform Cloud Shell, I'm facing this error? Can someone please help me? I can see the output of the "kubectl config view".
Posting this answer as community wiki for better visibility and the fact that the possible solution was posted in the comments:
Does this answer your question? Unable to connect to the server: dial tcp i/o time out
Adding to that:
Below command:
$ kubectl config view
is used to show the configuration stored in your ./kube/config file. The fact that you can see the output of this command doesn't mean you have correct cluster configured to use with kubectl.
From the perspective of Google Cloud Platform and Cloud Shell
There is an official documentation regarding troubleshooting issues with GKE:
Cloud.google.com: Kubernetes Engine: Docs: Troubleshooting
There could be several reasons why you are getting following error:
You are referencing wrong cluster in your ~/.kube/config file.
$ gcloud container clusters get-credentials CLUSTER_NAME --zone=ZONE - you will need to run this command to fetch the correct configuration
You can also get above command from the Kubernetes Engine page (Connect button)
You are referencing a cluster in your ~/.kube/config file that was deleted
You created Private GKE cluster
For more information you can look in the Cloud Console -> Kubernetes Engine -> CLUSTER_NAME
You can also run:
$ gcloud container clusters list - this command will show clusters and their state (status) they are in
$ gcloud container clusters describe CLUSTER_NAME --zone=ZONE - this command will show you the configuration of the cluster
Having my cluster up and running on AWS EKS, I'm finding trouble running helm init with the following error:
$ helm init --service-account tiller --upgrade
Error: error installing: deployments.extensions is forbidden: User "system:anonymous" cannot create deployments.extensions in the namespace "kube-system"
kubectl works properly (object retrieval, creation and cluster administration), authenticating and authorizing correctly by running heptio-authenticator-aws at connection time ( with an exec section in the kubectl config).
In order to prepare the cluster for helm, I created the service account and role binding as specified in the helm docs.
I've heard of people having helm running on EKS, and I'm guessing they're skipping the exec section of the kubectl config by hardcoding the token... I'd like to avoid that!
Any ideas on how to fix this? My guess is that it is related to helm not being able to execute heptio-authenticator-aws properly
I was running helm version 2.8.2 when obtaining this error, upgrading to v2.9.1 fixed this!
Trying to update a multi-container pods with
kubectl rolling-update my_rc --image=eu.gcr.io/project_id/myimage
I got:
error: Image update is not supported for multi-container pods
What is the way to update a single container or I must delete and recreate the pod ?
For now, your best option is to update the yaml file defining the replication controller to use the new image and run:
kubectl rolling-update my_rc -f my_file.yaml
If you don't have a yaml file defining your replication controller, you can get one by running:
kubectl get rc my_rc --output=yaml > my_file.yaml
You should then be able to update the image specified in that file and run the rolling-update.
In the next release of Kubernetes (targeted for March), you'll be able to just pass the --container flag to tell kubectl which of the containers in the pod should use the new image:
kubectl rolling-update my_rc --container=my_container --image=eu.gcr.io/project_id/myimage
This feature was added by a community member after version 1.1 was cut.