EKS Connector Pods stuck in Init:CrashLoopBackOff - amazon-web-services

I have a single node kubernetes cluster setup on AWS,I am currently running a VPC with one public and private subnet.
The master node is in the public subnet and worker node is in the private subnet.
So on the AWS console I can succesfuly register a cluster and download the connector manifest which, I then download and apply the manifest on my master node but unfortunately the pods don't start. the below is what i observered.
kubectl get pods
NAME               READY      STATUS              RESTARTS            AGE
eks-connector-0   0/2  Init:CrashLoopBackOff   7 (4m36s ago)       19m
kubectl logs ejs-connector-0
Defaulted container "connector-agent" out of: connector-agent, connector-proxy, connector-init (init)
Error from server (BadRequest): container "connector-agent" in pod "eks-connector-0" is waiting to start: PodInitializing
The pods are failing to start with th above logged errors.

I would suggest providing output of kubectl get pod eks-connector-0 -o yaml and kubectl logs -p eks-connector-0

Related

AWS NFS mount volume issue in kubernetes cluster (EKS)

I am using AWS EKS. As i am trying to mount efs to my eks cluster, getting the following error.
Warning FailedMount 3m1s kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-client-root], unattached volumes=[nfs-client-root nfs-client-provisioner-token-8bx56]: timed out waiting for the condition
Warning FailedMount 77s kubelet MountVolume.SetUp failed for volume "nfs-client-root" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/b07f3f15-b655-435c-8ec1-8d14b8690c1d/volumes/kubernetes.io~nfs/nfs-client-root --scope -- mount -t nfs 172.31.26.154:/mnt/nfs_share/ /var/lib/kubelet/pods/b07f3f15-b655-435c-8ec1-8d14b8690c1d/volumes/kubernetes.io~nfs/nfs-client-root
Output: Running scope as unit run-23226.scope.
mount.nfs: Connection timed out
And also i tried to connect with external nfs server, also getting the same warning message.
i have opened the inbound allow all traffic in eks cluster, efs and nfs security groups.
If it is the problem with nodes to install nfs-common, please let me know the steps how to install the nfs-common package inside the nodes.
As i am using AWS EKS, i am unable to login to the nodes.
While creating an ec2 machine for an external NFS-server, you must add it to the vpc used by the eks cluster and include it in the security group that nodes use to communicate with each other.

network: CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container issue in EKS. help anybody, please

When pods are increased through hpa, the following error occurs and pod creation is not possible.
If I manually change the replicas of the deployments, the pods are running normally.
It seems to be a CNI-related problem, and the same phenomenon occurs even if you install 1.7.10 cni for 1.20 cluster with add on .
200 IPs per subnet is sufficient, and the outbound security group is also open.
By default, that issue does not occur when the number of pods is scaled via kubectl .
7s Warning FailedCreatePodSandBox pod/b4c-ms-test-develop-5f64db58f-bm2vc Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7632e23d2f3db8f8b8c0335aaaa6afe1e52ad43cf293bfa6789aa14f5b665cf1" network for pod "b4c-ms-test-develop-5f64db58f-bm2vc": networkPlugin cni failed to set up pod "b4c-ms-test-develop-5f64db58f-bm2vc_b4c-test" network: CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "7632e23d2f3db8f8b8c0335aaaa6afe1e52ad43cf293bfa6789aa14f5b665cf1"
Region: eu-west-1
Cluster Name: dev-pangaia-b4c-eks
For AWS VPC CNI issue, have you attached node logs?: No
For DNS issue, have you attached CoreDNS pod log?:

EKS: Unhealthy nodes in the kubernetes cluster

I’m getting an error when using terraform to provision node group on AWS EKS.
Error: error waiting for EKS Node Group (xxx) creation: NodeCreationFailure: Unhealthy nodes in the kubernetes cluster.
And I went to console and inspected the node. There is a message “runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker network plugin is not ready: cni config uninitialized”.
I have 5 private subnets and connect to Internet via NAT.
Is someone able to give me some hint on how to debug this?
Here are some details on my env.
Kubernetes version: 1.18
Platform version: eks.3
AMI type: AL2_x86_64
AMI release version: 1.18.9-20201211
Instance types: m5.xlarge
There are three workloads set up in the cluster.
coredns, STATUS (2 Desired, 0 Available, 0 Ready)
aws-node STATUS (5 Desired, 5 Scheduled, 0 Available, 0 Ready)
kube-proxy STATUS (5 Desired, 5 Scheduled, 5 Available, 5 Ready)
go inside the coredns, both pods are in pending state, and conditions has “Available=False, Deployment does not have minimum availability” and “Progress=False, ReplicaSet xxx has timed out progressing”
go inside the one of the pod in aws-node, the status shows “Waiting - CrashLoopBackOff”
Add pod network add-on
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

Unable to connect Redis Instance from GKE pod - Connection to Redis <IP:6379 >failed after 2 failures.Last Error : (110) Operation timed out

I have GKE cluster that I created with following command:
$ gcloud container clusters create stage1 \
--enable-ip-alias \
--release-channel stable \
--zone us-central1 \
--node-locations us-central1-a,us-central1-b
and I also created a redis instance with following command:
$ gcloud redis instances create redisbox --size=2 --region=us-central1 --redis-version=redis_5_0
I have retrieved the IP address of the redis instance with:
$ gcloud redis instances describe redisbox --region=us-central1
I have updated this IP in my PHP application, built my docker image , created the pod in GKE cluster. When pod is created the container throws following error
Connection to Redis :6379 failed after 2 failures.Last Error : (110) Operation timed out
Note 1: This is working application in hosted environment and we are migrating to Google Cloud
Note 2: GKE and Redis instance is in same region
Note 3: Enabled IP aliasing in cluster
After reproducing this VPC-native GKE cluster and Redis instance with your gcloud commands, I could check that both the nodes and their pods can reach the redisbox host, for example with ncat in a debian:latest pod:
$ REDIS_IP=$(gcloud redis instances describe redisbox --format='get(host)' --region=us-central1)
$ gcloud container clusters get-credentials stage1 --region=us-central1
$ kubectl exec -ti my-debian-pod -- /bin/bash -c "ncat $REDIS_IP 6379 <<<PING"
+PONG
Therefore, I suggest that you try performing this lower-level reachability test in case there there is an issue with the specific request that your PHP application is doing.

Kubernetes on AWS using Kops - kube-apiserver authentication for kubectl

I have setup a basic 2 node k8s cluster on AWS using KOPS .. I had issues connecting and interacting with the cluster using kubectl ... and I keep getting the error:
The connection to the server api.euwest2.dev.avi.k8s.com was refused - did you specify the right host or port? when trying to run any kubectl command .....
have done basic kops export kubecfg --name xyz.hhh.kjh.k8s.com --config=~$KUBECONFIG --> to export the kubeconfig for the cluster I have created. Not sure what else I'm missing to make a successful connection to the kubeapi-server to make kubectl work ?
Sounds like either:
Your kube-apiserver is not running.
Check with docker ps -a | grep apiserver on your Kubernetes master.
api.euwest2.dev.avi.k8s.com is resolving to an IP address where your nothing is listening.
208.73.210.217?
You have the wrong port configured for your kube-apiserver on your ~/.kube/config
server: https://api.euwest2.dev.avi.k8s.com:6443?