How to add rule to migrate on node failure in k8s - amazon-web-services

I have k8s cluster running on 2 nodes and 1 master in AWS.
When I changed replica of my all replication pods are span on same node. Is there a way to distribute across nodes.?
sh-3.2# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
backend-6b647b59d4-hbfrp 1/1 Running 0 3h 100.96.3.3 node1
api-server-77765b4548-9xdql 1/1 Running 0 3h 100.96.3.1 node2
api-server-77765b4548-b6h5q 1/1 Running 0 3h 100.96.3.2 node2
api-server-77765b4548-cnhjk 1/1 Running 0 3h 100.96.3.5 node2
api-server-77765b4548-vrqdh 1/1 Running 0 3h 100.96.3.7 node2
api-db-85cdd9498c-tpqpw 1/1 Running 0 3h 100.96.3.8 node2
ui-server-84874d8cc-f26z2 1/1 Running 0 3h 100.96.3.4 node1
And when I tried to stop/terminated AWS instance (node-2) pods are in pending state instead of migrating to available node. Can we specify it ??
sh-3.2# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
backend-6b647b59d4-hbfrp 1/1 Running 0 3h 100.96.3.3 node1
api-server-77765b4548-9xdql 0/1 Pending 0 32s <none> <none>
api-server-77765b4548-b6h5q 0/1 Pending 0 32s <none> <none>
api-server-77765b4548-cnhjk 0/1 Pending 0 32s <none> <none>
api-server-77765b4548-vrqdh 0/1 Pending 0 32s <none> <none>
api-db-85cdd9498c-tpqpw 0/1 Pending 0 32s <none> <none>
ui-server-84874d8cc-f26z2 1/1 Running 0 3h 100.96.3.4 node1

Normally scheduler takes that under account and tries to spread your pods, but there are many reasons why the other node might be unschedulable at time of starting the pods. If you don't need to have multiple pods on the same node, you can force that with Pod Anti Affinity rules, with which you can say that pods of the same set of labels (ie. name and version) can never run on the same node.

Related

coredns pods are running but not ready

I used wavenet cni plugin . when i describe pod, i got the error saying that Warning Unhealthy 46s (x71 over 10m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503 any idea on this issue. I am using amazon ec2 rhel 8 instances
[ec2-user#master ~]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-jbczb 0/1 Running 0 15m
kube-system coredns-64897985d-pxxxx 0/1 Running 0 15m
kube-system etcd-master 1/1 Running 13 15m
kube-system kube-apiserver-master 1/1 Running 7 15m
kube-system kube-controller-manager-master 1/1 Running 1 15m
kube-system kube-proxy-2b9vp 1/1 Running 0 15m
kube-system kube-proxy-8sbw8 1/1 Running 0 8m18s
kube-system kube-proxy-k9w7g 1/1 Running 0 7m59s
kube-system kube-scheduler-master 1/1 Running 7 15m
kube-system weave-net-5hrbz 2/2 Running 0 7m59s
kube-system weave-net-fk4c6 2/2 Running 0 8m18s
kube-system weave-net-zpwpg 2/2 Running 0 11m

Unable to get ArgoCD working on EC2 running centos 7

I am trying to run argocd on my EC2 instance running centos 7 by following official documentation and EKS workshop from AWS, but it is in pending state, all pods from kube-system namespace are running fine.
below is the output of kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
argocd argocd-application-controller-5785f6b79-nvg7n 0/1 Pending 0 29s
argocd argocd-dex-server-7f5d7d6645-gprpd 0/1 Pending 0 19h
argocd argocd-redis-cccbb8f7-vb44n 0/1 Pending 0 19h
argocd argocd-repo-server-67ddb49495-pnw5k 0/1 Pending 0 19h
argocd argocd-server-6bcbf7997d-jqqrw 0/1 Pending 0 19h
kube-system calico-kube-controllers-56b44cd6d5-tzgdm 1/1 Running 0 19h
kube-system calico-node-4z9tx 1/1 Running 0 19h
kube-system coredns-f9fd979d6-8d6hm 1/1 Running 0 19h
kube-system coredns-f9fd979d6-p9dq6 1/1 Running 0 19h
kube-system etcd-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-apiserver-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-controller-manager-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
kube-system kube-proxy-tkp7k 1/1 Running 0 19h
kube-system kube-scheduler-ip-10-1-3-94.us-east-2.compute.internal 1/1 Running 0 19h
While same configuration is working fine on my local mac, I've made sure that docker, kubernetes services are up and runnning. Tried deleting pods, reconfigured argocd, however everytime result remained same.
Being new to ArgoCD I am unable to figure out the reason for the same. Please let me know where I am going wrong. Thanks!
I figured out what the problem was by running:
kubectl describe pods <name> -n argocd
It gave output ending with FailedScheduling:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x5 over 7m2s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
henceforth, by referring this GitHub issue, I figured out to run:
kubectl taint nodes --all node-role.kubernetes.io/master-
After this command, pods started to work and transitioned from Pending state to Running with kubectl describe pods showing output as:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m (x5 over 7m2s) default-scheduler 0/1 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
Normal Scheduled 106s default-scheduler Successfully assigned argocd/argocd-server-7d44dfbcc4-qfj6m to ip-XX-XX-XX-XX.<region>.compute.internal
Normal Pulling 105s kubelet Pulling image "argoproj/argocd:v1.7.6"
Normal Pulled 81s kubelet Successfully pulled image "argoproj/argocd:v1.7.6" in 23.779457251s
Normal Created 72s kubelet Created container argocd-server
Normal Started 72s kubelet Started container argocd-server
From this error and resolution I've learned to always use kubectl describe pods to resolve the errors.

In AWS EKS, how to install and access etcd, kube-apiserver, and other things?

I am learning AWS EKS now and I want to know how to access etcd, kube-apiserver and other control plane components?
For example, when we run command as below in minikube, we can find etcd-minikube,kube-apiserver-minikube
[vagrant#localhost ~]$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6955765f44-lrt6z 1/1 Running 0 176d
kube-system coredns-6955765f44-xbtc2 1/1 Running 1 176d
kube-system etcd-minikube 1/1 Running 1 176d
kube-system kube-addon-manager-minikube 1/1 Running 1 176d
kube-system kube-apiserver-minikube 1/1 Running 1 176d
kube-system kube-controller-manager-minikube 1/1 Running 1 176d
kube-system kube-proxy-69mqp 1/1 Running 1 176d
kube-system kube-scheduler-minikube 1/1 Running 1 176d
kube-system storage-provisioner 1/1 Running 2 176d
And then, we can access them by below command:
[vagrant#localhost ~]$ kubectl exec -it -n kube-system kube-apiserver-minikube -- /bin/sh
# kube-apiserver
W0715 13:56:17.176154 21 services.go:37] No CIDR for service cluster IPs specified.
...
My question: I want to do something like the above example in AWS EKS, but I cannot find kube-apiserver
xiaojie#ubuntu:~/environment/calico_resources$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system aws-node-flv95 1/1 Running 0 23h
kube-system aws-node-kpkv9 1/1 Running 0 23h
kube-system aws-node-rxztq 1/1 Running 0 23h
kube-system coredns-cdd78ff87-bjnmg 1/1 Running 0 23h
kube-system coredns-cdd78ff87-f7rl4 1/1 Running 0 23h
kube-system kube-proxy-5wv5m 1/1 Running 0 23h
kube-system kube-proxy-6846w 1/1 Running 0 23h
kube-system kube-proxy-9rbk4 1/1 Running 0 23h
AWS EKS is a managed kubernetes offering. Kubernetes control plane components such as API Server, ETCD are installed, managed and upgraded by AWS. Hence you can neither see these components nor can exec into these components.
In AWS EKS you can only play with the worker nodes
You are at the left ... AWS is at the right
EKS is not a managed service for the whole kubernetes cluster.
EKS is a managed service only for Kubernetes Master nodes.
That's why, it's worth to operate EKS with tools (.e.g; terraform) that helps provisioning the whole cluster in no time .. as explained here.
As what Arghya Sadhu and Abdennour TOUMI said, EKS Encapsulates most Control Plane Components but kube-proxy, See here.
Amazon Elastic Kubernetes Service (Amazon EKS) is a managed service that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own Kubernetes control plane.
So, I have tried to find the way to configure these Components instead of accessing these container and input command, but finally I give up. See this Github issue.

How do i connect AWS RDS to mysql Kubernetes Pod

So i have launched a wordpress by following the documentation provided from https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/ but i see that the mysql is running as a pod, but my requirement to connect the running mysql pod to AWS rds so that i can dump my existing info into it.Please guide me
pod/wordpress-5f444c8849-2rsfd 1/1 Running 0 27m
pod/wordpress-mysql-ccc857f6c-7hj9m 1/1 Running 0 27m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 29m
service/wordpress LoadBalancer 10.100.148.152 a4a868cfc752f41fdb4397e3133c7001-1148081355.us-east-1.elb.amazonaws.com 80:32116/TCP 27m
service/wordpress-mysql ClusterIP None <none> 3306/TCP 27m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/wordpress 1/1 1 1 27m
deployment.apps/wordpress-mysql 1/1 1 1 27m
NAME DESIRED CURRENT READY AGE
replicaset.apps/wordpress-5f444c8849 1 1 1 27m
replicaset.apps/wordpress-mysql-ccc857f6c 1 1 1 27m
Once you have mysql running on K9s as a cluster ip service, which is accessible only inside the cluster via it's own ip.
wordpress-mysql:3306
you can double check your database recreating your service as NodePort, then you would be able to connect via SQL Administrator like Workbench, then you would be able to admin it...
here is an example: https://www.youtube.com/watch?v=s0uIvplOqJM

Are these pods inside the overlay network?

How can I confirm whether or not some of the pods in this Kubernetes cluster are running inside the Calico overlay network?
Pod Names:
Specifically, when I run kubectl get pods --all-namespaces, only two of the nodes in the resulting list have the word calico in their names. The other pods, like etcd and kube-controller-manager, and others do NOT have the word calico in their names. From what I read online, the other pods should have the word calico in their names.
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-l6jd2 1/2 Running 0 51m
kube-system calico-node-wvtzf 1/2 Running 0 51m
kube-system coredns-86c58d9df4-44mpn 0/1 ContainerCreating 0 40m
kube-system coredns-86c58d9df4-j5h7k 0/1 ContainerCreating 0 40m
kube-system etcd-ip-10-0-0-128.us-west-2.compute.internal 1/1 Running 0 50m
kube-system kube-apiserver-ip-10-0-0-128.us-west-2.compute.internal 1/1 Running 0 51m
kube-system kube-controller-manager-ip-10-0-0-128.us-west-2.compute.internal 1/1 Running 0 51m
kube-system kube-proxy-dqmb5 1/1 Running 0 51m
kube-system kube-proxy-jk7tl 1/1 Running 0 51m
kube-system kube-scheduler-ip-10-0-0-128.us-west-2.compute.internal 1/1 Running 0 51m
stdout from applying calico
The stdout that resulted from applying calico is as follows:
$ sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
configmap/calico-config created
service/calico-typha created
deployment.apps/calico-typha created
poddisruptionbudget.policy/calico-typha created
daemonset.extensions/calico-node created\nserviceaccount/calico-node created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
How the cluster was created:
The commands that installed the cluster are:
$ sudo -i
# kubeadm init --kubernetes-version 1.13.1 --pod-network-cidr 192.168.0.0/16 | tee kubeadm-init.out
# exit
$ sudo mkdir -p $HOME/.kube
$ sudo chown -R lnxcfg:lnxcfg /etc/kubernetes
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ sudo kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml
$ sudo kubectl apply -f https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
This is running on AWS in Amazon Linux 2 host machines.
as per the official docs : (https://docs.projectcalico.org/v3.6/getting-started/kubernetes/) it looks fine. It contains further commands to activate and also check out the demo on the frontpage which shows some verifications
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6ff88bf6d4-tgtzb 1/1 Running 0 2m45s
kube-system calico-node-24h85 2/2 Running 0 2m43s
kube-system coredns-846jhw23g9-9af73 1/1 Running 0 4m5s
kube-system coredns-846jhw23g9-hmswk 1/1 Running 0 4m5s
kube-system etcd-jbaker-1 1/1 Running 0 6m22s
kube-system kube-apiserver-jbaker-1 1/1 Running 0 6m12s
kube-system kube-controller-manager-jbaker-1 1/1 Running 0 6m16s
kube-system kube-proxy-8fzp2 1/1 Running 0 5m16s
kube-system kube-scheduler-jbaker-1 1/1 Running 0 5m41s
Could you please let me where you found the literature mentioning that other pods would also have the calico name in them?
As far as I know, in the kube-system namespace, the scheduler, api server, controller and the proxy are provided by native kubernetes, hence the naming convention doesn't have any calico in them.
And one more thing, calico applies to the PODs you create for the actual applications you wish to run on k8s, not to the kubernetes control plane.
Are you facing any problem with the cluster creation? Then the question would be different.
Hope this helps.
This is normal and expected behavior, you have only a few pods starting with Calico. They are created when you initialize Calico or add new nodes to your cluster.
etcd-*, kube-apiserver-*, kube-controller-manager-*, coredns-*, kube-proxy-*, kube-scheduler-* are mandatory system components, pods have no dependency on Calico. Hence names would be system based.
Also, as #Jonathan_M already wrote - Calico doesn't apply to K8s control plane. Only to newly created pods
You could verify whether your pods inside network overlay or not by using kubectl get pods --all-namespaces -o wide
My example:
kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default my-nginx-76bf4969df-4fwgt 1/1 Running 0 14s 192.168.1.3 kube-calico-2 <none> <none>
default my-nginx-76bf4969df-h9w9p 1/1 Running 0 14s 192.168.1.5 kube-calico-2 <none> <none>
default my-nginx-76bf4969df-mh46v 1/1 Running 0 14s 192.168.1.4 kube-calico-2 <none> <none>
kube-system calico-node-2b8rx 2/2 Running 0 70m 10.132.0.12 kube-calico-1 <none> <none>
kube-system calico-node-q5n2s 2/2 Running 0 60m 10.132.0.13 kube-calico-2 <none> <none>
kube-system coredns-86c58d9df4-q22lx 1/1 Running 0 74m 192.168.0.2 kube-calico-1 <none> <none>
kube-system coredns-86c58d9df4-q8nmt 1/1 Running 0 74m 192.168.1.2 kube-calico-2 <none> <none>
kube-system etcd-kube-calico-1 1/1 Running 0 73m 10.132.0.12 kube-calico-1 <none> <none>
kube-system kube-apiserver-kube-calico-1 1/1 Running 0 73m 10.132.0.12 kube-calico-1 <none> <none>
kube-system kube-controller-manager-kube-calico-1 1/1 Running 0 73m 10.132.0.12 kube-calico-1 <none> <none>
kube-system kube-proxy-6zsxc 1/1 Running 0 74m 10.132.0.12 kube-calico-1 <none> <none>
kube-system kube-proxy-97xsf 1/1 Running 0 60m 10.132.0.13 kube-calico-2 <none> <none>
kube-system kube-scheduler-kube-calico-1 1/1 Running 0 73m 10.132.0.12 kube-calico-1 <none> <none>
kubectl get nodes --all-namespaces -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube-calico-1 Ready master 84m v1.13.4 10.132.0.12 <none> Ubuntu 16.04.5 LTS 4.15.0-1023-gcp docker://18.9.2
kube-calico-2 Ready <none> 70m v1.13.4 10.132.0.13 <none> Ubuntu 16.04.6 LTS 4.15.0-1023-gcp docker://18.9.2
You can see that K8s control plane uses initial IPs and nginx deployment pods already use Calico 192.168.0.0/16 range.