How to deploy web service into Amazon EKS? - amazon-web-services

I have configured a cluster in EKS but I am not able to deploy a web service into EKS. I tried the steps mentioned in https://github.com/spjenk/HelloSpringEKS but I failed to create tiller-deploy.
C:\WINDOWS\system32>kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system tiller-deploy-5d6cc99fc-hn6dm 0/1 ContainerCreating 0 6h45m
The tiller-deploy just stuck in ContainerCreating status.
Any help would be appreciated.
Thanks.

Related

EKS cluster upgrade fail with Kubelet version of Fargate pods must be updated to match cluster version

I have an EKS cluster v1.23 with Fargate nodes. Cluster and Nodes are in v1.23.x
$ kubectl version --short
Server Version: v1.23.14-eks-ffeb93d
Fargate nodes are also in v1.23.14
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fargate-ip-x-x-x-x.region.compute.internal Ready <none> 7m30s v1.23.14-eks-a1bebd3
fargate-ip-x-x-x-xx.region.compute.internal Ready <none> 7m11s v1.23.14-eks-a1bebd3
When I tried to upgrade cluster to 1.24 from AWS console, it gives this error.
Kubelet version of Fargate pods must be updated to match cluster version 1.23 before updating cluster version; Please recycle all offending pod replicas
What are the other things I have to check?
Fargate nodes are also in v1.23.14
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fargate-ip-x-x-x-x.region.compute.internal Ready <none> 7m30s v1.23.14-eks-a1bebd3
fargate-ip-x-x-x-xx.region.compute.internal Ready <none> 7m11s v1.23.14-eks-a1bebd3
From your question you only have 2 nodes, likely you are running only the coredns. Try kubectl scale deployment coredns --namespace kube-system --replicas 0 then upgrade. You can scale it back to 2 when the control plane upgrade is completed. Nevertheless, ensure you have selected the correct cluster on the console.

AWS EKS fargate coredns ImagePullBackOff

I'm trying to deploy a simple tutorial app to a new fargate based kubernetes cluster.
Unfortunately I'm stuck on ImagePullBackOff for the coredns pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning LoggingDisabled 5m51s fargate-scheduler Disabled logging because aws-logging configmap was not found. configmap "aws-logging" not found
Normal Scheduled 4m11s fargate-scheduler Successfully assigned kube-system/coredns-86cb968586-mcdpj to fargate-ip-172-31-55-205.eu-central-1.compute.internal
Warning Failed 100s kubelet Failed to pull image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": rpc error: code = Unknown desc = failed to pull and unpack image "602
401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": failed to resolve reference "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1": failed to do request: Head "https://602401143452.dkr.
ecr.eu-central-1.amazonaws.com/v2/eks/coredns/manifests/v1.8.0-eksbuild.1": dial tcp 3.122.9.124:443: i/o timeout
Warning Failed 100s kubelet Error: ErrImagePull
Normal BackOff 99s kubelet Back-off pulling image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1"
Warning Failed 99s kubelet Error: ImagePullBackOff
Normal Pulling 87s (x2 over 4m10s) kubelet Pulling image "602401143452.dkr.ecr.eu-central-1.amazonaws.com/eks/coredns:v1.8.0-eksbuild.1"
While googling I found https://aws.amazon.com/premiumsupport/knowledge-center/eks-ecr-troubleshooting/
It contains a following list:
To resolve this error, confirm the following:
- The subnet for your worker node has a route to the internet. Check the route table associated with your subnet.
- The security group associated with your worker node allows outbound internet traffic.
- The ingress and egress rule for your network access control lists (ACLs) allows access to the internet.
Since I've created both my private subnets as well as their NAT Gateways manually I tried to locate an issue here but couldn't find anything. They as well as security groups and ACLs look fine to me.
I even added the AmazonEC2ContainerRegistryReadOnly to my EKS role but after issuing command kubectl rollout restart -n kube-system deployment coredns the result is unfortunately the same: ImagePullBackOff
Unfortunately I've runned out of ideas and I'm stuck. Any help that would help me troubleshoot this would be greatly appreciated. ~Thanks
edit>
After creating new cluster via *eksctl as #mreferre suggested in his comment I get RBAC error with link: https://docs.aws.amazon.com/eks/latest/userguide/troubleshooting_iam.html#security-iam-troubleshoot-cannot-view-nodes-or-workloads
I'm not sure what is going on since I already have
edit>>
The cluster created via AWS Console ( web interface ) doesn't have the configmap aws-auth I've retrieved the configmap below using command kubectl edit configmap aws-auth -n kube-system
apiVersion: v1
data:
mapRoles: |
- groups:
- system:bootstrappers
- system:nodes
- system:node-proxier
rolearn: arn:aws:iam::370179080679:role/eksctl-tutorial-cluster-FargatePodExecutionRole-1J605HWNTGS2Q
username: system:node:{{SessionName}}
kind: ConfigMap
metadata:
creationTimestamp: "2021-04-08T18:42:59Z"
name: aws-auth
namespace: kube-system
resourceVersion: "918"
selfLink: /api/v1/namespaces/kube-system/configmaps/aws-auth
uid: d9a21964-a8bf-49e9-800f-650320b7444e
Creating an answer to sum up the discussion in the comment that deemed to be acceptable. The most common (and arguably easier) way to setup an EKS cluster with Fargate support is to use EKSCTL and setup the cluster using eksctl create cluster --fargate. This will build all the plumbing for you and you will get a cluster with no EC2 instances nor managed node groups with the two CoreDNS pods deployed on two Fargate instances. Note that when you deploy EKSCTL via the command line you may end up using different roles/users between your CLI and console. This may result in access denied issues. Best course of action would be to use a non-root user to login into the AWS console and use CloudShell to deploy with EKSCTL (CloudShell will inherit the same console user identity). {More info in the comments}

How to work with AWS cloud controller manager

I am trying to expose my applications running in my kubernetes cluster through AWS load balancer.
I followed the document https://cloudyuga.guru/blog/cloud-controller-manager and got till the point where i added --cloud-provider=external in kubeadm.conf file.
But this document is based on Digitial Ocean cloud and i'm working on AWS, i'm confused if i have to run any deployment.yaml file to get the pods running which are in pending status if so please provide me the link, i'm stuck at this point.
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-dlx76 0/1 Pending 0 3m32s
coredns-66bff467f8-svb6z 0/1 Pending 0 3m32s
etcd-ip-172-31-74-144.ec2.internal 1/1 Running 0 3m38s
kube-apiserver-ip-172-31-74-144.ec2.internal 1/1 Running 0 3m38s
kube-controller-manager-ip-172-31-74-144.ec2.internal 1/1 Running 0 3m37s
kube-proxy-rh8g4 1/1 Running 0 3m32s
kube-proxy-vsvlt 1/1 Running 0 3m28s
kube-scheduler-ip-172-31-74-144.ec2.internal 1/1 Running 0 3m37s
The coredns pods are pending because you have not installed a Pod Network add-on yet. From the docs here you can choose any supported Pod Network add-on. For example to use calico
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
After the Pod Network add-on is installed the coredns pods should come up.

How can I troubleshoot a Rancher HA deployment cert-manager issue on AWS?

I am new to both Rancher and K8s.
I walked through the Rancher HA documentation and deployed a 3-node cluster on AWS with a Layer 4 Load Balanced configured.
Everything indicates that the deployment was successful, but I am having issues with certificates. When I go to the sit after install (https://rancher.domain.net), I am prompted with an un-trusted site warning. I accept the risk , then the page just hangs. I can see the rancher favicon, but the page never loads.
I opted for the self-signed certs to get it up and running. My AWS NLB is just forward 443 and 80 to the target groups and not using a ACM provided cert.
I checked these two settings per the documentation:
$ kubectl -n cattle-system describe certificate
No resources found in castle-system namespace.
$ kubectl -n cattle-system describe issuer
No resources found in castle-system namespace.
Describe issuer originally showed what looked like appropriate output, but that is no longer showing anything.
I rand this command:
$ kubectl get pods --namespace cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-**********-***** 1/1 Running 0 34m
cert-manager-caininjector-**********-***** 1/1 Running 0 34m
cert-manager-webhook-**********-***** 1/1 Running 0 34m
At this point, I am beyond my experience and would appreciate some pointers on how to troubleshoot this.
List the services. What is the status of the rancher service?
kubectl -n <namespace> get services
Can you describe the rancher service object?
kubectl -n <namespace> describe service <rancher service>
or
kubectl -n <namespace> get service <rancher service> -o json
Is it of type Loadbalancer, i.e. did you let Kubernetes AWS Cloud provider create the NLB or did you create it outside of K8S? If you can better to let Kubernetes create the LB.
Reference for tweaking the cloud providers via annotations.

How to get k8s master logs on EKS?

I am looking for these logs:
/var/log/kube-apiserver.log
/var/log/kube-scheduler.log
/var/log/kube-controller-manager.log
In EKS user does not have access to the control plane and can't see these files directly.
I am aware of CloudTrail integration announced by AWS. But it shows events not from k8s API, but AWS EKS API like CreateCluster event. Also the open question how to get scheduler and controller manager logs.
There is no pods for api and controller in pods list.
$ kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system aws-node-9f4lm 1/1 Running 0 2h
kube-system aws-node-wj2cg 1/1 Running 0 2h
kube-system kube-dns-64b69465b4-4gw6n 3/3 Running 0 2h
kube-system kube-proxy-7mt7l 1/1 Running 0 2h
kube-system kube-proxy-vflzv 1/1 Running 0 2h
There is no master nodes in the node list
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-0-92.ec2.internal Ready <none> 9m v1.10.3
ip-10-0-1-63.ec2.internal Ready <none> 9m v1.10.3
Logs can be send to CloudWatch (not free of charge). The following logs can be individually selected to be send to CloudWatch:
API server
Audit
Authenticator
Controller Manager
Scheduler
Logs can be enabled via UI or AWS CLI. See Amazon EKS Control Plane Logging
Things like kube-api server logs, the kube-scheduler logs, the kube-controller manager logs, etc. will be available in CloudWatch logs. While (as you have stated) EKS API calls will be logged to cloudtrail.
**I take that back, I guess AWS EKS has not gotten around to that yet. You will need to use an EFK stack to get the logs.
Someone has already put together a quick how-to:
https://github.com/aws-samples/aws-workshop-for-kubernetes/tree/master/02-path-working-with-clusters/204-cluster-logging-with-EFK