kubectl wait for Service on AWS EKS to expose Elastic Load Balancer (ELB) address reported in .status.loadBalancer.ingress field - amazon-web-services

As the kubernetes.io docs state about a Service of type LoadBalancer:
On cloud providers which support external load balancers, setting the
type field to LoadBalancer provisions a load balancer for your
Service. The actual creation of the load balancer happens
asynchronously, and information about the provisioned balancer is
published in the Service's .status.loadBalancer field.
On AWS Elastic Kubernetes Service (EKS) a an AWS Load Balancer is provisioned that load balances network traffic (see AWS docs & the example project on GitHub provisioning a EKS cluster with Pulumi). Assuming we have a Deployment ready with the selector app=tekton-dashboard (it's the default Tekton dashboard you can deploy as stated in the docs), a Service of type LoadBalancer defined in tekton-dashboard-service.yml could look like this:
apiVersion: v1
kind: Service
metadata:
name: tekton-dashboard-external-svc-manual
spec:
selector:
app: tekton-dashboard
ports:
- protocol: TCP
port: 80
targetPort: 9097
type: LoadBalancer
If we create the Service in our cluster with kubectl apply -f tekton-dashboard-service.yml -n tekton-pipelines, the AWS ELB get's created automatically:
There's only one problem: The .status.loadBalancer field is populated with the ingress[0].hostname field asynchronously and is therefore not available immediately. We can check this, if we run the following commands together:
kubectl apply -f tekton-dashboard-service.yml -n tekton-pipelines && \
kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}'
The output will be an empty field:
{}%
So if we want to run this setup in a CI pipeline for example (e.g. GitHub Actions, see the example project's workflow provision.yml), we need to somehow wait until the .status.loadBalancer field got populated with the AWS ELB's hostname. How can we achieve this using kubectl wait?

TLDR;
Prior to Kubernetes v1.23 it's not possible using kubectl wait, but using until together with grep like this:
until kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}' | grep "ingress"; do : ; done
or even enhance the command using timeout (brew install coreutils on a Mac) to prevent the command from running infinitely:
timeout 10s bash -c 'until kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}' | grep "ingress"; do : ; done'
Problem with kubectl wait & the solution explained in detail
As stated in this so Q&A and the kubernetes issues kubectl wait unable to not wait for service ready #80828 & kubectl wait on arbitrary jsonpath #83094 using kubectl wait for this isn't possible in current Kubernetes versions right now.
The main reason is, that kubectl wait assumes that the status field of a Kubernetes resource queried with kubectl get service/xyz --output=yaml contains a conditions list. Which a Service doesn't have. Using jsonpath here would be a solution and will be possible from Kubernetes v1.23 on (see this merged PR). But until this version is broadly available in managed Kubernetes clusters like EKS, we need another solution. And it should also be available as "one-liner" just as a kubectl wait would be.
A good starting point could be this superuser answer about "watching" the output of a command until a particular string is observed and then exit:
until my_cmd | grep "String Im Looking For"; do : ; done
If we use this approach together with a kubectl get we can craft a command which will wait until the field ingress gets populated into the status.loadBalancer field in our Service:
until kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}' | grep "ingress"; do : ; done
This will wait until the ingress field got populated and then print out the AWS ELB address (e.g. via using kubectl get service tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer.ingress[0].hostname}' thereafter):
$ until kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}' | grep "ingress"; do : ; done
{"ingress":[{"hostname":"a74b078064c7d4ba1b89bf4e92586af0-18561896.eu-central-1.elb.amazonaws.com"}]}
Now we have a one-liner command that behaves just like a kubectl wait for our Service to become available through the AWS Loadbalancer. We can double check if this is working with the following commands combined (be sure to delete the Service using kubectl delete service/tekton-dashboard-external-svc-manual -n tekton-pipelines before you execute it, because otherwise the Service incl. the AWS LoadBalancer already exists):
kubectl apply -f tekton-dashboard-service.yml -n tekton-pipelines && \
until kubectl get service/tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer}' | grep "ingress"; do : ; done && \
kubectl get service tekton-dashboard-external-svc-manual -n tekton-pipelines --output=jsonpath='{.status.loadBalancer.ingress[0].hostname}'
Here's also a full GitHub Actions pipeline run if you're interested.

Related

Kubectl show expanded command when using alases or shorthand

Kubectl has many aliases like svc, po, deploy etc.
Is there a way to show the expanded command for a command with shorthand.
for example kubectl get po
to
kubectl get pods
On a similar question the api-resources is used # What's kubernetes abbreviation for deployments?
But it gives very top level shorthands,
for eg, kubeclt get svc expands to kubectl get services
but in kubectl create svc expands to kubectl create service
Kindly guide,
Thanks
kubectl explain may be of interest e.g.:
kubectl explain po
KIND: Pod
VERSION: v1
DESCRIPTION:
Pod is a collection of containers that can run on a host. This resource is
created by clients and scheduled onto hosts.
There are plugins for kubectl too.
I've not tried it but kubectl explore may be worth a try.
Unfortunately, kubectl isn't documented by explainshell.com which would be a boon as it would also document the various flags e.g. -n (--namespace) and -o (--output).

Istio Ingress not showing address (Kubeflow on AWS)

I'm trying to setup kubeflow on AWS, I did follow this tutorial to setup kubeflow on AWS.
I used dex instead of cognito with following policy.
then at step: kfctl apply -V -f kfctl_aws.yaml , first I received this error:
IAM for Service Account is not supported on non-EKS cluster
So to fix this I set the property enablePodIamPolicy: false
Then retried and it successfully deployed kubeflow, on checking services status using kubectl -n kubeflow get all, I found all services ready except MPI operator.
ignoring this when I tried to run kubectl get ingress -n istio-system
I got the following result.
upon investigation using kubectl -n kubeflow logs $(kubectl get pods -n kubeflow --selector=app=aws-alb-ingress-controller --output=jsonpath={.items..metadata.name})
I found the following error:
E1104 12:09:37.446342 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile LB managed SecurityGroup: failed to reconcile managed LoadBalancer securityGroup: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: Lsvzm7f4rthL4Wxn6O8wiQL1iYXQUES_9Az_231BV7fyjgs7CHrwgUOVTNTf4334_C4voUogjSuCoF8GTOKhc5A7zAFzvcGUKT_FBs6if06KMQCLiCoujgfoqKJbG75pPsHHDFARIAdxNYZeIr4klmaUaxbQiFFxpvQsfT4ZkLMD7jmuQQcrEIw_U0MlpCQGkcvC69NRVVKjynIifxPBySubw_O81zifDp0Dk8ciRysaN1SbF85i8V3LoUkrtwROhUI9aQYJgYgSJ1CzWpfNLplbbr0X7YIrTDKb9sMhmlVicj_Yng0qFka_OVmBjHTnpojbKUSN96uBjGYZqC2VQXM1svLAHDTU1yRruFt5myqjhJ0fVh8Imhsk1Iqh0ytoO6eFoiLTWK4_Crb8XPS5tptBBzpEtgwgyk4QwOmzySUwkvNdDB-EIsTJcg5RQJl8ds4STNwqYV7XXeWxYQsmL1vGPVFY2lh_MX6q1jA9n8smxITE7F6AXsuRHTMP5q0jk58lbrUe-ZvuaD1b0kUTvpO3JtwWwxRd7jTKF7xde2InNOXwXxYCxHOw0sMX56Y1wLkvEDTLrNLZWOACS-T5o7mXDip43U0sSoUtMccu7lpfQzH3c7lNdr9s2Wgz4OqYaQYWsxNxRlRBdR11TRMweZt4Ta6K-7si5Z-rrcGmjG44NodT0O14Gzj-S4i6bK-qPYvUEsVeUl51ev_MsnBKtCXcMF8W6j9D7Oe3iGj13uvlVJEtq3OIoRjBXIuQQ012H0b3nQqlkoKEvsPAA_txAjgHXVzEVcM301_NDQikujTHdnxHNdzMcCfY7DQeeOE_2FT_hxYGlbuIg5vonRTT7MfSP8_LUuoIICGS81O-hDXvCLoomltb1fqCBBU2jpjIvNALMwNdJmMnwQOcIMI_QonRKoe5W43v\n\tstatus code: 403, request id: a9be63bd-2a3a-4a21-bb87-93532923ffd2" "controller"="alb-ingress-controller" "request"={"Namespace":"istio-system","Name":"istio-ingress"}
I don't understand what exactly went wrong in security permissions ?
The alb-ingress-controller doesn't have permission to create an ALB.
By setting the enablePodIamPolicy: false, I assume you go for option 2 of the guide.
The alb-ingress-controller uses the kf-admin role, and the installer needs attach on that role a policy found in aws-config/iam-alb-ingress-policy.json. Most probably it's not installed, so you'll have to add it in IAM and attach it to the role.
After doing that, observe the reconciler logs of the alb-ingress-controller to see if it's able to create the ALB.
It's likely the cluster-name in the aws-alb-ingress-controller-config is not correctly configured.
If that's the case, you should edit the Config Map to the right cluster name using kubectl edit cm aws-alb-ingress-controller-config -n kubeflow.
After that you should delete the pod so it restarts (kubectl -n kubeflow delete pod $(kubectl get pods -n kubeflow --selector=app=aws-alb-ingress-controller --output=jsonpath={.items..metadata.name})).

AWS EKS Kubernetes pods taking a lot of time to get READY

Github repo: https://github.com/oussamabouchikhi/udagram-microservices
After I configured the kubectl with the AWS EKS cluster, I deployed the services using these commands
kubectl apply -f env-configmap.yaml
kubectl apply -f env-secret.yaml
kubectl apply -f aws-secret.yaml
# this is repeated for all services
kubectl apply -f svcname-deploymant.yaml
kubectl apply -f svcname-service.yaml
But the pods took hours and still in PENDING state, and when I run the command kubectl describe pod <POD_NAME> I get the follwing info
reverseproxy-667b78569b-2c6hv pod: https://pastebin.com/3xF04SEx
udagram-api-feed-856bbc5c45-jcgtk pod: https://pastebin.com/5UqB79tU
udagram-api-users-6fbd5cbf4f-qbmdd pod: https://pastebin.com/Hiqe1LAM
From your kubectl describe pod <podname>
Warning FailedScheduling 2m19s (x136 over 158m) default-scheduler 0/2 nodes are available: 2 Too many pods.
When you see this, it means that your nodes in AWS EKS is full.
To solve this, you need to add more (or bigger) nodes.
You can also investigate your nodes, e.g. list your nodes with:
kubectl get nodes
and investigate a specific node (check how many pods it has capacity for - and how many pods that runs on the node) with:
kubectl describe node <node-name>

Kubectl get deployments shows No resources found in default namespace

I am trying my hands on Kubernetes and I tried to deploy an image into k8s service
root#KubernetesMiniKube:/usr/local/bin# kubectl run hello-minikube --image=k8s.gcr.io/echoserver:1.10 --port=8080
pod/hello-minikube created
root#KubernetesMiniKube:/usr/local/bin# kubectl get pod
NAME READY STATUS RESTARTS AGE
hello-minikube 1/1 Running 0 16s
root#KubernetesMiniKube:/usr/local/bin# kubectl get deployments
No resources found in default namespace.
Why i am seeing No resource found but actually there is a resource running inside default namespace.
When you are using $ kubectl run it will create a pod.
In your example thats exactly what happned, it created pod, named hello-minikube.
pod/hello-minikube created
If you want to create deployment
Deployments represent a set of multiple, identical Pods with no unique identities. A Deployment runs multiple replicas of your application and automatically replaces any instances that fail or become unresponsive.
you can do it using command:
$ kubectl create deployment hello-minikube --image=k8s.gcr.io/echoserver:1.10 --port=8080
deployment.apps/hello-minikube created
user#cloudshell:$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-minikube 1/1 1 1 8s
You can also create deployment using YAML.
Save YAML from this documentation example and use kubectl apply.
$ vi nginx.yaml
<paste proper YAML definition. Also you can use nano editor, or download ready yaml>
user#cloudshell:$ kubectl apply -f nginx.yaml
deployment.apps/nginx-deployment created
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
hello-minikube 1/1 1 1 3m48s
nginx-deployment 3/3 3 3 64s
Please let me know if you have further questions regarding this answer.

How to setup Kubernetes Master HA on AWS

What I am trying to do:
I have setup kubernete cluster using documentation available on Kubernetes website (http_kubernetes.io/v1.1/docs/getting-started-guides/aws.html). Using kube-up.sh, i was able to bring kubernete cluster up with 1 master and 3 minions (as highlighted in blue rectangle in the diagram below). From the documentation as far as i know we can add minions as and when required, So from my point of view k8s master instance is single point of failure when it comes to high availability.
Kubernetes Master HA on AWS
So I am trying to setup HA k8s master layer with the three master nodes as shown above in the diagram. For accomplishing this I am following kubernetes high availability cluster guide, http_kubernetes.io/v1.1/docs/admin/high-availability.html#establishing-a-redundant-reliable-data-storage-layer
What I have done:
Setup k8s cluster using kube-up.sh and provider aws (master1 and minion1, minion2, and minion3)
Setup two fresh master instance’s (master2 and master3)
I then started configuring etcd cluster on master1, master 2 and master 3 by following below mentioned link:
http_kubernetes.io/v1.1/docs/admin/high-availability.html#establishing-a-redundant-reliable-data-storage-layer
So in short i have copied etcd.yaml from the kubernetes website (http_kubernetes.io/v1.1/docs/admin/high-availability/etcd.yaml) and updated Node_IP, Node_Name and Discovery Token on all the three nodes as shown below.
NODE_NAME NODE_IP DISCOVERY_TOKEN
Master1
172.20.3.150 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
Master2
172.20.3.200 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
Master3
172.20.3.250 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
And on running etcdctl member list on all the three nodes, I am getting:
$ docker exec <container-id> etcdctl member list
ce2a822cea30bfca: name=default peerURLs=http_localhost:2380,http_localhost:7001 clientURLs=http_127.0.0.1:4001
As per documentation we need to keep etcd.yaml in /etc/kubernete/manifest, this directory already contains etcd.manifest and etcd-event.manifest files. For testing I modified etcd.manifest file with etcd parameters.
After making above changes I forcefully terminated docker container, container was existing after few seconds and I was getting below mentioned error on running kubectl get nodes:
error: couldn't read version from server: Get httplocalhost:8080/api: dial tcp 127.0.0.1:8080: connection refused
So please kindly suggest how can I setup k8s master highly available setup on AWS.
To configure an HA master, you should follow the High Availability Kubernetes Cluster document, in particular making sure you have replicated storage across failure domains and a load balancer in front of your replicated apiservers.
Setting up HA controllers for kubernetes is not trivial and I can't provide all the details here but I'll outline what was successful for me.
Use kube-aws to set up a single-controller cluster: https://coreos.com/kubernetes/docs/latest/kubernetes-on-aws.html. This will create CloudFormation stack templates and cloud-config templates that you can use as a starting point.
Go the AWS CloudFormation Management Console, click the "Template" tab and copy out the complete stack configuration. Alternatively, use $ kube-aws up --export to generate the cloudformation stack file.
User the userdata cloud-config templates generated by kube-aws and replace the variables with actual values. This guide will help you determine what those values should be: https://coreos.com/kubernetes/docs/latest/getting-started.html. In my case I ended up with four cloud-configs:
cloud-config-controller-0
cloud-config-controller-1
cloud-config-controller-2
cloud-config-worker
Validate your new cloud-configs here: https://coreos.com/validate/
Insert your cloud-configs into the CloudFormation stack config. First compress and encode your cloud config:
$ gzip -k cloud-config-controller-0
$ cat cloud-config-controller-0.gz | base64 > cloud-config-controller-0.enc
Now copy the content into your encoded cloud-config into the CloudFormation config. Look for the UserData key for the appropriate InstanceController. (I added additional InstanceController objects for the additional controllers.)
Update the stack at the AWS CloudFormation Management Console using your newly created CloudFormation config.
You will also need to generate TLS asssets: https://coreos.com/kubernetes/docs/latest/openssl.html. These assets will have to be compressed and encoded (same gzip and base64 as above), then inserted into your userdata cloud-configs.
When debugging on the server, journalctl is your friend:
$ journalctl -u oem-cloudinit # to debug problems with your cloud-config
$ journalctl -u etcd2
$ journalctl -u kubelet
Hope that helps.
There is also kops project
From the project README:
Operate HA Kubernetes the Kubernetes Way
also:
We like to think of it as kubectl for clusters
Download the latest release, e.g.:
cd ~/opt
wget https://github.com/kubernetes/kops/releases/download/v1.4.1/kops-linux-amd64
mv kops-linux-amd64 kops
chmod +x kops
ln -s ~/opt/kops ~/bin/kops
See kops usage, especially:
kops create cluster
kops update cluster
Assuming you already have s3://my-kops bucket and kops.example.com hosted zone.
Create configuration:
kops create cluster --state=s3://my-kops --cloud=aws \
--name=kops.example.com \
--dns-zone=kops.example.com \
--ssh-public-key=~/.ssh/my_rsa.pub \
--master-size=t2.medium \
--master-zones=eu-west-1a,eu-west-1b,eu-west-1c \
--network-cidr=10.0.0.0/22 \
--node-count=3 \
--node-size=t2.micro \
--zones=eu-west-1a,eu-west-1b,eu-west-1c
Edit configuration:
kops edit cluster --state=s3://my-kops
Export terraform scripts:
kops update cluster --state=s3://my-kops --name=kops.example.com --target=terraform
Apply changes directly:
kops update cluster --state=s3://my-kops --name=kops.example.com --yes
List cluster:
kops get cluster --state s3://my-kops
Delete cluster:
kops delete cluster --state s3://my-kops --name=kops.identityservice.co.uk --yes