K8s service annotations getting reset after edit - amazon-web-services

I have a service of type 'loadbalancer' in my eks cluster. Currently aws has configured a classic loadbalancer for this service which is open to internet. Now i have to change this to a network loadbalancer which is not open to internet but whose scheme is internal.
In order to do that I tried adding the below marked annotations -
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
service.beta.kubernetes.io/aws-load-balancer-type: nlb
But after I do kubectl edit svc and add these annotations and save, The service definition gets reset to the previous version and removes the newly added annotations. I do see a network lb getting created in aws, but the classic lb still remains and is operational.
I also tried manually deleting the classic lb, but it gets re-created after sometime.
Appreciate any help or insights on this issue.

Related

k8s service annotations not working on AWS LB

I am running a cluster in EKS, with k8s 1.21.5
I know by default k8s has a Cloud Controller Manager which can be used to create Load balancers and by default it will create a classic LB in AWS.
I realize CLB are going away and I should use NLB or ALB and rather install the AWS Load Balancer controller instead but I want to work out why my annotations dont work.
What I am trying to do is setup a TLS listen using an ACM certificate because by default its all setup as TCP
Here are my annotations
# service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:<region>:<account>:certificate/<id>
# service.beta.kubernetes.io/aws-load-balancer-backend-protocol: ssl
# service.beta.kubernetes.io/aws-load-balancer-ssl-ports: <port>
I have followed the k8s docs here which specify which annotations to use https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
And I have checked in the k8s code that these annotations are present
https://github.com/kubernetes/kubernetes/blob/v1.21.5/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go#L162
https://github.com/kubernetes/kubernetes/blob/v1.21.5/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go#L167
https://github.com/kubernetes/kubernetes/blob/v1.21.5/staging/src/k8s.io/legacy-cloud-providers/aws/aws.go#L181
When I create my service with these annotations, the service in k8s says pending
Can anyone tell me why it wont work or give me any insight?
What I have been doing is manually configuring the LB after its created, but I want to get away from doing that
#congbaoguier
Thanks for your advice to look at the logs, I was being a complete dummy. After enabling my logging on control plane I was able to see that there was an issue with my ACM ARN and weirdly I have no idea where I got that ARN from, what I check it in ACM it was WRONG DOH
Updating my ARN it now works, so thanks for the push to use my brain again :P

"ERR_EMPTY_RESPONSE" - ShinyApp hosted over AWS (EC2 / EKS / ShinyProxy) does not work

Update #2:
I have checked the health status of my instances within the auto scaling group - here the instances are titled as "healthy". (Screenshot added)
I followed this trouble-shooting tutorial from AWS - without success:
Solution: Use the ELB health check for your Auto Scaling group. When you use the ELB health check, Auto Scaling determines the health status of your instances by checking the results of both the instance status check and the ELB health check. For more information, see Adding health checks to your Auto Scaling group in the Amazon EC2 Auto Scaling User Guide.
Update #1:
I found out that the two Node-Instances are "OutOfService" (as seen in the screenshots below) because they are failing the Healtcheck from the loadbalancer - could this be the problem? And how do i solve it?
Thanks!
I am currently on the home stretch to host my ShinyApp on AWS.
To make the hosting scalable, I decided to use AWS - more precisely an EKS cluster.
For the creation I followed this tutorial: https://github.com/z0ph/ShinyProxyOnEKS
So far everything worked, except for the last step: "When accessing the load balancer address and port, the login interface of ShinyProxy can be displayed normally.
The load balancer gives me the following error message as soon as I try to call it with the corresponding port: ERR_EMPTY_RESPONSE.
I have to admit that I am currently a bit lost and lack a starting point where the error could be.
I was already able to host the Shiny sample application in the cluster (step 3.2 in the tutorial), so it must be somehow due to shinyproxy, kubernetes proxy or the loadbalancer itself.
I link you to the following information below:
Overview EC2 Instances (Workspace + Cluster Nodes)
Overview Loadbalancer
Overview Repositories
Dockerfile ShinyProxy
Dockerfile Kubernetes Proxy
Dockerfile ShinyApp (sample application)
I have painted over some of the information to be on the safe side - if there is anything important, please let me know.
If you need anything else I haven't thought of, just give me a hint!
And please excuse the confusing question and formatting - I just don't know how to word / present it better. sorry!
Many thanks and best regards
Overview EC2 Instances (Workspace + Cluster Nodes)
Overview Loadbalancer
Overview Repositories
Dockerfile ShinyProxy (source https://github.com/openanalytics/shinyproxy-config-examples/tree/master/03-containerized-kubernetes)
Dockerfile Kubernetes Proxy (source https://github.com/openanalytics/shinyproxy-config-examples/tree/master/03-containerized-kubernetes - Fork)
Dockerfile ShinyApp (sample application)
The following files are 1:1 from the tutorial:
application.yaml (shinyproxy)
sp-authorization.yaml
sp-deployment.yaml
sp-service.yaml
Health-Status in the AutoScaling-Group
Unfortunately, there is a known issue in AWS
externalTrafficPolicy: Local with Type: LoadBalancer AWS NLB health checks failing · Issue #80579 · kubernetes/kubernetes
Closing this for now since it's a known issue
As per k8s manual:
.spec.externalTrafficPolicy - denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. There are two available options: Cluster (default) and Local. Cluster obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading. Local preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type Services, but risks potentially imbalanced traffic spreading.
But you may try to fix local protocol like in this answer
Upd:
This is actually a known limitation where the AWS cloud provider does not allow for --hostname-override, see #54482 for more details.
Upd 2: There is a workaround via patching kube-proxy:
As per AWS KB
A Network Load Balancer with the externalTrafficPolicy is set to Local (from the Kubernetes website), with a custom Amazon VPC DNS on the DHCP options set. To resolve this issue, patch kube-proxy with the hostname override flag.

Istio configuration on GKE

I have some basic questions about Istio. I installed Istio for my Tyk API gateway. Then I found that simply installing Istio will cause all traffic between the Tyk pods to be blocked. Is this the default behaviour for Istio? The Tyk gateway cannot communicate with the Tyk dashboard.
When I rebuild my deployment without Istio, everything works fine.
I have also read that Istio can be configured with virtual services to perform traffic routing. Is this what I need to do for every default installing of Istio? Meaning, if I don't create any virtual services, then Istio will block all traffic by default?
Secondly, I understand a virtual service is created as a YAML file applied as a CRD. The host name defined in the virtual service rules - in a default Kubernetes cluster implementation on Google Cloud, how do I find out the host name of my application?
Lastly, if I install Tyk first, then later install Istio, and I have created the necessary label in Tyk's nanmespace for the proxy to be injected, can I just perform a rolling upgrade of my Tyk pods to have Istio start the injection?
For example, I have these labels in my Tyk dashboard service. Do I use the value called "app" in my virtual service YAML?
labels:
app: dashboard-svc-tyk-pro
app.kubernetes.io/managed-by: Helm
chart: tyk-pro-0.8.1
heritage: Helm
release: tyk-pro
Sorry for all the basic questions!
For question on Tyk gateway cannot communicate with the Tyk dashboard.
(I think the problem is that your pod tries to connect to the database before the Istio sidecar is ready. And thus the connection can't be established.
Istio runs an init container that configures the pods route table so all traffic is routed through the sidecar. So if the sidecar isn't running and the other pod tries to connect to the db, no connection can be established. Ex case: Application running in Kubernetes cron job does not connect to database in same Kubernetes cluster)
For question on Virtual Services
2.Each virtual service consists of a set of routing rules that are evaluated in order, letting Istio match each given request to the virtual service to a specific real destination within the mesh.
By default, Istio configures the Envoy proxies to passthrough requests to unknown services. However, you can’t use Istio features to control the traffic to destinations that aren’t registered in the mesh.
For question on hostname refer to this documentation.
The hosts field lists the virtual service’s hosts - in other words, the user-addressable destination or destinations that these routing rules apply to. This is the address or addresses the client uses when sending requests to the service.
Adding Istio on GKE to an existing cluster please refer to this documentation.
If you want to update a cluster with the add-on, you may need to first resize your cluster to ensure that you have enough resources for Istio. As when creating a new cluster, we suggest at least a 4 node cluster with the 2 vCPU machine type.If you have an existing application on the cluster, you can find out how to migrate it so it's managed by Istio as mentioned in the Istio documentation.
You can uninstall the add-on following document which includes to shift traffic away from the Istio ingress gateway.Please take a look at this doc for more details on installing and uninstalling Istio on GKE.
Also adding this document for installing Istio on GKE which also includes installing it to an existing cluster to quickly evaluate Istio.

Istio: failed calling admission webhook Address is not allowed

I am getting the following error while creating a gateway for the sample bookinfo application
Internal error occurred: failed calling admission webhook
"pilot.validation.istio.io": Post
https://istio-galley.istio-system.svc:443/admitpilot?timeout=30s:
Address is not allowed
I have created a EKS poc cluster using two node-groups (each with two instances), one with t2.medium and another one is with t2.large type of instances in my dev AWS account using two subnets with /26 subnet with default VPC-CNI provided by EKS
But as the cluster is growing with multiple services running, I started facing issues of IPs not available (as per docs default vpc-cni driver treat pods as an EC2 instance)
to avoid same I followed following post to change networking from default to weave
https://medium.com/codeops/installing-weave-cni-on-aws-eks-51c2e6b7abc8
because of same I have resolved IPs unavailability issue,
Now after network reconfiguration from vpc-cni to weave
I am started getting above issue as per subject line for my service mesh configured using Istio
There are a couple of services running inside the mesh and also integrated kiali, prometheus, jaeger with the same.
I tried to have a look at Github (https://github.com/istio/istio/issues/9998) and docs
(https://istio.io/docs/ops/setup/validation/), but could not get a proper valid answer.
Let me if anyone face this issue and have partial/full solution on this.
This 'appears' to be related to the switch from AWS CNI to weave. CNI uses the IP range of your VPC while weave uses its own address range (for pods), so there may be remaining iptables rules from AWS CNI, for example.
Internal error occurred: failed calling admission webhook "pilot.validation.istio.io": Post https://istio-galley.istio-system.svc:443/admitpilot?timeout=30s: Address is not allowed
The message above implies that whatever address istio-galley.istio-system.svc resolves to, internally in your K8s cluster, is not a valid IP address. So I would also try to see what that resolves to. (It may be related to coreDNS).
You can also try the following these steps;
Basically, (quoted)
kubectl delete ds aws-node -n kube-system
delete /etc/cni/net.d/10-aws.conflist on each of the node
edit instance security group to allow UDP, TCP on 6873, 6874 ports
flush iptables, nat, mangle, filter
restart kube-proxy pods
apply weave-net daemonset
delete existing pods so the get recreated in Weave pod CIDR's address-space.
Furthermore, you can try reinstalling everything from the beginning using weave.
Hope it helps!

Enabling CDN to kubernetes backend through backendconfig doesn't allow custom host and path rules

Not able to add custom path rules lo Google CDN Loadbalancer
Despite some minor issues like address flapping between custom ingress controller IP and reserved CDN IP, we are implementing CDN for our GKE hosted app following this tutorial (https://cloud.google.com/kubernetes-engine/docs/how-to/cdn-backendconfig)
Almost everything is fine, but when trying to add some path rules, via k8s manifest or Google loadbalancer UI, they take no effect at all, in fact, in the UI case, the rules disappear after 2 minutes...
Any thoughts?
Try using "kubectl replace" when dealing with ingress manifest. Google Cloud does not allow updates to ingress after it is created. So in Kubernetes it might look like you make changes but they will not get applied in Google Cloud.
Using kubectl describe, in the Events section, I found this warning:
Warning Translate 114s (x32 over 48m) loadbalancer-controller error while evaluating the ingress spec: service "xxx-staging/statics-bucket" is type "ClusterIP", expected "NodePort" or "LoadBalancer"
So, this is the problem, I will try to change this and post here the resolution.