How to get latency details of load balancer in stackdriver? - google-cloud-platform

I have a simple spring boot application deployed on Kubernetes on GCP. I wish to custom auto-scale the application using latency threshold (response time). Stackdriver has a set of metrics for load balancer. Details of the metrics can be found in this link.
I have exposed my application to an external IP using the following command
kubectl expose deployment springboot-app-new --type=LoadBalancer --port 80 --target-port 9000
I used this API explorer to view the metrics. The response code is 200, but the response is empty.
The metrics filter I used is metric.type = "loadbalancing.googleapis.com/https/backend_latencies"
Question
Why am I not getting anything in the response? Am I making any mistake?
I have already enabled Stackdriver API. Is there any other settings to be made to get the response?

As mentioned in the comments, the metric you're trying to use belongs to an HTTP(S) load balancer and the type LoadBalancer, when used in GKE, will deploy a Network Load Balancer instead.
The reason you're not able to find its metrics using the Stackdriver Monitoring page is that, the link shared in the comment corresponds to a TCP/SSL Proxy load balancer (layer 7) documentation instead of a Network Load Balancer (layer 4), which is the one that is already created in your cluster and for now, the Network Load Balancer won't show up using the Stackdriver Monitoring page.
However, a feature request has been created to have this functionality enabled in the Monitoring dashboard.
If you need to have this particular metric (loadbalancing.googleapis.com/https/backend_latencies), it might be best to expose your deployment using an Ingress instead of using the LoadBalancer type. This will automatically create an HTTP(S) load balancer with the monitoring enabled instead of the current Network Load Balancer.

Related

k8s Service annotations for AWS NLB ALPN

I'm facing an issue on the Service annotation that enables ALPN policy in an AWS load balancer.
I'm testing an application in production, managed by EKS. I need to enable a Network Load Balancer (NLB) on AWS to manage some ingress rules (tls cert and so on...).
Among annotations is available:
service.beta.kubernetes.io/aws-load-balancer-alpn-policy: HTTP2Preferred
I think I need this to enable ALPN in the TLS handshake.
The issue is that it does not apply to my load balancer (other annotations works), I can confirm it by accessing the AWS dashboard or by executing curl -s -vv https://my.example.com. To enable this ALPN policy I must apply this patch manually, e.g. through the dashboard.
What am I missing? I wonder if that annotation could only be available for the load balancer controller and not for the base Service for NLBs.
EDIT: I found some github issues that requested for this feature in the legacy mode without using a third party controller, here is a comment that resumes all. Since it seems to be an unavailable feature (for now), how can I achieve the configuration result using terraform for example? Do I need to create the NLB first and then attach to my Service?

GCP Kubernetes - Health Check Fails in Loader Balancer with NEG backends

Here is what exists and works OK:
Kubernetes cluster in Google Cloud with deployed 8 workloads - basically GraphQL microservices.
Each of the workloads has a service that exposes port 80 via NEG (Network Endpoint Group). So, each workload has its ClusterIP in the 10.12.0.0/20 network. Each of the services has a custom namespace "microservices".
One of the workloads (API gateway) is exposed to the Internet via Global HTTP(S) Load Balancer. Its purpose is to handle all requests and route them to the right microservice.
Now, I needed to expose all of the workloads to the outside world so they can be reached individually without going through the gateway.
For this, I have created:
a Global Load Balancer, added backends (which referer to NEGs), configured routing (URL path defines which workload the request will go), and external IP
a Health Check that is used by Load Balancer for each of the backends
a firewall rule that allows traffic on TCP port 80 from the Google Health Check services 35.191.0.0/16, 130.211.0.0/22 to all hosts in the network.
The problem: Health Check fails and thus the load balancer does not work - it gives error 502.
What I checked:
logs show that the firewall rule allows traffic
logs for the Health Check show only changes I do to it and no other activities so I do not know what happens inside.
connected via SSH to the VM which hosts the Kubernetes node and checked that the clusterIPs (10.12.xx.xx) of each of workload return HTTP Status 200.
connected via SSH to a VM created for test purposes. From this VM I cannot reach any of the ClusterIPs (10.12.xx.xx)
It seems that for some reason traffic from the Health Check or my test VM does not get to the destination. What did I miss?

Does global layer 7 http(s) load balancer has option to rate limiting?

I use global http(s) load balancer for backend services running on Kubernetes cluster. I didn't find any information on how to limit number of requests in a time window from one IP. There is Cloud Armor, but there also simple IP, region, and header based access can be performed. Could you please share how can I perform IP based rate limitation on global http load balancer on google cloud to provide defence against DoS attacks.
Edit:
The backend service on running on Kubernetes cluster is a symfony server with web interface. I want to use Cloud CDN for the server therefore I had to use gce ingress instead of ingress-nginx. On google cloud, gce ingress creates a global HTTP(s) load balancer and ingress-nginx creates TCP load balancer.
In the nginx-ingress, I could simply use nginx.ingress.kubernetes.io/limit-rps annotation, which helps in limiting flood of http requests. I want to do the similar configuration on my global HTTP(s) load balancer. In the current setting, I observed that flood of http requests are sent to the load balancer which are forwarded to the symfony server and at one point latency of request increases. Which makes the liveness probe fail for the pod.
Cloud Armor is a WAF that you can configure to protect your service against DoS attacks, especially by blocking specific IPs.
Rate limiting isn't to protect your service against DDoS. Indeed, if the attack flood your rate limiting service, your valid IPs and the bad IPs won't be served, because your service is flooded: it's a denial of service
Rate limiting helps to preserve resource for legit users. But some can try to overcome some constraint by using in a different (wrong/bad) manner your APIs.
For example, you have a paid API to export all the customer. You have a free API to request 1 customer. A user can say "Hey I don't want to pay, I will request in a loop the single customer API to create my daily report like that!". It's a misuse of the single customer API and you can protect it against this misuse with rate limiting
You can use Cloud Endpoint and ESP (Extensible Service Proxy). I wrote an article with an ESP deployed on Cloud Run, but you can deploy it on K8S also.
You can also use API gateway which is the managed service of ESP, that will be soon plugable on HTTPS load balancer (to use it in addition to WAF protection).

How to do gradual traffic migration between two Cloud Run services using Google Cloud HTTP(S) load balancer

I have setup an External HTTP(S) load balancer with the following:
2 Serverless NEGs, each pointing at a different Cloud Run service in their respective region
1 Backend Service, using the 2 NEGs as 2 Backends
1 Host and path rule that sends everything to the Backend Service
1 HTTPS Frontend pointing at the Host and path rule
At this point, I notice that the traffic is routed to the Cloud Run service closest to the region of the client making the request.
I would like to change that to route 100% of the traffic to one Cloud Run service on day 1, 50% on each service on day 2, and on day 3, route 100% of the traffic to the other Cloud Run service.
It's unclear if an External HTTP(S) load balancer can help with that. And if it can, it's unclear if this should be done in the Backend Service or in the Host and Path rule.
Google Cloud load balancer does not support weighted/percent-based load balancing for the external HTTP(S) LB. This is listed at https://cloud.google.com/load-balancing/docs/features#load_balancing_methods.
Maybe I need to create 2 Backend Services, each pointing at one NEG?
Yes, this is how you would do it if external HTTPS GCLB supported it. You need to create separate backendServices for each serverless NEG and list weightedBackendServices in the route rule of the urlMap object. You can find an example here but I believe it only works for internal load balancer (ILB) currently per the link above.
AFAIK, External HTTPS load balancing can only route to the closest location but not dispatch the traffic according to weight.
In addition, your solution requires to deploy in 2 different regions, because you can't 2 backends in the same region in the same backend service.
The easiest solution for now is to use Cloud Run traffic splitting feature. Route all the traffic to the same service, and then, let the Cloud Run load balancer dispatching the requests.

Google Load Balancers doesn't appear in Monitoring Dashboard

I created a global load balancer with backend service and enabled the logging in Google Cloud project. The Load Balancer charts and metrics is supposed to appear in Monitoring dashboard, however, the charts and metrics were not be created.
In the Google Cloud document, it looks like that if a load balancer exists in the project, the load balancer dashboard is ready to use. I also cannot find to create Load Balancers dashboard manually.
Go into Monitoring > Dashboards and create a new dasboard. The go Metrics explorer and type into Find resource type and metric field load balancer and then select your balancer type (HTTP, TCP or UDP). The select metric (for example utilisation for HTTP. Then choose filtering option (in my case it was backend service name) - it should pop up on the list.
After that you can save chart in the dasboard you created. Open this dasboard and you should have a working panel. You can add more charts to observe various metrics.
This solution may vary in case of TCP load balancer (different metrics) but generally that is the way you do it.
I cold provide more specific solution but you have to update your question with more detailes (LB type is the most important).