How to do gradual traffic migration between two Cloud Run services using Google Cloud HTTP(S) load balancer

I have setup an External HTTP(S) load balancer with the following:
2 Serverless NEGs, each pointing at a different Cloud Run service in their respective region
1 Backend Service, using the 2 NEGs as 2 Backends
1 Host and path rule that sends everything to the Backend Service
1 HTTPS Frontend pointing at the Host and path rule
At this point, I notice that the traffic is routed to the Cloud Run service closest to the region of the client making the request.
I would like to change that to route 100% of the traffic to one Cloud Run service on day 1, 50% on each service on day 2, and on day 3, route 100% of the traffic to the other Cloud Run service.
It's unclear if an External HTTP(S) load balancer can help with that. And if it can, it's unclear if this should be done in the Backend Service or in the Host and Path rule.

Google Cloud load balancer does not support weighted/percent-based load balancing for the external HTTP(S) LB. This is listed at
Maybe I need to create 2 Backend Services, each pointing at one NEG?
Yes, this is how you would do it if external HTTPS GCLB supported it. You need to create separate backendServices for each serverless NEG and list weightedBackendServices in the route rule of the urlMap object. You can find an example here but I believe it only works for internal load balancer (ILB) currently per the link above.

AFAIK, External HTTPS load balancing can only route to the closest location but not dispatch the traffic according to weight.
In addition, your solution requires to deploy in 2 different regions, because you can't 2 backends in the same region in the same backend service.
The easiest solution for now is to use Cloud Run traffic splitting feature. Route all the traffic to the same service, and then, let the Cloud Run load balancer dispatching the requests.


Specify URL instead of IP:port in network endpoints for applications behind reverse proxy

We are using GCP external HTTPS load balancer, architecture is shown in the diagram below. The primary use of LB is redirecting users to static error site (hosted on Cloud Storage bucket) in case CE instance is down, Traefik crashes on CE, Docker crashes on CE, etc.
We have 4 backend services defined on load balancer:
static-error-page backend bucket
blog-backend-service, gallery-backend-service and shop-backend-service zonal network endpoint groups
Then, we defined host and path rules so that: -> blog-backend-service -> gallery-backend-service -> shop-backend-service
All unmatched (default) -> static-error-page
Each zonal network endpoint group (blog-backend-service, gallery-backend-service and shop-backend-service) has just 1 endpoint defined: 192.168.171:443 ( is internal IP of CE instance).
However, since my websites are served behind reverse proxy (Traefik), specifying IP:port combination in network endpoint is useless because they all have the same IP:port. I would like to specify URL instead of IP:port in network endpoint (that way network endpoint would also show correct health status if website is down, it always reports healthy now, even if application is down).
Is it possible to specify URL instead of IP:port in network endpoint? If not, what are my alternatives?
Instead of using treafik, you can use Google API Gateway which was meant to do that job, while you can still use the load balancer behind it.

Does global layer 7 http(s) load balancer has option to rate limiting?

I use global http(s) load balancer for backend services running on Kubernetes cluster. I didn't find any information on how to limit number of requests in a time window from one IP. There is Cloud Armor, but there also simple IP, region, and header based access can be performed. Could you please share how can I perform IP based rate limitation on global http load balancer on google cloud to provide defence against DoS attacks.
The backend service on running on Kubernetes cluster is a symfony server with web interface. I want to use Cloud CDN for the server therefore I had to use gce ingress instead of ingress-nginx. On google cloud, gce ingress creates a global HTTP(s) load balancer and ingress-nginx creates TCP load balancer.
In the nginx-ingress, I could simply use annotation, which helps in limiting flood of http requests. I want to do the similar configuration on my global HTTP(s) load balancer. In the current setting, I observed that flood of http requests are sent to the load balancer which are forwarded to the symfony server and at one point latency of request increases. Which makes the liveness probe fail for the pod.
Cloud Armor is a WAF that you can configure to protect your service against DoS attacks, especially by blocking specific IPs.
Rate limiting isn't to protect your service against DDoS. Indeed, if the attack flood your rate limiting service, your valid IPs and the bad IPs won't be served, because your service is flooded: it's a denial of service
Rate limiting helps to preserve resource for legit users. But some can try to overcome some constraint by using in a different (wrong/bad) manner your APIs.
For example, you have a paid API to export all the customer. You have a free API to request 1 customer. A user can say "Hey I don't want to pay, I will request in a loop the single customer API to create my daily report like that!". It's a misuse of the single customer API and you can protect it against this misuse with rate limiting
You can use Cloud Endpoint and ESP (Extensible Service Proxy). I wrote an article with an ESP deployed on Cloud Run, but you can deploy it on K8S also.
You can also use API gateway which is the managed service of ESP, that will be soon plugable on HTTPS load balancer (to use it in addition to WAF protection).

Mirror requests between Cloud Run revisions

Is it possible to mirror the traffic of one Cloud Run revision onto another?
We have a running Cloud Run service (one revision with 100% traffic), and we want to evaluate a change of in our algorithm, without actually deploying it to production.
It would be ideal, if we can just deploy a second revision (with 0% traffic, but with a revision URL) and mirror all incoming requests onto this URL.
I've seen, that you can mirror traffic using an Internal HTTP(S) Load Balancer ( ).
However, as far as I understand, I can't use an Internal HTTP(S) Load Balancer for Cloud Run, but only for VMs (Compute Engine).
For the serverless NEGs, it's possible to create External HTTP(S) Load Balancer, but those don't support this feature.
Am I understand it correctly, that it's not possible to mirror the traffic of Cloud Run with load balancers?
Are there any other solutions? Or do we need need to deploy our own load balancer (e.g. Nginx), and define our mirroring strategy there?
AFAIK, you can't mirror the request. you need, as you said, to deploy a proxy that split the traffic.
You can use another CLoud Run in front of your target service and then duplicate the request to the service tags. Nginx is an option, you can deploy it on Cloud Run for example.

Use of load balancer infront of ingress nginx controller

I am having hard time in understanding the role of a Load Balancer when used with Ingress Nginx.
I know a Load balancer distributes request over multiple nodes.
i.g, let's say I have two nodes A and B , and they are responisble for processing requests at
So a load balancer will take request for and distribute among them with help of defined algorithm.
I also understand what an API Gateway is,
i.g., let's say I have one order service and another payment service so an API gateway will get the request for and it will hand over the request for /orders to order service and /payments to payment service.
The Confusion:
Load Balancer(NLB) -> API Gateway -> Services -> order deployment -> which is running two replicas
Who distributes requests in those replicas for /orders
What is the role of load balancer in this case?
Some article suggest to create a service as type Load Balancer what does that mean? What this service will do?
Also, Load Balancer sits outside of the cluster NLB -> [ k8s cluster ], how does it know how to distribute requests?
These collectely could one question, I don't know.
Any kind of explanation would appreciated.
I have gone through many articles and blogs but none talks about complete picture.
Many of my doubts are cleared through this article
Within the cluster a service does load balancing among the replicas.
I still have some questions,
Do I only need a load balacner to expose the ingress controller service?
What if there is some problem with the ingress controller and it restarts.
What will happen will it get a new IP and load balancer will poin to new one or the ip will remain the same?
This article may help :
Q: Do I only need a load balacner to expose the ingress controller service?
A: Expose K8s services mainly
Q: What if there is some problem with the ingress controller and it restarts.
A: Problem can appear if new broken changes will be applied, and in this case old controller will still work, but new one will fail to start, therefore you will have to do kubectl describe etc, to understand what is wrong.
Q: What will happen will it get a new IP and load balancer will poin to new one or the ip will remain the same?
A: Why you need LB ip's? Use LoadBalancer DNS.

Send POST request from one service to another in Amazon ECS

I have a Node-Express website running on a microservices based architecture. I deployed the microservices on Amazon ECS cluster with one EC2 instance. The microservices sit behind an Application Load Balancer that routes external traffic correctly to the services. This system is working as expected except for one problem: I need to make a POST request from one service to the other. I am trying to use axios for this but I don't know what url to post to in axios. When testing locally, I just used'http://localhost:3000/service2',...) inside service 1 but how should I do it here?
So There are various ways.
1. Use Application Load Balancer behind the service
In this method, you put your micro services behind the load balancer(s) and to send request, you give load balancer URL. You can have path based routing for same load balancer or you can use multiple load balancers.
2. Use Service Discovery
In this method, you let your requester discover it. Now Service discovery can be done in various way like using ALB or Route 53 or ECS or Key Value Store or Configuration Management or Third Party Software such as Consul