Is it possible to mirror the traffic of one Cloud Run revision onto another?
We have a running Cloud Run service (one revision with 100% traffic), and we want to evaluate a change of in our algorithm, without actually deploying it to production.
It would be ideal, if we can just deploy a second revision (with 0% traffic, but with a revision URL) and mirror all incoming requests onto this URL.
I've seen, that you can mirror traffic using an Internal HTTP(S) Load Balancer (https://cloud.google.com/load-balancing/docs/l7-internal/setting-up-traffic-management#multiple_allowed_in_a_url_map ).
However, as far as I understand, I can't use an Internal HTTP(S) Load Balancer for Cloud Run, but only for VMs (Compute Engine).
For the serverless NEGs, it's possible to create External HTTP(S) Load Balancer, but those don't support this feature.
Am I understand it correctly, that it's not possible to mirror the traffic of Cloud Run with load balancers?
Are there any other solutions? Or do we need need to deploy our own load balancer (e.g. Nginx), and define our mirroring strategy there?
AFAIK, you can't mirror the request. you need, as you said, to deploy a proxy that split the traffic.
You can use another CLoud Run in front of your target service and then duplicate the request to the service tags. Nginx is an option, you can deploy it on Cloud Run for example.
Related
I'm trying to setup the multi region deployment with Load Balancer that drives traffic to the Cloud Run app which is deployed in the closed region to the visitor by this tutorial https://cloud.google.com/run/docs/multiple-regions
I have a Google Cloud Platform Load Balancer setup with a backend service which points to three regional network endpoint groups each of them linked to a separate instance of Cloud Run app in different regions.
When I'm accessing a Cloud Run app in any region directly by Cloud Run app URL (like this https://cms-us-east1-dpuglk7uja-ue.a.run.app) it works well.
When I'm accessing the app through the load balancer domain in the europe it works good as well.
But when I'm accessing the app through the load balancer domain in any other region (US, Asia) I'm getting a 404 error with message The requested URL was not found on this server. That’s all we know.
I've done everything explained in this tutorial and not sure what's wrong with that. Here are the regions I'm using: europe-north1, us-east1, asia-northeast1.
Is there any chance that the beta version of the Serverless NEG is still buggy?
Your load balancer configuration is the right one. You have one backend service, and 1 serverless NEG per region.
The condition to have something working is to have the SAME Cloud Run service name but deployed in different regions.
I have setup an External HTTP(S) load balancer with the following:
2 Serverless NEGs, each pointing at a different Cloud Run service in their respective region
1 Backend Service, using the 2 NEGs as 2 Backends
1 Host and path rule that sends everything to the Backend Service
1 HTTPS Frontend pointing at the Host and path rule
At this point, I notice that the traffic is routed to the Cloud Run service closest to the region of the client making the request.
I would like to change that to route 100% of the traffic to one Cloud Run service on day 1, 50% on each service on day 2, and on day 3, route 100% of the traffic to the other Cloud Run service.
It's unclear if an External HTTP(S) load balancer can help with that. And if it can, it's unclear if this should be done in the Backend Service or in the Host and Path rule.
Google Cloud load balancer does not support weighted/percent-based load balancing for the external HTTP(S) LB. This is listed at https://cloud.google.com/load-balancing/docs/features#load_balancing_methods.
Maybe I need to create 2 Backend Services, each pointing at one NEG?
Yes, this is how you would do it if external HTTPS GCLB supported it. You need to create separate backendServices for each serverless NEG and list weightedBackendServices in the route rule of the urlMap object. You can find an example here but I believe it only works for internal load balancer (ILB) currently per the link above.
AFAIK, External HTTPS load balancing can only route to the closest location but not dispatch the traffic according to weight.
In addition, your solution requires to deploy in 2 different regions, because you can't 2 backends in the same region in the same backend service.
The easiest solution for now is to use Cloud Run traffic splitting feature. Route all the traffic to the same service, and then, let the Cloud Run load balancer dispatching the requests.
I am currently working on a deploying a front-end that will scale dynamically based on the usage on google cloud platform. I was advised by a friend to use google cloud run. I have my angular front end building to a docker image with a simple express server and deployed on google cloud run. This (from what I understand) means that when the request threshold is met on one of the docker instances, another will boot up and take on the additional requests. How does this differ from a load balancer? Do I need a load balancer on top of google cloud run scaling?
I apologize in advance for my lack of devops knowledge.
Cloud Run provides autoscaling, meaning that you don't necessarily need to put a Load Balancer in front of your Cloud run services (which in the case of serverless products in GCP are known as Network Endpoint Groups), as this is done automatically on your behalf: each revision is automatically scaled to the number of container instances needed to handle all incoming requests, and even cooler since it's a scale to zero service the number of instances can reach zero if you are not receiving any requests (be aware that spinning up each new instance does necessarily take some time, which is known as cold starts, so you can always set a value of min_instances to avoid this type of issues). The use of Network Endpoint Groups is more oriented if you'd only have the backend part of your application hosted in Cloud Run, need your Load Balancer to do some sort of special routing and I believe the most wide use will be if you need to have a fixed external IP address for your application.
I'm new to GCP and trying to make heads and tails of it. So far, I've experienced with GKE and Cloud Run.
In GKE, I can create a Workload (deployment) for a service of any kind under any port I like and allocate resources to it. Then I can create a load balancer and open the ports from the pods to the Internet. The load balancer has an IP that I can use to access the underlying pods.
On the other hand, when I create a Could Run service, I'll give it a docker image and a port and once the service is up and running, it exposes an HTTPS URL! The port that I specify in Cloud Run is the docker's internal port and if I want to access the URL, I have to do that through port 80.
Does this mean that Cloud Run is designed only for HTTP services under port 80? Or maybe I'm missing something?
Technically "no", Cloud Run cannot be used for non-HTTP services. See Cloud Run's container runtime contract.
But also "sort of":
The URL of a Cloud Run service can be kept "private" (and they are by default), this means that nobody but some specific identities are allowed to invoked the Cloud Run service. See this page to learn more)
The container must listen for requests on a certain port, and it does not have CPU outside of request processing. However, it is very easy to wrap your binary into a lightweight HTTP server. See for example the Shell sample that Uses a very small Go HTTP sevrer to invoke an arbitrary shell script.
With the Cloud Foundry Feature, "Polyglot" for integrated Service Discovery and direct communication between service containers through the internal routes, How does the Load Balancing work? Is Cloud Foundry taking care of the Load Balancing? Is there a way to utilize Client Side Load Balancing, something like Ribbon on top of this Polyglot enabled communication?
When you are using container to container networking...
If you connect directly to IP addresses, no load balancing is done.
If you use the platform's DNS based polyglot service discovery, then you will get limited load balancing via round-robin DNS.
With the polyglot service discovery feature, DNS responses are rotated so that IPs are listed in different orders in the response. You can observe/validate this by doing the following:
Map an internal route to an app
Scale the same app up to have two or more instances
Run cf ssh into any app container
Inside the container, run dig <internal-route>
Repeat the last step any number of times. You should see the response from DNS come back with IP addresses in a different order (they are rotated).
That said, there is nothing to stop you from using a different form of load balancing be that a reverse proxy app you have deployed or something client side like Ribbon.