What is the rate limit for GCE instance metadata service? - google-cloud-platform

On a GCP compute environment, I need to get an id_token (expires every 3600s) to make service-to-service authentication (using GCF, Cloud Run etc).
I get this id_token from the instance metadata service:
http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=[...]
Instead of implementing some form of caching+TTL for this identity token, I'm curious if I can call this endpoint every time I will make an outbound RPC (I might make a lot).
That's why I'm curious:
What's the rate limit for the instance metadata service?
Does the rate limit differ between GCE vs serverless platforms (GAE, GCF, Cloud Run)?

Related

AWS API Gateway - Using custom API endpoint for authorization running always on an EC2 instance

We are planning on setting up an AWS API Gateway. We also have a requirement that every call that hits our API gateway should go through a JWT token validation. We are using Okta as our IDP which will generate an access token.
We tried AWS REST API gateway which supports lambda authorizer but the problem we have with it is that lambda's are too slow and there is no way we can cache the JWKS URI and keys. We would get 1000's of requests within a second and lambda's does not work well here. Our goal is to achieve token validation within 2 digit milliseconds which is only possible if we are able to cache the JWKS public key and refetch it only when it is rotated. Please note we do not want to use the caching option of REST API to cache authorization.
We also tried AWS HTTP API which support JWT authorizer but even here we are getting only 3 digit millisecond performance like 500ms-700ms. The majority of the time is spent on fetching the OIDC metdata endpoint and fetching the public key from okta issuer endpoint.
Now, we know that the we could implement our own custom token validation (Authorizer) within one of our own REST API service (Using golang or nodejs) that will run always within an EC2 instance or fargate server within an ECS cluster or for simplicity just within an EC2 instance.
Does AWS API gateway by any chance allow some kind of custom hook where whichever endpoint on the gateway is invoked it should first invoke our custom REST API endpoint which is running in an EC2 instance and then once it gets a valid token response it should invoke the actual URL. Is this even possible? If not, what other options do we have.
Thanks

Can Services in GCP's Monitoring monitor endpoints?

I installed managed Anthos on a GKE cluster. Anthos Service Mesh is working and is displaying my API. Thanks to that Services that are in Monitoring automatically detect my API. This is great as it enables me to easily set SLOs and Error Budget for my API.
However I would like to be able to easily set SLOs for individual endpoints in my api. Services(in Monitoring) detect only my API and not the endpoints within my API(my API is one pod/container + sidecar). I tried to add endpoints to Services in Monitoring but it looks like it is only possible to add Kubernetes Objects there.
Is there a way to use Services in Monitoring with endpoints? Is the only way to do so to break endpoints to separate microservices?
You can monitor your endpoints using Cloud Endpoints with OpenAPI, which allows you to monitor the health of APIs you own by using the logs and metrics Cloud Endpoints maintains for you automatically. When users make requests to your API, Endpoints logs information about the requests and responses and also tracks three of the four golden signals of monitoring: latency, traffic, and errors. These usage and performance metrics help you monitor your API.
The following URL Configuring Cloud Endpoints has the configuration process for Cloud Endpoints. Use this URL Monitoring your API as a reference on the monitoring process for your API, and this last URL for the Cloud Endpoint’s overview.

Setting Cloud Monitoring uptime checks for non publicly accessible backends

I'm having some trouble setting uptime checks for some Cloud Run services that don't allow unauthenticated invocations.
For context, I'm using Cloud Endpoints + ESPv2 as an API gateway that's connected to a few Cloud Run services.
The ESPv2 container/API gateway allows unauthenticated invocations, but the underlying Cloud Run services do not (since requests to these backends flow via the API gateway).
Each Cloud Run service has an internal health check endpoint that I'd like to hit periodically via Cloud Monitoring uptime checks.
This serves the purpose of ensuring that my Cloud Run services are healthy, but also gives the added benefit of reduced cold boot times as the containers are kept 'warm'
However, since the protected Cloud Run services expect a valid authorisation header all of the requests from Cloud Monitoring fail with a 403.
From the Cloud Monitoring UI, it looks like you can only configure a static auth header, which won't work in this case. I need to be able to dynamically create an auth header per request sent from Cloud Monitoring.
I can see that Cloud Scheduler supports this already. I have a few internal endpoints on the Cloud Run services (that aren't exposed via the API gateway) that are hit via Cloud Scheduler, and I am able to configure an OIDC auth header on each request. Ideally, I'd be able to do the same with Cloud Monitoring.
I can see a few workarounds for this, but all of them are less than ideal:
Allow unauthenticated invocations for the underlying Cloud Run services. This will make my internal services publicly accessible and then I will have to worry about handling auth within each service.
Expose the internal endpoints via the API gateway/ESPv2. This is effectively the same as the previous workaround.
Expose the internal endpoints via the API gateway/ESPv2 AND configure some sort of auth. This sort of works but at the time of writing the only auth methods supported by ESPv2 are API Keys and JWT. JWT is already out of the question but I guess an API key would work. Again, this requires a bit of set up which I'd rather avoid if possible.
Would appreciate any thought/advice on this.
Thanks!
This simple solution may work on your use case as it is easier to just use a TCP uptime check on port 443:
Create your own Cloud Run service using https://cloud.google.com/run/docs/quickstarts/prebuilt-deploy.
Create a new uptime check on TCP port 443 Cloud Run URL.
Wait a couple of minutes.
Location results: All locations passed
Virginia OK
Oregon OK
Iowa OK
Belgium OK
Singapore OK
Sao Paulo OK
I would also like to advise that Cloud Run is a Google fully managed product and it has a 99.95 % monthly up time SLA, with no recent incidents in the past few months, but proactively monitoring this on your end is a very good thing too.

Using Pub/Sub on a public Cloud Run service

According to the "Authenticating service-to-service" documentation for Cloud Run, to use Pub/Sub and Cloud Scheduler on a service, unauthenticated access must be disabled because they rely on HTTP calls because of the zero scaling capability of Cloud Run services.
My services allow internal and Load Balancer traffic and must be publicly available for frontend clients, but they also must be able to communicate with each other privately with Pub/Sub.
Is there a way to achieve this? It feels unnatural to create a separate private service just for using Pub/Sub.
It's a missing piece. You can't plug in your VPC PubSub push subscription and Cloud Scheduler (but also Cloud Task, Cloud Build, Workflows,...). I asked Google Cloud few months ago, and it should be fixed by a new network features, soon. At least in 2021!
So, in your case, if your Cloud Run service is accessible from the public internet through a Load Balancer, you can use this public endpoint to call the path that you want on your service and thus perform the process.
If your Cloud Run in only accessible from ingress=internal, you can't for now.

How can my cloud run service call other cloud run services?

I have a service listening on 'https://myapp.a.run.app/dosomething', but I want to leverage the scalability features of Cloud Run, so in the controller for 'dosomething', I send off 10 requests to 'https://myapp.a.run.app/smalltask'; with my app configured to allow servicing of only one request per instance, I expect 10 instances to spin up, all do their smalltask, and return (all within the timeout period).
But I don't know how to properly authenticate the request, so those 10 requests all result in 403's. For Cloud Run services, I manually pass in a bearer token with the initial request, though I expect to add some api proxy at some point. But without said API proxy, what's the right way to send the request such that it is accepted? The app is running as a user that does have permissions to access the endpoint.
Authenticating service-to-service
If your architecture is using multiple services, these services will likely need to communicate with each other.
You can use synchronous or asynchronous service-to-service communication:
For asynchronous communication, use
Cloud Tasks for one to one asynchronous communication
Pub/Sub for one to many asynchronous communication
Cloud Scheduler for regularly scheduled asynchronous communication.
Cloud Workflows for orchestration services.
For synchronous communication
One service invokes another one over HTTP using its endpoint URL. In this use case, it's a good idea to ensure that each service is only able to make requests to specific services. For instance, if you have a login service, it should be able to access the user-profiles service, but it probably shouldn't be able to access the search service.
First, you'll need to configure the receiving service to accept requests from the calling service:
Grant the Cloud Run Invoker (roles/run.invoker) role to the calling service identity on the receiving service. By default, this identity is PROJECT_NUMBER-compute#developer.gserviceaccount.com.
In the calling service, you'll need to:
Create a Google-signed OAuth ID token with the audience (aud) set to the URL of the receiving service. This value must contain the schema prefix (http:// or https://) and custom domains are currently not supported for the aud value.
Include the ID token in an Authorization: Bearer ID_TOKEN header. You can get this token from the metadata server, while the container is running on Cloud Run (fully managed). If the application is running outside Google Cloud, you can generate an ID token from a service account key file.
For a full guide and examples in Node/Python/Go/Java and others see: Authenticating service-to-service