Deploy function with long running time on Google Cloud Run - google-cloud-platform

If I understood it well, Google Cloud Run will make an API publicly available. Once a request is received, an instance is started and the job is processed. Once the job done, the instance will be terminated. Is this right?
If so, I presume that Google determine when the instance should be shutdown when the HTTP response in sent back to the client. Is that also right?
In my case the process will run from 10 to 20 Minutes. Can I still send the HTTP response after so much time? Any Advice on how to implement that?

Frankly, all of this is well documented in the cloud run docs:
Somewhat, but this depends on how you configured your scaling https://cloud.google.com/run/docs/about-instance-autoscaling
Also see above, but a request is considered "done" when the HTTP connection is closed (either by you or the client), yes
60 mins is the limit, see:
https://cloud.google.com/run/docs/configuring/request-timeout
Any Advice on how to implement that?
You just keep the connection open for 20mins, but do note the remark on long living connections in the link above.

Related

How to use Google Cloud PubSub and Run to handle resource-intensive long-running tasks?

I've got a Google Cloud PubSub topic which at times has thousands of messages and at times zero messages coming in. These messages represent tasks which can take upwards of an hour each. Preferably I'm able to use Cloud Run for this, as it scales really well to the demand, if a thousand messages gets published, I want 100s of Cloud Run instances to spin up. These Run instances get started by a push subscription. The problem is that PubSub has a 600 second timeout for the acknowledgement. This means in order to have Cloud Run process these messages they have to finish within 600 seconds. If they do not, PubSub times it out, and sends it again, causing the task to be restarted until the first task finally does acknowledge it (this causes the same task to be ran many times). Cloud Run acknowledges the messages by returning a 2** HTTP status code. The documentation states
When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers.
So is it maybe possible to acknowledge a PubSub request through code and continue the processing, without having Google Cloud Run hand over the resources? Or is there a better solution I'm unaware of?
Because these processes are so code/resource-intensive, I feel Cloud Functions will not suffice. I've looked at https://cloud.google.com/solutions/using-cloud-pub-sub-long-running-tasks and https://cloud.google.com/blog/products/gcp/how-google-cloud-pubsub-supports-long-running-workloads. But these didn't answer my question.
I've looked at Google Cloud Tasks, which might be something? But the rest of the project has been built around PubSub/Run/Functions, so preferably I stick with that.
This project is written in Python.
So preferably I would like to write my Google Cloud Run tasks like this:
#app.route('/', methods=['POST'])
def index():
"""Endpoint for Google Cloud PubSub messages"""
pubsub_message = request.get_json()
logger.info(f'Received PubSub pubsub_message {pubsub_message}')
if message_incorrect(pubsub_message):
return "Invalid request", 400 #use normal NACK handling
# acknowledge message here without returning
# ...
# Do actual processing of the task here
# ...
So how can or should I solve this, so that the the resource-intensive tasks get properly scaled on demand ( so a push PubSub subscription ). And the tasks only get executed once.
Answers:
In short what has been answered. Cloud Run and Functions are just not suited for this problem. There is no way to have them do tasks that take longer than 9 or 15 minutes respectively. The only solution is to switch over to another Google Service and use a pull style subscription and lose out on auto-scaling of GC Run/Functions
Cloud Run on GKE can handle long process, more CPU and memory than available on managed platform. However, you have a GKE cluster always running and you loose the "pay-as-you-use" benefit.
If you want to use this solution, don't link directly PubSub push subscription to your Cloud Run on GKE. Use Cloud Task with HTTP job for this. The timeout is longer than PubSub (up to 24h instead of 10 min) and the retry policies are customizables.
Neither Cloud Functions nor Cloud Run is sufficient for arbitrarily long running operations. Cloud Functions has a hard cap of 9 minutes per invocation, and Cloud Run caps at 60. If you need more time, you're going to have to delegate the work to another product, such as Google Compute Engine. It should be possible to kick off some Compute Engine work from one of the serverless products.
Give the limits of pubsub acks, you'll probably have to find a way for a client to be able to poll or listen to some resource to find out when the work is actually done. You could use a database for that, and Cloud Firestore lets you listen to documents to find out when they change. So you could use that to track the status of your long-running work.

How to handle long requests in Google Cloud Run?

I have hosted my node app in Cloud Run and all of my requests served within 300 - 600ms time. But one endpoint that gets data from a 3rd party service so that request takes 1.2s - 2.5s to complete the request.
My doubts regarding this are
Is 1.2s - 2.5s requests suitable for cloud run? Or is there any rule that the requests should be completed within xx ms?
Also see the screenshot, I got a message along with the request in logs "The request caused a new container instance to be started and may thus take longer and use more CPU than a typical request"
What caused a new container instance to be started?
Is there any alternative or work around to handle long requests?
Any advice / suggestions would be greatly appreciated.
Thanks in advance.
I don't think that will be an issue unless you're worried about the cost of the CPU/memory time, which honestly should only matter if you're getting 10k+ requests/day. So, probably doesn't matter and cloud run can handle that just fine (my own app does requests longer than that with no problem)
It's possible that your service was "scaled to zero" meaning that there were no containers left running to serve requests. In that case, it would be necessary to start up a new instance and wait for whatever initializing/startup costs are associated with that process. It's also possible that it was auto-scaled due to all other instances being at their request limits. Make sure that your setting for max concurrent requests per instance is set greater than one - Node/Express can handle multiple requests at once. Plus, you'll only get charged for the total time spend, not per request:
In situations where you get very long (30 seconds, minutes+) operations, it may be a good idea to switch to some different data transfer method. You could use polling, where the client makes a request every 5 seconds and checks if the response is ready. You could also switch to some kind of push-based system like WebSockets, but Cloud Run doesn't have support for that.
TL;DR longer requests (~10-30 seconds) should be fine unless you're worried about the cost of the increased compute time they may occur at scale.

Alexa sent multiple request to AWS Lambda

I'm building the Alexa skill that sends the request to my web server,
then web server will do some process and upload a file to Amazon S3.
During the period of web server process, I make skill keep getting the file from Amazon S3 per 10 seconds till get the file. And the response is based on the file content.
But unfortunately, the web server process takes more than 1 minute. That means skill must stay more than 1 minute to get the file to response.
For now, I used progressive response with async await in my code,
and skill did keep waiting for the file on S3.
But I found that the skill will send the second request to Lambda after 50 seconds automatically. That means for the same skill, i got the two lambda function running at the same time.
And the execution result is : After the first response that progressive response made, 50 seconds later will hear another response that also made by the progressive response which belongs to the second request.
And nothing happened till the end.
I know it is bad to let skill waits this long, but i still want to figure out the executable way if skill needs to wait this long.
There are some points I want to figure out.
Is there anyway to prevent the skill to send the second
requests to Lambda?
Is there another way I can try to accomplish the goal?
Thanks
Eventually, I found that the second invoke of Lambda is not from Alexa, is from AWS Lambda itself. Refer to the following artical
https://cloudonaut.io/your-lambda-function-might-execute-twice-deal-with-it/
So you have to deal with this kind of situation in your Lambda code. One thing can be used is these two times invoke's request id is the same. So you can tell if this is the first time execution by checking your storage for the same request id which you store at the first time execution.
Besides, I also found that once the Alexa Skill waits for more than 1 minutes, it will crash and return the error by speaking (test by Amazon Echo). And there is nothing different in the AWS Lambda log compare to the normal execution one. That meaning the Log seems to be fine but actually the execution result is not.
Hope this can help someone is also struggled at this problem.

Is there an AWS / Pagerduty service that will alert me if it's NOT notified

We've got a little java scheduler running on AWS ECS. It's doing what cron used to do on our old monolith. it fires up (fargate) tasks in docker containers. We've got a task that runs every hour and it's quite important to us. I want to know if it crashes or fails to run for any reason (eg the java scheduler fails, or someone turns the task off).
I'm looking for a service that will alert me if it's not notified. I want to call the notification system every time the script runs successfully. Then if the alert system doesn't get the "OK" notification as expected, it shoots off an alert.
I figure this kind of service must exist, and I don't want to re-invent the wheel trying to build it myself. I guess my question is, what's it called? And where can I go to get that kind of thing? (we're using AWS obviously and we've got a pagerDuty account).
We use this approach for these types of problems. First, the task has to write a timestamp to a file in S3 or EFS. This file is the external evidence that the task ran to completion. Then you need an http based service that will read that file and calculate if the time stamp is valid ie has been updated in the last hour. This could be a simple php or nodejs script. This process is exposed to the public web eg https://example.com/heartbeat.php. This script returns a http response code of 200 if the timestamp file is present and valid, or a 500 if not. Then we use StatusCake to monitor the url, and notify us via its Pager Duty integration if there is an incident. We usually include a message in the response so a human can see the nature of the error.
This may seem tedious, but it is foolproof. Any failure anywhere along the line will be immediately notified. StatusCake has a great free service level. This approach can be used to monitor any critical task in same way. We've learned the hard way that critical cron type tasks and processes can fail for any number of reasons, and you want to know before it becomes customer critical. 24x7x365 monitoring of these types of tasks is necessary, and helps us sleep better at night.
Note: We always have a daily system test event that triggers a Pager Duty notification at 9am each day. For the truly paranoid, this assures that pager duty itself has not failed in some way eg misconfiguratiion etc. Our support team knows if they don't get a test alert each day, there is a problem in the notification system itself. The tech on duty has to awknowlege the incident as per SOP. If they do not awknowlege, then it escalates to the next tier, and we know we have to have a talk about response times. It keeps people on their toes. This is the final piece to insure you have robust monitoring infrastructure.
OpsGene has a heartbeat service which is basically a watch dog timer. You can configure it to call you if you don't ping them in x number of minutes.
Unfortunately I would not recommend them. I have been using them for 4 years and they have changed their account system twice and left my paid account orphaned silently. I have to find a new vendor as soon as I have some free time.

aws get time according to cloud

I'm not seeing any answers for this when googling, so it's likely I'm asking a silly question, but here goes!
I want to call the aws api to get what time my cloud services think it is, because that appears to be different to my local server time (by a small but significant amount).
For a bit more context, I am running an automated test which needs to check that a new object is created in S3 as a result of the system under test working. The object which is created is given a timestamp in its name by AWS, based on the amazon server time. I use this timestamp to to distinguish the object from all the other objects in the bucket as it will be the only one with a time after the start of my test run. Unfortunately the time it gets given by amazon ends up being a few seconds before the test run started, because my server time is ahead of amazon's.
So if I could get the 'cloud' start time at the start of my test and compare that I won't have this problem.
EDIT
In case anyone has the same problem, I used this quick and dirty workaround in c#, in the spirit of the accepted answer but not quite as rigourous.
public async Task<DateTime> EstimateInternetTime()
{
var httpClient = new HttpClient();
var response = await httpClient.GetAsync("http://google.com");
return response.Headers.Date.Value.UtcDateTime;
}
There isn't a "cloud time", thus no way to ensure your timestamps match exactly to those used in AWS.
The best you can do is synchronize your clock using NTP to a public NTP server.
Amazon does have their own pool of NTP servers which you can use:
0.amazon.pool.ntp.org
1.amazon.pool.ntp.org
3.amazon.pool.ntp.org
2.amazon.pool.ntp.org
In many cases, EC2 instances are already configured to use these servers.
See the AWS documentation about configuring NTP on your Linux servers:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/set-time.html#configure_ntp