I have a list of 10 time stamps which keeps on updating dynamically. In total there are 3 such lists for 3 users. I want to build a utility to trigger a function at the next upcoming time stamp. (preferably everything over server-less compute)
I am stuck in finding out how to achieve this over aws or firebase
On Firebase/Google Cloud Functions the two most common options are either to store the schedule in a database and then periodically trigger a Cloud Function and run the tasks that are due, or to use Cloud Tasks to dynamically schedule a callback to a separate Cloud Function for each task.
I recommend also reading:
Doug's blog post on How to schedule a Cloud Function to run in the future with Cloud Tasks (to build a Firestore document TTL)
Fireship.io's tutorial on Dynamic Scheduled Background Jobs in Firebase
How can scheduled Firebase Cloud Messaging notifications be made outside of the Firebase Console?
Previous questions on dynamically scheduling functions, as this has been covered quite well before.
Update (late 2022): there is now also a built-in way to schedule Cloud Functions dynamically: enqueue functions with Cloud Tasks.
Related
I heavily use Google cloud run, for many reasons - one of the reasons is the simplicity of treating each request as stateless and handling it individually.
However I was thinking recently that for a service we have which simply writes data to a DB, it would be very handy to batch a few requests rather than write each one individually. Is this possible via serverless platforms - specifically cloud run?
Because Cloud Run is stateless, you can't stack the requests (mean keep them, so statefull) and process them later on. You need an intermediary layer for that.
On good way, that I have already implemented, is to publish the request in PubSub (either directly, or you use a CLoud Run/Cloud Function to get the request and transform it in PubSub message).
Then, you can create a Cloud Scheduler, that trigger a Cloud Run service. This Cloud Run will pull the PubSub topic and read a bunch of messages (maybe all). And then, you have all the "request" in batch and you can process them "inside the Cloud Scheduler request" (don't forget that you can't process in background with Cloud Run, you must be in a request context. -> for now ;) )
I think you can give a try to these blogs, I've done some reading and looks like you can pull some good ideas from them.
Running a serverless batch workload on GCP with Cloud Scheduler, Cloud Functions, and Compute Engine
Batching Jobs in GCP Using the Cloud Scheduler and Functions
Here is another stackoverflow thread that shows some similar approach.
I have 2 cloud functions that run every 5 minutes, currently using two different Cloud Scheduler Jobs, is it possible to configure Cloud Scheduler to run them both at the same time using only 1 job instead of 2.
You have several options. The 2 easiest are:
With Cloud Scheduler publish a message in PubSub instead of calling a Cloud Function. Then add 2 push subscription to PubSub to call your Cloud Functions. The message in entry in the topic is duplicated in each subscription (here 2) and thus the functions are called in parallel. Note: The PubSub message format isn't exactly the same as your own specific for Cloud Functions (if you have data to POST to your function) and you need to rework on this entry point part
With Cloud Scheduler you can call Workflows and in your workflow you can call task in parallel. I wrote an article on that this week
In both cases, you can't do this out of the box and you need to use a intermediary component to perform the fan out of the only one scheduling event.
At the moment I am investigating the possibility and the proper way of migrating complex web applications from AWS to GCP. There is actually no issues with mapping general compute and networking services from one provider to another, but I wonder if GCP has a service similar to AWS Step Functions? I've already taken a look at Google Dataflow and Google Cloud Tasks. The second one seems to be something like that, but I am not sure if it's the optimal solution.
So the question is what service from google provides same functionality as AWS Step Functions? And if there is no such - then combination of which services would you recommend to achieve effective orchestration of distributed tasks (primarily cloud functions).
Thanks!
2021 Update
As Brian de Alwis noted below, since this answer was written Cloud Workflows is now generally available and is functionally similar to Step Functions.
2019 Answer
As far as I'm aware there's nothing specifically like Step Functions, but I have two strategies for creating these types of micro-service systems on Google Cloud.
Strategy 1: Cloud Run/Cloud Functions with Pub/Sub
Here I'd create microservices using Cloud Run or Cloud Functions and subscribe these functions to Pub/Sub topics. That means that when Function A executes and completes it's work, it publishes a message to a specific topic with a data packet that any function subscribed to it will receive and execute.
For example you could create two topics named FunctionASuccess and FunctionAError and create two separate functions that subscribe to one or the other and handle the success and error use cases.
Strategy 2: Firebase Functions with Firestore/Realtime Database
Similarly to above I create Firebase Functions that watch for changes in Firestore or in the RTDB.
So Function A executes and completes its task, it saves a document to the FunctionAResults collection in Firestore or RTDB. Functions that are subscribed to changes in the FunctionAResults collection are then executed and take it to the next step.
They both work reliably so I have no preference, but I typically go with the 2nd strategy if I'm utilizing other Firebase services.
Cloud Workflows was announced at Cloud Next On Air 2020.
You're looking for Cloud Composer. It's based on the open-source library Apache Airflow which allows you to define and orchestrate workflows in a similar way to step functions.
I'm trying to schedule a batch job using google_cloud_scheduler_job terraform resource.
As per the document https://www.terraform.io/docs/providers/google/r/cloud_scheduler_job.html, I see only the following options:
PubSub target
HTTP target
AppEngine target
Any suggestions on how to create a batch job scheduler using google_cloud_scheduler_job? Thanks.
Let us split the story into two parts. Let us assume a function ... that when called, will initiate your batch job. You can write this function in a variety of programming languages .. in this example, we'll assume Node. In your Node function, you could (for example) call the DataProc Node.js sumitJob function to instantiate a DataProc job.
Now the question changes from "How do I schedule the execution of my batch job" to "How do I schedule the execution of a function (which executes the batch job)". And here is where the combination of Google Cloud Scheduler and Google Cloud Functions comes into play. Google Cloud Functions allows you to write a code function which is externally triggered by an arriving event. Such an event could be an HTTP request (as WebHook) or a Pub/Sub message. And where can these events come from? The answer is Google Cloud Scheduler. Once you have created your function, you can define that the function be executed/triggered on a schedule. And the result of all of this appears to be your desired request.
A tutorial illustrating Cloud Scheduler and Cloud Functions interacting can be found here.
I'm new to bigquery and need to do some tests on it. Looking through bigquery documentation, i can't find nothing about creating jobs and scheduling them.
I found in other page on internet that the only available method is creating a bucket in google cloud storage and create a function in cloud functions using javascript, and inside it's body write down the sql query.
Can someone help me here? Is it true?
Your question is a bit confusing as you mix scheduling jobs with defining a query in a cloud function.
There is a difference in scheduling jobs vs scheduling queries.
BigQuery offers Scheduled queries. See docs here.
BigQuery Data Transfer Service (schedule recurring data loads from GCS.) See docs here.
If you want to schedule jobs for (load, delete, copy jobs etc) you better do this with a trigger on the observed resource like Cloud Storage new file, a Pub/Sub message, a HTTP trigger all this wired in a Cloud Function.
Some other related blog posts:
How to schedule a BigQuery ETL job with Dataprep
Scheduling BigQuery Jobs: This time using Cloud Storage & Cloud Functions