I have a very simple webjob that does just one thing. In the settings.job, I run it every 5 minutes {"schedule":"0 */5 * * * *"}
Now I want to add a new webjob, but wants to include it in the same project. This new webjob will run every 2 minutes. Is it possible to specify individual webjob in settings.job? Or it is better to specify directly in the function? e.g.
public static void MyFunction([TimerTrigger("0 */2 * * * *",
RunOnStartup = true)]
TimerInfo timerInfo,
TextWriter log)
Yes, you can have multiple WebJobs in the same project. However to do the same you would need to use Azure WebJobs SDK Extension i.e. TimerTrigger.
More details on the extension can be found at URL
https://github.com/Azure/azure-webjobs-sdk-extensions#timertrigger
You may also find below blog useful which talks about "How to have a single project to hold multiple scheduled Azure Webjobs"
https://patrickdesjardins.com/blog/how-to-have-a-single-project-to-hold-multiple-scheduled-azure-webjobs
Thanks
Alok
Related
Error message:
Error: There was an error deploying functions
firebease-debug.log holds this:
[debug] [2021-11-16T12:12:16.165Z] Error: Failed to upsert schedule function lab2 in region europe-west3
function code:
exports.lab2 =
functions
.region('europe-west3')
.pubsub.schedule('*/10 * * * *')
.onRun(lab);
What can I do? Google support leads to stackoverflow, so I post it here. Are there better ways to deal with the Google Cloud problems?
When you are using scheduled functions in Firebase Functions, an App Engine instance is created that is needed for Cloud Scheduler to work. You can read about it here. During its setup you're prompted to select your project's default Google Cloud Platform (GCP) resource location (if it wasn't already selected when setting up another service).
You are getting that error because there is a difference between the default GCP resource location you specified and the region of your scheduled Cloud Function. If you click on the cogwheel next to project-overview in Firebase you can see where your resources are located. Setting the default GCP resource location same as the scheduler function region, solves the issue.
The reason our schedule function deployment failed was because of the cron job formula. For some reason firebase couldn't understand 0 0 3 1/1 * ? *, but could understand every 5 minutes.
It's a shame that firebase doesn't provide a better error message. Error: Failed to upsert schedule function xxx in region xxx is way too generic.
I got the same problem of deploying my schedule functions, and I solved it with a few steps as below:
Step 1: Rename the schedule function (for example: newScheduleABC), then only deploy it:
$ Firebase deploy --only functions:newScheduleABC
NOTE: as #Nicholas mentioned, the Unix Cron Format The GCP accepts has only 5 fields: schedule(* * * * *)
Step 2: Delete old (schedule) functions by going to your Firebase Console : https://console.firebase.google.com/.../functions , you will see all of your functions there. click on the vertical ellipsis at the end of a function and click delete.
That's it. There should be no problem from now.
You can read more from Manage functions
I have a very simple C# console exe. My code deletes a blob from a particular blob storage. It takes a couple of command-line arguments - container name & blob name and deletes the blob whenever triggered.
Now, I want to schedule this exe as a webjob.
I have a couple of questions -
How can I manually trigger this webjob since it takes command line arguments?
Is there any way that I can trigger this webjob via a SQL server stored procedure?
You can use the stored procedure to send http requests to fulfill your needs.
Steps
You can use the Kudu Webjob API to invoke your function.
Create HttpRequest Method in sql server.
Related post
How can I make HTTP request from SQL server?
I have successfully scheduled my query in BigQuery, and the result is saved as a table in my dataset. I see a lot of information about scheduling data transfer in to BigQuery or Cloud Storage, but I haven't found anything regarding scheduling an export from a BigQuery table to Cloud Storage yet.
Is it possible to schedule an export of a BigQuery table to Cloud Storage so that I can further schedule having it SFTP-ed to me via Google BigQuery Data Transfer Services?
There isn't a managed service for scheduling BigQuery table exports, but one viable approach is to use Cloud Functions in conjunction with Cloud Scheduler.
The Cloud Function would contain the necessary code to export to Cloud Storage from the BigQuery table. There are multiple programming languages to choose from for that, such as Python, Node.JS, and Go.
Cloud Scheduler would send an HTTP call periodically in a cron format to the Cloud Function which would in turn, get triggered and run the export programmatically.
As an example and more specifically, you can follow these steps:
Create a Cloud Function using Python with an HTTP trigger. To interact with BigQuery from within the code you need to use the BigQuery client library. Import it with from google.cloud import bigquery. Then, you can use the following code in main.py to create an export job from BigQuery to Cloud Storage:
# Imports the BigQuery client library
from google.cloud import bigquery
def hello_world(request):
# Replace these values according to your project
project_name = "YOUR_PROJECT_ID"
bucket_name = "YOUR_BUCKET"
dataset_name = "YOUR_DATASET"
table_name = "YOUR_TABLE"
destination_uri = "gs://{}/{}".format(bucket_name, "bq_export.csv.gz")
bq_client = bigquery.Client(project=project_name)
dataset = bq_client.dataset(dataset_name, project=project_name)
table_to_export = dataset.table(table_name)
job_config = bigquery.job.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP
extract_job = bq_client.extract_table(
table_to_export,
destination_uri,
# Location must match that of the source table.
location="US",
job_config=job_config,
)
return "Job with ID {} started exporting data from {}.{} to {}".format(extract_job.job_id, dataset_name, table_name, destination_uri)
Specify the client library dependency in the requirements.txt file
by adding this line:
google-cloud-bigquery
Create a Cloud Scheduler job. Set the Frequency you wish for
the job to be executed with. For instance, setting it to 0 1 * * 0
would run the job once a week at 1 AM every Sunday morning. The
crontab tool is pretty useful when it comes to experimenting
with cron scheduling.
Choose HTTP as the Target, set the URL as the Cloud
Function's URL (it can be found by selecting the Cloud Function and
navigating to the Trigger tab), and as HTTP method choose GET.
Once created, and by pressing the RUN NOW button, you can test how the export
behaves. However, before doing so, make sure the default App Engine service account has at least the Cloud IAM roles/storage.objectCreator role, or otherwise the operation might fail with a permission error. The default App Engine service account has a form of YOUR_PROJECT_ID#appspot.gserviceaccount.com.
If you wish to execute exports on different tables,
datasets and buckets for each execution, but essentially employing the same Cloud Function, you can use the HTTP POST method
instead, and configure a Body containing said parameters as data, which
would be passed on to the Cloud Function - although, that would imply doing
some small changes in its code.
Lastly, when the job is created, you can use the Cloud Function's returned job ID and the bq CLI to view the status of the export job with bq show -j <job_id>.
Not sure if this was in GA when this question was asked, but at least now there is an option to run an export to Cloud Storage via a regular SQL query. See the SQL tab in Exporting table data.
Example:
EXPORT DATA
OPTIONS (
uri = 'gs://bucket/folder/*.csv',
format = 'CSV',
overwrite = true,
header = true,
field_delimiter = ';')
AS (
SELECT field1, field2
FROM mydataset.table1
ORDER BY field1
);
This could as well be trivially setup via a Scheduled Query if you need a periodic export. And, of course, you need to make sure the user or service account running this has permissions to read the source datasets and tables and to write to the destination bucket.
Hopefully this is useful for other peeps visiting this question if not for OP :)
You have an alternative to the second part of the Maxim answer. The code for extracting the table and store it into Cloud Storage should work.
But, when you schedule a query, you can also define a PubSub topic where the BigQuery scheduler will post a message when the job is over. Thereby, the scheduler set up, as described by Maxim is optional and you can simply plug the function to the PubSub notification.
Before performing the extraction, don't forget to check the error status of the pubsub notification. You have also a lot of information about the scheduled query; useful is you want to perform more checks or if you want to generalize the function.
So, another point about the SFTP transfert. I open sourced a projet for querying BigQuery, build a CSV file and transfert this file to FTP server (sFTP and FTPs aren't supported, because my previous company only used FTP protocol!). If your file is smaller than 1.5Gb, I can update my project for adding the SFTP support is you want to use this. Let me know
I need to automate a process to extract data from Google Big Query and exported to an external CSV in a external server outside of the GCP.
I just researching how to to that I found some commands to run from my External Server. But I prefer to do everything in GCP to avoid possible problems.
To run the query to CSV in Google storage
bq --location=US extract --compression GZIP 'dataset.table' gs://example-bucket/myfile.csv
To Download the csv from Google Storage
gsutil cp gs://[BUCKET_NAME]/[OBJECT_NAME] [OBJECT_DESTINATION]
But I would like to hear your suggestions
If you want to fully automatize this process, I would do the following:
Create a Cloud Function to handle the export:
This is the more lightweight solution, as Cloud Functions are serverless, and provide flexibility to implement code with the Client Libraries. See the quickstart, I recommend you to use the console to create the functions to start with.
In this example I recommend you to trigger the Cloud Function from an HTTP request, i.e. when the function URL is called, it will run the code inside of it.
An example Cloud Function code in Python, that creates the export when a HTTP request is made:
main.py
from google.cloud import bigquery
def hello_world(request):
project_name = "MY_PROJECT"
bucket_name = "MY_BUCKET"
dataset_name = "MY_DATASET"
table_name = "MY_TABLE"
destination_uri = "gs://{}/{}".format(bucket_name, "bq_export.csv.gz")
bq_client = bigquery.Client(project=project_name)
dataset = bq_client.dataset(dataset_name, project=project_name)
table_to_export = dataset.table(table_name)
job_config = bigquery.job.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP
extract_job = bq_client.extract_table(
table_to_export,
destination_uri,
# Location must match that of the source table.
location="US",
job_config=job_config,
)
return "Job with ID {} started exporting data from {}.{} to {}".format(extract_job.job_id, dataset_name, table_name, destination_uri)
requirements.txt
google-cloud-bigquery
Note that the job will run asynchronously in the background, you will receive a return response with the job ID, which you can use to check the state of the export job in the Cloud Shell, by running:
bq show -j <job_id>
Create a Cloud Scheduler scheduled job:
Follow this documentation to get started. You can set the Frequency with the standard cron format, for example 0 0 * * * will run the job every day at midnight.
As a target, choose HTTP, in the URL put the Cloud Function HTTP URL (you can find it in the console, inside the Cloud Function details, under the Trigger tab), and as HTTP method choose GET.
Create it, and you can test it in the Cloud Scheduler by pressing the Run now button in the Console.
Synchronize your external server and the bucket:
Up until now you only have scheduled exports to run every 24 hours, now to synchronize the bucket contents with your local computer, you can use the gsutil rsync command. If you want to save the imports, lets say to the my_exports folder, you can run, in your external server:
gsutil rsync gs://BUCKET_WITH_EXPORTS /local-path-to/my_exports
To periodically run this command in your server, you could create a standard cron job in your crontab inside your external server, to run each day as well, just at a few hours later than the bigquery export, to ensure that the export has been made.
Extra:
I have hard-coded most of the variables in the Cloud Function to be always the same. However, you can send parameters to the function, if you do a POST request instead of a GET request, and send the parameters as data in the body.
You will have to change the Cloud Scheduler job to send a POST request to the Cloud Function HTTP URL, and in the same place you can set the body to send the parameters regarding the table, dataset and bucket, for example. This will allow you to run exports from different tables at different hours, and to different buckets.
I'm rushing (never a good thing) to get Sync Framework up and running for a "offline support" deadline on my project. We have a SQL Express 2008 instance on our server and then will deploy SQLCE to the clients. Clients will only sync with server, no peer-to-peer.
So far I have the following working:
Server schema setup
Scope created and tested
Server provisioned
Client provisioned w/ table creation
I've been very impressed with the relative simplicity of all of this. Then I realized the following:
Schema created through client provisioning to SQLCE does not setup default values for uniqueidentifier types.
FK constraints are not created on client
Here is the code that is being used to create the client schema (pulled from an example I found somewhere online)
static void Provision()
{
SqlConnection serverConn = new SqlConnection(
"Data Source=xxxxx, xxxx; Database=xxxxxx; " +
"Integrated Security=False; Password=xxxxxx; User ID=xxxxx;");
// create a connection to the SyncCompactDB database
SqlCeConnection clientConn = new SqlCeConnection(
#"Data Source='C:\SyncSQLServerAndSQLCompact\xxxxx.sdf'");
// get the description of the scope from the SyncDB server database
DbSyncScopeDescription scopeDesc = SqlSyncDescriptionBuilder.GetDescriptionForScope(
ScopeNames.Main, serverConn);
// create CE provisioning object based on the scope
SqlCeSyncScopeProvisioning clientProvision = new SqlCeSyncScopeProvisioning(clientConn, scopeDesc);
clientProvision.SetCreateTableDefault(DbSyncCreationOption.CreateOrUseExisting);
// starts the provisioning process
clientProvision.Apply();
}
When Sync Framework creates the schema on the client I need to make the additional changes listed earlier (default values, constraints, etc.).
This is where I'm getting confused (and frustrated):
I came across a code example that shows a SqlCeClientSyncProvider that has a CreatingSchema event. This code example actually shows setting the RowGuid property on a column which is EXACTLY what I need to do. However, what is a SqlCeClientSyncProvider?! This whole time (4 days now) I've been working with SqlCeSyncProvider in my sync code. So there is a SqlCeSyncProvider and a SqlCeClientSyncProvider?
The documentation on MSDN is not very good in explaining what either of these.
I've further confused whether I should make schema changes at provision time or at sync time?
How would you all suggest that I make schema changes to the client CE schema during provisioning?
SqlCeSyncProvider and SqlCeClientSyncProvider are different.
The latter is what is commonly referred to as the offline provider and this is the provider used by the Local Database Cache project item in Visual Studio. This provider works with the DbServerSyncProvider and SyncAgent and is used in hub-spoke topologies.
The one you're using is referred to as a collaboration provider or peer-to-peer provider (which also works in a hub-spoke scenario). SqlCeSyncProvider works with SqlSyncProvider and SyncOrchestrator and has no corresponding Visual Studio tooling support.
both providers requires provisioning the participating databases.
The two types of providers provisions the sync objects required to track and apply changes differently. The SchemaCreated event applies to the offline provider only. This get's fired the first time a sync is initiated and when the framework detects that the client database has not been provisioned (create user tables and the corresponding sync framework objects).
the scope provisioning used by the other provider dont apply constraints other than the PK. so you will have to do a post-provisioning step to apply the defaults and constraints yourself outside of the framework.
While researching solutions without using SyncAgent I found that the following would also work (in addition to my commented solution above):
Provision the client and let the framework create the client [user] schema. Now you have your tables.
Deprovision - this removes the restrictions on editing the tables/columns
Make your changes (in my case setting up Is RowGuid on PK columns and adding FK constraints) - this actually required me to drop and add a column as you can't change the "Is RowGuid" property an existing columns
Provision again using DbSyncCreationOption.CreateOrUseExisting