I have a very simple C# console exe. My code deletes a blob from a particular blob storage. It takes a couple of command-line arguments - container name & blob name and deletes the blob whenever triggered.
Now, I want to schedule this exe as a webjob.
I have a couple of questions -
How can I manually trigger this webjob since it takes command line arguments?
Is there any way that I can trigger this webjob via a SQL server stored procedure?
You can use the stored procedure to send http requests to fulfill your needs.
Steps
You can use the Kudu Webjob API to invoke your function.
Create HttpRequest Method in sql server.
Related post
How can I make HTTP request from SQL server?
Related
This should be a very easy question but I can't wrap my head around what to use. I would like to create a data pipeline that fetches data from an outside/external API (for example, Spotify API) and perform some rather simple data cleaning on it, while either continue to create a JSON file in Cloud Storage or enter the data into BigQuery.
As far as I understand I can use Composer to do it, using DAGS etc but what I need here is something more simple/lightweight (mainly UI based) that doesn't cost as much as Composer does as well as being easier to use. What I am looking for is something like Data Factory in Azure.
So, in brief:
Login to a data source using username/password
Extract data from a well known format (CSV/Json)
Transform data, such as remove columns, perform simple filtering like date filtering.
Reformat the data into another format (JSON/CSV/BigQuery)
...without having to code everything from scratch.
Can I handle all of this with one GCP application or do I need to use combinations like Cloud Scheduler, Cloud Functions etc?
As always, you have several options...
Cloud Scheduler seems to be a requirement to trigger regularly the process (up to every minutes).
Then, you have 2 options:
Code the process: API Call, transform/clean the data, sink the data into the destination
Use Cloud Workflow: you can define the API calls that you want to do
Call the API
Store the raw data in BigQuery (API Call also, you have connectors to simplify the process)
Run a query in BigQuery to clean/format your data and store them into a final table (API Call also)
You can also perform a mix between Cloud Functions to get the data and clean/format the data with a query in BigQuery.
Doing something specific like that without starting from scratch... difficult...
EDIT 1
If you have a look to the documentation, you can see that sample
- getCurrentTime:
call: http.get
args:
url: https://us-central1-workflowsample.cloudfunctions.net/datetime
result: currentTime
- readWikipedia:
call: http.get
args:
url: https://en.wikipedia.org/w/api.php
query:
action: opensearch
search: ${currentTime.body.dayOfTheWeek}
result: wikiResult
- returnResult:
return: ${wikiResult.body[1]}
The first step getCurrentTime performs an external call and store the result in result: currentTime.
In the next step, you can reuse the result currentTime and get only the value that you want in another API call.
And you can plug steps like that.
If you need authentication, you can perform a call to secret manager to get the secret values and then to result the secret manager call result in subsequent steps.
For an easier connection to Google APIs, you can use connectors
I've cloned a repo from here and trying to explore AWS AppSync's subscription. My understanding is that if there is real-time updates to server data, client should expect to see some notification or updates or sorts, so what I did was:
running the app on a simulator
Open DynamoDB console and add the records manually.
I was expecting there is some notifications received on my app but there isn't, and if I refresh the app it will have the updated records? Am I understand the subscription wrongly?
Subscriptions are not triggered from your dynamo db, but from your mutations (defined in your graphql schema). Try to add records via the mutation your subscription listens on. You can run a mutation from the app sync console under "queries".
If your client is set up correctly, it should update accordingly.
Hope this helps :)
Subscription can only be triggered by mutation. When you add record directly to your DB, the mutation is not called hence no subscription is triggered. Does not really servers the purpose for the external db updates. There is work around available.
Scenario 1: If you are making the change to the DB directly via some DB client, you need to call the mutation endpoint explicitly (from AWS console, postman etc.). This will trigger the subscription. I am guessing the direct DB change is done for testing.
Scenario 2: The direct DB change is done by some other external process and not via Appsync mutation. You need to call the none data source mapped mutation in your processor. This dummy mutation will trigger the subscription.
Here's [a link] (https://aws.amazon.com/premiumsupport/knowledge-center/appsync-notify-subscribers-real-time/ ) explaining how to create a none data source mapped mutation.
Am trying to invoke campaign via SOAP request ,which is suppose to insert data into data warehouse,but it's not happening.am using DS2 code for it
When I try run the same campaign in test mode in SAS CI it's happening
I want the log that gets generated via SOAP request.can anyone let me know the path for the same.
Thanks in advance.
You need to check the log files on your application & mid-tier servers. Search the log files for the time you ran your request.
Start by checking the ObjectSpawner log to see if any connections were made the SAS environment at that time: ex. C:\SASHome\Lev1\ObjectSpawner\Logs
Having configured an Azure SQL Database, I would like to feed some tables with data from an HTTP REST GET call.
I have tried Microsoft Flow (whose HTTP Request action is utterly botched) and I am now exploring Azure Data Factory, to no avail.
The only way I can currently think of is provisioning an Azure VM and install Postman with Newman. But then, I would still need to create a Web Service interface to the Azure SQL Database.
Does Microsoft offer no HTTP call service to hook up to an Azure SQL Database?
Had the same situation a couple of weeks ago and I ended up building the API call management using Azure Functions. No problem to use the Azure SDK's to upload the result to e.g BLOB store or Data Lake. And you can add whatever assembly you need to perform the HTTP post operation.
From their you can easily pull it with Data Factory to a Azure SQL db.
I would suggest you write yourself an Azure Data Factory custom activity to achieve this. I've done this for a recent project.
Add a C# class library to your ADF solution and create a class that inherits from IDotNetActivity. Then in the IDictionary method make the HTTP web request to get the data. Land the downloaded file in blob storage first, then have a downstream activity to load the data into SQL DB.
public class GetLogEntries : IDotNetActivity
{
public IDictionary<string, string> Execute(
IEnumerable<LinkedService> linkedServices,
IEnumerable<Dataset> datasets,
Activity activity,
IActivityLogger logger)
{
etc...
HttpWebResponse myHttpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse();
You can use the ADF linked services to authenticate against the storage account and define where container and file name you want as the output etc.
This is an example I used for data lake. But there is an almost identical class for blob storage.
Dataset outputDataset = datasets.Single(dataset => dataset.Name == activity.Outputs.Single().Name);
AzureDataLakeStoreLinkedService outputLinkedService;
outputLinkedService = linkedServices.First(
linkedService =>
linkedService.Name ==
outputDataset.Properties.LinkedServiceName).Properties.TypeProperties
as AzureDataLakeStoreLinkedService;
Don't bother with an input for the activity.
You will need an Azure Batch Service as well to handle the compute for the compiled classes. Check out my blog post on doing this.
https://www.purplefrogsystems.com/paul/2016/11/creating-azure-data-factory-custom-activities/
Hope this helps.
I know there are api to configure the notification when a job is failed or finished.
But what if, say, I run a hive query that count the number of rows in a table. If the returned result is zero I want to send out emails to the concerned parties. How can I do that?
Thanks.
You may want to look at Airflow and Qubole's operator for airflow. We use airflow to orchestrate all jobs being run using Qubole and in some cases non Qubole environments. We DataDog API to report success / failures of each task (Qubole / Non Qubole). DataDog in this case can be replaced by Airflow's email operator. Airflow also has some chat operator (like Slack)
There is no direct api for triggering notification based on results of a query.
However there is a way to do this using Qubole:
-Create a work flow in qubole with following steps:
1. Your query (any query) that writes output to a particular location on s3.
2. A shell script - This script reads result from your s3 and fails the job based on any criteria. For instance in your case, fail the job if result returns 0 rows.
-Schedule this work flow using "Scheduler" API to notify on failure.
You can also use "Sendmail" shell command to send mail based on results in step 2 above.