Is there any way to monitor Azure Synapse Pipelines execution? - azure-sqldw

In my project, I've a need where I need to show how Pipeline is progressing on custom Web Portal built in PHP. Is there any way in any language such as C# or Java through which I can list pipelines and monitor the progress or even log into Application Insights?

Are you labelling your queries with the OPTION (LABEL='MY LABEL') syntax?
This will make it easy to monitor the progress of your pipeline by querying sys.dm_pdw_exec_requests to pick individual queries (see last paragraph under link heading), and if you use a naming convention like 'pipeline_query' you can probably achieve what you want.

try
{
PipelineRunClient pipelineRunClient = new(new Uri(_Settings.SynapseExtractEndpoint), new DefaultAzureCredential());
run = await pipelineRunClient.GetPipelineRunAsync(runId);
while(run.Status == "InProgress" || run.Status == "Queued")
{
_Logger.LogInformation($"!!Pipeline {run.PipelineName} {runId} Status: {run.Status}");
Task.Delay(30000).Wait();
run = await pipelineRunClient.GetPipelineRunAsync(runId);
}
_Logger.LogInformation($"!!Pipeline {run.PipelineName} {runId} Status: {run.Status} Runtime: {run.DurationInMs} Message: {run.Message}");
}

Related

Next.js CLI - is it possible to pre-build certain routes when running dev locally?

I'm part of an org with an enterprise app built on Next.js, and as it's grown the local dev experience has been degrading. The main issue is that our pages make several calls to /api routes on load, and those are built lazily when you run yarn dev, so you're always forced to sit and wait in the browser while that happens.
I've been thinking it might be better if we were able to actually pre-build all of the /api routes right away when yarn dev is run, so we'd get a better experience when the browser is opened. I've looked at the CLI docs but it seems the only options for dev are -p (port) and -H (host). I also don't think running yarn build first will work as I assume the build output is quite different between the build and dev commands.
Does anyone know if this is possible? Any help is appreciated, thank you!
I don't believe there's a way to prebuild them, but you can tell Next how long to keep them before discarding and rebuilding. Check out the onDemandEntries docs. We had a similar issue and solved it for a big project about a year ago with this in our next.config.js:
const { PHASE_DEVELOPMENT_SERVER } = require("next/constants")
module.exports = (phase, {}) => {
let devOnDemandEntries = {}
if (phase === PHASE_DEVELOPMENT_SERVER) {
devOnDemandEntries = {
// period (in ms) where the server will keep pages in the buffer
maxInactiveAge: 300 * 1000,
// number of pages that should be kept simultaneously without being disposed
pagesBufferLength: 5,
}
}
return {
onDemandEntries,
...
}
}

Enabling Change Tracking for Entity

Is there a way to enable change tracking for couple of entities programmatically? I could not find any api in Dataverse which can help to do this.
You cannot do that with webapi but you can definitely achieve this programatically.
you can either create console application or you can run code now plugin for xrmtoolbox and get this done.
Below code snippet for reference.
UpdateEntityRequest updateBankAccountRequest = new UpdateEntityRequest
{
Entity = BankAccountEntity,
ChangeTrackingEnabled = true //or false here
};
_serviceProxy.Execute(updateBankAccountRequest);
Microsoft docs for ChangeTrackingEnabled

Running a request in Postman multiple times with different data only runs once

I am new to Postman and running into a recurrent issue that I can’t figure out.
I am trying to run the same request multiple times using an array of data established on the Pre-request script, however, when I go to the runner the request is only running once, rather than 3 times.
Pre-request script:
var uuids = pm.environment.get(“uuids”);
if(!uuids) {
uuids= [“1eb253c6-8784”, “d3fb3ab3-4c57”, “d3fb3ab3-4c78”];
}
var currentuuid = uuids.shift();
pm.environment.set(“uuid”, currentuuid);
pm.environment.set(“uuids”, uuids);
Tests:
var uuids = pm.environment.get(“uuids”);
if (uuids && uuids.length>0) {
postman.setNextRequest(myurl/?userid={{uuid}});
} else {
postman.setNextRequest();
}
I have looked over regarding documentation and I cannot find what is wrong with my code.
Thanks!
Pre-request script is not a good way to test api with different data. Better use Postman runner for the same.
First, prepare a request with postman with variable data. For e.g
Then click to the Runner tab
Prepare csv file with data
uuids
1eb253c6-8784
d3fb3ab3-4c57
d3fb3ab3-4c78
And provide as data file, and run the sample.
It will allow you run the same api, multiple times with different data types and can check test cases.
You are so close! The issue is that you are not un-setting your environment variable for uuids, so it is an empty list at the start of each run. Simply add
pm.environment.unset("uuids") to your exit statement and it should run all three times. All specify the your next request should stop the execution by setting it to null.
So your new "Tests" will become:
var uuids = pm.environment.get("uuids");
if (uuids && uuids.length>0) {
postman.setNextRequest(myurl/?userid={{uuid}});
} else {
postman.setNextRequest(null);
pm.environment.unset("uuids")
}
It seems as though the Runner tab has been removed now?
For generating 'real' data, I found this video a great help: Creating A Runner in Postman-API Testing
Sending 1000 responses to the db to simulate real usage has saved a lot of time!

Should I have concern about datastoreRpcErrors?

When I run dataflow jobs that writes to google cloud datastore, sometime I see the metrics show that I had one or two datastoreRpcErrors:
Since these datastore writes usually contain a batch of keys, I am wondering in the situation of RpcError, if some retry will happen automatically. If not, what would be a good way to handle these cases?
tl;dr: By default datastoreRpcErrors will use 5 retries automatically.
I dig into the code of datastoreio in beam python sdk. It looks like the final entity mutations are flushed in batch via DatastoreWriteFn().
# Flush the current batch of mutations to Cloud Datastore.
_, latency_ms = helper.write_mutations(
self._datastore, self._project, self._mutations,
self._throttler, self._update_rpc_stats,
throttle_delay=_Mutate._WRITE_BATCH_TARGET_LATENCY_MS/1000)
The RPCError is caught by this block of code in write_mutations in the helper; and there is a decorator #retry.with_exponential_backoff for commit method; and the default number of retry is set to 5; retry_on_rpc_error defines the concrete RPCError and SocketError reasons to trigger retry.
for mutation in mutations:
commit_request.mutations.add().CopyFrom(mutation)
#retry.with_exponential_backoff(num_retries=5,
retry_filter=retry_on_rpc_error)
def commit(request):
# Client-side throttling.
while throttler.throttle_request(time.time()*1000):
try:
response = datastore.commit(request)
...
except (RPCError, SocketError):
if rpc_stats_callback:
rpc_stats_callback(errors=1)
raise
...
I think you should first of all determine which kind of error occurred in order to see what are your options.
However, in the official Datastore documentation, there is a list of all the possible errors and their error codes . Fortunately, they come with recommended actions for each.
My advice is that your implement their recommendations and see for alternatives if they are not effective for you

Azure Queue Storage Triggered-Webjob - Dequeue Multiple Messages at a time

I have an Azure Queue Storage-Triggered Webjob. The process my webjob performs is to index the data into Azure Search. Best practice for Azure Search is to index multiple items together instead of one at a time, for performance reasons (indexing can take some time to complete).
For this reason, I would like for my webjob to dequeue multiple messages together so I can loop through, process them, and then index them all together into Azure Search.
However I can't figure out how to get my webjob to dequeue more than one at a time. How can this be accomplished?
For this reason, I would like for my webjob to dequeue multiple messages together so I can loop through, process them, and then index them all together into Azure Search.
According to your description, I suggest you could try to use Microsoft.Azure.WebJobs.Extensions.GroupQueueTrigger to achieve your requirement.
This extension will enable you to trigger functions and receive the group of messages instead of a single message like with [QueueTrigger].
More details, you could refer to below code sample and article.
Install:
Install-Package Microsoft.Azure.WebJobs.Extensions.GroupQueueTrigger
Program.cs:
static void Main()
{
var config = new JobHostConfiguration
{
StorageConnectionString = "...",
DashboardConnectionString = "...."
};
config.UseGroupQueueTriggers();
var host = new JobHost(config);
host.RunAndBlock();
}
Function.cs:
//Receive 10 messages at one time
public static void MyFunction([GroupQueueTrigger("queue3", 10)]List<string> messages)
{
foreach (var item in messages)
{
Console.WriteLine(item);
}
}
Result:
How would I get that changed to a GroupQueueTrigger? Is it an easy change?
In my opinion, it is an easy change.
You could follow below steps:
1.Install the package Microsoft.Azure.WebJobs.Extensions.GroupQueueTrigger from Nuget Package manager.
2.Change the program.cs file enable UseGroupQueueTriggers.
3.Change the webjobs functions according to your old triggered function.
Note:The group queue message trigger must use list.
As my code sample shows:
public static void MyFunction([GroupQueueTrigger("queue3", 10)]List<string> messages)
This function will get 10 messages from the "queue3" one time, so in this function you could change the function loop the list of the messages and process them, then index them all together into Azure Search.
4.Publish your webjobs to azure web apps.