The azure continuous webjob is running but sometime it was stopped/restarted unexpectedly - azure-webjobs

Our application is using a webjob to generate the data, for a moment we are facing a problem that is sometime it was stopped/restarted unexpectedly when it is processing the messages queue. It leads to our webjob don't know when it is forcing restarting/stopping to mark which data were processed then let the webjob restart/stop afterward.
Is there any idea to get the stopping/restarting notification to synchronize data?
Many thanks!

If you're using queues, a restarting webjob shouldn't cause you to have any data loss. Since the message will not be completed, it will be put back on the queue for (re)processing.
As far as the restarting goes: make sure you don't have any scenario's in code that break the webjob completely.
Add Application Insights and add an alert for the specific case you're looking for.
See Set Alerts in Application Insights

Sometimes webjobs can get killed by scale-in procedures. You can make sure they have a graceful death by listening to the shutdown event by using the class Microsoft.Azure.WebJobs.WebJobsShutdownWatcher in nuget package Microsoft.Azure.WebJobs.
As in version 1.1.2 of the nuget package:
public sealed class WebJobsShutdownWatcher : IDisposable
{
// Begin watching for a shutdown notification from Antares.
public WebJobsShutdownWatcher();
// Get a CancellationToken that is signaled when the shutdown notification is detected.
public CancellationToken Token { get; }
// Stop watching for the shutdown notification
public void Dispose();
}
A way to use this: in your webjob Program.cs class you get a cancellation token and write the code you want to be executed when shutdown happens.
private static void Main()
{
...
var cancellationToken = new WebJobsShutdownWatcher().Token;
...
cancellationToken.Register(() =>
{
//Your data operations here
});
...
}

Thank Diana for your information. I tried this approach but it was not work very well, webjob is just waiting for 5 seconds before restarting/stopping although I set 60 seconds in the settings.job file. Here is my code below
static void Main()
{
var config = new JobHostConfiguration();
var host = new JobHost();
var cancellationToken = new WebJobsShutdownWatcher().Token;
cancellationToken.Register(() =>
{
//Raise the signal
});
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
}

Related

Avoid webjob waiting when ingest small batch of data to azure data explorer

I have a webjob receives site click events from azure event hub, then ingest those events into ADX.
public static async Task Run([EventHubTrigger] EventData[] events, ILogger logger)
{
// Process events
try
{
var ingestResult = await _adxIngester.IngestAsync(events);
if (!ingestResult)
{
AppInsightLogError();
logger.LogError();
}
}
catch(Exception ex)
{
AppInsighLogError();
logger.LogError()
}
}
I've used queue ingestion and turned off FlushImmediately when ingesting to ADX, which enable batch ingestion. When events does not meet default IngestionBatch policy of 1000 events / 1GB data size, ADX waits 5 minutes until it return Success status, which makes Run also waits for that amount of time.
public async Task<bool> IngestAsync(...)
{
IKustoQueuedIngestClient client = KustoIngestFactory.CreateQueuedIngestClient(kustoConnectionString);
var kustoIngestionProperties = new KustoQueuedIngestionProperties(databaseName: "myDB", tableName: "events")
{
ReportLevel = IngestionReportLevel.FailuresOnly,
ReportMethod = IngestionReportMethod.Table,
FlushImmediately = false
};
var streamIdentifier = Guid.NewGuid();
var clientResult = await client.IngestFromStreamAsync(...);
var ingestionStatus = clientResult.GetIngestionStatusBySourceId(streamIdentifier);
while (ingestionStatus.Status == Status.Pending)
{
await Task.Delay(TimeSpan.FromSeconds(15));
ingestionStatus = clientResult.GetIngestionStatusBySourceId(streamIdentifier);
}
if (ingestionStatus.Status == Status.Failed)
{
return false;
}
return true;
}
Since I don't want my webjob to wait that long when there are not many events coming in, or simply QA is at work, I made the following changes:
Don't await on IngestAsync, thus make Run a synchronous method
Add parameter Action onError to IngestAsync and call it when ingest task fails. Call AppInsightLogError() and logger.LogError() inside onError, instead of return false
Replace IngestFromStreamAsync with IngestFromStream
Basically, I want to ensure events reaches Azure Queue and throws exception (if any) before I poll for ingest status, then exit Run method, and I don't have to wait for status polling, if anything fails then it will be log.
My question is:
Is it a good practice to avoid webjob waits for minutes? If no, why ?
If yes, is my solution good enough for this problem? Otherwise how
should I do this?
If you are ingesting small batches of data and wish to cut down on the ingestion batching times, please read the following article: https://learn.microsoft.com/en-us/azure/kusto/concepts/batchingpolicy
Ingestion Batching policy allows you to control the batching limits per database or table.
The ingestion is performed in few phases. One phase is done at the client side, and one phase is done at the server side:
The ingest client code you’re using is going to take your stream and upload it to a blob, and then it will send a message to a queue.
Any exceptions thrown during that phase, will indeed be propagated to your code, which is why you should also use some try-catch block, where in the catch block you can log the error message as you suggested.
You can either use IngestFromStreamAsync with the await keyword, or use IngestFromStream. The first option is better if you’d like to release the worker thread and save resources. But choosing between those two doesn’t have anything to do with the polling. The polling is relevant to the second phase.
Kusto’s DataManagement component is constantly listening to messages in the queue, so as soon as it gets to your new message, it will read it and see some metadata information about the new ingestion request, such as the blob URI where the data is stored and such as the Azure table where failures/progress should be updated.
That phase is done remotely by the server side, and you have an option to wait in your client code for each single ingestion and poll until the server completes the ingestion process. If there are any exceptions during that phase, then of course they won’t be propagated to your client code, but rather you’ll be able to examine the Azure table and see what happened.
You can also decide to defer that status examination, and have it done in some other task.
IngestFromStreamAsync upload your data to a blob and post a message to the Data Management input queue.It will not wait for aggregation time and the final state you will get is Queued.
FlushImmediately defaults to false.
If there isn't any additional processing, consider using the Event Hub to Kusto connection
[Edited] responding to comments:
Queued state indicate the blob is pending ingestion. You can track status by show ingestion failures command, metrics and ingestion logs.
Event hub connection goes through queued ingestion by default. It will use streaming ingestion only if it is set as policy on the database / table
Some of the processing can be done on ADX, using ingestion mapping and update policy.

How to apply Singleton Attribute for NonTriggered method in Azure Webjobs

If I apply [Singleton] and [NoAutomaticTrigger] attributes and publish the webjob, it goes to pending restart state.
We want to solve multiple instance issue which is occurring in a method.
Please help.
it goes to pending restart state.
In your case, you need to check the reason why webjob goes to pending restart state.
There are lots of reasons that goes to pending restart state. maybe due to an issues or webjob thread is finished needs to restart. We could check it with Webjob log.
Before publish it to azure, we make sure that it works correctly locally and add the appsetting AzureWebJobsDashboard and AzureWebJobsStorage with storage connection string then we could get the webjob log from webjob dashboard.
If you publish it as continuous type webjob, and method is executed completely. And the status will become to pending restart. It is a normal behavior.
[Singleton] and [NoAutomaticTrigger] attributes could work correctly, please refer to the following demo code.
static void Main()
{
JobHost host = new JobHost();
host.Call(typeof(Functions).GetMethod("CreateQueueMessage"), new { value = "Hello world!" + Guid.NewGuid() });
}
[Singleton]
[NoAutomaticTrigger]
public static void CreateQueueMessage(TextWriter logger,string value,[Queue("outputqueue")] out string message)
{
message = value;
logger.WriteLine("Creating queue message: ", message);
Console.WriteLine(message);
}

Azure webjob with runMode "OnDemand" keeps running

the azure webjob with runmode set to "onDemand" keeps running and I am not able to stop it.
I don't see anything that needs to be handled but the job.
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "ScheduledJob",
"runMode": "OnDemand"
}
ScheduledJob Triggered Running n/a
the only way to restarted is by restarting the web service. Then start the job manually. And then it keeps running. It does not stop.
What is going on with this webjob?
Update1:
I am using the code from Pnp Partner package which can be found here.
As the code is two long I am just providing the code in the program.cs file.
For the rest please have a look at the I posted above.
static void Main()
{
var job = new PnPPartnerPackProvisioningJob();
job.UseThreading = false;
job.AddSite(PnPPartnerPackSettings.InfrastructureSiteUrl);
job.UseAzureADAppOnlyAuthentication(
PnPPartnerPackSettings.ClientId,
PnPPartnerPackSettings.Tenant,
PnPPartnerPackSettings.AppOnlyCertificate);
job.Run();
#if DEBUG
Console.ReadLine();
#endif
}
In your code, the PnPPartnerPackProvisioningJob class is inheritted from TimerJob class.
In TimerJob class, there is not a stop method. And if timer job has started executing, you can not really stop it unless you restart web jobs. For more details, you could refer to this article.
So if your requirement is to cancel a job, you will need to delete the timer job definition. However if timer job has started executing, you can not really STOP it unless you reset IIS or stop Sharepoint Windows Timer Service.

Akka daily scheduled tasks

I am preparing to rewrite my Play1 application with Play2 and I need to implement scheduled tasks that run exactly once a day at some specific time.
In my old app I implemented it as follows:
the task is scheduled using Play1 jobs and the app runs on multiple nodes
at the specified time all healthy nodes start the task and I use lock record in the database to ensure only one of them proceeds with execution and all others exit without doing anything.
How do I implement similar functionality with Akka?
You can just use the Scheduler to either execute a runnable or to send a message to an actor:
system.scheduler().scheduleOnce(Duration.create(24, TimeUnit.HOURS),
taskActor, "doTask", system.dispatcher(), null);
Or
system.scheduler().scheduleOnce(Duration.create(24, TimeUnit.HOURS),
new Runnable() {
#Override
public void run() {
doTask()
}
}, system.dispatcher());
I would prefer the method including an actor though.
You can read up on how to create an actor to receive the doTask message here.

Laravel 4.2 AWS SQS queue setup using EB worker environment

I'm trying to set up Laravel 4.2 queue using AWS SQS and an EB Worker environment. I'm pushing the job into the queue from another server and I want the worker environment to execute it. But so far it looks like the worker tries to execute it, but for some reason gets a 405 error in the access log...
I'm trying to get a very simple test code... On the worker env. I pretty much clean Laravel installation just with queue config and stuff and this class:
class TestQueue {
public function fire($job, $data)
{
File::append(storage_path().'/sqs_push.txt', $data['date']);
$job->delete();
}
}
Now on the main server, from where I want to push, I have this:
public function getTestQueue(){
$data = ['date' => date('Y-m-d H:i:s')];
$queue = \Queue::push('TestQueue', $data);
var_dump($queue);
}
On the worker I have launched the
php artisan queue:listen
When I run that method, it adds it to the SQS queue (I can see it in the SQS console) and the worker tries to execute it, but all I see is some 405 errors in the access logs...
Maybe im doing something wrong in my queue setup? Can anyone help me please?
Error 405 stands for "MethodNotAllowed" where the specified method is not allowed against this. Since you have mentioned that Main Server successfully sends the messages to SQS (you have verified it via the console), I will provide a solution to implement a worker thread. This was taken from this repository in GitHub. Have a look at the worker.php file.
$queue = new Queue(QUEUE_NAME, unserialize(AWS_CREDENTIALS));
// Continuously poll queue for new messages and process them.
while (true) {
$message = $queue->receive();
if ($message) {
try {
$message->process();
$queue->delete($message);
} catch (Exception $e) {
$queue->release($message);
echo $e->getMessage();
}
} else {
// Wait 20 seconds if no jobs in queue to minimise requests to AWS API
sleep(20);
}
}