If I apply [Singleton] and [NoAutomaticTrigger] attributes and publish the webjob, it goes to pending restart state.
We want to solve multiple instance issue which is occurring in a method.
Please help.
it goes to pending restart state.
In your case, you need to check the reason why webjob goes to pending restart state.
There are lots of reasons that goes to pending restart state. maybe due to an issues or webjob thread is finished needs to restart. We could check it with Webjob log.
Before publish it to azure, we make sure that it works correctly locally and add the appsetting AzureWebJobsDashboard and AzureWebJobsStorage with storage connection string then we could get the webjob log from webjob dashboard.
If you publish it as continuous type webjob, and method is executed completely. And the status will become to pending restart. It is a normal behavior.
[Singleton] and [NoAutomaticTrigger] attributes could work correctly, please refer to the following demo code.
static void Main()
{
JobHost host = new JobHost();
host.Call(typeof(Functions).GetMethod("CreateQueueMessage"), new { value = "Hello world!" + Guid.NewGuid() });
}
[Singleton]
[NoAutomaticTrigger]
public static void CreateQueueMessage(TextWriter logger,string value,[Queue("outputqueue")] out string message)
{
message = value;
logger.WriteLine("Creating queue message: ", message);
Console.WriteLine(message);
}
Related
I am working on multiple Firebase cloud functions (all hosted in the same region) that connect with a GCP hosted Redis instance in the same region, using a VPC connector. I am using version 3.0.2 of the nodejs library for Redis. In the cloud functions' debug logs, I am seeing frequent connection reset logs, triggered for each cloud function with no fixed pattern around the timeline for the connection reset. And each time, the error captured in the error event handler is ECONNRESET. While creating the Redis instance, I have provided a retry_strategy to reconnect after 5 ms with maximum of 10 such attempts, along with the retry_unfulfilled_commands set to true, expecting that any unfulfilled command at the time of connection reset will be automatically retried (refer the code below).
const redisLib = require('redis');
const client = redisLib.createClient(REDIS_PORT, REDIS_HOST, {
enable_offline_queue: true,
retry_unfulfilled_commands: true,
retry_strategy: function(options) {
if (options.error && options.error.code === "ECONNREFUSED") {
// End reconnecting on a specific error and flush all commands with
// a individual error
return new Error("The server refused the connection");
}
if (options.attempt > REDIS_CONNECTION_RETRY_ATTEMPTS) {
// End reconnecting with built in error
console.log('Connection retry count exceeded 10');
return undefined;
}
// reconnect after 5 ms
console.log('Retrying connection after 5 ms');
return 5;
},
});
client.on('connect', () => {
console.log('Redis instance connected');
});
client.on('error', (err) => {
console.error(`Error connecting to Redis instance - ${err}`);
});
exports.getUserDataForId = (userId) => {
console.log('getUserDataForId invoked');
return new Promise((resolve, reject) => {
if(!client.connected) {
console.log('Redis instance not yet connected');
}
client.get(userId, (err, reply) => {
if(err) {
console.error(JSON.stringify(err));
reject(err);
} else {
resolve(reply);
}
});
});
}
// more such exports for different operations
Following are the questions / issues I am facing.
Why is the connection getting reset intermittently?
I have seen logs that even if the cloud function is being executed, the connection to Redis server lost resulting in failure of the command.
With retry_unfulfilled_commands set to true, I hoped it will handle the scenario as mentioned in point number 2 above, but as per debug logs, the cloud function times out in such scenario. This is what I observed in the logs in that case.
getUserDataForId invoked
Retrying connection after 5 ms
Redis instance connected
Function execution took 60002 ms, finished with status: 'timeout' --> coming from wrapper cloud function
Should I, instead of having a Redis connection instance at global level, try to have a connection created during each such Redis operation? It might have some performance issues as well as issues around number of concurrent Redis connections (since I have multiple cloud functions and all those will be creating Redis connections for each simultaneous invocation), right?
So, how to best handle it since I am facing all these issues during development itself, so not really sure if it's code related issue or some infrastructure configuration related issue.
This behavior could be caused by background activities.
"Background activity is anything that happens after your function has
terminated"
When the background activity interferes with subsequent invocations in Cloud Functions, unexpected behavior and errors that are hard to diagnose may occur. Accessing the network after a function terminates usually leads to "ECONNRESET" errors.
To troubleshoot this, make sure that there is no background activity by searching the logs for entries after the line saying that the invocation finished. Background activity can sometimes be buried deeper in the code, especially when asynchronous operations such as callbacks or timers are present. Review your code to make sure all asynchronous operations finish before you terminate the function.
Source
I have a webjob receives site click events from azure event hub, then ingest those events into ADX.
public static async Task Run([EventHubTrigger] EventData[] events, ILogger logger)
{
// Process events
try
{
var ingestResult = await _adxIngester.IngestAsync(events);
if (!ingestResult)
{
AppInsightLogError();
logger.LogError();
}
}
catch(Exception ex)
{
AppInsighLogError();
logger.LogError()
}
}
I've used queue ingestion and turned off FlushImmediately when ingesting to ADX, which enable batch ingestion. When events does not meet default IngestionBatch policy of 1000 events / 1GB data size, ADX waits 5 minutes until it return Success status, which makes Run also waits for that amount of time.
public async Task<bool> IngestAsync(...)
{
IKustoQueuedIngestClient client = KustoIngestFactory.CreateQueuedIngestClient(kustoConnectionString);
var kustoIngestionProperties = new KustoQueuedIngestionProperties(databaseName: "myDB", tableName: "events")
{
ReportLevel = IngestionReportLevel.FailuresOnly,
ReportMethod = IngestionReportMethod.Table,
FlushImmediately = false
};
var streamIdentifier = Guid.NewGuid();
var clientResult = await client.IngestFromStreamAsync(...);
var ingestionStatus = clientResult.GetIngestionStatusBySourceId(streamIdentifier);
while (ingestionStatus.Status == Status.Pending)
{
await Task.Delay(TimeSpan.FromSeconds(15));
ingestionStatus = clientResult.GetIngestionStatusBySourceId(streamIdentifier);
}
if (ingestionStatus.Status == Status.Failed)
{
return false;
}
return true;
}
Since I don't want my webjob to wait that long when there are not many events coming in, or simply QA is at work, I made the following changes:
Don't await on IngestAsync, thus make Run a synchronous method
Add parameter Action onError to IngestAsync and call it when ingest task fails. Call AppInsightLogError() and logger.LogError() inside onError, instead of return false
Replace IngestFromStreamAsync with IngestFromStream
Basically, I want to ensure events reaches Azure Queue and throws exception (if any) before I poll for ingest status, then exit Run method, and I don't have to wait for status polling, if anything fails then it will be log.
My question is:
Is it a good practice to avoid webjob waits for minutes? If no, why ?
If yes, is my solution good enough for this problem? Otherwise how
should I do this?
If you are ingesting small batches of data and wish to cut down on the ingestion batching times, please read the following article: https://learn.microsoft.com/en-us/azure/kusto/concepts/batchingpolicy
Ingestion Batching policy allows you to control the batching limits per database or table.
The ingestion is performed in few phases. One phase is done at the client side, and one phase is done at the server side:
The ingest client code you’re using is going to take your stream and upload it to a blob, and then it will send a message to a queue.
Any exceptions thrown during that phase, will indeed be propagated to your code, which is why you should also use some try-catch block, where in the catch block you can log the error message as you suggested.
You can either use IngestFromStreamAsync with the await keyword, or use IngestFromStream. The first option is better if you’d like to release the worker thread and save resources. But choosing between those two doesn’t have anything to do with the polling. The polling is relevant to the second phase.
Kusto’s DataManagement component is constantly listening to messages in the queue, so as soon as it gets to your new message, it will read it and see some metadata information about the new ingestion request, such as the blob URI where the data is stored and such as the Azure table where failures/progress should be updated.
That phase is done remotely by the server side, and you have an option to wait in your client code for each single ingestion and poll until the server completes the ingestion process. If there are any exceptions during that phase, then of course they won’t be propagated to your client code, but rather you’ll be able to examine the Azure table and see what happened.
You can also decide to defer that status examination, and have it done in some other task.
IngestFromStreamAsync upload your data to a blob and post a message to the Data Management input queue.It will not wait for aggregation time and the final state you will get is Queued.
FlushImmediately defaults to false.
If there isn't any additional processing, consider using the Event Hub to Kusto connection
[Edited] responding to comments:
Queued state indicate the blob is pending ingestion. You can track status by show ingestion failures command, metrics and ingestion logs.
Event hub connection goes through queued ingestion by default. It will use streaming ingestion only if it is set as policy on the database / table
Some of the processing can be done on ADX, using ingestion mapping and update policy.
Our application is using a webjob to generate the data, for a moment we are facing a problem that is sometime it was stopped/restarted unexpectedly when it is processing the messages queue. It leads to our webjob don't know when it is forcing restarting/stopping to mark which data were processed then let the webjob restart/stop afterward.
Is there any idea to get the stopping/restarting notification to synchronize data?
Many thanks!
If you're using queues, a restarting webjob shouldn't cause you to have any data loss. Since the message will not be completed, it will be put back on the queue for (re)processing.
As far as the restarting goes: make sure you don't have any scenario's in code that break the webjob completely.
Add Application Insights and add an alert for the specific case you're looking for.
See Set Alerts in Application Insights
Sometimes webjobs can get killed by scale-in procedures. You can make sure they have a graceful death by listening to the shutdown event by using the class Microsoft.Azure.WebJobs.WebJobsShutdownWatcher in nuget package Microsoft.Azure.WebJobs.
As in version 1.1.2 of the nuget package:
public sealed class WebJobsShutdownWatcher : IDisposable
{
// Begin watching for a shutdown notification from Antares.
public WebJobsShutdownWatcher();
// Get a CancellationToken that is signaled when the shutdown notification is detected.
public CancellationToken Token { get; }
// Stop watching for the shutdown notification
public void Dispose();
}
A way to use this: in your webjob Program.cs class you get a cancellation token and write the code you want to be executed when shutdown happens.
private static void Main()
{
...
var cancellationToken = new WebJobsShutdownWatcher().Token;
...
cancellationToken.Register(() =>
{
//Your data operations here
});
...
}
Thank Diana for your information. I tried this approach but it was not work very well, webjob is just waiting for 5 seconds before restarting/stopping although I set 60 seconds in the settings.job file. Here is my code below
static void Main()
{
var config = new JobHostConfiguration();
var host = new JobHost();
var cancellationToken = new WebJobsShutdownWatcher().Token;
cancellationToken.Register(() =>
{
//Raise the signal
});
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
}
I have the following WebJob project where I'm trying to deploy a TimerTrigger WebJob function, however I cannot get it to run on a scheduled basis when deploying it via "Publish As Azure WebJob..." in Visual Studio 2017.
Program.cs
class Program
{
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.UseTimers();
var host = new JobHost(config);
host.RunAndBlock();
}
}
Functions.cs
public class Functions
{
public static async Task ProcessAsync([TimerTrigger("0 */3 * * * *")] TimerInfo timerInfo, TextWriter log)
{
...
}
}
webjob-publish-settings.json
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "TestWebJob",
"runMode": "OnDemand"
}
Settings.job
{ "schedule": "0 */3 * * * *" }
The documentation for this is pretty non-existent, and it's baffling to why Azure supports Scheduled CRON TimerTrigger's but doesn't actually include them as an option when deploying.
Is this possible?
If you have created a schedule webjob manually, I think you probably have found it will generate a settings.job to set the schedule.Then the SCHEDULE in the portal read the schedule and show it. And if you deploy a TimerTrigger webjob with VS2017, it won't generate this file because you have define the TimerTrigger function.
Then I did some tests to show it.Firstly I create a webjob with TimerTrigger and deploy it, it will show same result just like yours with n/a SCHEDULE. Then I kill the webjob process and upload a settings.job then refresh(not the refresh in in the portal) the page, then the SCHEDULE change to CRON expression. And if you delete the file, it will change back.
As for the log, in my opinion it's also caused by the settings.job, if you have this file it will trigger this webjob every x minutes, and if you don't have it will trigger the function every x minutes in a webjob.
If you still have questions, please let me know.
It seems that the above code is working. However, it runs completely differently to how you would imagine if you're familiar with running "Scheduled" WebJobs manually.
If you were to run them manually, you would usually see the Schedule at the top level, along with the Status updating every x minutes, etc:
and you would also see the logs update at parent level, like so:
However, when deploying it using the above method via Visual Studio 2017, you only ever get the WebJob running once for the duration of it's lifetime. As a result you would only ever get one parent log in the logs list too.
Though if you click into this, you will see individual logs for each scheduled function log:
Hopefully this will make sense for other people who are looking into setting up WebJobs :)
the azure webjob with runmode set to "onDemand" keeps running and I am not able to stop it.
I don't see anything that needs to be handled but the job.
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "ScheduledJob",
"runMode": "OnDemand"
}
ScheduledJob Triggered Running n/a
the only way to restarted is by restarting the web service. Then start the job manually. And then it keeps running. It does not stop.
What is going on with this webjob?
Update1:
I am using the code from Pnp Partner package which can be found here.
As the code is two long I am just providing the code in the program.cs file.
For the rest please have a look at the I posted above.
static void Main()
{
var job = new PnPPartnerPackProvisioningJob();
job.UseThreading = false;
job.AddSite(PnPPartnerPackSettings.InfrastructureSiteUrl);
job.UseAzureADAppOnlyAuthentication(
PnPPartnerPackSettings.ClientId,
PnPPartnerPackSettings.Tenant,
PnPPartnerPackSettings.AppOnlyCertificate);
job.Run();
#if DEBUG
Console.ReadLine();
#endif
}
In your code, the PnPPartnerPackProvisioningJob class is inheritted from TimerJob class.
In TimerJob class, there is not a stop method. And if timer job has started executing, you can not really stop it unless you restart web jobs. For more details, you could refer to this article.
So if your requirement is to cancel a job, you will need to delete the timer job definition. However if timer job has started executing, you can not really STOP it unless you reset IIS or stop Sharepoint Windows Timer Service.