The scenario here is that we have a service bus queue and a web job. The web job reads the message from the service bus queue and calls a logic up which then goes on and does other stuff.
The problem we are facing is that after the web job reads the message from the service bus, it occasionally doesn't delete it after, which constantly causes the logic app to be called and flood our database with data.
Here is the message in question as seen from azure management studio:
https://gyazo.com/7f57b460421d1bb4a69fcb8b5a9ff01f
As you can see, there is no lock time on the message. I have tried to play around with the settings to no avail.
When i manually try to delete that message from azure management studio it is also unsuccessful but there is no error message received.
Does anyone know what is going on here? I feel like this is a problem with the queue itself as opposed to a bug in our code since 2-3 tools that i have used are unable to delete this message from the queue.
It looks like the message is only deleted after a specific time (does not go to the dead-letter queue however).
Thanks
So just for information, i figured my own issue out. When the file scraper job runs, it puts a message in the service bus. The webjob now that runs and picks up that file stores the file that it just picked up locally as well as on blob storage.
The problem was that webjob keeps a queue of what it processes locally which was never cleared so every time the webjob run, it was processing all previous files as well.
Related
I am have simple python code which subscribes to a service bus subscription. I have containerized this and deployed as part of ACI on Azure.
If message arrives on service bus subscription, the code is executed, executes it logic and then waits indefinitely for another message from appear.
The code is what Azure has provided in its documentation for python sdk here
Since ACI is serverless and bills/second, just wanted a confirmation if I'll get billed even if it is not executing my code and waiting for message for appear on topic/subscription (event-based) ?
Yes, of course. It will cost if there is anyone container instance in the running state. Until you stop all the container instance, then the cost will stop. So even if your code is waiting, but the instance is running.
We've got a little java scheduler running on AWS ECS. It's doing what cron used to do on our old monolith. it fires up (fargate) tasks in docker containers. We've got a task that runs every hour and it's quite important to us. I want to know if it crashes or fails to run for any reason (eg the java scheduler fails, or someone turns the task off).
I'm looking for a service that will alert me if it's not notified. I want to call the notification system every time the script runs successfully. Then if the alert system doesn't get the "OK" notification as expected, it shoots off an alert.
I figure this kind of service must exist, and I don't want to re-invent the wheel trying to build it myself. I guess my question is, what's it called? And where can I go to get that kind of thing? (we're using AWS obviously and we've got a pagerDuty account).
We use this approach for these types of problems. First, the task has to write a timestamp to a file in S3 or EFS. This file is the external evidence that the task ran to completion. Then you need an http based service that will read that file and calculate if the time stamp is valid ie has been updated in the last hour. This could be a simple php or nodejs script. This process is exposed to the public web eg https://example.com/heartbeat.php. This script returns a http response code of 200 if the timestamp file is present and valid, or a 500 if not. Then we use StatusCake to monitor the url, and notify us via its Pager Duty integration if there is an incident. We usually include a message in the response so a human can see the nature of the error.
This may seem tedious, but it is foolproof. Any failure anywhere along the line will be immediately notified. StatusCake has a great free service level. This approach can be used to monitor any critical task in same way. We've learned the hard way that critical cron type tasks and processes can fail for any number of reasons, and you want to know before it becomes customer critical. 24x7x365 monitoring of these types of tasks is necessary, and helps us sleep better at night.
Note: We always have a daily system test event that triggers a Pager Duty notification at 9am each day. For the truly paranoid, this assures that pager duty itself has not failed in some way eg misconfiguratiion etc. Our support team knows if they don't get a test alert each day, there is a problem in the notification system itself. The tech on duty has to awknowlege the incident as per SOP. If they do not awknowlege, then it escalates to the next tier, and we know we have to have a talk about response times. It keeps people on their toes. This is the final piece to insure you have robust monitoring infrastructure.
OpsGene has a heartbeat service which is basically a watch dog timer. You can configure it to call you if you don't ping them in x number of minutes.
Unfortunately I would not recommend them. I have been using them for 4 years and they have changed their account system twice and left my paid account orphaned silently. I have to find a new vendor as soon as I have some free time.
I noticed that multiple instances of my Web job are receiving the same message and end up acting on it. This is not the desired behavior. I would like multiple messages to be processed concurrently, however, I do not want the same message being processed by multiple instances of the web job.
My web job is of the continuous running type.
I use a QueueTrigger to receive the message and invoke the function
My function runs for several hours.
I have looked into the JobHostConfiguration.BatchSize and MaxDequeue properties and I am not sure on these. I simply want a single instance processing a message and that it could take several hours to complete.
This is what I see in the web job logs indicating the message is received twice.
[01/24/2017 16:17:30 > 7e0338: INFO] Executing:
'Functions.RunExperiment' - Reason: 'New queue message detected on
'runexperiment'.'
[01/24/2017 16:17:30 > 7e0338: INFO] Executing:
'Functions.RunExperiment' - Reason: 'New queue message detected on
'runexperiment'.'
According to the official document, if we use Azure queue storage in the WebJob on the multiple instance, we no need to write code to prevent multiple instances to processing the same queue message.
The WebJobs SDK queue trigger automatically prevents a function from processing a queue message multiple times; functions do not have to be written to be idempote.
I deployed a WebJob on the 2 instances WebApp, It also works correctly(not execute twice with same queue message). It could run on the 2 instances and there is no duplicate executed message.
So it is very odd that the queue message is executed twice, please have a try to debug it whether there are 2 queue messages that have the same content are triggered.
The following is my debug code. I wrote the message that with the executed time and instance id info into another queue.
public static void ProcessQueueMessage([QueueTrigger("queue")] string message, [Queue("logqueue")] out string newMessage, TextWriter log)
{
string instance = Environment.GetEnvironmentVariable("WEBSITE_INSTANCE_ID");
string newMsg = $"WEBSITE_INSTANCE_ID:{instance}, timestamp:{DateTime.Now}, Message:{message}";
log.WriteLine(newMsg);
Console.WriteLine(newMsg);
newMessage = newMsg;
}
}
I had the same issue of a single message processed multiple times at the same time. The issue disappeared as soon as I have set the MaxPollingInterval property...
I'm running some tests on my web app which has a WebJob running to handle some backend tasks.
I connect to the queue using Cloud Explorer in Visual Studio and clear all the messages from the queue. When I restart my WebJob, it still finds messages and tries to process them.
Where are these messages coming from? If I clear the queue through Cloud Explorer in Visual Studio, shouldn't the queue be empty? BTW, I also clear the queue poision.
The Clear Queue command in the VS Queue explorer will indeed delete all messages in the queue, including any messages that may currently be invisible due to their invisibility timeout. When viewing the queue, if there are any invisible messages you'll see them in the display text in the bottom of the window (e.g. "0 of 5 messages").
So if you've executed the Clear Command and it shows "0 of 0" messages the queue is completely empty. If after that your queue triggered function gets invoked on that queue, you must have some code somewhere that is adding messages to that queue. Not a very satisfying answer perhaps, but the neither the WebJobs SDK nor Azure Storage itself is going to be manufacturing any messages in this way :)
I have a service broker message queue, each message calls a web service via a CLR stored procedure to do some processing
I have an issue where the conversation does not end, it works fine, everything it needs to do is done, it doesn't error, but the conversation never ends even though EndConversation is called.
It seems to be coming back from the web service call and calling EndConversation before the processing that the web service is doing has completed, and so the conversation does not end and the message is called again.
Is there anyway to stop the web service call coming back before it has completed so then the conversation in the message queue can end successfully.
I believe this is what is happening because if i cut out some of the work the web service call is doing so that it runs quicker than everything runs fine and the conversation ends.
I have also stepped through all of the steps happening in the web service call, and everything works, there are no errors etc.
May need to see some of the code, especially the initiator.
Are you using explicit transactions?
Make sure you have a COMMIT TRANSACTION statement after END CONVERSATION.