Azure WebJob with TimerTrigger just firing 5 times - azure-webjobs

I recently created an Azure WebJob which is intended to be executed every 5 minutes, forever. Thing comes when I can see it just gets executed every 5 minutes but just 5 times:
[08/25/2020 12:41:51 > 2e704f: INFO] The next 5 occurrences of the schedule will be: *
[08/25/2020 12:41:51 > 2e704f: INFO] 8/25/2020 2:46:51 PM *
[08/25/2020 12:41:51 > 2e704f: INFO] 8/25/2020 2:51:51 PM *
[08/25/2020 12:41:51 > 2e704f: INFO] 8/25/2020 2:56:51 PM *
[08/25/2020 12:41:51 > 2e704f: INFO] 8/25/2020 3:01:51 PM *
[08/25/2020 12:41:51 > 2e704f: INFO] 8/25/2020 3:06:51 PM
On my Azure AppService => WebJobs panel I can see the last time it got executed was 17 hours ago, which confirms this just got fired those shown scheduled times.
My webjob-publish-settings.json is:
{
"$schema": "http://schemastore.org/schemas/json/webjob-publish-settings.json",
"webJobName": "Inbox",
"runMode": "Scheduled"
}
My Functions.cs contains a single async method with this firm:
public async static Task SomethingAsync([TimerTrigger("00:05:00", RunOnStartup = true, UseMonitor = true)] TimerInfo timer)
And my Program.cs sets JobHostConfiguration to call the UseTimers() method:
public static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.UseTimers();
var host = new JobHost(config);
host.Start();
}
I've already tried using a CRON expresion instead the TimeSpan.FromMinutes expression already set with the same luck (kept the TimeSpan expression since it's easier to read) with the same result. Note I'm using the v2.3.0 version of the Azure WebJobs.
What am I exactly missing? Any help will be appreciated. Many thanks.

The root of my problem was a bad interpretation of concepts.
What I really needed was a WebJob to call my function every 5 minutes. I thought I needed a scheduled WebJob and looking at the console output and seeing the "scheduled ocurrences" made me think they were being called when in fact they weren't.
What was really happening was that my WebJob was getting started, launching the function (thanks to the RunOnStartup = true), it was identifying the specified TimerTrigger and then the JobHost ended.
What did I change? Moved to continuous WebJob.
Changed the webjob-publish-settings.json's runMode to "Continuous"
Changed the JobHost's method from Call() to RunAndBlock()
Set the TimerTrigger to 1 minute to perform the tryout.
Be careful about the third point, where I was originally intented to launch the function every 5 minutes and I remember suffering SCM_COMMAND_IDLE_TIMEOUT errors since the CPU was idle for more than 120 seconds. That's another war I will probably fight later on.
So now I'm finally seing my Function called every minute:
Functions.SomethingAsync (08/26/2020 09:42:4 ...) Success 41 seconds ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:41:4 ...) Success 2 minutes ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:40:4 ...) Success 3 minutes ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:39:4 ...) Success 4 minutes ago (1 s running time)
Functions.SomethingAsync (08/26/2020 09:38:4 ...) Success 5 minutes ago (1 s running time)
Functions.SomethingAsync (08/26/2020 09:37:4 ...) Success 6 minutes ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:36:4 ...) Success 7 minutes ago (1 s running time)
Functions.SomethingAsync (08/26/2020 09:35:4 ...) Success 8 minutes ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:34:4 ...) Success 9 minutes ago (2 s running time)
Functions.SomethingAsync (08/26/2020 09:33:4 ...) Success 10 minutes ago (1 s running time)

Related

Can you schedule a cron that starts at a specific minute and hour and repeats?

For example, I'd like it to trigger at 6:30, 6:40, 6:50, 7:00, 7:10, 7:20..etc
Is it possible to schedule a cron job that starts at 6:30 every day and runs every 10 minutes until 10:00?
I've tried (30/10 6-10 ? * * * *), but that triggers 6:30, 6:40, 6:50, 7:30, 7:40 and misses the triggers between 7 and 7:30
This is on AWS EventBridge's scheduler.
With 30/10 it will start after 30 mins for every 10 mins. Due to this for every hour it will miss the 0-30 mins window
Why not try 0/10 6-10 ? * * *
Starting 3 - 6.00 6.10 & 6.20 will be extra but it works for other times

Aws Lambda function triggers on a delay time for 2 out of 3 cron jobs

I have a Lambda function that has 3 event triggers here are the Cron job for each:
Cron 1: cron(50/1 22 * * ? *)
Cron 2: cron(50/1 12 * * ? *)
Cron 3: cron(*/15 * * * ? *)
Now Cron 2 Timestamp logs reads as follows, which is ok. Notice that it starts 2-3 seconds into the intended trigger:
10
2021-07-10T05:59:03.867-07:00
11
2021-07-10T05:59:03.867-07:00
12
2021-07-10T05:59:02.314-07:00
START
13
2021-07-10T05:58:02.988-07:00
END
14
2021-07-10T05:58:02.988-07:00
15
2021-07-10T05:58:02.547-07:00
START
BUT Cron 1 & 3 starts over 30+ seconds into the intended trigger. I compared everything possible and there are no settings that are different (to my knowledge). Any idea why 2 of the 3 events have a delay but one doesn't? I understand a small 1-5 second delay by reading here https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html but somethings seems to be off.
2021-07-10T06:30:37.253-07:00
2
2021-07-10T06:30:37.253-07:00
3
2021-07-10T06:30:33.929-07:00
4
2021-07-10T06:15:36.931-07:00
5
2021-07-10T06:15:36.931-07:00
6
2021-07-10T06:15:33.881-07:00
7
2021-07-10T06:00:34.037-07:00
8
2021-07-10T06:00:34.037-07:00
9
2021-07-10T06:00:33.596-07:00
The precision of Event Bridge is one minute:
All scheduled events use UTC+0 time zone, and the minimum precision for a schedule is one minute. Your scheduled rule runs within that minute, but not on the precise 0th second.
So your delays are perfectly fine and within the 1 minute interval.

Cron expression Aws to run every 5 minutes after EventBridge (Cloudwatch trigger) enables

I've seen several examples and all of them just trigger one job at a specific time, I have right now:
0 */5 * ? * *
and it triggers at mins 0,5,10, and on.
But, I need the trigger to run at +5 of the moment that the trigger was enabled.
So, if service becomes enable at 12:07 pm I need it to run then at 12:12 pm and on.
Is there a way to accomplish this?
Like you mentioned offsets are part of the solution to your problem.
0 */5+your_offset * ? * *
Now coming to what could be your offset:
Let's say cloudwatch-event bridge is enabled at some 12:07, (You can get that info from event details timestamp.)
your_offset = 7 + 5
// so your cron becomes : 0 */5+12 * ? * *
Or in general your
offset = the minute part of timestamp + 5
// for your to schedule 5 mins after service is enabled
Solution is simple:
Before create the rule check the minute time at that moment
time_minutes_now */5 * ? * *

Airflow: Dag scheduled twice a few seconds apart

I am trying to run a DAG only once a day at 00:15:00 (midnight 15 minutes), yet, it's being scheduled twice, a few seconds apart.
dag = DAG(
'my_dag',
default_args=default_args,
start_date=airflow.utils.dates.days_ago(1) - timedelta(minutes=10),
schedule_interval='15 0 * * * *',
concurrency=1,
max_active_runs=1,
retries=3,
catchup=False,
)
The main goal of that Dag is check for new emails then check for new files in a SFTP directory and then run a "merger" task to add those new files to a database.
All the jobs are Kubernetes pods:
email_check = KubernetesPodOperator(
namespace='default',
image="g.io/email-check:0d334adb",
name="email-check",
task_id="email-check",
get_logs=True,
dag=dag,
)
sftp_check = KubernetesPodOperator(
namespace='default',
image="g.io/sftp-check:0d334adb",
name="sftp-check",
task_id="sftp-check",
get_logs=True,
dag=dag,
)
my_runner = KubernetesPodOperator(
namespace='default',
image="g.io/my-runner:0d334adb",
name="my-runner",
task_id="my-runner",
get_logs=True,
dag=dag,
)
my_runner.set_upstream([sftp_check, email_check])
So, the issue is that there seems to be two runs of the DAG scheduled a few seconds apart. They do not run concurrently, but as soon as the first one is done, the second one kicks off.
The problem here is that the my_runner job is intended to only run once a day: it tries to create a file with the date as a suffix, and if the file already exists, it throws an exception, so that second run always throws an exception (because the file for the day has already been properly created by the first run)
Since an image (or two) are worth a thousand words, here it goes:
You'll see that there's a first run that is scheduled "22 seconds after 00:15" (that's fine... sometimes it varies a couple of seconds here and there) and then there's a second one that always seems to be scheduled "58 seconds after 00:15 UTC" (at least according to the name they get). So the first one runs fine, nothing else seems to be running... And as soon as it finishes the run, a second run (the one scheduled at 00:15:58) starts (and fails).
A "good" one:
A "bad" one:
Can you check the schedule interval parameter?
schedule_interval='15 0 * * * *'. The cron schedule takes only 5 parameters and I see an extra star.
Also, can you have fixed start_date?
start_date: datetime(2019, 11, 10)
It looks like setting the start_date to 2 days ago instead of 1 did the trick
dag = DAG(
'my_dag',
...
start_date=airflow.utils.dates.days_ago(2),
...
)
I don't know why.
I just have a theory. Maaaaaaybe (big maybe) the issue was that because.days_ago(...) sets a UTC datetime with hour/minute/second set to 0 and then subtracts whichever number of days indicated in the argument, just saying "one day ago" or even "one day and 10 minutes ago" didn't put the start_date over the next period (00:15) and that was somehow confusing Airflow?
Let’s Repeat That The scheduler runs your job one schedule_interval
AFTER the start date, at the END of the period.
https://airflow.readthedocs.io/en/stable/scheduler.html#scheduling-triggers
So, the end of the period would be 00:15... If my theory was correct, doing it airflow.utils.dates.days_ago(1) - timedelta(minutes=16) would probably also work.
This doesn't explain why if I set a date very far in the past, it just doesn't run, though. ¯\_(ツ)_/¯

Django-background-tasks : tasks being randomly locked and never unlocked

I am using the django-background-tasks 1.2.0 on Ubuntu 18.04 and Im running it with a cronjob. Is it possible that my cronjob somehow starts the tasks right before it is refreshed and then it gets stuck ?
It could be one or many stuck tasks at the same moment, depending on how many pending there are.
Cronjob:
* * * * * /project/manage.py process_tasks --duration=59 --sleep=2
settings.py
BACKGROUND_TASK_RUN_ASYNC = True
BACKGROUND_TASK_ASYNC_THREADS = 4
After six months of extensive testing the only way I don't get any stuck tasks is by running two parallel cron jobs that overlap each other and at the moment of refresh there is always one running. I ve tried with 1 running for longer period (3600 seconds) but i got to the same problem.
1 * * * * /project/manage.py process_tasks --duration=3600 --sleep=2
24 * * * * /project/manage.py process_tasks --duration=3600 --sleep=2
I hope it ll help you guys as well.