Django crontab, running a job every 12 hours - django

I have a django crontab sceduled to run every 12 hours, meaning it should run twice per day however, it is running more than that.
Can anyone tell me what's wront with it ?
('* */12 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))
Also what changes I need to make if I need it to run every 24 hours?

After every 12 hours you want to run job any particular minute from 0 to 59, not every other minute. So it should be (assuming 0th minute):
('0 */12 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))
For once in a day or every 24 hours (You can decide any specific hour from 0 to 23, assuming at midnight):
('0 0 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))

Related

Can you schedule a cron that starts at a specific minute and hour and repeats?

For example, I'd like it to trigger at 6:30, 6:40, 6:50, 7:00, 7:10, 7:20..etc
Is it possible to schedule a cron job that starts at 6:30 every day and runs every 10 minutes until 10:00?
I've tried (30/10 6-10 ? * * * *), but that triggers 6:30, 6:40, 6:50, 7:30, 7:40 and misses the triggers between 7 and 7:30
This is on AWS EventBridge's scheduler.
With 30/10 it will start after 30 mins for every 10 mins. Due to this for every hour it will miss the 0-30 mins window
Why not try 0/10 6-10 ? * * *
Starting 3 - 6.00 6.10 & 6.20 will be extra but it works for other times

AWS/Cron start schedule specific minute start every 5 minutes

I'm using AWS's Cron validator in EventBridge. I want to run a schedule starting at 8:55 that runs every 5 minutes until 5 P.M. Why is AWS telling me this cron schedule runs every hour starting on the 55-minute?
55/5 13-22 ? * MON-FRI *
Yes, 55/5 13-22 ? * MON-FRI * this means start at 55-minute then continue with 5 min until 59. That means it will run once in a hour.
*/5 14-21 ? * MON-FRI you can use this with 55 13 ? * MON-FRI to handle 8:55.

Aws Lambda function triggers on a delay time for 2 out of 3 cron jobs

I have a Lambda function that has 3 event triggers here are the Cron job for each:
Cron 1: cron(50/1 22 * * ? *)
Cron 2: cron(50/1 12 * * ? *)
Cron 3: cron(*/15 * * * ? *)
Now Cron 2 Timestamp logs reads as follows, which is ok. Notice that it starts 2-3 seconds into the intended trigger:
10
2021-07-10T05:59:03.867-07:00
11
2021-07-10T05:59:03.867-07:00
12
2021-07-10T05:59:02.314-07:00
START
13
2021-07-10T05:58:02.988-07:00
END
14
2021-07-10T05:58:02.988-07:00
15
2021-07-10T05:58:02.547-07:00
START
BUT Cron 1 & 3 starts over 30+ seconds into the intended trigger. I compared everything possible and there are no settings that are different (to my knowledge). Any idea why 2 of the 3 events have a delay but one doesn't? I understand a small 1-5 second delay by reading here https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html but somethings seems to be off.
2021-07-10T06:30:37.253-07:00
2
2021-07-10T06:30:37.253-07:00
3
2021-07-10T06:30:33.929-07:00
4
2021-07-10T06:15:36.931-07:00
5
2021-07-10T06:15:36.931-07:00
6
2021-07-10T06:15:33.881-07:00
7
2021-07-10T06:00:34.037-07:00
8
2021-07-10T06:00:34.037-07:00
9
2021-07-10T06:00:33.596-07:00
The precision of Event Bridge is one minute:
All scheduled events use UTC+0 time zone, and the minimum precision for a schedule is one minute. Your scheduled rule runs within that minute, but not on the precise 0th second.
So your delays are perfectly fine and within the 1 minute interval.

Cron expression Aws to run every 5 minutes after EventBridge (Cloudwatch trigger) enables

I've seen several examples and all of them just trigger one job at a specific time, I have right now:
0 */5 * ? * *
and it triggers at mins 0,5,10, and on.
But, I need the trigger to run at +5 of the moment that the trigger was enabled.
So, if service becomes enable at 12:07 pm I need it to run then at 12:12 pm and on.
Is there a way to accomplish this?
Like you mentioned offsets are part of the solution to your problem.
0 */5+your_offset * ? * *
Now coming to what could be your offset:
Let's say cloudwatch-event bridge is enabled at some 12:07, (You can get that info from event details timestamp.)
your_offset = 7 + 5
// so your cron becomes : 0 */5+12 * ? * *
Or in general your
offset = the minute part of timestamp + 5
// for your to schedule 5 mins after service is enabled
Solution is simple:
Before create the rule check the minute time at that moment
time_minutes_now */5 * ? * *

Airflow: Dag scheduled twice a few seconds apart

I am trying to run a DAG only once a day at 00:15:00 (midnight 15 minutes), yet, it's being scheduled twice, a few seconds apart.
dag = DAG(
'my_dag',
default_args=default_args,
start_date=airflow.utils.dates.days_ago(1) - timedelta(minutes=10),
schedule_interval='15 0 * * * *',
concurrency=1,
max_active_runs=1,
retries=3,
catchup=False,
)
The main goal of that Dag is check for new emails then check for new files in a SFTP directory and then run a "merger" task to add those new files to a database.
All the jobs are Kubernetes pods:
email_check = KubernetesPodOperator(
namespace='default',
image="g.io/email-check:0d334adb",
name="email-check",
task_id="email-check",
get_logs=True,
dag=dag,
)
sftp_check = KubernetesPodOperator(
namespace='default',
image="g.io/sftp-check:0d334adb",
name="sftp-check",
task_id="sftp-check",
get_logs=True,
dag=dag,
)
my_runner = KubernetesPodOperator(
namespace='default',
image="g.io/my-runner:0d334adb",
name="my-runner",
task_id="my-runner",
get_logs=True,
dag=dag,
)
my_runner.set_upstream([sftp_check, email_check])
So, the issue is that there seems to be two runs of the DAG scheduled a few seconds apart. They do not run concurrently, but as soon as the first one is done, the second one kicks off.
The problem here is that the my_runner job is intended to only run once a day: it tries to create a file with the date as a suffix, and if the file already exists, it throws an exception, so that second run always throws an exception (because the file for the day has already been properly created by the first run)
Since an image (or two) are worth a thousand words, here it goes:
You'll see that there's a first run that is scheduled "22 seconds after 00:15" (that's fine... sometimes it varies a couple of seconds here and there) and then there's a second one that always seems to be scheduled "58 seconds after 00:15 UTC" (at least according to the name they get). So the first one runs fine, nothing else seems to be running... And as soon as it finishes the run, a second run (the one scheduled at 00:15:58) starts (and fails).
A "good" one:
A "bad" one:
Can you check the schedule interval parameter?
schedule_interval='15 0 * * * *'. The cron schedule takes only 5 parameters and I see an extra star.
Also, can you have fixed start_date?
start_date: datetime(2019, 11, 10)
It looks like setting the start_date to 2 days ago instead of 1 did the trick
dag = DAG(
'my_dag',
...
start_date=airflow.utils.dates.days_ago(2),
...
)
I don't know why.
I just have a theory. Maaaaaaybe (big maybe) the issue was that because.days_ago(...) sets a UTC datetime with hour/minute/second set to 0 and then subtracts whichever number of days indicated in the argument, just saying "one day ago" or even "one day and 10 minutes ago" didn't put the start_date over the next period (00:15) and that was somehow confusing Airflow?
Let’s Repeat That The scheduler runs your job one schedule_interval
AFTER the start date, at the END of the period.
https://airflow.readthedocs.io/en/stable/scheduler.html#scheduling-triggers
So, the end of the period would be 00:15... If my theory was correct, doing it airflow.utils.dates.days_ago(1) - timedelta(minutes=16) would probably also work.
This doesn't explain why if I set a date very far in the past, it just doesn't run, though. ¯\_(ツ)_/¯