Django-background-tasks : tasks being randomly locked and never unlocked - django

I am using the django-background-tasks 1.2.0 on Ubuntu 18.04 and Im running it with a cronjob. Is it possible that my cronjob somehow starts the tasks right before it is refreshed and then it gets stuck ?
It could be one or many stuck tasks at the same moment, depending on how many pending there are.
Cronjob:
* * * * * /project/manage.py process_tasks --duration=59 --sleep=2
settings.py
BACKGROUND_TASK_RUN_ASYNC = True
BACKGROUND_TASK_ASYNC_THREADS = 4

After six months of extensive testing the only way I don't get any stuck tasks is by running two parallel cron jobs that overlap each other and at the moment of refresh there is always one running. I ve tried with 1 running for longer period (3600 seconds) but i got to the same problem.
1 * * * * /project/manage.py process_tasks --duration=3600 --sleep=2
24 * * * * /project/manage.py process_tasks --duration=3600 --sleep=2
I hope it ll help you guys as well.

Related

Can you schedule a cron that starts at a specific minute and hour and repeats?

For example, I'd like it to trigger at 6:30, 6:40, 6:50, 7:00, 7:10, 7:20..etc
Is it possible to schedule a cron job that starts at 6:30 every day and runs every 10 minutes until 10:00?
I've tried (30/10 6-10 ? * * * *), but that triggers 6:30, 6:40, 6:50, 7:30, 7:40 and misses the triggers between 7 and 7:30
This is on AWS EventBridge's scheduler.
With 30/10 it will start after 30 mins for every 10 mins. Due to this for every hour it will miss the 0-30 mins window
Why not try 0/10 6-10 ? * * *
Starting 3 - 6.00 6.10 & 6.20 will be extra but it works for other times

Cron expression Aws to run every 5 minutes after EventBridge (Cloudwatch trigger) enables

I've seen several examples and all of them just trigger one job at a specific time, I have right now:
0 */5 * ? * *
and it triggers at mins 0,5,10, and on.
But, I need the trigger to run at +5 of the moment that the trigger was enabled.
So, if service becomes enable at 12:07 pm I need it to run then at 12:12 pm and on.
Is there a way to accomplish this?
Like you mentioned offsets are part of the solution to your problem.
0 */5+your_offset * ? * *
Now coming to what could be your offset:
Let's say cloudwatch-event bridge is enabled at some 12:07, (You can get that info from event details timestamp.)
your_offset = 7 + 5
// so your cron becomes : 0 */5+12 * ? * *
Or in general your
offset = the minute part of timestamp + 5
// for your to schedule 5 mins after service is enabled
Solution is simple:
Before create the rule check the minute time at that moment
time_minutes_now */5 * ? * *

Airflow: Dag scheduled twice a few seconds apart

I am trying to run a DAG only once a day at 00:15:00 (midnight 15 minutes), yet, it's being scheduled twice, a few seconds apart.
dag = DAG(
'my_dag',
default_args=default_args,
start_date=airflow.utils.dates.days_ago(1) - timedelta(minutes=10),
schedule_interval='15 0 * * * *',
concurrency=1,
max_active_runs=1,
retries=3,
catchup=False,
)
The main goal of that Dag is check for new emails then check for new files in a SFTP directory and then run a "merger" task to add those new files to a database.
All the jobs are Kubernetes pods:
email_check = KubernetesPodOperator(
namespace='default',
image="g.io/email-check:0d334adb",
name="email-check",
task_id="email-check",
get_logs=True,
dag=dag,
)
sftp_check = KubernetesPodOperator(
namespace='default',
image="g.io/sftp-check:0d334adb",
name="sftp-check",
task_id="sftp-check",
get_logs=True,
dag=dag,
)
my_runner = KubernetesPodOperator(
namespace='default',
image="g.io/my-runner:0d334adb",
name="my-runner",
task_id="my-runner",
get_logs=True,
dag=dag,
)
my_runner.set_upstream([sftp_check, email_check])
So, the issue is that there seems to be two runs of the DAG scheduled a few seconds apart. They do not run concurrently, but as soon as the first one is done, the second one kicks off.
The problem here is that the my_runner job is intended to only run once a day: it tries to create a file with the date as a suffix, and if the file already exists, it throws an exception, so that second run always throws an exception (because the file for the day has already been properly created by the first run)
Since an image (or two) are worth a thousand words, here it goes:
You'll see that there's a first run that is scheduled "22 seconds after 00:15" (that's fine... sometimes it varies a couple of seconds here and there) and then there's a second one that always seems to be scheduled "58 seconds after 00:15 UTC" (at least according to the name they get). So the first one runs fine, nothing else seems to be running... And as soon as it finishes the run, a second run (the one scheduled at 00:15:58) starts (and fails).
A "good" one:
A "bad" one:
Can you check the schedule interval parameter?
schedule_interval='15 0 * * * *'. The cron schedule takes only 5 parameters and I see an extra star.
Also, can you have fixed start_date?
start_date: datetime(2019, 11, 10)
It looks like setting the start_date to 2 days ago instead of 1 did the trick
dag = DAG(
'my_dag',
...
start_date=airflow.utils.dates.days_ago(2),
...
)
I don't know why.
I just have a theory. Maaaaaaybe (big maybe) the issue was that because.days_ago(...) sets a UTC datetime with hour/minute/second set to 0 and then subtracts whichever number of days indicated in the argument, just saying "one day ago" or even "one day and 10 minutes ago" didn't put the start_date over the next period (00:15) and that was somehow confusing Airflow?
Let’s Repeat That The scheduler runs your job one schedule_interval
AFTER the start date, at the END of the period.
https://airflow.readthedocs.io/en/stable/scheduler.html#scheduling-triggers
So, the end of the period would be 00:15... If my theory was correct, doing it airflow.utils.dates.days_ago(1) - timedelta(minutes=16) would probably also work.
This doesn't explain why if I set a date very far in the past, it just doesn't run, though. ¯\_(ツ)_/¯

Django crontab, running a job every 12 hours

I have a django crontab sceduled to run every 12 hours, meaning it should run twice per day however, it is running more than that.
Can anyone tell me what's wront with it ?
('* */12 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))
Also what changes I need to make if I need it to run every 24 hours?
After every 12 hours you want to run job any particular minute from 0 to 59, not every other minute. So it should be (assuming 0th minute):
('0 */12 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))
For once in a day or every 24 hours (You can decide any specific hour from 0 to 23, assuming at midnight):
('0 0 * * *', 'some_method','>>'+os.path.join(BASE_DIR,'log/mail.log'))

Jenkins - Build Periodically doesn't work

I have googled several posts, but can't find anyone mentioning their build periodically not inspite of configuring the cron in right way.
I have the below setting for Build Periodically which I guess is correct, as I see the below comment. Still nothing happens. I have Jenkins 1.574 (latest stable build).
"Would last have run at Friday, 8 August 2014 07:01:40 o'clock BST; would next run at Friday, 8 August 2014 07:16:40 o'clock BST."
# every fifteen minutes (perhaps at :07, :22, :37, :52)
H/15 * * * *"
I have tried with several combinations like 15 * * * * / H * * * * / etc. But nothing seems to help. Can someone help me debugging and pointing me at right direction?
Thanks,
Raghu
I have faced same issue.
My Jenkins version 2.50 running on Ubuntu 16.04.2 LTS.
In order to get it resolved just create new job ( Do not copy ) with same configuration.
I really don't know what was the issue, but this is the solution I found.