I am using django celery beat to schedule my task .
Currently i am just creating a interval schedule of 2 days and creating a periodic task to run at that interval .
My main problem is , when i schedule a task to run at 2 days , at what time does it run ? and cant i change that time , because i need to run the interval task at certain time provided by the user .
The code written so far is
periodic_task=PeriodicTask.objects.update_or_create(
name='my-interval-task,
defaults={
'interval': schedule, #interval schedule object
'task': 'myapp.tasks.auto_refresh',
}
)
Have a look at the crontab class
Eg. schedule = crontab(hour=0, minute=0, day_of_month='2-30/3') fires every even numbered day at midnight
Related
i like to add celery periodic task based on user api request , from which i can get the time and date . Is there anyway to trigger celery.beat by api call ?
Solution 1
If all you need is to execute the task at the date/time received from the API request, you don't need to run celery beat for this, all you need is configure the estimated time of arrival (ETA)
The ETA (estimated time of arrival) lets you set a specific date and time that is the earliest time at which your task will be executed.
tasks.py
from celery import Celery
app = Celery("my_app")
#app.task
def my_task(param):
print(f"param {param}")
Logs (Producer)
>>> from dateutil.parser import parse
>>> import tasks
>>>
>>> api_request = "2021-08-20T15:00+08:00"
>>> tasks.my_task.apply_async(("some param",), eta=parse(api_request))
<AsyncResult: 82a5710c-095f-49a2-9289-d6b86e53d4da>
Logs (Consumer)
$ celery --app=tasks worker --queues=celery --loglevel=INFO
[2021-08-20 14:58:18,234: INFO/MainProcess] Task tasks.my_task[82a5710c-095f-49a2-9289-d6b86e53d4da] received
[2021-08-20 15:00:00,161: WARNING/ForkPoolWorker-4] param some param
[2021-08-20 15:00:00,161: WARNING/ForkPoolWorker-4]
[2021-08-20 15:00:00,161: INFO/ForkPoolWorker-4] Task tasks.my_task[82a5710c-095f-49a2-9289-d6b86e53d4da] succeeded in 0.0005905449997953838s: None
As you can see, the task executed at exactly 15:00 as requested by the mocked API request.
Solution 2
If you need to execute the task periodically based on the API request e.g. for every minute from the time indicated, then you have to run celery beat (note that this will be a separate worker). But since this will be a dynamic update where you need to add a new task at runtime without restarting Celery, then you can't just add a new schedule because it wouldn't reflect, you can't either update the file celerybeat-schedule (the file that the celery beat scheduler is reading from time to time to execute scheduled tasks) which holds the info about the tasks that are scheduled because it is locked while the celery beat scheduler is running.
To solve this, you have to change the usage of the file celerybeat-schedule into a database so that it is possible to update it even while celery beat scheduler is running. This way, during runtime, if you update the database and add a new scheduled task, the celery beat scheduler would see it and execute accordingly, without the need of restarts.
For the solution, since you are not in a Django application (where you could use django-celery-beat), I used celery-sqlalchemy-scheduler. You can see the detailed steps in my other answer posted here: https://stackoverflow.com/a/68858483/11043825
I want to know if we can schedule a DAG to run continuously after 2 minutes of completion of the same DAG in Airflow.
Edit:
My DAG should run in such a way that every time it completes its run, it has to wait for 2 minutes and start running again. I don't want to schedule my DAG to run for every 2 minutes instead it should continuously run right after 2 minutes of completion of the same DAG.
You could schedule your dag at an arbitrary time in a day and use the TriggerDagRunOperator to trigger itself again. To wait for 2 minutes before triggering itself, you could simply introduce a sleep task.
DAG:
Task 1 >> Task 2 >> Task 3 BashOperator(bash_command="sleep 120") >> Task 4 TriggerDagRunOperator(trigger_dag_id="this-dag-id")
Yes, you can schedule DAG to run every 2 Minutes.
Set schedule_interval='*/2 * * * *'
Schedule_interval accepts a CRON expression:
https://en.wikipedia.org/wiki/Cron#CRON_expression
Structuring dag
If you want to rerun continuously after 2 mins try configuring TriggerDagRunOperator
Is it possible to get seconds remaining to start scheduled celery task
. eg: my_task.apply_async(countdown=100)
You could see the dump of scheduled tasks http://docs.celeryproject.org/en/latest/userguide/workers.html#dump-of-scheduled-eta-tasks
Every task has eta param, so you can calculate remaining time.
Everyday I have a set of tasks (A few hundred) that needs to performed at random time. So a periodic task is probably not what I want. It seems like I cannot dynamically change crontab
Should I:
Dynamically schedule a task whenever it's received a user, and let celery "wake up" at the scheduled time to perform the task? If so, how is it done?
OR
Create a celery tasks that wakes up every 60 seconds to look into the database for tasks scheduled current time. So the database acts as a queue. I am wondering if this would put too much load on the server?
I have this task which is set to crontab(day_of_month=1). But then when it perform the tasks in continues to send task minutely which is supposed to perform once.
from my tasks.py
from celery.task.schedules import crontab
#periodic_task(run_every=crontab(day_of_month=1))
def Sample():
...
Am I missing something?
By default crontab will run every minute so you need to specify minutes and hours.
Change #periodic_task(run_every=crontab(day_of_month=1)) to #periodic_task(run_every=crontab(minute=0, hour=0, day_of_month=1))
This would run the task only at midnight on the first day of the month.