Where celery stores task functions? - django

Have a Django app that runs periodic tasks via celery+kombu+Oracle. I`ve spent some time, until noticed that to change the tasks code celery worker needs to be restarted, not the Django server (uWSGI).
The question is, where does celery stores that code? Some sort of cache or what?

A Celery system consists of 1 or more (usually python) processes which load your methods/tasks in memory.
It's the same as launching an interactive shell. If you do:
>>> from spam import eggs
eggs will be allocated to a memory slot. If you edit eggs, you'll have to restart the shell to see the changes.

Celery runs several worker processes, separate from the django server process.
These processes load the python code into memory and execute it. They continue running until shut down.
If you update the python code on disk the change will not be picked up by the running processes - you will need to restart them.

Related

Tasks created from celery tasks getting created twice

We are using celery 3.1.17 with redis backend in our flask 0.10.1 application. On our server every celery task created from some celery task is getting created twice. For example,
#celery.task(name='send_some_xyz_users_alerts')
def send_some_xyz_users_alerts():
list_of_users = find_some_list_of_users()
for user in list_of_users:
send_user_alert.delay(user)
#celery.task(name='send_user_alert')
def send_user_alert(user):
data = get_data_for_user(user)
send_mail_to_user(data)
If we start send_some_xyz_users_alerts from our application it runs once. I then see 2 send_user_alert tasks running in celery for each user. Both these tasks have different task_ids. We have 2 workers running on server. Some times these duplicate tasks run on same worker. Sometimes on different workers. I have tried lot to find the problem without any luck. Would really appreciate if someone knows why this could happen. Things were running fine for months on these versions of celery and flask and suddenly we are seeing this problem on our servers. Tasks run fine on local env.

Celery - How to route task to local worker only

I have a Django view where a user can upload a file to process.
I'd like to hand off the processing to a celery task but I need to give the task a path to the file.
The problem is I have 3 servers running the Django app and the same three servers running celery workers.
Is there a way I can tell Celery I only want the task to run on a worker that's on the same server where the file was uploaded?
(Or is there a way better way to do this? I don't have any shared locations all three servers can see files.)

Django Celery Beat admin updating Cron Schedule Periodic task not taking effect

I'm running a site using Django 10, RabbitMQ, and Celery 4 on CentOS 7.
My Celery Beat and Celery Worker instances are controlled by supervisor and I'm using the django celery database scheduler.
I've scheduled a cron style task using the cronsheduler in Django-admin.
When I start celery beat and worker instances the job fires as expected.
But if a change the schedule time in Django-admin then the changes are not picked up unless I restart the celery-beat instance.
Is there something I am missing or do I need to write my own scheduler?
Celery Beat, with the 'django_celery_beat.schedulers.DatabaseScheduler' loads the schedule from the database. According to the following doc https://media.readthedocs.org/pdf/django-celery-beat/latest/django-celery-beat.pdf this should force Celery Beat to reload:
A schedule that runs at a specific interval (e.g. every 5 seconds).
•
django_celery_beat.models.CrontabSchedule
A
schedule
with
fields
like
entries
in
cron: minute hour day-of-week day_of_month month_of_year.
django_celery_beat.models.PeriodicTasks
This model is only used as an index to keep track of when the schedule has changed. Whenever you update a PeriodicTask a counter in this table is also incremented, which tells the celery beat
service to reload the schedule from the database.
If you update periodic tasks in bulk, you will need to update the counter manually:
from django_celery_beat.models import PeriodicTasks
PeriodicTasks.changed()
From the above I would expect the Celery Beat process to check the table regularly for any changes.
i have changed the celery from 4.0 to 3.1.25, django to 1.9.11 and installed djcelery 3.1.17. Then test again, It's OK. So, maybe it's a bug.
I have a solution by:
Creating a separate worker process that consumes a RabbitMQ queue.
When Django updates the database it posts a message to the queue containing the name of the Celery Beat process (name defined by Supervisor configuration).
The worker process then restarts the named Celery Beat process.
A bit long winded but does the job. Also makes it easier to manage multiple Django apps on the same server that require the same functionality.

Gunicorn is creating workers in every second

I am running Django using Gunicorn behind Nginx. In one of my installation, when I run the gunicorn process, I keep getting debug output, it's like workers are being created in every second (I assume this because django is loading very slow and note the message "[20205] [DEBUG] 3 workers"). You can check the detail output at this gist
In similar setup, I am running 3 more installations without any such issues and respective site loads almost instantly.
Any idea why this is happening? Thanks.
The polling of the workers every second on --log-level debug was introduced in gunicorn==19.2.
Change the log level to info.

Django with celery: scheduled task (ETA) executed multiple times in parallel

I'm developing a web application with Django which uses Celery to process asynchronous tasks, especially for transactional emails.
One on my email task is scheduled with the ETA option but it's executed multiple times in parallel resulting in mail chain, very anoying. I can't figure out exactly why.
I checked twice my Django code and I'm sure that it is publish only one time.
I'm using Redis as a broker/backend result.
My Celery daemon is hosted on Heroku and launched via this command:
python manage.py celeryd -E -B --loglevel=INFO
Thanks for your help.
EDIT: I find a valid solution here thanks to a guy on the #celery IRC channel: http://loose-bits.com/2010/10/distributed-task-locking-in-celery.html
Have you checked the Ensuring a task is only executed one at a time docs?