How to set max task per child in Celery with Django? - django

I am trying to set some settings for Celery in my Django setup, but wherever I put this:
CELERYD_MAX_TASKS_PER_CHILD=1
it always allows to start multiple tasks at the same time. I tried putting it in settings.py and proj.settings. My celery.py is as follows:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('proj', backend='redis://', broker='redis://localhost')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()

The place where it should go is settings.py:
CELERY_WORKER_CONCURRENCY = 1 # this is the one I was actually looking for
CELERY_MAX_TASKS_PER_CHILD = 1

There is no limit by default.
Try
from celery import conf
conf.CELERYD_MAX_TASKS_PER_CHILD = 1 #max_tasks_per_child
Also, you can pass it through the cmd as parameter at starting (depends of ver.):
--maxtasksperchild=1
or
--max-tasks-per-child=1
Source

Related

django import-export-celery cannot import resource

I'm following this repo but I got this error:
Error: Import error cannot import name 'ProfileResource' from 'crowdfunding.models' (C:\_\_\_\_\_\crowdfunding\models.py)
which supposedly makes an asynchronous import. The problem is it cannot detect my ProfileResource.
I have specified in my settings.py that my resource be retrieved from admin.py.
def resource():
from crowdfunding.admin import ProfileResource
return ProfileResource
IMPORT_EXPORT_CELERY_MODELS = {
"Profile": {
'app_label': 'crowdfunding',
'model_name': 'Profile',
'resource': resource,
}
}
but it can't seem to do that.
My celery.py is this:
from __future__ import absolute_import, unicode_literals
import os
import sys
from celery import Celery
# sys.path.append("../")
# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mainapp.settings')
from django.conf import settings
app = Celery('mainapp',
broker='amqp://guest:guest#localhost:15672//',
# broker='localhost',
# backend='rpc://',
backend='db+sqlite:///db.sqlite3',
# include=['crowdfunding.tasks']
)
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
and the broker and backend are working fine so it's just the config not being recognized. What could be the problem?
I believe that the problem is that changes to the code do not apply to celery automatically. Every time you change the source code, you need to manually restart celery to apply the new changes that you made to the import path in settings.py.

Prevent Redis/Celery scheduled tasks from being processed multiple times - Django on Heroku

I am looking for some advice. I use Celery/Redis Scheduled tasks to check changes via an API request every 10 seconds. If there is a change, a database object with the request feedback will be created and when it did some calculations a boolean named is_called will be set to True, to prevent duplicates.
This worked fine locally and for a time also on Heroku, but since an update (nothing in the task code changed) the worker seems busy and uses up to 7 ForkPoolWorkers.
For example the ForkPoolWorker-1 will work on the same task as ForkPoolWorker-7, which ignores the is_called = True boolean and gives me duplicate calculations with 1 database object created.
What's the best way to serve a task only to 1 ForkPoolWorker at a time? I read the docs and google research, but it's not completely clear what and how to do this.
I built in Django on Heroku.
celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "stockpilot.settings")
app = Celery('proj')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.beat_schedule = {
'every-ten-seconds': {
'task': 'get_api_task',
'schedule': 10.0
},
}
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))

celery not working in django and just waiting (pending)

i'm trying found how celery is working. i have a project that have about 10 app.now i want use celery .
setting.py:
CELERY_BROKER_URL = 'amqp://rabbitmq:rabbitmq#localhost:5672/rabbitmq_vhost'
CELERY_RESULT_BACKEND = 'redis://localhost'
i created a user in rabbitmq with this info:username: rabbitq and password:rabbitmq . then i create a vhost with name rabbitmq_vhost and add rabbitmq permission to it. all is fine i think because all of error about rabbitmq disappear .
here is my test.py:
from .task import when_task_expiration
def test_celery():
result = when_task_expiration.apply_async((2, 2), countdown=3)
print(result.get())
task.py:
from __future__ import absolute_import, unicode_literals
import logging
from celery import shared_task
from proj.celery import app
#app.task
def when_task_expiration(task, x):
print(task.id, 'task done')
return True
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('proj')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
now when i call test_celery() in python shell it's pending.i try to replace #shared_task and #app.task(bind=True) but noting changed.even i try use .delay() instead apply_async((2, 2), countdown=3) and again nothing happend.
i'm trying to use celery to call a function in specific time during this quesation that i ask in past.thank you.
You most likely forgot to run at least one Celery worker process. To do so, execute the following in the shell: celery worker -A proj.celery -c 4 -l DEBUG (here I assumed your Celery application is defined in proj/celery.py as you have Celery('proj') in there)

if a task call with delay() when will execute exactly?

I'm new in celery and i want use it but i don't know when i call a task with delay() when exactly will execute? and after adding a new task what i must to do to this task work correctly? i use a present project and i extending it, but old task work correctly and my task doesn't .
present app1/task.py:
from __future__ import absolute_import, unicode_literals
import logging
logger = logging.getLogger('notification')
#shared_task
def send_message_to_users(users, client_type=None, **kwargs):
.
.
#doing something here
.
.
logger.info(
'notification_log',
exc_info=False,
)
)
and this is my code in app2/task.py:
#shared_task
def update_students_done_count(homework_id):
homework_students = HomeworkStudent.objects.filter(homework_id=homework_id)
students_done_count = 0
for homework_student in homework_students:
if homework_student.student_homework_status:
students_done_count += 1
homework = get_object_or_404(HomeWork, homework_id=homework_id)
homework.students_done_count = students_done_count
homework.save()
logger.info(
'update_students_done_count_log : with id {id}'.format(id=task_id),
exc_info=False,
)
sample of how both task calling:
send_message_to_users.delay(users = SomeUserList)
update_students_done_count.delay(homework_id=SomeHomeWorkId)
project/celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hamclassy.settings')
app = Celery('hamclassy')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
projetc/init.py:
from __future__ import absolute_import, unicode_literals
# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app
__all__ = ['celery_app']
after any change to task i run workers by this command:
celery -A project worker -l info
when i calling update_students_done_count.delay(homework_id=1) in an endpoint the log shown me that recived task but how much i'm waiting the task not execute. any idea?thank you

Celery + Amazon SQS - Tasks performed twice after the last Daylight Saving Time Change

On the last Sunday I found the tasks performed twice.
This has been a problem especially for mass emails sent twice.
What is the problem?
Another question, same issue, but no answer:
Duplicated tasks after time change
My celery config file:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from celery.schedules import crontab
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myproj.settings_prod')
app = Celery('myproj')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
app.conf.beat_schedule = {
'mass-email1': {
'task': 'myproj.myapp.tasks.send_email1',
'schedule': crontab(hour=8, minute=30, day_of_week=1), # Executes every Monday morning at 8:30am
}
}
The timezone set in settings.py is TIME_ZONE = 'America/New_York'.
Celery packages:
celery==4.1.1
django_celery_beat==1.1.1
django_celery_results==1.0.1