Revoking a task on #periodic_task sends Discarding revoked tasks & Due task to workers.
celery-workers-screenshot
[2018-09-17 12:23:50,864: INFO/MainProcess] Received task: cimexapp.tasks.add[xxxxxxx]
[2018-09-17 12:23:50,864: INFO/MainProcess] Discarding revoked task: cimexapp.tasks.add[xxxxxxx]
[2018-09-17 12:24:00,865: INFO/Beat] Scheduler: Sending due task cimexapp.tasks.add (cimexapp.tasks.add)
[2018-09-17 12:24:00,869: INFO/MainProcess] Received task: cimexapp.tasks.add[xxxxxxx]
[2018-09-17 12:24:00,869: INFO/MainProcess] Discarding revoked task: cimexapp.tasks.add[xxxxxxx]
[2018-09-17 12:24:10,865: INFO/Beat] Scheduler: Sending due task cimexapp.tasks.add (cimexapp.tasks.add)
[2018-09-17 12:24:10,868: INFO/MainProcess] Received task: cimexapp.tasks.add[xxxxxxx]
[2018-09-17 12:24:10,869: INFO/MainProcess] Discarding revoked task: cimexapp.tasks.add[xxxxxxx]
tasks.py
#periodic_task(run_every=timedelta(seconds=10),options={"task_id":"xxxxxxx"})
def add():
call(["ping","-c10","google.com"])
def stop():
x = revoke("xxxxxxx",terminate=True,signal="KILL")
print(x)
print('DONE')
I had created task_id with a name so that it could be easy for me to kill it by calling id.
How do I completely stop it from sending tasks ?
I don't want to kill all workers with
pkill -9 -f 'celery worker'
celery -A PROJECTNAME control shutdown
I just want to stop the tasks/workers for add() function.
One possible way is to store the tasks in the database and add remove tasks dynamically. You can use database backed celery beat scheduler for the same. Refer https://django-celery-beat.readthedocs.io/en/latest/. PeriodicTask database store the periodic tasks. You can manipulate the periodic task by using database commands (Django ORM).
This is how I handled the dynamic tasks (Create and stop tasks dynamically).
from django_celery_beat.models import PeriodicTask, IntervalSchedule, CrontabSchedule
chon_schedule = CrontabSchedule.objects.create(minute='40', hour='08', day_of_week='*', day_of_month='*', month_of_year='*') # To create a cron schedule.
schedule = IntervalSchedule.objects.create(every=10, period=IntervalSchedule.SECONDS) # To create a schedule to run everu 10 min.
PeriodicTask.objects.create(crontab=chon_schedule, name='name_to_identify_task',task='name_of_task') # It creates a entry in the database describing that periodic task (With cron schedule).
task = PeriodicTask.objects.create(interval=schedule, name='run for every 10 min', task='for_each_ten_min', ) # It creates a periodic task with interval schedule
Whenever you update a PeriodicTask a counter in this table is also
incremented, which tells the celery beat service to reload the
schedule from the database.
So you don't need to restart the or kill the beat.
If you want to stop a task when particular criteria met then
periodic_task = PeriodicTask.objects.get(name='run for every 10 min')
periodic_task.enabled = False
periodic_task.save()
When enabled is False then the periodic task becomes idle. You can again make it active by making enable = True.
If you no longer needs the task then you can simply delete the entry.
When creating your Project model object create periodic task too. Just create cron schedule or interval schedule based on your scenario. Then Create PeriodicTask object you can give Project.name to PeriodicTask name(so you can easily relate the project object with PeriodicTask object. Thats it, From that moment onwards the task will handle by celery beat.
If you want to disable or enable Periodic task dynamically just set enabled flag in PeriodicTask as follows
task = PeriodicTask.objects.get(name='task_name')
task.enabled = False
task.save()
In Celery 4.2.2, you can remove all beat tasks by executing the following command
celery -A YourProjectName purge
Remember to replace YourProjectName.
For more info, check out this page
Related
I'm using Celery in Django with Redis as the Broker.
Tasks are being scheduled for the future using the eta argument in apply_async.
After scheduling the task, I can run celery -A MyApp inspect scheduled and I see the task with the proper eta for the future (24 hours in the future).
Before the scheduled time, if I restart Redis (with service redis restart) or the server reboots, running celery -A MyApp inspect scheduled again shows "- empty -".
All scheduled tasks are lost after Redis restarts.
Redis is setup with AOF, so it shouldn't be losing DB state after restarting.
EDIT
After some more research, I found out that running redis-cli -n 0 hgetall unacked both before and after the redis restart shows the tasked in the queue. So redis still has knowledge of the task, but for some reason when redis restarts, the task is removed from the worker? And then never sent again and it just stays indefinitely in the unakced queue.
I'm using RabbitMQ 3.6.0 and Celery 3.1.20 on a Windows 10 machine in a Django application. Everything is running on the same computer. I've configured Celery to Acknowledge Late (CELERY_ACKS_LATE=True) and now I'm getting connection problems.
I start the Celery worker, and after 50-60 seconds of handling tasks each worker thread fails with the following message:
Couldn't ack ###, reason:ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None)
(### is the number of the task)
When I look at the RabbitMQ logs I see this:
=INFO REPORT==== 10-Feb-2016::22:16:16 ===
accepting AMQP connection <0.247.0> (127.0.0.1:55372 -> 127.0.0.1:5672)
=INFO REPORT==== 10-Feb-2016::22:16:16 ===
accepting AMQP connection <0.254.0> (127.0.0.1:55373 -> 127.0.0.1:5672)
=ERROR REPORT==== 10-Feb-2016::22:17:14 ===
closing AMQP connection <0.247.0> (127.0.0.1:55372 -> 127.0.0.1:5672):
{writer,send_failed,{error,timeout}}
The error occurs exactly when the Celery workers are getting their connection reset.
I thought this was an AMQP Heartbeat issue, so I've added BROKER_HEARTBEAT = 15 to my Celery settings, but it did not make any difference.
I was having a similar issue with Celery on Windows with long running
tasks with concurrency=1. The following configuration finally worked for
me:
CELERY_ACKS_LATE = True
CELERYD_PREFETCH_MULTIPLIER = 1
I also started the celery worker daemon with the -Ofair option:
celery -A test worker -l info -Ofair
In my limited understanding, CELERYD_PREFETCH_MULTIPLIER sets the number
of messages that sit in the queue of a specific Celery worker. By
default it is set to 4. If you set it to 1, each worker will only
consume one message and complete the task before it consumes another
message. I was having issues with long-running task because the
connection to RabbitMQ was consistently lost in the middle of the long task, but
then the task was re-attempted if any other message/tasks were waiting
in the celery queue.
The following option was also specific to my situation:
CELERYD_CONCURRENCY = 1
Setting concurrency to 1 made sense for me because I had long running
tasks that needed a large amount of RAM so they each needed to run solo.
#bbaker solution with CELERY_ACKS_LATE (which is task_acks_late in celery 4x) itself did not work for me. My workers are in Kubernetes pods and must be run with --pool solo and each task takes 30-60s.
I solved it by including broker_heartbeat=0
broker_pool_limit = None
task_acks_late = True
broker_heartbeat = 0
worker_prefetch_multiplier = 1
I am using Supervisor to manage celery. The supervisor config file for celery contains the following 2 entries:
stdout_logfile = /var/log/supervisor/celery.log
stderr_logfile = /var/log/supervisor/celery_err.log
What is confusing me is that although Celery is working properly and all tasks are being successfully completed, they are all written to celery_err.log. I thought that would only be for errors. The celery.log file only shows the normal celery startup info. Is the behaviour of writing successful task completion to the error log correct?
Note - tasks are definitely completing successfully (emails being sent, db entries made etc).
I met the same phenomenon just like you.This is because of celery's loging mechanism. Just see the setup_task_loggers method of celery logger source.
If logfile is not specified, then sys.stderr is used.
So, clear enough? Celery uses sys.stderr when logfile is not specified.
Solution:
You can use supervisord redirect_stderr = true flag to combine two log files into one. I'm using this one.
Config the celery logfile option.
Is the behaviour of writing successful task completion to the error log correct?
No, its not. I am having same setup and logging is working fine.
celery.log has task info
[2015-07-23 11:40:07,066: INFO/MainProcess] Received task: foo[b5a6e0e8-1027-4005-b2f6-1ea032c73d34]
[2015-07-23 11:40:07,494: INFO/MainProcess] Task foo[b5a6e0e8-1027-4005-b2f6-1ea032c73d34] succeeded in 0.424549156s: 1
celery_err.log has some warnings/errors. Try restarting supervisor process.
One issue you could be having is that python buffers output by default
In your supervisor file, you can disable this by setting the PYTHONUNBUFFERED environment variable, see below my Django example supervisor file
[program:celery-myapp]
environment=DJANGO_SETTINGS_MODULE="myapp.settings.production",PYTHONUNBUFFERED=1
command=/home/me/.virtualenvs/myapp/bin/celery -A myapp worker -B -l DEBUG
directory=/home/me/www/saleor
user=me
stdout_logfile=/home/me/www/myapp/log/supervisor-celery.log
stderr_logfile=/home/me/www/myapp/log/supervisor-celery-err.log
autostart=true
autorestart=true
startsecs=10
Blockquote
I have a periodic task that I am implementing on heroku procfile using worker:
Procfile
web: gunicorn voltbe2.wsgi --log-file - --log-level debug
worker: celery -A voltbe2 worker --beat --events --loglevel INFO
tasks.py
class PullXXXActivityTask(PeriodicTask):
"""
A periodic task that fetch data every 1 mins.
"""
run_every = timedelta(minutes=1)
def run(self, **kwargs):
abc= MyModel.objects.all()
for rk in abc:
rk.pull()
logger = self.get_logger(**kwargs)
logger.info("Running periodic task for XXX.")
return True
For this periodictask, I need the --beat (I checked by turning it off, and it does not repeat the task). So, in some way, the --beat does the work of a clock (https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes)
My concern is: if I scale the worker heroku ps:scale worker=2 to 2x dynos, I am seeing that there are two beats running on worker.1 and worker.2 from the logs:
Aug 25 09:38:11 emstaging app/worker.2: [2014-08-25 16:38:11,580: INFO/Beat] Scheduler: Sending due task apps.notification.tasks.SendPushNotificationTask (apps.notification.tasks.SendPushNotificationTask)
Aug 25 09:38:20 emstaging app/worker.1: [2014-08-25 16:38:20,239: INFO/Beat] Scheduler: Sending due task apps.notification.tasks.SendPushNotificationTask (apps.notification.tasks.SendPushNotificationTask)
The log displayed is for a different periodic task, but the key point is that both worker dynos are getting signals to do the same task from their respective clocks, while in fact there should be one clock that ticks and after every XX seconds decides what to do, and gives that task to the least loaded worker.n dyno
More on why a single clock is essential is here : https://devcenter.heroku.com/articles/scheduled-jobs-custom-clock-processes#custom-clock-processes
Is this a problem and how to avoid this, if so?
You should have a separate worker for the beat process.
web: gunicorn voltbe2.wsgi --log-file - --log-level debug
worker: celery -A voltbe2 worker -events -loglevel info
beat: celery -A voltbe2 beat
Now you can scale the worker task without affecting the beat one.
Alternatively, if you won't always need the extra process, you can continue to use -B in the worker task but also have a second task - say, extra_worker - which is normally set to 0 dynos, but which you can scale up as necessary. The important thing is to always keep the task with the beat at 1 process
I use Celery with RabbitMQ in my Django app (on Elastic Beanstalk) to manage background tasks and I daemonized it using Supervisor.
The problem now, is that one of the period task that I defined is failing (after a week in which it worked properly), the error I've got is:
[01/Apr/2014 23:04:03] [ERROR] [celery.worker.job:272] Task clean-dead-sessions[1bfb5a0a-7914-4623-8b5b-35fc68443d2e] raised unexpected: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).',)
Traceback (most recent call last):
File "/opt/python/run/venv/lib/python2.7/site-packages/billiard/pool.py", line 1168, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).
All the processes managed by supervisor are up and running properly (supervisorctl status says RUNNNING).
I tried to read several logs on my ec2 instance but no one seems to help me in finding out what is the cause of the SIGKILL. What should I do? How can I investigate?
These are my celery settings:
CELERY_TIMEZONE = 'UTC'
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
BROKER_URL = os.environ['RABBITMQ_URL']
CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = False
CELERYD_HIJACK_ROOT_LOGGER = False
And this is my supervisord.conf:
[program:celery_worker]
environment=$env_variables
directory=/opt/python/current/app
command=/opt/python/run/venv/bin/celery worker -A com.cygora -l info --pidfile=/opt/python/run/celery_worker.pid
startsecs=10
stopwaitsecs=60
stopasgroup=true
killasgroup=true
autostart=true
autorestart=true
stdout_logfile=/opt/python/log/celery_worker.stdout.log
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=10
stderr_logfile=/opt/python/log/celery_worker.stderr.log
stderr_logfile_maxbytes=5MB
stderr_logfile_backups=10
numprocs=1
[program:celery_beat]
environment=$env_variables
directory=/opt/python/current/app
command=/opt/python/run/venv/bin/celery beat -A com.cygora -l info --pidfile=/opt/python/run/celery_beat.pid --schedule=/opt/python/run/celery_beat_schedule
startsecs=10
stopwaitsecs=300
stopasgroup=true
killasgroup=true
autostart=false
autorestart=true
stdout_logfile=/opt/python/log/celery_beat.stdout.log
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=10
stderr_logfile=/opt/python/log/celery_beat.stderr.log
stderr_logfile_maxbytes=5MB
stderr_logfile_backups=10
numprocs=1
Edit 1
After restarting celery beat the problem remains.
Edit 2
Changed killasgroup=true to killasgroup=false and the problem remains.
The SIGKILL your worker received was initiated by another process. Your supervisord config looks fine, and the killasgroup would only affect a supervisor initiated kill (e.g. the ctl or a plugin) - and without that setting it would have sent the signal to the dispatcher anyway, not the child.
Most likely you have a memory leak and the OS's oomkiller is assassinating your process for bad behavior.
grep oom /var/log/messages. If you see messages, that's your problem.
If you don't find anything, try running the periodic process manually in a shell:
MyPeriodicTask().run()
And see what happens. I'd monitor system and process metrics from top in another terminal, if you don't have good instrumentation like cactus, ganglia, etc for this host.
One sees this kind of error when an asynchronous task (through celery) or the script you are using is storing a lot of data in memory because it leaks.
In my case, I was getting data from another system and saving it on a variable, so I could export all data (into Django model / Excel file) after finishing the process.
Here is the catch. My script was gathering 10 Million data; it was leaking memory while I was gathering data. This resulted in the raised Exception.
To overcome the issue, I divided 10 million pieces of data into 20 parts (half a million on each part). I stored the data in my own preferred local file / Django model every time the length of data reached 500,000 items. I repeated this for every batch of 500k items.
No need to do the exact number of partitions. It is the idea of solving a complex problem by splitting it into multiple subproblems and solving the subproblems one by one. :D