Celeryd Worker stops processing tasks - django

I have a celeryd running with two workers, and watching them I see them accept 4 tasks each, process them, and then stop processing tasks. (Note that these tasks are long running, up to 2mins to process).
celeryctl provides the following information:
django#server: ./manage.py celeryctl inspect active
<- active
DEBUG 2012-06-01 12:51:11,330 amqplib 661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2012 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.8.2'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
DEBUG 2012-06-01 12:51:11,331 amqplib 507 Open OK! known_hosts []
DEBUG 2012-06-01 12:51:11,331 amqplib 70 using channel_id: 1
DEBUG 2012-06-01 12:51:11,332 amqplib 484 Channel open
-> eso-dev: OK
- empty -
django#server: ./manage.py celeryctl inspect scheduled
<- scheduled
DEBUG 2012-06-01 12:52:07,107 amqplib 661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2012 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.8.2'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
DEBUG 2012-06-01 12:52:07,108 amqplib 507 Open OK! known_hosts []
DEBUG 2012-06-01 12:52:07,108 amqplib 70 using channel_id: 1
DEBUG 2012-06-01 12:52:07,109 amqplib 484 Channel open
-> eso-dev: OK
- empty -
django#server: ./manage.py celeryctl inspect registered
<- registered
DEBUG 2012-06-01 12:52:20,567 amqplib 661 Start from server, version: 8.0, properties: {u'information': u'Licensed under the MPL. See http://www.rabbitmq.com/', u'product': u'RabbitMQ', u'copyright': u'Copyright (C) 2007-2012 VMware, Inc.', u'capabilities': {}, u'platform': u'Erlang/OTP', u'version': u'2.8.2'}, mechanisms: [u'PLAIN', u'AMQPLAIN'], locales: [u'en_US']
DEBUG 2012-06-01 12:52:20,568 amqplib 507 Open OK! known_hosts []
DEBUG 2012-06-01 12:52:20,568 amqplib 70 using channel_id: 1
DEBUG 2012-06-01 12:52:20,569 amqplib 484 Channel open
-> eso-dev: OK
* celery.backend_cleanup
* celery.chord
* celery.chord_unlock
*
celery_haystack.tasks.CeleryHaystackSignalHandler
*
celery_haystack.tasks.CeleryHaystackUpdateIndex
* convert.tasks.create_pdf
* convert.tasks.create_pngs
In addition every time this happens the last thing to be printed to logs is:
[2012-06-01 12:17:53,777: INFO/MainProcess] Task convert.tasks.create_pdf[319984de-5bc4-47fc-891f-273d827d625f] retry: None
None
[2012-06-01 12:17:54,327: INFO/MainProcess] Task convert.tasks.create_pdf[8a89f3c1-e991-487e-a2db-a57d23bae17f] retry: None
None
The tasks also happen to have failed just before this is printed, and in my code all I have called is
:
except HTTPError, e:
statsd.incr('A.stat')
log.warn('Woops: %s', e)
create_pdf.retry()
If I kill celeryd (^C, and it dies straight away, no waiting for tasks) and start it again it continues like nothing has happend for a few more tasks and dies again (I think its always on the create_pdf task, but the logs show this task failing and being retried without a problem)

This is a know bug with celery that was fixed in v3.0. (Issue 707)

Related

Sitecore ECM 2.1 on Sitecore 7.2 Not Sending Emails when New Relic APM is Enabled

Sitecore Email Campaign Manager 2.1 appears to get stuck on sending emails whenever New Relic APM is Enabled, if New Relic is disabled the Email Campaign Manager sends as usual.
No errors reported in the Sitecore Logs
Request just stops processing and that thread never shows up in the log again
IIS Log shows return codes of 200 so no failures there either
When performing an IIS reset sometimes the email is received but in a delayed fashion, such as 30 minutes after an IIS Reset
Below are the logs with Sitecore ECM debug set to true:
2015-11-18 16:00:57 ManagedPoolThread #4 INFO Job started: Sending message (56E4501BEE95446BAD97171B3316226F)
2015-11-18 16:00:57 ManagedPoolThread #4 INFO EmailCampaign: 'SendAnEmail': 1 recipient is added to the queue.
2015-11-18 16:00:57 ManagedPoolThread #4 INFO EmailCampaign: Dispatch Message (SendAnEmail): Started
2015-11-18 16:00:57 6312 INFO EmailCampaign: BodyLink -> GetTargetItemUrl: 00:00:00.0130126
2015-11-18 16:00:57 6312 INFO EmailCampaign: Get body link: 00:00:00.0190269
2015-11-18 16:00:58 6312 INFO EmailCampaign: Download string content(url): 00:00:00.6889992
2015-11-18 16:00:59 6312 INFO EmailCampaign: ReplaceTokens -> Find/Add $title$ token: 00:00:00.6088072
2015-11-18 16:00:59 6312 INFO EmailCampaign: Replace tokens: 00:00:00
2015-11-18 16:00:59 6312 INFO EmailCampaign: Remove 'form' tag: 00:00:00.0157781
2015-11-18 16:00:59 6312 INFO EmailCampaign: Remove VIEWSTATE: 00:00:00
2015-11-18 16:00:59 6312 INFO EmailCampaign: Insert style sheets: 00:00:00
2015-11-18 16:00:59 6312 INFO EmailCampaign: Modify 'href' links: 00:00:00.0491805
There should be a parameter in the New Relic config:
<browserMonitoring autoInstrument="true" />
which is designed to time page download speed to browser.
With this set to true a JavaScript being injected by New Relic monitoring on the SE environment.
However, it will do this to ALL web pages generated by .NET including email body.
It is possible that ECM 2.1 cannot correctly process the message body with this script.

Django 1.6 + RabbitMQ 3.2.3 + Celery 3.1.9 - why does my celery worker die with: WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV)

This seems to address a very similar issue, but doesn't give me quite enough insight: https://github.com/celery/billiard/issues/101 Sounds like it might be a good idea to try a non-SQLite database...
I have a straightforward celery setup with my django app. In my settings.py file I set a task to run as follows:
CELERYBEAT_SCHEDULE = {
'sync_database': {
'task': 'apps.data.tasks.celery_sync_database',
'schedule': timedelta(minutes=5)
}
}
I have followed the instructions here: http://celery.readthedocs.org/en/latest/django/first-steps-with-django.html
I am able to open two new terminal windows and run celery processes as follows:
ONE - the celery beat process which is required for scheduled tasks and will put the task on the queue:
PROMPT> celery -A myproj beat
celery beat v3.1.9 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> amqp://myproj#localhost:5672//
. loader -> celery.loaders.app.AppLoader
. scheduler -> djcelery.schedulers.DatabaseScheduler
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2014-02-20 16:15:20,085: INFO/MainProcess] beat: Starting...
[2014-02-20 16:15:20,086: INFO/MainProcess] Writing entries...
[2014-02-20 16:15:20,143: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2014-02-20 16:15:20,143: INFO/MainProcess] Writing entries...
[2014-02-20 16:20:20,143: INFO/MainProcess] Scheduler: Sending due task sync_database (apps.data.tasks.celery_sync_database)
[2014-02-20 16:20:20,161: INFO/MainProcess] Writing entries...
TWO - the celery worker, which should take the task off the queue and run it:
PROMPT> celery -A myproj worker -l info
-------------- celery#Jons-MacBook.local v3.1.9 (Cipater)
---- **** -----
--- * *** * -- Darwin-13.0.0-x86_64-i386-64bit
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app: myproj:0x1105a1050
- ** ---------- .> transport: amqp://myproj#localhost:5672//
- ** ---------- .> results: djcelery.backends.database:DatabaseBackend
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> celery exchange=celery(direct) key=celery
[tasks]
. apps.data.tasks.celery_sync_database
. myproj.celery.debug_task
[2014-02-20 16:15:29,402: INFO/MainProcess] Connected to amqp://myproj#127.0.0.1:5672//
[2014-02-20 16:15:29,419: INFO/MainProcess] mingle: searching for neighbors
[2014-02-20 16:15:30,440: INFO/MainProcess] mingle: all alone
[2014-02-20 16:15:30,474: WARNING/MainProcess] celery#Jons-MacBook.local ready.
When the task gets sent, however, it appears that about 50% of the time the worker runs the task and the other 50% of the time I get the following error:
[2014-02-20 16:35:20,159: INFO/MainProcess] Received task: apps.data.tasks.celery_sync_database[960bcb6c-d6a5-4e32-8267-cfbe2b411b25]
[2014-02-20 16:36:54,561: ERROR/MainProcess] Process 'Worker-4' pid:19500 exited with exitcode -11
[2014-02-20 16:36:54,580: ERROR/MainProcess] Task apps.data.tasks.celery_sync_database[960bcb6c-d6a5-4e32-8267-cfbe2b411b25] raised unexpected: WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)
Traceback (most recent call last):
File "/Users/jon/dev/vpe/VAN/lib/python2.7/site-packages/billiard/pool.py", line 1168, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).
I am developing on a Macbook Pro running Mavericks.
Celery version 3.1.9
RabbitMQ 3.2.3
Django 1.6
Note that I am using django-celery 3.1.9 and have the djcelery app enabled.
When I switched from SQLite to PostgreSQL the problem disappeared.

Django/Celery Quickstart example not working (worker is not executing any tasks)

I'm using Django/Celery Quickstart... or, how I learned to stop using cron and love celery, and it seems the jobs are getting queued, but never run.
tasks.py:
from celery.task.schedules import crontab
from celery.decorators import periodic_task
# this will run every minute, see http://celeryproject.org/docs/reference/celery.task.schedules.html#celery.task.schedules.crontab
#periodic_task(run_every=crontab(hour="*", minute="*", day_of_week="*"))
def test():
print "firing test task"
So I run celery:
bash-3.2$ sudo manage.py celeryd -v 2 -B -s celery -E -l INFO
/scratch/software/python/lib/celery/apps/worker.py:166: RuntimeWarning: Running celeryd with superuser privileges is discouraged!
'Running celeryd with superuser privileges is discouraged!'))
-------------- celery#myserver v3.0.12 (Chiastic Slide)
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: django://localhost//
- ** ---------- . app: default:0x12120290 (djcelery.loaders.DjangoLoader)
- ** ---------- . concurrency: 2 (processes)
- ** ---------- . events: ON
- ** ----------
- *** --- * --- [Queues]
-- ******* ---- . celery: exchange:celery(direct) binding:celery
--- ***** -----
[Tasks]
. GotPatch.tasks.test
[2012-12-12 11:58:37,118: INFO/Beat] Celerybeat: Starting...
[2012-12-12 11:58:37,163: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 11:58:37,249: WARNING/MainProcess] /scratch/software/python/lib/djcelery/loaders.py:132: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn("Using settings.DEBUG leads to a memory leak, never "
[2012-12-12 11:58:37,348: WARNING/MainProcess] celery#myserver ready.
[2012-12-12 11:58:37,352: INFO/MainProcess] consumer: Connected to django://localhost//.
[2012-12-12 11:58:37,700: INFO/MainProcess] child process calling self.run()
[2012-12-12 11:58:37,857: INFO/MainProcess] child process calling self.run()
[2012-12-12 11:59:00,229: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:00:00,017: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:01:00,020: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
[2012-12-12 12:02:00,024: INFO/Beat] Scheduler: Sending due task GotPatch.tasks.test (GotPatch.tasks.test)
The tasks are indeed getting queued:
python manage.py shell
>>> from kombu.transport.django.models import Message
>>> Message.objects.count()
234
And the count increases over time:
>>> Message.objects.count()
477
There are no lines in the log file that seem to indicate the task is being executed. I'm expecting something like:
[... INFO/MainProcess] Task myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492] succeeded in 0.00423407554626s: None
Any suggestions how to diagnose / debug this?
I'm new to celery as well, but from the comments on the link you provided, it looks like there was an error in the tutorial. One of the comments points out:
At this command
sudo ./manage.py celeryd -v 2 -B -s celery -E -l INFO
You must add "-I tasks" to load tasks.py file ...
Did you try that?
You should check that you specify BROKER_URL parameter inside django's settyngs.py.
BROKER_URL = 'django://'
And you should check that your timezones in django, mysql and celery is equal.
It helped me.
P.s.:
[... INFO/MainProcess] Task myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492] succeeded in 0.00423407554626s: None
This line means that your task was scheduled (!not executed!)
Please check your config and i hope that it helps you.
I hope someone could learn from my experience in hacking this.
After setting everything up according to the tutorial I noticed that when I call
add.delay(4,5)
nothing happens. the worker did not receive the task (nothing was printed on stderr).
The problem was with the rabbitmq installation. It turns out the default free disk size requirements is 1GB which was way too much for my VM.
what put me on track was to read the rabbitmq log file.
to find it I had to stop and start the rabbitmq server
sudo rabbitmqctl stop
sudo rabbitmq-server
rabbitmq dumps the log file location to the screen. in the file I noticed this:
=WARNING REPORT==== 14-Mar-2017::13:57:41 ===
disk resource limit alarm set on node rabbit#supporttip.
**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************
I then followed the instruction here in order to reduce the free disk limit
Rabbitmq ignores configuration on Ubuntu 12
As a baseline I used the config file from git
https://github.com/rabbitmq/rabbitmq-server/blob/stable/docs/rabbitmq.config.example
The change itself:
{disk_free_limit, "50MB"}

How do I properly install CouchDB using build-couchdb?

I'm trying CouchDB on Ubuntu 11.10. Several tests were failing, so I followed this article's advice and tried to install from build-couchdb, but I'm getting some nasty errors trying to start couchdb after a successful build.
Does anyone know what this crash report means?
Does anyone know why 1.0.1 would be installed, and not the latest build version 1.1.0?
Thanks!
$ build/bin/couchdb
Apache CouchDB 1.0.1 (LogLevel=info) is starting.
=CRASH REPORT==== 8-Jan-2012::22:19:54 ===
crasher:
initial call: couch_event_sup:init/1
pid: <0.80.0>
registered_name: []
exception exit: {{badmatch,
{'EXIT',
{{badmatch,{error,enoent}},
[{couch_log,init,1},
{gen_event,server_add_handler,4},
{gen_event,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}}},
[{couch_event_sup,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
in function gen_server:init_it/6
ancestors: [couch_primary_services,couch_server_sup,<0.32.0>]
messages: []
links: [<0.79.0>,<0.6.0>]
dictionary: []
trap_exit: false
status: running
heap_size: 377
stack_size: 24
reductions: 116
neighbours:
=SUPERVISOR REPORT==== 8-Jan-2012::22:19:54 ===
Supervisor: {local,couch_primary_services}
Context: start_error
Reason: {{badmatch,{'EXIT',{{badmatch,{error,enoent}},
[{couch_log,init,1},
{gen_event,server_add_handler,4},
{gen_event,handle_msg,5},
{proc_lib,init_p_do_apply,3}]}}},
[{couch_event_sup,init,1},
{gen_server,init_it,6},
{proc_lib,init_p_do_apply,3}]}
Offender: [{pid,undefined},
{name,couch_log},
{mfargs,{couch_log,start_link,[]}},
{restart_type,permanent},
{shutdown,brutal_kill},
{child_type,worker}]
=SUPERVISOR REPORT==== 8-Jan-2012::22:19:54 ===
Supervisor: {local,couch_server_sup}
Context: start_error
Reason: shutdown
Offender: [{pid,undefined},
{name,couch_primary_services},
{mfargs,{couch_server_sup,start_primary_services,[]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
=CRASH REPORT==== 8-Jan-2012::22:19:54 ===
crasher:
initial call: application_master:init/4
pid: <0.31.0>
registered_name: []
exception exit: {bad_return,
{{couch_app,start,
[normal,
["/etc/couchdb/default.ini",
"/etc/couchdb/local.ini"]]},
{'EXIT',
{{badmatch,{error,shutdown}},
[{couch_server_sup,start_server,1},
{application_master,start_it_old,4}]}}}}
in function application_master:init/4
ancestors: [<0.30.0>]
messages: [{'EXIT',<0.32.0>,normal}]
links: [<0.30.0>,<0.7.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 156
neighbours:
=INFO REPORT==== 8-Jan-2012::22:19:54 ===
application: couch
exited: {bad_return,{{couch_app,start,
[normal,
["/etc/couchdb/default.ini",
"/etc/couchdb/local.ini"]]},
{'EXIT',{{badmatch,{error,shutdown}},
[{couch_server_sup,start_server,1},
{application_master,start_it_old,4}]}}}}
type: temporary
Marcello is right in his comment. The log indicates that you are somehow (I'm not sure how) running version 1.0.1 but Build CouchDB would be building version 1.1.1.
Perhaps you could update your question with the output of these commands?
pwd
./build/bin/couchdb

Django celery: Consumer Connection Error (111) when running python manage.py celeryd

I am trying to configure a Django project to use Celery (I am using Django 1.3 on Debian Squeeze)
I installed django-celery (2.3.3) and then followed these instructions.
My django celery settings are the following:
BROKER_HOST = "localhost"
BROKER_PORT = 5672
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
BROKER_VHOST = "/"
When I try to launch the celery worker server with...
$ python manage.py celeryd -l info
I get the following output with a "Consumer: Connection Error: [Errno 111]" at the end :
/home/thomas/virtualenv/ULYSSE/lib/python2.6/site-packages/djcelery/loaders.py:84: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
warnings.warn("Using settings.DEBUG leads to a memory leak, never "
[2011-09-20 12:14:00,645: WARNING/MainProcess]
-------------- celery#debian v2.3.3
---- **** -----
--- * *** * -- [Configuration]
-- * - **** --- . broker: amqp://guest#localhost:5672//
- ** ---------- . loader: djcelery.loaders.DjangoLoader
- ** ---------- . logfile: [stderr]#INFO
- ** ---------- . concurrency: 1
- ** ---------- . events: OFF
- *** --- * --- . beat: OFF
-- ******* ----
--- ***** ----- [Queues]
-------------- . celery: exchange:celery (direct) binding:celery
[Tasks]
. competitions.tasks.add
[2011-09-20 12:14:00,788: INFO/PoolWorker-1] child process calling self.run()
[2011-09-20 12:14:00,795: WARNING/MainProcess] celery#debian has started.
[2011-09-20 12:14:00,809: ERROR/MainProcess] **Consumer: Connection Error: [Errno 111] Connection refused. Trying again in 2 seconds**...
Apparently, my settings are correctly read (cf. Configuration section in the output) and the worker process is correctly started ("celery#debian has started")
I can not figure out why this "Consumer: Connection Error: [Errno 111]" error appends...
Has this to do with the BROKER_USER and BROKER_PASSWORD settings?
I tried different settings for user/password (my account, root account...) but I always get the same error. Does 'BROKER_USER' and 'BROKER_PASSWORD refer to a OS user, a database user, a "broker" user?
How can I get rid of this Connection Error?
Looks like rabbitmq isn't installed or running. Can you check this?
apt-get install rabbitmq-server
on Ubuntu