Tasks not executing (Django + Heroku + Celery + RabbitMQ) - django

I'm using RabbitMQ for the first time and I must be misunderstanding some simple configuration settings. Note that I am encountering this issue while running the app locally right now; I have not yet attempted to launch to production via Heroku.
For this app, every 20 seconds I want to look for some unsent messages in the database, and send them via Twilio. Apologies in advance if I've left some relevant code out of my examples below. I've followed all of the Celery setup/config instructions. Here is my current setup:
BROKER_URL = 'amqp://VflhnMEP:8wGLOrNBP.........Bhshs' # Truncated URL string
from datetime import timedelta
CELERYBEAT_SCHEDULE = {
'send_queued_messages_every_20_seconds': {
'task': 'comm.tasks.send_queued_messages',
'schedule': timedelta(seconds=20),
# 'schedule': crontab(seconds='*/20')
},
}
CELERY_TIMEZONE = 'UTC'
I am pretty sure that the tasks are being racked up in RabbitMQ; here is the dash that I can see with all of the accumulated messages:
The function, 'send_queued_messages' should be called every 20 seconds.
comm/tasks.py
import datetime
from celery.decorators import periodic_task
from comm.utils import get_user_mobile_number
from comm.api import get_twilio_connection, send_message
from dispatch.models import Message
#periodic_task
def send_queued_messages(run_every=datetime.timedelta(seconds=20)):
unsent_messages = Message.objects.filter(sent_success=False)
connection = get_twilio_connection()
for message in unsent_messages:
mobile_number = get_user_mobile_number(message=message)
try:
send_message(
connection=connection,
mobile_number=mobile_number,
message=message.raw_text
)
message.sent_success=True
message.save()
except BaseException as e:
raise e
pass
I'm pretty sure that I have something misconfigured with RabbitMQ or in my Heroku project settings, but I'm not sure how to continue troubleshooting. When I run 'celery -A myproject beat' everything appears to be running smoothly.
(venv)josephs-mbp:myproject josephfusaro$ celery -A myproject beat
celery beat v3.1.18 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> amqp://VflhnMEP:**#happ...Bhshs
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2015-05-27 03:01:53,810: INFO/MainProcess] beat: Starting...
[2015-05-27 03:02:13,941: INFO/MainProcess] Scheduler: Sending due task send_queued_messages_every_20_seconds (comm.tasks.send_queued_messages)
[2015-05-27 03:02:34,036: INFO/MainProcess] Scheduler: Sending due task send_queued_messages_every_20_seconds (comm.tasks.send_queued_messages)
So why aren't the tasks executing as they do without Celery being involved*?
My Procfile:
web: gunicorn myproject.wsgi --log-file -
worker: celery -A myproject beat
*I have confirmed that my code executes as expected without Celery being involved!

Special thanks to #MauroRocco for pushing me in the right direction on this. The pieces that I was missing were best explained in this tutorial: https://www.rabbitmq.com/tutorials/tutorial-one-python.html
Note: I needed to modify some of the code in the tutorial to use URLParameters, passing in the resource URL defined in my settings file.
The only line in send.py and receive.py is:
connection = pika.BlockingConnection(pika.URLParameters(BROKER_URL))
and of course we need to import the BROKER_URL variable from settings.py
from settings import BROKER_URL
settings.py
BROKER_URL = 'amqp://VflhnMEP:8wGLOrNBP...4.bigwig.lshift.net:10791/sdklsfssd'
send.py
import pika
from settings import BROKER_URL
connection = pika.BlockingConnection(pika.URLParameters(BROKER_URL))
channel = connection.channel()
channel.queue_declare(queue='hello')
channel.basic_publish(exchange='',
routing_key='hello',
body='Hello World!')
print " [x] Sent 'Hello World!'"
connection.close()
receive.py
import pika
from settings import BROKER_URL
connection = pika.BlockingConnection(pika.URLParameters(BROKER_URL))
channel = connection.channel()
channel.queue_declare(queue='hello')
print ' [*] Waiting for messages. To exit press CTRL+C'
def callback(ch, method, properties, body):
print " [x] Received %r" % (body,)
channel.basic_consume(callback,
queue='hello',
no_ack=True)
channel.start_consuming()

Related

celery: could not connect to rabbitmq

Using rabbitmq as broker for celery. Issue is coming while running command
celery -A proj worker --loglevel=info
celery console shows this
[2017-06-23 07:57:09,261: ERROR/MainProcess] consumer: Cannot connect to amqp://bruce:**#127.0.0.1:5672//: timed out.
Trying again in 2.00 seconds...
[2017-06-23 07:57:15,285: ERROR/MainProcess] consumer: Cannot connect to amqp://bruce:**#127.0.0.1:5672//: timed out.
Trying again in 4.00 seconds...
following are the logs from rabbitmq
=ERROR REPORT==== 23-Jun-2017::13:28:58 ===
closing AMQP connection <0.18756.0> (127.0.0.1:58424 -> 127.0.0.1:5672):
{handshake_timeout,frame_header}
=INFO REPORT==== 23-Jun-2017::13:29:04 ===
accepting AMQP connection <0.18897.0> (127.0.0.1:58425 -> 127.0.0.1:5672)
=ERROR REPORT==== 23-Jun-2017::13:29:14 ===
closing AMQP connection <0.18897.0> (127.0.0.1:58425 -> 127.0.0.1:5672):
{handshake_timeout,frame_header}
=INFO REPORT==== 23-Jun-2017::13:29:22 ===
accepting AMQP connection <0.19054.0> (127.0.0.1:58426 -> 127.0.0.1:5672)
Any input would be appreciated.
I know its late
But I came across the same issue today, spent almost an hour to find the exact fix. Thought it might help someone else
I was using celery version 4.1.0
Hope you have configured RabbitMQ properly, if not please configure it as mentioned in the page http://docs.celeryproject.org/en/latest/getting-started/brokers/rabbitmq.html#setting-up-rabbitmq
Also cross check if the broker url is correct. Here is the brocker url syntax
amqp://user_name:password#localhost/host_name
You might not need to specify the port number, since it will automatically select the default one
If you follow the same variables from the setup tutorial link above your Brocker url will be like
amqp://myuser:mypassword#localhost/myvhost
Follow this project structure
Project
../app
../Project
../settings.py
../celery.py
../tasks.py
../celery_config.py
celery_config.py
# - - - - - - - - - -
# BROKER SETTINGS
# - - - - - - - - - -
# BROKER_URL = os.environ['APP_BROKER_URL']
BROKER_HEARTBEAT = 10
BROKER_HEARTBEAT_CHECKRATE = 2.0
# Setting BROKER_POOL_LIMIT to None disables pooling
# Disabling pooling causes open/close connections for every task.
# However, the rabbitMQ cluster being behind an Elastic Load Balancer,
# the pooling is not working correctly,
# and the connection is lost at some point.
# There seems no other way around it for the time being.
BROKER_POOL_LIMIT = None
BROKER_TRANSPORT_OPTIONS = {'confirm_publish': True}
BROKER_CONNECTION_TIMEOUT = 20
BROKER_CONNECTION_RETRY = True
BROKER_CONNECTION_MAX_RETRIES = 100
celery.py
from __future__ import absolute_import, unicode_literals
from celery import Celery
from Project import celery_config
app = Celery('Project',
broker='amqp://myuser:mypassword#localhost/myvhost',
backend='amqp://',
include=['Project'])
# Optional configuration, see the application user guide.
# app.conf.update(
# result_expires=3600,
# CELERY_BROKER_POOL_LIMIT = None,
# )
app.config_from_object(celery_config)
if __name__ == '__main__':
app.start()
tasks.py
from __future__ import absolute_import, unicode_literals
from .celery import app
#app.task
def add(x, y):
return x + y
Then start the celery with “celery -A Project worker -l info” from the project directory
Everything will be fine.
set CELERY_BROKER_POOL_LIMIT = None in settings.py
This solution is for GCP users.
I've been working on GCP and faced the same issue.
The error message was :
[2022-03-15 16:56:00,318: ERROR/MainProcess] consumer: Cannot connect
to amqp://root:**#34.125.161.132:5672/vhost: timed out.
I spent almost one hour to solve this issue and finally found the solution
We have to add the port number 5672 in the Firewall rules
Steps:
Go to Firewall
select default-allow-http rule
press Edit
search "Specified protocols and ports"
add 5672 in tcp box ( example if you want to add more ports : 80,5672,8000 )
save the changes and there you go !

Python Django Celery is taking too much memory

I am running a celery server which have 5,6 task to run periodically. Celery is taking too much memory after 5,6 days of continuous execution.
Celery documentation is very confusing. I am using following settings.
# celeryconfig.py
import os
os.environ['DJANGO_SETTINGS_MODULE'] = 'xxx.settings'
# default RabbitMQ broker
BROKER_URL = "amqp://guest:guest#localhost:5672//"
from celery.schedules import crontab
# default RabbitMQ backend
CELERY_RESULT_BACKEND = None
#4 CONCURRENT proccesess are running.
CELERYD_CONCURRENCY = 4
# specify location of log files
CELERYD_LOG_FILE="/var/log/celery/celery.log"
CELERY_ALWAYS_EAGER = True
CELERY_IMPORTS = (
'xxx.celerydir.cron_tasks.deprov_cron_script',
)
CELERYBEAT_SCHEDULE = {
'deprov_cron_script': {
'task': 'xxx.celerydir.cron_tasks.deprov_cron_script.check_deprovision_vms',
'schedule': crontab(minute=0, hour=17),
'args': ''
}
}
I am running celery service using nohup command(this will run this in background).
nohup celery beat -A xxx.celerydir &
After going through documentation. I came to know that DEBUG was True in settings.
Just change value of DEBUG in settings.
REF:https://github.com/celery/celery/issues/2927

Django Celery Task TwythonStreamer SIGSEGV

I had a Django project where a TwythonStreamer connection is started on a Celery task worker. The connection is started and reloaded as the search terms change. However, in it's current state and prior to updating the project to Celery 3.1.1 it would SIGSEGV when this particular tasks attempts to run. I can execute the same commands of the task in the Django Shell and have it work just fine:
tu = TwitterUserAccount.objects.first()
stream = NetworkStreamer(settings.TWITTER_CONSUMER_KEY, settings.TWITTER_CONSUMER_SECRET, tu.twitter_access_token, tu.twitter_access_token_secret)
stream.statuses.filter(track='foo,bar')
however, with RabbitMQ/Celery running (while in the project's virtualenv) in another window:
celery worker --app=project.app -B -E -l INFO
and try to run:
#task()
def test_network():
tu = TwitterUserAccount.objects.first()
stream = NetworkStreamer(settings.TWITTER_CONSUMER_KEY, settings.TWITTER_CONSUMER_SECRET, tu.twitter_access_token, tu.twitter_access_token_secret)
in the Django shell via:
test_network.apply_async()
the following SIGSEVG error occurs in the Celery window (upon initializing of the NetworkStreamer):
Task project.app.tasks.test_network with id 5a9d1689-797a-4d35-8bf3-9795e51bb0ec raised exception:
"WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)"
Task was called with args: [] kwargs: {}.
The contents of the full traceback was:
Traceback (most recent call last):
File "/Users/foo_user/.virtualenvs/project/lib/python2.7/site-packages/billiard/pool.py", line 1170, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).
NetworkStreamer is simply an inherited TwythonStreamer (as shown online here).
I have other Celery tasks that run just fine in addition to various Celery Beat tasks. djcelery.setup_loader(), etc is being done. I've tried adjusting various settings (thought it might have been a pickle issue) but am not even passing any parameters. This project structure is how Celery is being setup, named, etc…
BROKER_URL = 'amqp://'
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_RESULT_ENGINE_OPTIONS = {"echo": True}
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_BACKEND = 'amqp'
# Short lived sessions, disabled by default
CELERY_RESULT_PERSISTENT = True
CELERY_RESULT_BACKEND = 'amqp'
CELERY_TASK_RESULT_EXPIRES = 18000 # 5 hours.
CELERY_SEND_TASK_ERROR_EMAILS = True
Versions:
Python: 2.7.5
RabbitMQ: 3.3.4
Django==1.6.5
amqp==1.4.5
billiard==3.3.0.18
celery==3.1.12
django-celery==3.1.10
flower==0.7.0
psycopg2==2.5.3
pytz==2014.4
twython==3.1.2

How to call task properly?

I configured django-celery in my application. This is my task:
from celery.decorators import task
import simplejson as json
import requests
#task
def call_api(sid):
try:
results = requests.put(
'http://localhost:8000/api/v1/sids/'+str(sid)+"/",
data={'active': '1'}
)
json_response = json.loads(results.text)
except Exception, e:
print e
logger.info('Finished call_api')
When I add in my view:
call_api.apply_async(
(instance.service.id,),
eta=instance.date
)
celeryd shows me:
Got task from broker: my_app.tasks.call_api[755d50fd-0f0f-4861-9a18-7f4e4563290a]
Task my_app.tasks.call_api[755d50fd-0f0f-4861-9a18-7f4e4563290a] succeeded in 0.00513911247253s: None
so should be good, but nothing happen... There is no call to for example:
http://localhost:8000/api/v1/sids/1/
What am I doing wrong?
Are you running celery as a separate process?
For example in Ubuntu run using the command
sudo python manage.py celeryd
Till you run celery (or django celery) as a separate process, the jobs will be stored in the database (or queue or the persistent mechanism you have configured - generally in settings.py).

Running periodic tasks with django and celery

I'm trying create a simple background periodic task using Django-Celery-RabbitMQ combination. I installed Django 1.3.1, I downloaded and setup djcelery. Here is how my settings.py file looks like:
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
....
import djcelery
djcelery.setup_loader()
...
INSTALLED_APPS = (
'djcelery',
)
And I put a 'tasks.py' file in my application folder with the following contents:
from celery.task import PeriodicTask
from celery.registry import tasks
from datetime import timedelta
from datetime import datetime
class MyTask(PeriodicTask):
run_every = timedelta(minutes=1)
def run(self, **kwargs):
self.get_logger().info("Time now: " + datetime.now())
print("Time now: " + datetime.now())
tasks.register(MyTask)
And then I start up my django server (local development instance):
python manage.py runserver
Then I start up the celerybeat process:
python manage.py celerybeat --logfile=<path_to_log_file> -l DEBUG
I can see entries like this in the log:
[2012-04-29 07:50:54,671: DEBUG/MainProcess] tasks.MyTask sent. id->72a5963c-6e15-4fc5-a078-dd26da663323
And I also can see the corresponding entries getting created in database, but I can't find where it is logging the text I specified in the actual run function in MyTask class.
I tried fiddling with the logging settings, tried using the django logger instead of celery logger, but of no use. I'm not even sure, my task is getting executed. If I print any debug information in the task, where does it go?
Also, this is first time I'm working with any type of message queuing system. It looks like the task will get executed as part of the celerybeat process - outside the django web framework. Will I still be able to access all the django models I created.
Thanks,
Venkat.
Celerybeat it stuff, which pushes task when it need, but not executing them. You tasks instances stored in RabbitMq server. You need to execute celeryd daemon for executing your tasks.
python manage.py celeryd --logfile=<path_to_log_file> -l DEBUG
Also if you using RabbitMq, I recommend to you to install special rabbitmq management plugins:
rabbitmq-plugins list
rabbitmq-enable rabbitmq_management
service rabbitmq-server restart
It will be available at http://:55672/ login: guest pass: guest. Here you can check how many tasks in your rabbit instance online.
You should check the RabbitMQ logs, since celery sends the tasks to RabbitMQ and it should execute them. So all the prints of the tasks should be in RabbitMQ logs.