django-celery as a daemon: not working - django

I have a website project written with django, celery and rabbitmq. And a '.delay' task (the task creates a new folder) is called when a button is clicked.
Everything works fine with celery (the .delay task is called, and a new folder is created) when I run celery with manage.py like:
python manage.py celeryd
However, when I ran celery as the daemon, even there was no error, the task was not executed (no folder was created).
I was kind of following the tutorial: http://www.arruda.blog.br/programacao/django-celery-in-daemon/
My settings are:
/etc/default/celeryd
:
# Name of nodes to start, here we have a single node
CELERYD_NODES="w1"
# Where to chdir at start.
CELERYD_CHDIR="/var/www/myproject"
# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
# How to call "manage.py celeryctl"
CELERYCTL="$CELERYD_CHDIR/manage.py celeryctl"
# Extra arguments to celeryd
CELERYD_OPTS=""
# Name of the celery config module.
CELERY_CONFIG_MODULE="myproject.settings"
# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/var/log/celery/w1.log"
CELERYD_PID_FILE="/var/run/celery/w1.pid"
# Workers should run as an unprivileged user.
#CELERYD_USER="root"
#CELERYD_GROUP="root"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="myproject.settings"
the correlated folders are created too
for the '/etc/default/celeryd/init.d' file, I used this version:
https://raw.github.com/ask/celery/1da3aa43d1e6de525beeda398d0acb8841d5b4d2/contrib/generic-init.d/celeryd
for /var/www/myproject/myproject/settings.py, I have:
:
import djcelery
djcelery.setup_loader()
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
INSTALLED_APPS = (
'djcelery',
...
)
There was no error when I start celery by using:
/etc/init.d/celeryd start
and no results neither. Does someone know how to fix the problem?

Celery's docs have a daemon troubleshooting section that might be helpful. Celery has a flag that lets you run your init script without actually daemonizing, and that should show what's going wrong:
C_FAKEFORK=1 sh -x /etc/init.d/celeryd start
Newer versions of that init script have a dryrun command that's an easier-to-remember way to run the start command without daemonizing.

Related

Django - Celery 4.1 with django-celery-beat/rabbitmq : Nothing?

I followed the tutorial on http://docs.celeryproject.org/en/latest/ and I am on virtualbox (Xubuntu 16.XX TLS), Django 1.11.3, Celery 4.1 . rabbitmq 3.6.14, Python 2.7 .
and when I started the daemonization with the init-script: celerybeat (with /etc/default/celeryd config file)
[2017-11-19 01:13:00,912: INFO/MainProcess] beat: Starting...
and nothing more after. Do you see what could I make wrong ?
My celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'oscar.settings')
app = Celery('oscar')
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Broker settings
app.conf.broker_url = 'amqp://oscar:oscar#localhost:5672/oscarRabbit'
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
some_app/tasks.py:
from __future__ import absolute_import, unicode_literals
from oscar import celery_app
from celery.schedules import crontab
from .models import HelpRequest
from datetime import datetime, timedelta
import logging
""" CONSTANTS FOR THE TIMER """
# Can be changed (by default 1 week)
WEEKS_BEFORE_PENDING = 0
DAYS_BEFORE_PENDING = 0
HOURS_BEFORE_PENDING = 0
MINUTES_BEFORE_PENDING = 1
# http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html
# for schedule : http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
#celery_app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
sender.add_periodic_task(
crontab(minute=2),
set_open_help_request_to_pending
)
#celery_app.task(name="HR_OPEN_TO_PENDING")
def set_open_help_request_to_pending():
""" For timedelay idea : https://stackoverflow.com/a/27869101/6149867 """
logging.info("RUNNING CRON TASK FOR STUDENT COLLABORATION : set_open_help_request_to_pending")
request_list = HelpRequest.objects.filter(
state=HelpRequest.OPEN,
timestamp__gte=datetime.now() - timedelta(hours=HOURS_BEFORE_PENDING,
minutes=MINUTES_BEFORE_PENDING,
days=DAYS_BEFORE_PENDING,
weeks=WEEKS_BEFORE_PENDING)
)
if request_list:
logging.info("FOUND ", request_list.count(), " Help request(s) => PENDING")
for help_request in request_list.all():
help_request.change_state(HelpRequest.PENDING)
/etc/default/celeryd:
# Names of nodes to start
# most people will only start one node:
CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS
#CELERYD_NODES="worker1 worker2 worker3"
# alternatively, you can specify the number of nodes to start:
#CELERYD_NODES=10
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/home/jy95/Documents/oscareducation/ve/local/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="oscar"
# Where to chdir at start.
CELERYD_CHDIR="/home/jy95/Documents/oscareducation"
# Extra command-line arguments to the worker
# django_celery_beat for admin purpuse
CELERYD_OPTS="--scheduler django_celery_beat.schedulers:DatabaseScheduler -f /var/log/celery/celery_tasks.log"
# Set logging level to DEBUG
#CELERYD_LOG_LEVEL="DEBUG"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists (e.g., nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
My setup of rabbitmq :
$ sudo rabbitmqctl add_user oscar oscar
$ sudo rabbitmqctl add_vhost oscarRabbit
$ sudo rabbitmqctl set_user_tags oscar administrator
$ sudo rabbitmqctl set_permissions -p oscarRabbit oscar ".*" ".*" ".*"
The commands I run to start (and their messages) :
sudo rabbitmq-server -detached
sudo /etc/init.d/celerybeat start
Warning: PID file not written; -detached was passed.
/etc/init.d/celerybeat: lerybeat: not found
celery init v10.1.
Using configuration: /etc/default/celeryd
Starting celerybeat...
sudo /etc/init.d/celerybeat start
source ve/bin/activate
python manage.py runserver
Performing system checks...
System check identified no issues (0 silenced).
November 19, 2017 -01:49:22 Django version 1.11.3, using settings 'oscar.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Thanks for your answer
It looks like you've started a celerybeat process and your server, but haven't started a celery worker process.
python celery -A proj worker -B
(where proj is the name of your project).
Note that you can start a celery worker with an embedded beat process rather than needing to run celerybeat separately.

Django Celery Task TwythonStreamer SIGSEGV

I had a Django project where a TwythonStreamer connection is started on a Celery task worker. The connection is started and reloaded as the search terms change. However, in it's current state and prior to updating the project to Celery 3.1.1 it would SIGSEGV when this particular tasks attempts to run. I can execute the same commands of the task in the Django Shell and have it work just fine:
tu = TwitterUserAccount.objects.first()
stream = NetworkStreamer(settings.TWITTER_CONSUMER_KEY, settings.TWITTER_CONSUMER_SECRET, tu.twitter_access_token, tu.twitter_access_token_secret)
stream.statuses.filter(track='foo,bar')
however, with RabbitMQ/Celery running (while in the project's virtualenv) in another window:
celery worker --app=project.app -B -E -l INFO
and try to run:
#task()
def test_network():
tu = TwitterUserAccount.objects.first()
stream = NetworkStreamer(settings.TWITTER_CONSUMER_KEY, settings.TWITTER_CONSUMER_SECRET, tu.twitter_access_token, tu.twitter_access_token_secret)
in the Django shell via:
test_network.apply_async()
the following SIGSEVG error occurs in the Celery window (upon initializing of the NetworkStreamer):
Task project.app.tasks.test_network with id 5a9d1689-797a-4d35-8bf3-9795e51bb0ec raised exception:
"WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)"
Task was called with args: [] kwargs: {}.
The contents of the full traceback was:
Traceback (most recent call last):
File "/Users/foo_user/.virtualenvs/project/lib/python2.7/site-packages/billiard/pool.py", line 1170, in mark_as_worker_lost
human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).
NetworkStreamer is simply an inherited TwythonStreamer (as shown online here).
I have other Celery tasks that run just fine in addition to various Celery Beat tasks. djcelery.setup_loader(), etc is being done. I've tried adjusting various settings (thought it might have been a pickle issue) but am not even passing any parameters. This project structure is how Celery is being setup, named, etc…
BROKER_URL = 'amqp://'
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_RESULT_ENGINE_OPTIONS = {"echo": True}
CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']
CELERY_BACKEND = 'amqp'
# Short lived sessions, disabled by default
CELERY_RESULT_PERSISTENT = True
CELERY_RESULT_BACKEND = 'amqp'
CELERY_TASK_RESULT_EXPIRES = 18000 # 5 hours.
CELERY_SEND_TASK_ERROR_EMAILS = True
Versions:
Python: 2.7.5
RabbitMQ: 3.3.4
Django==1.6.5
amqp==1.4.5
billiard==3.3.0.18
celery==3.1.12
django-celery==3.1.10
flower==0.7.0
psycopg2==2.5.3
pytz==2014.4
twython==3.1.2

uwsgi: no app loaded. going in full dynamic mode

In my uwsgi config, I have these options:
[uwsgi]
chmod-socket = 777
socket = 127.0.0.1:9031
plugins = python
pythonpath = /adminserver/
callable = app
master = True
processes = 4
reload-mercy = 8
cpu-affinity = 1
max-requests = 2000
limit-as = 512
reload-on-as = 256
reload-on-rss = 192
no-orphans
vacuum
My app structure looks like this:
/adminserver
app.py
...
My app.py has these bits of code:
app = Flask(__name__)
...
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5003, debug=True)
The result is that when I try to curl my server, I get this error:
Wed Sep 11 23:28:56 2013 - added /adminserver/ to pythonpath.
Wed Sep 11 23:28:56 2013 - *** no app loaded. going in full dynamic mode ***
Wed Sep 11 23:28:56 2013 - *** uWSGI is running in multiple interpreter mode ***
What do the module and callable options do? The docs say:
module, wsgi Argument: string
Load a WSGI module as the application. The module (sans .py) must be
importable, ie. be in PYTHONPATH.
This option may be set with -w from the command line.
callable Argument: string Default: application
Set default WSGI callable name.
Module
A module in Python maps to a file on disk - when you have a directory like this:
/some-dir
module1.py
module2.py
If you start up a python interpreter while the current working directory is /some-dir you will be able to import each of the modules:
some-dir$ python
>>> import module1, module2
# Module1 and Module2 are now imported
Python searches sys.path (and a few other things, see the docs on import for more information) for a file that matches the name you are trying to import. uwsgi uses Python's import process under the covers to load the module that contains your WSGI application.
Callable
The WSGI PEPs (333 and 3333) specify that a WSGI application is a callable that takes two arguments and returns an iterable that yields bytestrings:
# simple_wsgi.py
# The simplest WSGI application
HELLO_WORLD = b"Hello world!\n"
def simple_app(environ, start_response):
"""Simplest possible application object"""
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [HELLO_WORLD]
uwsgi needs to know the name of a symbol inside of your module that maps to the WSGI application callable, so it can pass in the environment and the start_response callable - essentially, it needs to be able to do the following:
wsgi_app = getattr(simple_wsgi, 'simple_app')
TL;PC (Too Long; Prefer Code)
A simple parallel of what uwsgi is doing:
# Use `module` to know *what* to import
import simple_wsgi
# construct request environment from user input
# create a callable to pass for start_response
# and then ...
# use `callable` to know what to call
wsgi_app = getattr(simple_wsgi, 'simple_app')
# and then call it to respond to the user
response = wsgi_app(environ, start_response)
For anyone else having this problem, if you are sure your configuration is correct, you should check your uWSGI version.
Ubuntu 12.04 LTS provides 1.0.3. Removing that and using pip to install 2.0.4 resolved my issues.
First, check your configuration whether is correct.
my uwsgi.ini configuration:
[uwsgi]
chdir=/home/air/repo/Qiy
uid=nobody
gid=nobody
module=Qiy.wsgi:application
socket=/home/air/repo/Qiy/uwsgi.sock
master=true
workers=5
pidfile=/home/air/repo/Qiy/uwsgi.pid
vacuum=true
thunder-lock=true
enable-threads=true
harakiri=30
post-buffering=4096
daemonize=/home/air/repo/Qiy/uwsgi.log
then use uwsgi --ini uwsgi.ini to run uwsgi.
if not work, you rm -rf the venv directory, and re-initial the venv, and re-try my step.
I re-initial the venv solved my issue, seems the problem is when I pip3 install some packages of requirements.txt, and upgrade the pip, then install uwsgi package. so, I delete the venv, and re-initial my virtual environment.

Django script configuration for beat mode

Below is my current /etc/init.d/celeryd script:
# Name of nodes to start, here we have a single node
#CELERYD_NODES="w1"
# or we could have three nodes:
CELERYD_NODES="w1 w2 w3"
# Where to chdir at start.
CELERYD_CHDIR="/srv/project/website"
# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi"
# How to call "manage.py celeryctl"
CELERYCTL="$CELERYD_CHDIR/manage.py celeryctl"
# Extra arguments to celeryd
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/srv/project/logs/celery/%n.log"
CELERYD_PID_FILE="/srv/project/celery/%n.pid"
# Workers should run as an unprivileged user.
CELERYD_USER="root"
CELERYD_GROUP="root"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="website.settings"
I now want to run Periodic Tasks, adding to/changing my example above how to I create the script configuration for beat mode?
Do I just add the following to the file? and what is the last line?
# Where the Django project is.
CELERYBEAT_CHDIR="/srv/project/website"
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="website.settings"
# Path to celerybeat
CELERYBEAT="/opt/project/website/manage.py celerybeat"
# Extra arguments to celerybeat
CELERYBEAT_OPTS="--schedule=/var/run/celerybeat-schedule"
Yes, you can add it to the bottom of the script or create a new one that your init script points to.

Running periodic tasks with django and celery

I'm trying create a simple background periodic task using Django-Celery-RabbitMQ combination. I installed Django 1.3.1, I downloaded and setup djcelery. Here is how my settings.py file looks like:
BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"
....
import djcelery
djcelery.setup_loader()
...
INSTALLED_APPS = (
'djcelery',
)
And I put a 'tasks.py' file in my application folder with the following contents:
from celery.task import PeriodicTask
from celery.registry import tasks
from datetime import timedelta
from datetime import datetime
class MyTask(PeriodicTask):
run_every = timedelta(minutes=1)
def run(self, **kwargs):
self.get_logger().info("Time now: " + datetime.now())
print("Time now: " + datetime.now())
tasks.register(MyTask)
And then I start up my django server (local development instance):
python manage.py runserver
Then I start up the celerybeat process:
python manage.py celerybeat --logfile=<path_to_log_file> -l DEBUG
I can see entries like this in the log:
[2012-04-29 07:50:54,671: DEBUG/MainProcess] tasks.MyTask sent. id->72a5963c-6e15-4fc5-a078-dd26da663323
And I also can see the corresponding entries getting created in database, but I can't find where it is logging the text I specified in the actual run function in MyTask class.
I tried fiddling with the logging settings, tried using the django logger instead of celery logger, but of no use. I'm not even sure, my task is getting executed. If I print any debug information in the task, where does it go?
Also, this is first time I'm working with any type of message queuing system. It looks like the task will get executed as part of the celerybeat process - outside the django web framework. Will I still be able to access all the django models I created.
Thanks,
Venkat.
Celerybeat it stuff, which pushes task when it need, but not executing them. You tasks instances stored in RabbitMq server. You need to execute celeryd daemon for executing your tasks.
python manage.py celeryd --logfile=<path_to_log_file> -l DEBUG
Also if you using RabbitMq, I recommend to you to install special rabbitmq management plugins:
rabbitmq-plugins list
rabbitmq-enable rabbitmq_management
service rabbitmq-server restart
It will be available at http://:55672/ login: guest pass: guest. Here you can check how many tasks in your rabbit instance online.
You should check the RabbitMQ logs, since celery sends the tasks to RabbitMQ and it should execute them. So all the prints of the tasks should be in RabbitMQ logs.