Some task are not processing using Django-Q - django

I have a django Q cluster running with this configuration:
Q_CLUSTER = {
'name': 'pretty_name',
'workers': 1,
'recycle': 500,
'timeout': 500,
'queue_limit': 5,
'cpu_affinity': 1,
'label': 'Django Q',
'save_limit': 0,
'ack_failures': True,
'max_attempts': 1,
'attempt_count': 1,
'redis': {
'host': CHANNEL_REDIS_HOST,
'port': CHANNEL_REDIS_PORT,
'db': 5,
}
}
On this cluster I have a scheduled task supposed to run every 15 minutes.
Sometimes it works fine and this is what I can see on my worker logs:
[Q] INFO Enqueued 1
[Q] INFO Process-1 created a task from schedule [2]
[Q] INFO Process-1:1 processing [oranges-georgia-snake-social]
[ My Personal Custom Task Log]
[Q] INFO Processed [oranges-georgia-snake-social]
But other times the task does not start, this is what I get on my log:
[Q] INFO Enqueued 1
[Q] INFO Process-1 created a task from schedule [2]
And then nothing for the next 15 minutes.
Any idea where this might come from ?

So this was my prod environment and it appears that my dev environment was using the same redis db and even though no task existed on my dev environment it seems that this was the cause of the issue.
The solution was to change the redis db between my dev and prod environment !

Related

Restarting celery and celery beat schedule relationship in django

Will restarting celery cause all the periodic tasks(celery beat schedules) to get reset and start from the time celery is restarted or does it retain the schedule?
For example assume I have a periodic task that gets executed at 12 pm everyday. Now I restart celery at 3 pm. Will the periodic task be reset to run at 3 pm everyday?
How do you set your task?
Here is many ways to set task schedule →
Example: Run the tasks.add task every 30 seconds.
app.conf.beat_schedule = {
'add-every-30-seconds': {
'task': 'tasks.add',
'schedule': 30.0,
'args': (16, 16)
},
}
app.conf.timezone = 'UTC'
This task is running every 30 seconds after start.
Another example:
from celery.schedules import crontab
app.conf.beat_schedule = {
# Executes every Monday morning at 7:30 a.m.
'add-every-monday-morning': {
'task': 'tasks.add',
'schedule': crontab(hour=7, minute=30),
'args': (16, 16),
},
}
This task is running at 7:30 every day.
You may check schedule examples
So answer is depending on your code.

Django-q: WARNING reincarnated worker Process-1:1 after timeout

I've installed and configured Django-Q 1.3.5 (on Django 3.2 with Redis 3.5.3 and Python 3.8.5).
This is my Cluster configuration:
# redis defaults
Q_CLUSTER = {
'name': 'my_broker',
'workers': 4,
'recycle': 500,
'timeout': 60,
'retry': 65,
'compress': True,
'save_limit': 250,
'queue_limit': 500,
'cpu_affinity': 1,
'redis': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
'socket_timeout': None,
'charset': 'utf-8',
'errors': 'strict',
'unix_socket_path': None
}
}
where I have appropriately chosen timeout:60 and retry:65 to explain my problem.
I created this simple function to call via Admin Scheduled Task:
def test_timeout_task():
time.sleep(61)
return "Result of task"
And this is my "Scheduled Task page" (localhost:8000/admin/django_q/schedule/)
ID
Name
Func
Success
1
test timeout
mymodel.tasks.test_timeout_task
?
When I run this task, I get the following warning:
10:18:21 [Q] INFO Process-1 created a task from schedule [test timeout]
10:19:22 [Q] WARNING reincarnated worker Process-1:1 after timeout
10:19:22 [Q] INFO Process-1:7 ready for work at 68301
and the task is no longer executed.
So, my question is: is there a way to correctly handle an unpredicted task?
You can set your timeout to :
'timeout': None and it should handle your task without stopping.

Flask-sqlalchemy / uwsgi: DB connection problem when more than on process is used

I have a Flask app running on Heroku with uwsgi server in which each user connects to his own database. I have implemented the solution reported here for a very similar situation. In particular, I have implemented the connection registry as follows:
class DBSessionRegistry():
_registry = {}
def get(self, URI, **kwargs):
if URI not in self._registry:
current_app.logger.info(f'INFO - CREATING A NEW CONNECTION')
try:
engine = create_engine(URI,
echo=False,
pool_size=5,
max_overflow=5)
session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)
a_session = Session()
self._registry[URI] = a_session
except ArgumentError:
raise Exception('Error')
current_app.logger.info(f'SESSION ID: {id(self._registry[URI])}')
current_app.logger.info(f'REGISTRY ID: {id(self._registry)}')
current_app.logger.info(f'REGISTRY SIZE: {len(self._registry.keys())}')
current_app.logger.info(f'APP ID: {id(current_app)}')
return self._registry[URI]
In my create_app() I assign a registry to the app:
app.DBregistry = DBSessionRegistry()
and whenever I need to talk to the DB I call:
current_app.DBregistry.get(URI)
where the URI is dependent on the user. This works nicely if I use uwsgi with one single process. With more processes,
[uwsgi]
processes = 4
threads = 1
sometimes it gets stuck on some requests, returning a 503 error code. I have found that the problem appears when the requests are handled by different processes in uwsgi. This is an excerpt of the log, which I commented to illustrate the issue:
# ... EVERYTHING OK UP TO HERE.
# ALL PREVIOUS REQUESTS HANDLED BY PROCESS pid = 12
INFO in utils: SESSION ID: 139860361716304
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
INFO in utils: APP ID: 139860526857584
# NOTE THE pid IN THE NEXT LINE...
[pid: 12|app: 0|req: 1/1] POST /manager/_save_task =>
generated 154 bytes in 3457 msecs (HTTP/1.1 200) 4 headers in 601
bytes (1 switches on core 0)
# PREVIOUS REQUEST WAS MANAGED BY PROCESS pid = 12
# THE NEXT REQUEST IS FROM THE SAME USER AND TO THE SAME URL.
# SO THERE IS NO NEED FOR CREATING A NEW CONNECTION, BUT INSTEAD...
INFO - CREATING A NEW CONNECTION
# TO THIS POINT, I DON'T UNDERSTAND WHY IT CREATED A NEW CONNECTION.
# THE SESSION ID CHANGES, AS IT IS A NEW SESSION
INFO in utils: SESSION ID: 139860363793168 # <<--- CHANGED
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
# THE APP AND THE REGISTRY ARE UNIQUE
INFO in utils: APP ID: 139860526857584
# uwsgi GIVES UP...
*** HARAKIRI ON WORKER 4 (pid: 11, try: 1) ***
# THE FAILED REQUEST WAS MANAGED BY PROCESS pid = 11
# I ASSUME THIS IS WHY IT CREATED A NEW CONNECTION
HARAKIRI: -- syscall> 7 0x7fff4290c6d8 0x1 0xffffffff 0x4000 0x0 0x0
0x7fff4290c6b8 0x7f33d6e3cbc4
HARAKIRI: -- wchan> poll_schedule_timeout
HARAKIRI !!! worker 4 status !!!
HARAKIRI [core 0] - POST /manager/_save_task since 1587660997
HARAKIRI !!! end of worker 4 status !!!
heroku[router]: at=error code=H13 desc="Connection closed without
response" method=POST path="/manager/_save_task"
DAMN ! worker 4 (pid: 11) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 4 (new pid: 14)
# FROM HERE ON, NOTHINGS WORKS ANYMORE
This behavior is consistent over several attempts: when the pid changes, the request fails. Even with a pool_size = 1 in the create_engine function the issue persists. No issue instead is uwsgi is used with one process.
I am pretty sure it is my fault, there is something I don't know or I don't understand about how uwsgi and/or sqlalchemy work. Could you please help me?
Thanks
What is hapeening is that you are trying to share memory between processes.
There are some exaplanations in these posts.
(is it possible to share memory between uwsgi processes running flask app?).
(https://stackoverflow.com/a/45383617/11542053)
You can use an extra layer to store your sessions outsite of the app.
For that, you can use uWsgi's SharedArea(https://uwsgi-docs.readthedocs.io/en/latest/SharedArea.html) which is very low level or you can user other approaches like uWsgi's caching(https://uwsgi-docs.readthedocs.io/en/latest/Caching.html)
hope it helps.

Airflow cannot run DAG because of upstream tasks been failed

I am trying to use Apache Airflow to create a workflow. So basically I've installed Airflow manually in my own anaconda kernel in server.
Here is the way I run a simple DAG
export AIRFLOW_HOME=~/airflow/airflow_home # my airflow home
export AIRFLOW=~/.conda/.../lib/python2.7/site-packages/airflow/bin
export PATH=~/.conda/.../bin:$AIRFLOW:$PATH # my kernel
When I do the same thing using airflow test, it worked for particular task independently. For example, in dag1: task1 >> task2
airflow test dag1 task2 2017-06-22
I suppose that it will run task1 first then run task2. But it just run task2 independently.
Do you guys have any idea about this ? Thank you very much in advance!
Here is my code:
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'txuantu',
'depends_on_past': False,
'start_date': datetime(2015, 6, 1),
'email': ['tran.xuantu#axa.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
# 'queue': 'bash_queue',
# 'pool': 'backfill',
# 'priority_weight': 10,
# 'end_date': datetime(2016, 1, 1),
}
dag = DAG(
'tutorial', default_args=default_args, schedule_interval=timedelta(1))
def python_op1(ds, **kwargs):
print(ds)
return 0
def python_op2(ds, **kwargs):
print(str(kwargs))
return 0
# t1, t2 and t3 are examples of tasks created by instantiating operators
# t1 = BashOperator(
# task_id='bash_operator',
# bash_command='echo {{ ds }}',
# dag=dag)
t1 = PythonOperator(
task_id='python_operator1',
python_callable=python_op1,
# provide_context=True,
dag=dag)
t2 = PythonOperator(
task_id='python_operator2',
python_callable=python_op2,
# provide_context=True,
dag=dag)
t2.set_upstream(t1)
Airflow: v1.8.0
Using executor SequentialExecutor with SQLLite
airflow run tutorial python_operator2 2015-06-01
Here is error message:
[2017-06-28 22:49:15,336] {models.py:167} INFO - Filling up the DagBag from /home/txuantu/airflow/airflow_home/dags
[2017-06-28 22:49:16,069] {base_executor.py:50} INFO - Adding to queue: airflow run tutorial python_operator2 2015-06-01T00:00:00 --mark_success --local -sd DAGS_FOLDER/tutorial.py
[2017-06-28 22:49:16,072] {sequential_executor.py:40} INFO - Executing command: airflow run tutorial python_operator2 2015-06-01T00:00:00 --mark_success --local -sd DAGS_FOLDER/tutorial.py
[2017-06-28 22:49:16,765] {models.py:167} INFO - Filling up the DagBag from /home/txuantu/airflow/airflow_home/dags/tutorial.py
[2017-06-28 22:49:16,986] {base_task_runner.py:112} INFO - Running: ['bash', '-c', u'airflow run tutorial python_operator2 2015-06-01T00:00:00 --mark_success --job_id 1 --raw -sd DAGS_FOLDER/tutorial.py']
[2017-06-28 22:49:17,373] {base_task_runner.py:95} INFO - Subtask: [2017-06-28 22:49:17,373] {__init__.py:57} INFO - Using executor SequentialExecutor
[2017-06-28 22:49:17,694] {base_task_runner.py:95} INFO - Subtask: [2017-06-28 22:49:17,693] {models.py:167} INFO - Filling up the DagBag from /home/txuantu/airflow/airflow_home/dags/tutorial.py
[2017-06-28 22:49:17,899] {base_task_runner.py:95} INFO - Subtask: [2017-06-28 22:49:17,899] {models.py:1120} INFO - Dependencies not met for <TaskInstance: tutorial.python_operator2 2015-06-01 00:00:00 [None]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'successes': 0, 'failed': 0, 'upstream_failed': 0, 'skipped': 0, 'done': 0}, upstream_task_ids=['python_operator1']
[2017-06-28 22:49:22,011] {jobs.py:2083} INFO - Task exited with return code 0
If you only want to run python_operator2, you should execute:
airflow run tutorial python_operator2 2015-06-01 --ignore_dependencies=False
If you want to execute the entire dag and execute both tasks, use trigger_dag:
airflow trigger_dag tutorial
For reference, airflow test will "run a task without checking for dependencies."
Documentation for all three commands can be found at https://airflow.incubator.apache.org/cli.html
Finally, I found about an answer for my problem. Basically I thought airflow is lazy load, but it seems not. So the answer is instead of:
t2.set_upstream(t1)
It should be:
t1.set_downstream(t2)

celerybeat automatically disables periodic task

I'd like to create a periodic task for celery using django-celery's admin interface. I have a task set up which runs great when called manually or by script. It just doesn't work through celerybeat. According to the debug logs the task is set to enabled = False on first retrieval and I wonder why.
When adding the periodic task and passing [1, False] as positional arguments, the task is automatically disabled and I don't see any further output. When added without arguments the task is executed but raises an exception instantly because I didn't supply the needed arguments (makes sense).
Does anyone see what's the problem here?
Thanks in advance.
This is the output after supplying arguments:
[DEBUG/Beat] SELECT "djcelery_periodictask"."id", [...]
FROM "djcelery_periodictask"
WHERE "djcelery_periodictask"."enabled" = true ; args=(True,)
[DEBUG/Beat] SELECT "djcelery_intervalschedule"."id", [...]
FROM "djcelery_intervalschedule"
WHERE "djcelery_intervalschedule"."id" = 3 ; args=(3,)
[DEBUG/Beat] SELECT (1) AS "a"
FROM "djcelery_periodictask"
WHERE "djcelery_periodictask"."id" = 3 LIMIT 1; args=(3,)
[DEBUG/Beat] UPDATE "djcelery_periodictask"
SET "name" = E'<taskname>', "task" = E'<task.module.path>',
"interval_id" = 3, "crontab_id" = NULL,
"args" = E'[1, False,]', "kwargs" = E'{}', "queue" = NULL,
"exchange" = NULL, "routing_key" = NULL,
"expires" = NULL, "enabled" = false,
"last_run_at" = E'2011-05-25 00:45:23.242387', "total_run_count" = 9,
"date_changed" = E'2011-05-25 09:28:06.201148'
WHERE "djcelery_periodictask"."id" = 3;
args=(
u'<periodic-task-name>', u'<task.module.path>',
3, u'[1, False,]', u'{}',
False, u'2011-05-25 00:45:23.242387', 9,
u'2011-05-25 09:28:06.201148', 3
)
[DEBUG/Beat] Current schedule:
<ModelEntry: celery.backend_cleanup celery.backend_cleanup(*[], **{}) {<crontab: 0 4 * (m/h/d)>}
[DEBUG/Beat] Celerybeat: Waking up in 5.00 seconds.
EDIT:
It works with the following setting. I still have no idea why it doesn't work with django-celery.
CELERYBEAT_SCHEDULE = {
"example": {
"task": "<task.module.path>",
"schedule": crontab(),
"args": (1, False)
},
}
I had the same issue. Make sure the arguments are JSON formatted. For example, try setting the positional args to [1, false] -- lowercase 'false' -- I just tested it on a django-celery instance (version 2.2.4) and it worked.
For the keyword args, use something like {"name": "aldarund"}
I got the same problem too.
With the description of PeriodicTask models in djcelery ("JSON encoded positional arguments"), same as Evan answer. I try using python json lib to encode before save.
And this work with me
import json
o = PeriodicTask()
o.kwargs = json.dumps({'myargs': 'hello'})
o.save()
celery version 3.0.11
CELERYBEAT_SCHEDULE = {
"example": {
"task": "<task.module.path>",
"schedule": crontab(),
"enable": False
},
}
I tried and it worked.I run on celery beat v5.1.2