Django-q: WARNING reincarnated worker Process-1:1 after timeout - django

I've installed and configured Django-Q 1.3.5 (on Django 3.2 with Redis 3.5.3 and Python 3.8.5).
This is my Cluster configuration:
# redis defaults
Q_CLUSTER = {
'name': 'my_broker',
'workers': 4,
'recycle': 500,
'timeout': 60,
'retry': 65,
'compress': True,
'save_limit': 250,
'queue_limit': 500,
'cpu_affinity': 1,
'redis': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
'socket_timeout': None,
'charset': 'utf-8',
'errors': 'strict',
'unix_socket_path': None
}
}
where I have appropriately chosen timeout:60 and retry:65 to explain my problem.
I created this simple function to call via Admin Scheduled Task:
def test_timeout_task():
time.sleep(61)
return "Result of task"
And this is my "Scheduled Task page" (localhost:8000/admin/django_q/schedule/)
ID
Name
Func
Success
1
test timeout
mymodel.tasks.test_timeout_task
?
When I run this task, I get the following warning:
10:18:21 [Q] INFO Process-1 created a task from schedule [test timeout]
10:19:22 [Q] WARNING reincarnated worker Process-1:1 after timeout
10:19:22 [Q] INFO Process-1:7 ready for work at 68301
and the task is no longer executed.
So, my question is: is there a way to correctly handle an unpredicted task?

You can set your timeout to :
'timeout': None and it should handle your task without stopping.

Related

Some task are not processing using Django-Q

I have a django Q cluster running with this configuration:
Q_CLUSTER = {
'name': 'pretty_name',
'workers': 1,
'recycle': 500,
'timeout': 500,
'queue_limit': 5,
'cpu_affinity': 1,
'label': 'Django Q',
'save_limit': 0,
'ack_failures': True,
'max_attempts': 1,
'attempt_count': 1,
'redis': {
'host': CHANNEL_REDIS_HOST,
'port': CHANNEL_REDIS_PORT,
'db': 5,
}
}
On this cluster I have a scheduled task supposed to run every 15 minutes.
Sometimes it works fine and this is what I can see on my worker logs:
[Q] INFO Enqueued 1
[Q] INFO Process-1 created a task from schedule [2]
[Q] INFO Process-1:1 processing [oranges-georgia-snake-social]
[ My Personal Custom Task Log]
[Q] INFO Processed [oranges-georgia-snake-social]
But other times the task does not start, this is what I get on my log:
[Q] INFO Enqueued 1
[Q] INFO Process-1 created a task from schedule [2]
And then nothing for the next 15 minutes.
Any idea where this might come from ?
So this was my prod environment and it appears that my dev environment was using the same redis db and even though no task existed on my dev environment it seems that this was the cause of the issue.
The solution was to change the redis db between my dev and prod environment !

Trouble trying to get size of Celery queue using redis-cli (for a Django app)

I'm using Django==2.2.24 and celery[redis]==4.4.7.
I want to get the length of my celery queues, so that I can use this information for autoscaling purposes in AWS EC2.
I found the following piece of documentation:
https://docs.celeryproject.org/en/v4.4.7/userguide/monitoring.html#redis
Redis
If you’re using Redis as the broker, you can monitor the Celery
cluster using the redis-cli command to list lengths of queues.
Inspecting queues
Finding the number of tasks in a queue:
$ redis-cli -h HOST -p PORT -n DATABASE_NUMBER llen QUEUE_NAME
The default queue is named celery. To get all available queues,
invoke:
$ redis-cli -h HOST -p PORT -n DATABASE_NUMBER keys \*
Note
Queue keys only exists when there are tasks in them, so if a key doesn’t exist it simply means there are no messages in that queue.
This is because in Redis a list with no elements in it is
automatically removed, and hence it won’t show up in the keys command
output, and llen for that list returns 0. Also, if you’re using Redis
for other purposes, the output of the keys command will include
unrelated values stored in the database. The recommended way around
this is to use a dedicated DATABASE_NUMBER for Celery, you can also
use database numbers to separate Celery applications from each other
(virtual hosts), but this won’t affect the monitoring events used by
for example Flower as Redis pub/sub commands are global rather than
database based.
Now, my Celery configuration (in Django) has the following relevant part:
CELERY_QUEUES = (
Queue('default', Exchange('default'), routing_key='default'),
Queue('email', Exchange('email'), routing_key='email'),
Queue('haystack', Exchange('haystack'), routing_key='haystack'),
Queue('thumbnails', Exchange('thumbnails'), routing_key='thumbnails'),
)
So I tried this:
$ redis-cli -n 0 -h ${MY_REDIS_HOST} -p 6379 llen haystack
(yes, celery is configured to use redis database number 0)
I tried my 4 queues, and I always get 0, when this is simply not possible. Some of these queues are usually very active, or my website wouldn't be working properly.
One key part of the documentation is that I can list the available queues, so I tried it:
$ redis-cli -n 0 -h ${MY_REDIS_HOST} -p 6379 keys \*
And I get about 20,000 lines of something like this:
celery-task-meta-b30fb605-d7b6-48db-b8cd-493458566876
celery-task-meta-e10ec56c-6601-420b-9f87-de6455968e76
celery-task-meta-14558a3a-1153-4f02-91f8-614bc29f6775
celery-task-meta-4c266854-512b-48af-8356-c786c507eb9e
celery-task-meta-e4ad4298-3d74-4986-8831-4c4d3c3e79f2
celery-task-meta-dfab0202-3975-46ce-9670-0d4cf3e278db
celery-task-meta-494fcb21-5995-495d-8980-0d8aa7edf0b8
celery-task-meta-345c4857-87f9-4e3f-8028-a6ef8cf93f5d
celery-task-meta-a4a48d00-68dc-4d30-87dd-869d2a20c347
celery-task-meta-d14fc394-6415-442b-8a5d-c9a4f37a9509
If I exclude all the celery-task-meta lines:
$ redis-cli -n 0 -h ${MY_REDIS_HOST} -p 6379 keys \* | grep -v celery-task-meta
I get this:
_kombu.binding.celeryev
_kombu.binding.default
_kombu.binding.thumbnails
_kombu.binding.email
unacked
_kombu.binding.celery.pidbox
_kombu.binding.haystack
unacked_index
_kombu.binding.reply.celery.pidbox
I tried to use the celery CLI to get the information, and this is some relevant output:
$ celery --app my-app inspect active_queues
-> celery#683a8e8bc84f: OK
* {'name': 'thumbnails', 'exchange': {'name': 'thumbnails', 'type': 'direct', 'arguments': None, 'durable': True, 'passive': False, 'auto_delete': False, 'delivery_mode': None, 'no_declare': False}, 'routing_key': 'thumbnails', 'queue_arguments': None, 'binding_arguments': None, 'consumer_arguments': None, 'durable': True, 'exclusive': False, 'auto_delete': False, 'no_ack': False, 'alias': None, 'bindings': [], 'no_declare': None, 'expires': None, 'message_ttl': None, 'max_length': None, 'max_length_bytes': None, 'max_priority': None}
-> celery#bf11d4c3bd6f: OK
* {'name': 'email', 'exchange': {'name': 'email', 'type': 'direct', 'arguments': None, 'durable': True, 'passive': False, 'auto_delete': False, 'delivery_mode': None, 'no_declare': False}, 'routing_key': 'email', 'queue_arguments': None, 'binding_arguments': None, 'consumer_arguments': None, 'durable': True, 'exclusive': False, 'auto_delete': False, 'no_ack': False, 'alias': None, 'bindings': [], 'no_declare': None, 'expires': None, 'message_ttl': None, 'max_length': None, 'max_length_bytes': None, 'max_priority': None}
-> celery#86151417b361: OK
* {'name': 'default', 'exchange': {'name': 'default', 'type': 'direct', 'arguments': None, 'durable': True, 'passive': False, 'auto_delete': False, 'delivery_mode': None, 'no_declare': False}, 'routing_key': 'default', 'queue_arguments': None, 'binding_arguments': None, 'consumer_arguments': None, 'durable': True, 'exclusive': False, 'auto_delete': False, 'no_ack': False, 'alias': None, 'bindings': [], 'no_declare': None, 'expires': None, 'message_ttl': None, 'max_length': None, 'max_length_bytes': None, 'max_priority': None}
-> celery#9a5360a82f14: OK
* {'name': 'haystack', 'exchange': {'name': 'haystack', 'type': 'direct', 'arguments': None, 'durable': True, 'passive': False, 'auto_delete': False, 'delivery_mode': None, 'no_declare': False}, 'routing_key': 'haystack', 'queue_arguments': None, 'binding_arguments': None, 'consumer_arguments': None, 'durable': True, 'exclusive': False, 'auto_delete': False, 'no_ack': False, 'alias': None, 'bindings': [], 'no_declare': None, 'expires': None, 'message_ttl': None, 'max_length': None, 'max_length_bytes': None, 'max_priority': None}
and
$ celery --app my-app inspect scheduled
-> celery#683a8e8bc84f: OK
- empty -
-> celery#86151417b361: OK
- empty -
-> celery#bf11d4c3bd6f: OK
- empty -
-> celery#9a5360a82f14: OK
- empty -
The command above seems to work well: if there are active tasks, the task is shown there, even tho in my copy/paste it says empty.
So, does anybody know what I might be doing wrong and why I can't get the real size of my queues?
Thanks!

Airflow: Current Task Failure, How To Mark Downstream As Skipped or No Status

I have a DAG where the last task is an EmailOperator to send an informational "success" email. When the previous (previous to EmailOperator) task fails, and is marked as, State: failed, the last task (EmailOperator) is then shown as yellow in the UI up_for_retry, but is marked in the logs as, State: upstream_failed.
Why does a task with upstream_failed get assigned to up_for_retry or at least in this specific case? Is there way to ensure it is not up_for_retry and have it visualized as skipped or no_status.
Here are my current default_args setting...
default_args = {
'owner': '<myname>',
'depends_on_past': True,
'start_date': datetime(2019, 9, 27),
'email': ['<myemail>'],
'email_on_failure': True,
#'email_on_retry': False,
'retries': 0
#'retry_delay': timedelta(minutes=15)
}

How to avoid duplication of task execution in celery? And how to assign an worker to an default queue

My task executes more than once with two minutes. but scheduled for once in celery_beat queue.
Tried with restarting the celery with supervisorctl command as supervisorctl stop all and supervisorctl stop all
app.autodiscover_tasks(settings.INSTALLED_APPS, related_name='tasks')
app.conf.task_default_queue = 'default'
app.conf.task_routes = {'cloudtemplates.tasks.get_metrics': {'queue': 'metrics'}}
app.conf.beat_schedule = {
'load-softlyer-machine-images': {
'task': 'load_ibm_machine_images',
'schedule': crontab(0, 0, day_of_month='13'),
'args': '',
'options': {'queue': 'celery_beat'},
}
}
Expected to run the sheduled task only once for on 13 th of every month.

Karma tests fail with Browser DISCONNECTED error, exit code: 1

We have a total of 2016 unit test cases written with Jasmine and are using Karma to run them. The tests run for a period of 1 min 30 sec to 2 min and then suddenly Karma disconnects from the browser.Here is a screenshot of the console logs.
The problem is that I am not able to diagnose why that is happening and which test case is causing it to get disconnected. I have tried different reporters of Karma to be able to identify the test case which forces it to disconnect from the browser but have been unsuccessful so far.
I have also tried running the tests in short batches to be able to drill down to the error test case (in case it is a test case error and not Karma configuration) but so far, the error has been thrown for all batches.
As per this post, I have tried setting the browserNoActivityTimeout to as high as 10 minutes (600000ms) but still no resolution. Also, the post mentions that there might be a problem with insufficient memory, so I have tried running the cases in one 8GB RAM and one 16GB RAM systems (Windows 10 on both).
Here's the complete stack trace:
[02:06:48] Error: MyApp Chromebook Unit tests failed with exitCode: 1
at formatError (C:\Users\barnadeep.bhowmik\AppData\Roaming\npm\node_modules\gulp\bin\gulp.js:169:10)
at Gulp.<anonymous> (C:\Users\barnadeep.bhowmik\AppData\Roaming\npm\node_modules\gulp\bin\gulp.js:195:15)
at emitOne (events.js:96:13)
at Gulp.emit (events.js:188:7)
at Gulp.Orchestrator._emitTaskDone (C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\orchestrator\index.js:264:8)
at C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\orchestrator\index.js:275:23
at finish (C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\orchestrator\lib\runTask.js:21:8)
at cb (C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\orchestrator\lib\runTask.js:29:3)
at C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\build\tasks\test.js:18:13
at removeAllListeners (C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\karma\lib\server.js:336:7)
at Server.<anonymous> (C:\Users\barnadeep.bhowmik\Desktop\Projects\MyProject\test-player-15-may\myapp-chrome\node_modules\karma\lib\server.js:347:9)
at Server.g (events.js:291:16)
at emitNone (events.js:91:20)
at Server.emit (events.js:185:7)
at emitCloseNT (net.js:1555:8)
at _combinedTickCallback (internal/process/next_tick.js:71:11)
at process._tickCallback (internal/process/next_tick.js:98:9)
Here's my config file:
module.exports = function(config) {
config.set({
basePath: '',
frameworks: ['jasmine'],
files: [
'bower_components/jquery/dist/jquery.js',
'node_modules/angular/angular.js',
'other_dependencies/**.*.js',
'src/app/app.js',
'src/app/pack1-components/**/*.js',
'src/app/pack2-components/**/*.js',
'src/**/*.html'
],
exclude: [
'src/some-folder/*',
],
port: 8081,
logLevel: config.LOG_INFO,
autoWatch: false,
browsers: ['ChromeNoSandbox'],//temp fix for Chrome Browser 'Chrome'
customLaunchers: {
ChromeNoSandbox: {
base: 'Chrome',
flags: ['--no-sandbox']
}
},
reporters: ["spec","progress","coverage","html"],
specReporter: {
maxLogLines: 5, // limit number of lines logged per test
suppressErrorSummary: false, // do not print error summary
suppressFailed: false, // do not print information about failed tests
suppressPassed: false, // do not print information about passed tests
suppressSkipped: true, // do not print information about skipped tests
showSpecTiming: false, // print the time elapsed for each spec
failFast: true // test would finish with error when a first fail occurs.
},
preprocessors: {
'src/**/*.js':['coverage'],
'src/**/*.html':['ng-html2js']
},
coverageReporter: {
type: 'lcov',
dir: 'qualityreports/testresults/unit/coverage/'
},
htmlReporter: {
outputFile: 'qualityreports/testresults/unit/testresults.html'
},
browserNoActivityTimeout: 600000,
captureTimeout: 60000,
browserDisconnectTimeout : 60000,
browserDisconnectTolerance : 1,
ngHtml2JsPreprocessor: {
},
plugins: [
'karma-jasmine','karma-chrome-launcher','karma-coverage','karma-htmlfile-reporter','karma-ng-html2js-preprocessor',"karma-spec-reporter"],
singleRun: true
});
};
Here is a similar post but it did not have all details, hence posting mine. Any help would be deeply appreciated.