Custom metrics from celery workers into prometheus - django

I have a few celery workers running in containers under kubernetes. They are not auto-scaled by celery and each run in a single process (i.e. no multiprocessing). I would like to get a bunch of different metrics from them into prometheus. I've looked at celery-prometheus-exporter (unmaintained) and celery-exporter but they are focused on metrics at celery level rather than app metrics inside of the celery workers.
It looks like two options would be either to find some hacky way to get app level metrics to celery-prometheus-exporter which then would make them available to prometheus OR to use pushgateway.
Which is better, or maybe there's another option I missed?

Just use the default client and let it run the http server in a thread.

Related

Background tasks with Django on Heroku

So I'm building a Django app on Heroku. The tasks that the app performs frequently run longer than 30 seconds, and therefore I'm running into the 30s timeout by Heroku. I first tried solving it by submitting the task from my Django view to AWS lambda, but in that case, the view is waiting for the AWS Lambda function to finish, so it doesn't solve my problem.
I have already read the tutorials on Heroku on handling background tasks with Django. I'm now faced with a few different options on how to proceed, and would love to get outside input on which one makes the most sense:
Use Celery & Redis to manage the background tasks, and let the tasks be executed on AWS Lambda.
Use Celery & Redis to manage the background tasks, but let the tasks be executed in a Python script on Heroku.
Trying to solve it with asyncio in order to keep it leaner (not sure whether that specific case could be solved with asyncio, though?
Maybe there's an even better solution that I don't see?
Looking forward to any input/suggestions!

Celery queue length doesn't match Redis keys

I'm using Celery workers in Google Kubernetes Engine and Redis as a broker. I'd like to be able to scale my Celery GKE deployment based on the number of queued tasks in Celery. I'm able to do this using Horizonal Pod Autoscale using an external metric (redis.googleapis.com/keyspace/keys), but the metric itself isn't reporting like I expected it to.
I can use this code to get the current length of the Celery queue:
with celery_app.pool.acquire(block=True) as conn:
print(conn.default_channel.client.llen("celery"))
I can also use Google Cloud Monitoring to look at the number of keys in Redis. I expected these numbers to match, but they do not. The number of keys in Redis is very few, usually around 10 keys, with occasional spikes up to hundreds or a thousand keys that last only a couple minutes. However, running the code above very often reports a queue length in the thousands, even when Redis is reporting only ten or so keys.
Clearly I'm misunderstanding how Redis and Celery interact, and I'm hoping someone can enlighten me. I'd also love to get suggestions on how to scale GKE pods based on Celery queue length, if using the Redis key count metric is wrong. Thanks!

How to record all tasks information with Django and Celery?

In my Django project I'm using Celery with a RabbitMQ broker for asynchronous tasks, how can I record the information of all of my tasks (e.g. created time (task appears in queue), worker consume task time, execution time, status, ...) to monitor how Celery is doing?
I know there are solutions like Flower but that seems to much for what I need, django-celery-results looks like what I want but it's missing a few information I need like task created time.
Thanks!
It seems like you often find the answer yourself after asking on SO. I settled with using celery signals to do all the recording I want and store the results in a database table.

Django-Celery - Work in parallel on repetitive requests

I am not familiar with Djano-Celery, so I would like to know if it is the right tool for what I need to do before going deeper in the doc.
My django app has a web service for tiling map images that is called like this: http://host.com/tiling/x/y/z.png
xyz are integer variables that are used in the tiling function to compute the output.
My question is: do Djano-Celery can create workers for parallel processing on this tiling function when repetitive requests are detected?
For instance, 10 or more requests could be sent by a user at a time : http://host.com/tiling/0/1/1.png, http://host.com/tiling/1/0/1.png etc...
Can Django-Celery creates workers for each in parralel instead of computing each request one by one? What are the requierements on server side? Do I need something linke NGINX or GUNICORN or WSGI or CGI? I am confuse about those things...
In most cases celery is used for asynchronous tasks handling. But it works also for concurrent tasks!
By default celery uses multiprocessing but you can also use Eventlet - concurrent networking library for Python.
Reference:
http://docs.celeryproject.org/en/latest/userguide/concurrency/eventlet.html#concurrency-eventlet
http://docs.celeryproject.org/en/latest/userguide/workers.html#concurrency

Why doesn't CeleryCAM work with Amazon SQS?

I'm using Celery 2.4.6 and django-celery 2.4.2.
When I configure Celery to use Amazon SQS per the resolution on this question: Celery with Amazon SQS
I don't see anything in the celerycam table in the Django admin. If I switch back to RabbitMQ, the tasks start showing up again.
I have a lot (now 40+) queues in SQS named something like this: "celeryev-92e068c4-9390-4c97-bc1d-13fd6e309e19", which look like they might be related (some of the older ones even have an event in them), but nothing's showing up in the database and I see no errors in the celerycam log.
Any suggestions on what the issue might be or how to debug this further would be much appreciated.
SQS is a limited implementation of an AMQP bus. As I understand, it doesn't support PUB/SUB broadcasting like say rabbit-MQ does, which is necessary for events to work properly. SNS was put in place to support broadcasting, but its a separate system.
Some libraries/packages out there are using SimpleDB as a messaging model store as a hack on top of SQS to emulate proper AMQP behavior, but apparently celery does not have a full hack in place yet.