Background tasks with Django on Heroku - django

So I'm building a Django app on Heroku. The tasks that the app performs frequently run longer than 30 seconds, and therefore I'm running into the 30s timeout by Heroku. I first tried solving it by submitting the task from my Django view to AWS lambda, but in that case, the view is waiting for the AWS Lambda function to finish, so it doesn't solve my problem.
I have already read the tutorials on Heroku on handling background tasks with Django. I'm now faced with a few different options on how to proceed, and would love to get outside input on which one makes the most sense:
Use Celery & Redis to manage the background tasks, and let the tasks be executed on AWS Lambda.
Use Celery & Redis to manage the background tasks, but let the tasks be executed in a Python script on Heroku.
Trying to solve it with asyncio in order to keep it leaner (not sure whether that specific case could be solved with asyncio, though?
Maybe there's an even better solution that I don't see?
Looking forward to any input/suggestions!

Related

How to run long tasks in the background od django without pausing the execution of the app

I want to know how to run independently a very long task that takes probably 2 minutes in the backend of django. I used threading in python and it works but as soon as i execute another task in the main django project the task in the background stops and doesn't finish executing.
Celery and django background tasks have the same issue as well, i tried them and it didn't work.
So please if anyone has an idea how to do that, help me!!!!
thanks so much in advance
Celery has capability to run background task, and it don't overrides other task, i have used and it works perfectly fine, can you re check documentation of celery, you can also check on django-celery-beat for perodic task scheduling

on heroku, celery beat database scheduler doesn’t run periodic tasks

I have an issue where django_celery_beat’s DatabaseScheduler doesn’t run periodic tasks. Or I should say where celery beat doesn’t find any tasks when the scheduler is DatabaseScheduler. In case I use the standard scheduler the tasks are executed regularly.
I setup celery on heroku by using a dyno for worker and one for beat (and one for web, obviously).
I know that beat and worker are connected to redis and to postgres for task results.
Every periodic task I run from django admin by selecting a task and “run selected task” gets executed.
However, it is about two days that I’m trying to figure out why there isn’t a way for beat/worker to find that I scheduled a task to execute every 10 seconds, or using a cron (even restarting beat and remot doesn’t change it).
I’m kind of desperate, and my next move would be to give redbeat a try.
Any help on how to how to troubleshoot this particular problem would be greatly appreciated. I suspect the problem is in the is_due method. I am using UTC (in celery and django), all cron are UTC based. All I see in the beat log is “writing entries..” every on and then.
I’ve tried changing celery version from 4.3 to 4.4 and django celery beat from 1.4.0 to 1.5.0 to 1.6.0
Any help would be greatly appreciated.
In case it helps someone who's having or will have a similar trouble as ours: to recreate this issue, it is possible to create a task as simple as:
#app.task(bind=True)
def test(self, arg):
print(kwargs.get("notification_id"))
then, in django admin, enter the task editing and put something in the extra args field. Or, viceversa, the task could be
#app.task(bind=True)
def test(self, **kwargs):
print(notification_id)
And try to pass a positional argument. While locally this breaks, on Heroku's beat and worker dyno, this somehow slips away unnoticed, and django_celery_beats stop processing any task whatsoever in the future. The scheduler is completely broken by a "wrong" task.

Django celery, celery-beat: fills the queue without control, scheduling troubles

I have small project with couple of tasks to run several times a day.
The project is based on Django 2.1, having celery 4.2.1 and django-celery-beat 1.3.0. And also have rabbitmq installed.
Each task is inside it's projects application. Runs, works, gives some result.
The problem is - on virtual server, leased from some provider, if I set any task to run periodically (each hour, or two)- it starts running immidiately, without end and, as i suppose in some kind of parallel threads, wish mesh each other.
Command rabbintmqctl list_queues name messages_unacknowldged always shows 8 in queue celery. Purging the queue celery does not give any changes. Restarting service - too.
But setting tasks schedule to run in exact time works good. Well, almost good. Two tasks have schedule to run in the beginning of different hours (even and odd). But both run in about 30 minutes after hour beginning, of the same (odd) hour. At least tasks don't run more times in a day than set in schedule. But it is still something wrong.
As a newbie with rabbitmq and celery don't know where to look for solution. Official celery docs didn't help me. May be was not looking in right place. Any help help or advice would be good. Thanks.
It seems this is bug of django-celery-beat - https://github.com/celery/celery/issues/4041.
If anyone have already made any solution for this - please inform.

How to record all tasks information with Django and Celery?

In my Django project I'm using Celery with a RabbitMQ broker for asynchronous tasks, how can I record the information of all of my tasks (e.g. created time (task appears in queue), worker consume task time, execution time, status, ...) to monitor how Celery is doing?
I know there are solutions like Flower but that seems to much for what I need, django-celery-results looks like what I want but it's missing a few information I need like task created time.
Thanks!
It seems like you often find the answer yourself after asking on SO. I settled with using celery signals to do all the recording I want and store the results in a database table.

Back end tasks and frontend separation

I use Django and Django Rest Framework for my internal API and I use Vue.js for my frontend. The backend (API) and the frontend are totally separated.
I need to run a background task (every time a user is created) and I am considering 2 solutions:
Call (with a post_save signal) a function that runs the task.
Note that this function will call a 3rd party API. The call might fail for various reasons and/or run during a long period ~20sec.
Create a background task
With Redis or RabbitMQ or django-background-tasks.
Which solution should I go for ?
If both solutions are acceptable, what would be the limitations/advantages of each one ?
You might need django celery. This is a great package for background tasks for django, you can choose either Redis or RabbitMQ as the broker, where the brokers doesn't matter much on my opinion.
Why can this be a good solution for your problem?
This is easy to install where you just need to install the django celery and redis(I prefer redis), configure some settings and you have now async functions.
You might need soon a scheduled task, where you just need to install its additional package.
You only need to build type function and attach a decorator for it to be async.
from celery import shared_task
#shared_task
def add(x,y):
return X+y
and call it anywhere in you code
add.delay()
you know how background task.