Django celery, celery-beat: fills the queue without control, scheduling troubles - django

I have small project with couple of tasks to run several times a day.
The project is based on Django 2.1, having celery 4.2.1 and django-celery-beat 1.3.0. And also have rabbitmq installed.
Each task is inside it's projects application. Runs, works, gives some result.
The problem is - on virtual server, leased from some provider, if I set any task to run periodically (each hour, or two)- it starts running immidiately, without end and, as i suppose in some kind of parallel threads, wish mesh each other.
Command rabbintmqctl list_queues name messages_unacknowldged always shows 8 in queue celery. Purging the queue celery does not give any changes. Restarting service - too.
But setting tasks schedule to run in exact time works good. Well, almost good. Two tasks have schedule to run in the beginning of different hours (even and odd). But both run in about 30 minutes after hour beginning, of the same (odd) hour. At least tasks don't run more times in a day than set in schedule. But it is still something wrong.
As a newbie with rabbitmq and celery don't know where to look for solution. Official celery docs didn't help me. May be was not looking in right place. Any help help or advice would be good. Thanks.

It seems this is bug of django-celery-beat - https://github.com/celery/celery/issues/4041.
If anyone have already made any solution for this - please inform.

Related

Running a Particular Function after a duration of 2 hours of occurance of an event in Django

Currently i am working on a Django project .Use case is when i add a device object in database,if it does not come online in the first 2 hours after addition then i need to delete that device object from database .
i have written a function delete_device_from_db() which delete device if it not comes online.How to invoke this function exactly after 2 hours after the addition of device.
In our project we are using celery to run background tasks and periodic tasks.
what is the best way to solve this.can it be solved using celery?
Don't run a celery task just to keep it paused for 2 hours. That's not a good approach. It will keep your celery worker occupied. What happens if you have a few devices going offline? It'll totally jam your application.
What i recommend you to do is to have a celery beat task that runs every 5 minutes, and runs a check of the last updated datetime of each device.
from django.utils import timezone
if (timezone.now() - device.updated) > datetime.timedelta(hours=2):
device.delete_device_from_db()
P.S. I'm assuming here that your device model has a updated field. If not, add it.
updated = models.DateTimeField(auto_now=True)

How to run long tasks in the background od django without pausing the execution of the app

I want to know how to run independently a very long task that takes probably 2 minutes in the backend of django. I used threading in python and it works but as soon as i execute another task in the main django project the task in the background stops and doesn't finish executing.
Celery and django background tasks have the same issue as well, i tried them and it didn't work.
So please if anyone has an idea how to do that, help me!!!!
thanks so much in advance
Celery has capability to run background task, and it don't overrides other task, i have used and it works perfectly fine, can you re check documentation of celery, you can also check on django-celery-beat for perodic task scheduling

on heroku, celery beat database scheduler doesn’t run periodic tasks

I have an issue where django_celery_beat’s DatabaseScheduler doesn’t run periodic tasks. Or I should say where celery beat doesn’t find any tasks when the scheduler is DatabaseScheduler. In case I use the standard scheduler the tasks are executed regularly.
I setup celery on heroku by using a dyno for worker and one for beat (and one for web, obviously).
I know that beat and worker are connected to redis and to postgres for task results.
Every periodic task I run from django admin by selecting a task and “run selected task” gets executed.
However, it is about two days that I’m trying to figure out why there isn’t a way for beat/worker to find that I scheduled a task to execute every 10 seconds, or using a cron (even restarting beat and remot doesn’t change it).
I’m kind of desperate, and my next move would be to give redbeat a try.
Any help on how to how to troubleshoot this particular problem would be greatly appreciated. I suspect the problem is in the is_due method. I am using UTC (in celery and django), all cron are UTC based. All I see in the beat log is “writing entries..” every on and then.
I’ve tried changing celery version from 4.3 to 4.4 and django celery beat from 1.4.0 to 1.5.0 to 1.6.0
Any help would be greatly appreciated.
In case it helps someone who's having or will have a similar trouble as ours: to recreate this issue, it is possible to create a task as simple as:
#app.task(bind=True)
def test(self, arg):
print(kwargs.get("notification_id"))
then, in django admin, enter the task editing and put something in the extra args field. Or, viceversa, the task could be
#app.task(bind=True)
def test(self, **kwargs):
print(notification_id)
And try to pass a positional argument. While locally this breaks, on Heroku's beat and worker dyno, this somehow slips away unnoticed, and django_celery_beats stop processing any task whatsoever in the future. The scheduler is completely broken by a "wrong" task.

Running a quick async task with Django+Gunicorn

We have coded a system that uses Django + Celery, where our Celery tasks take a few minutes each to complete.
I'm looking for a quick, easy to use method for running an immediate async task (a few seconds) when a user logs in, without having to use the celery system (where queued tasks may take ages to finish)
I have read similar questions on S.O but they were referring to Apache+uWSGI rather than Gunicorn. Also, Questions regarding Gunicorn mentioned that greenlets are blocking.
This answer suggests using Threads or Multiprocessing, but I am confused - will those options work with Gunicorn or will they cause it to hang/crash? What about using Fork?
I think I found a solution: I should use celery's "Task Routing" and set up:
A queue for slow tasks
A queue for quick tasks
And two (or more) workers, one of which only executes the quick tasks.
See example in this sample code (change "windows" -> "slow" or "quick" according to the need)
(original presentation here)

AWS Elastic Beanstalk Worker timing out after inactivity during long computation

I am trying to use Amazon Elastic Beanstalk to run a very long numerical simulation - up to 20 hours. The code works beautifully when I tell it to do a short, 20 second simulation. However, when running a longer one, I get the error "The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own)".
After browsing the web, it seems to me that the issue is that Elastic Beanstalk allows worker processes to run for 30 minutes at most, and then they time out because the instance has not responded (i.e. finished the simulation). The solution some have proposed is to send a message every 30 seconds or so that "pings" Elastic Beanstalk, letting it know that the simulation is going well so it doesn't time out, which would let me run a long worker process. So I have a few questions:
Is this the correct approach?
If so, what code or configuration would I add to the project to make it stop terminating early?
If not, how can I smoothly run a 12+ hour simulation on AWS or more generally, the cloud?
Add on information
Thank you for the feedback, Rohit. To give some more information, I'm using Python with Flask.
• I am indeed using an Elastic Beanstalk worker tier with SQS queues
• In my code, I'm running a simulation of variable length - from as short as 20 seconds to as long as 20 hours. 99% of the work that Elastic Beanstalk does is running the simulation. The other 1% involves saving results, sending emails, etc.
• The simulation itself involves using generating many random numbers and working with objects that I defined. I use numpy heavily here.
Let me know if I can provide any more information. I really appreciate the help :)
After talking to a friend who's more in the know about this stuff than me, I solved the problem. It's a little sketchy, but got the job done. For future reference, here is an outline of what I did:
1) Wrote a main script that used Amazon's boto library to connect to my SQS queue. Wrote an infinite while loop to poll the queue every 60 seconds. When there's a message on the queue, run a simulation and then continue through with the loop
2) Borrowed a beautiful /etc/init.d/ template to run my script as a daemon (http://blog.scphillips.com/2013/07/getting-a-python-script-to-run-in-the-background-as-a-service-on-boot/)
3) Made my main script and the script in (2) executable
4) Set up a cron job to make sure the script would start back up if it failed.
Once again, thank you Rohit for taking the time to help me out. I'm glad I still got to use Amazon even though Elastic Beanstalk wasn't the right tool for the job
From your question it seems you are running into launches timing out because some commands during launch that run on your instance take more than 30 minutes.
As explained here, you can adjust the Timeout option in the aws:elasticbeanstalk:command namespace. This can have values between 1 and 1800. This means if your commands finish within 30 minutes you won't see this error. The commands might eventually finish as the error message says but since Elastic Beanstalk has not received a response within the specified period it does not know what is going on your instance.
It would be helpful if you could add more details about your usecase. What commands you are running during startup? Apparently you are using ebextensions to launch commands which take a long time. Is it possible to run those commands in the background or do you need these commands to run during server startup?
If you are running a Tomcat web app you could also use something like servlet init method to run app bootstrapping code. This code can take however long it needs without giving you this error message.
Unfortunately, there is no way to 'process a message' from an SQS queue for more than 12 hours (see the description of ChangeVisibilityTimeout).
With that being the case, this approach doesn't fit your application well. I have ran into the same problem.
The correct way to do this: I don't know. However, I would suggest an alternate approach where you grab a message off of your queue, spin off a thread or process to run your long running simulation, and then delete the message (signaling successful processing). In this approach, be careful of spinning off too many threads on one machine and also be wary of machines shutting down before the simulation has ended, because the queue message has already been deleted.
Final note: your question is excellently worded and sufficiently detailed :)
For those looking to run jobs shorter than 10 hours, it needs to be mentioned that the current inactivity timeout limit is 36000 seconds, so exactly 10 hours and not anymore 30 minutes, like mentioned in posts all over the web (which led me to think a workaround like described above is needed).
Check out the docs: https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html
A very nice write-up can be found here: https://dev.to/rizasaputra/understanding-aws-elastic-beanstalk-worker-timeout-42hi