I have a Django application.
One of my models looks like this:
class MyModel(models.Model):
def house_cleaning(self):
// cleaning up data of the model instance
Every time when I update an instance of MyModel, I'd need to clean up the data N days later. So I'd like to schedule a job to call
this_instance.house_cleaning()
N days from now.
Is there any job queue that would allow me to:
Integrate well with Django - allow me to call a method of individual model instances
Only run jobs that are scheduled to run today
Ideally handle failures gracefully
Thanks
django-chronograph might be good for your use case. If you write your cleanup jobs as django commands, then you schedule them to run at some time. It runs using unix cron behind the scene.
Is there any reason why a cron job wouldn't work? Or something like django-cron that behaves the same way? It's pretty easy to write stand-alone Django scripts. If you want to trigger house cleaning on some change to you model after a certain number of days, why not create a date flag in the model which is set to N days in the future when the job needs to be scheduled? You could run a script on a daily basis which pulls all records where the date is <= today, calls the instance's house_cleaning() method and then clears the date field. If an exception is raised during the process, it's easy enough to log it or dispatch an email.
Related
I have a Django backend that I've created for a real estate company. I've built quite a few tasks that are hardcoded and I'm wanting to make those tasks customizable via the admin page... but I can't quite figure out how to do that.
For example, let's say that I wanted to create a task that would send an email that could be customized from the admin page. Ideally, I'd have a list of triggers to choose from like a contact form submission.
Something that looked like this:
I had the very same need on many occasions, and solved it as follows:
I have an abstract "base" Task model with a few fields common to any "generic" background task: created_on, started_on, completed_on, status, progress, failure_reason and a few more
When a new specific task is needed, I write:
a job function to be run by the scheduler (Celery in your case)
a concrete Model derived from Task
The derived Model knows which job has to be run; this is hardcoded in the Model definition; I also add specific fields to collect any custom parameter required by the job
Now, you can create a new task either programmatically or from the Django admin, supplying actual parameters as needed; in the latter case, Django provides the required form and validation as usual
After saving the new record in the database, the model kicks off the job, passing by the task (model) id; from the job you can retrieve task's details: which is the answer to the original question.
You can also update the progress/status in the model for later inspection, and use other services provided by the base class (for example, logging).
This schema has been proven very useful, since, as an added benefit, you can monitor async tasks from the Django admin.
Having used it in several projects, I encapsulated this logic in a reusable Django app:
https://github.com/morlandi/django-task
The current implementation is based on rq, as Celery is over-engineered for my needs. I guess it can be adapted to Celery with some modifications:
remove Task.check_worker_active_for_queue()
remove Task.get_queue()
refactor Task.run()
Additionally, the helper Job class must be refactored as follows:
replace rq.get_current_job() with the equivalent for Celery
Unfortunately, I haven't used Celery recently, so I can't give more detailed advices.
Is there a way to add expiry date to a Huey Dynamic periodic task ?
Just like there is an option in celery task - "some_celery_task.apply_async(args=('foo',), expires=expiry_date)"
to add expiry date while creating the task.
I want to add the expiry date while creating the Huey Dynamic periodic task. I used "revoke" , it worked as it supposed to , but I want to stop the task completely after the expiry date not revoking it . When the Huey dynamic periodic task is revoked - message is displayed on the Huey terminal that the huey function is revoked (whenever crontab condition becomes true).
(I am using Huey in django)
(Extra)
What I did to meet the need of this expiry date -
I created the function which return Days - Months pairs for crontab :
For eg.
start date = 2021-1-20 , end date = 2021-6-14
then the function will return - Days_Month :[['20-31',1], ['*','2-5'], ['1-14','6']]
Then I call the Huey Dynamic periodic task (three times in this case).
(the Days_Month function will return Day-Months as per requirement - Daily, Weekly, Monthly or repeating after n days)
Is there a better way to do this?
Thank you for the help.
The best solution will depend on how often you need this functionality of having periodic tasks with a specific end date but the ideal approach is probably involving your database.
I would create a database model (let's call it Job) with fields for your end_date, a next_execution_date and a field that indicates the interval between repetitions (like x days).
You would then create a periodic task with huey that runs every day (or even every hour/minute if you need finer grain of control). Every time this periodic task runs you would then go over all your Job instances and check whether their next_execution_date is in the past. If so, launch a new huey task that actually executes the functionality you need to have periodically executed per Job instance. On success, you calculate the new next_execution_date using the interval.
So whenever you want a new Job with a new end_date, you can just create this in the django admin (or make an interface for it) and you would set the next_execution_date as the first date where you want it to execute.
Your final solution would thus have the Job model and two huey decorated functions. One for the periodic task that merely checks whether Job instances need to be executed and updates their next_execution_date and another one that actually executes the periodic functionality per Job instance. This way you don't have to do any manual cancelling and you only need 1 periodic task that just runs indefinitely but doesn't actually execute anything if there are no Job instances that need to be run.
Note: this will only be a reasonable approach if you have multiple of these tasks and you potentially want to control the end_dates in your interface.
Does anyone know a way to generate a record in a table without the system being used by the user?
I need to generate something similar to a notification or reminder, with various data obtained from other tables, something similar to a report
Thank you
To run periodic tasks, you will need some sort of task scheduler like celery or huey. With that in place, you can just create and save instances of whatever model you have in mind from the task scripts and the task scheduler will repeat it periodically.
We want to collect data during the day and create an User Task once a day. How can that be done with camunda? Is there a possibility to use process variables or do we need to access our own database and mark the corresponding items as processed (as soon as we create the daily user task)?
Do we need to create these user tasks programmatically? (We are using embedded Spring Boot Camunda instance)
One very good option would be to use a Timer Start Event per the documentation here: https://docs.camunda.org/manual/7.10/reference/bpmn20/events/timer-events/#timer-start-event.
It seems that you may want to use that in conjunction with a Timer Intermediate Catching Event (https://docs.camunda.org/manual/7.10/reference/bpmn20/events/timer-events/#timer-intermediate-catching-event) in something like the following manner:
Start a process instance at a specific time in the morning with the Timer Start Event. Perhaps 6:30AM in your local time zone?
Execute certain steps to gather data, perhaps through external service invocations, etc.
At a specific time (in the afternoon?), create the User Task and present the data. The User Task could follow the Timer Intermediate Catching Event noted above.
I hope this helps!
We've got a pretty typical django app running on postgresql 9.0. We've recently discovered some db queries that have run for over 4 hours, due to inefficient searches in the admin interface. While we plan to fix these queries, as a safeguard we'd like to artificially constrain database query time to 15 seconds--but only in the context of a web request; batch jobs and celery tasks should not be bounded by this constraint.
How can we do that? Or is it a terrible idea?
The best way to do this would be to set up a role/user that is only used to run the web requests, then set the statement_timeout on that role.
ALTER ROLE role_name SET statement_timeout = 15000
All other roles will use the global setting of statement_timeout (which is disabled in a stock install).
You will need to handle this manually. That is checking for the 15 second rule and killing the queries that violate it.
Query pg_stat_activity and find the violators and issue calls to pg_terminate_backend(procpid) to kill the offenders.
Something like this in a loop:
SELECT pg_terminate_backend(pg_stat_activity.procpid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'TARGET_DB'
AND usename = 'WEBUSERNAME'
AND (now()-query_start) > '00:00:15';
As far as the timing goes, you could pass all of your queries through a class which, on instantiation, spawns two threads: one for the query, and one for a timer. If the timer reaches 15 seconds, then kill the thread with the query.
As far as figuring out if the query is instantiated from a web request, I don't know enough about Django to be able to help you. Simplistically, I would say, in your class that handles your database calls, an optional parameter to the constructor could be something like context, which could be http in the event of a web request and "" for anything else.