How to use Django admin to create custom tasks using Celery - django

I have a Django backend that I've created for a real estate company. I've built quite a few tasks that are hardcoded and I'm wanting to make those tasks customizable via the admin page... but I can't quite figure out how to do that.
For example, let's say that I wanted to create a task that would send an email that could be customized from the admin page. Ideally, I'd have a list of triggers to choose from like a contact form submission.
Something that looked like this:

I had the very same need on many occasions, and solved it as follows:
I have an abstract "base" Task model with a few fields common to any "generic" background task: created_on, started_on, completed_on, status, progress, failure_reason and a few more
When a new specific task is needed, I write:
a job function to be run by the scheduler (Celery in your case)
a concrete Model derived from Task
The derived Model knows which job has to be run; this is hardcoded in the Model definition; I also add specific fields to collect any custom parameter required by the job
Now, you can create a new task either programmatically or from the Django admin, supplying actual parameters as needed; in the latter case, Django provides the required form and validation as usual
After saving the new record in the database, the model kicks off the job, passing by the task (model) id; from the job you can retrieve task's details: which is the answer to the original question.
You can also update the progress/status in the model for later inspection, and use other services provided by the base class (for example, logging).
This schema has been proven very useful, since, as an added benefit, you can monitor async tasks from the Django admin.
Having used it in several projects, I encapsulated this logic in a reusable Django app:
https://github.com/morlandi/django-task
The current implementation is based on rq, as Celery is over-engineered for my needs. I guess it can be adapted to Celery with some modifications:
remove Task.check_worker_active_for_queue()
remove Task.get_queue()
refactor Task.run()
Additionally, the helper Job class must be refactored as follows:
replace rq.get_current_job() with the equivalent for Celery
Unfortunately, I haven't used Celery recently, so I can't give more detailed advices.

Related

Making backup of db skiping part of data in django [duplicate]

Is it possible to selectively filter which records Django's dumpdata management command outputs? I have a few models, each with millions of rows, and I only want to dump records in one model fitting a specific criteria, as well as all foreign-key linked records referencing any of those records.
Consider this use-case. Say I had a production database where my User model has millions of records. I have several other models (Log, Transaction, Purchase, Bookmarks, etc) all referencing the User model. I want to do development on my Django app, and I want to test using realistic data. However, my production database is so enormous, I can't realistically take a snapshot of the entire thing and load it locally. So ideally, I'd want to use dumpdata to dump 50 random User records, and all related records to JSON, and use that to populate a development database.
Is there an easy way to accomplish this?
I think django-fixture-magic might be worth a look at.
You'll find some additional background info in Scrubbing your Django database.
This snippet might be helpful for you (it follows relationships and serializes them):
http://djangosnippets.org/snippets/918/
You could use also that management command and override the default managers for whichever models you would like to return custom querysets.
This isn't a simple answer to my question, but I found some interesting docs on Django's built-in natural keys feature, which would allow representing serialized records without the primary key. Unfortunately, it doesn't look like this is fully integrated into dumpdata, and there's an old outstanding ticket to fully rely on natural keys.
It also seems the serializers.serialize() function allows serialization of an arbitrary list of specific model instances.
Presumably, if I implemented a natural_key() method on all my models, and then called serializers.serialize([Users.objects.filter(criteria)]), it should come close to accomplishing what I want. I might have to write a function to crawl all the FK references, and include those in the list of objects passed to serialize().
This is a very old question, but I recently wrote a custom management command to do just that. It looks very similar to the existing dumpdata command except that it takes some extra arguments to define how I want to filter the querysets and it overrides the get_objects function to perform the actual filtering:
def get_objects(dump_attributes, dump_values):
qs_1 = ModelClass1.objects.filter(**options["filter_options_for_model_class_1"])
qs_2 = ModelClass2.objects.filter(**options["filter_options_for_model_class_2"])
# ...repeat for as many different model classes you want to dump...
yield from chain(qs_1, qs_2, ...)
I had the same problem but i didn't want to add another package and the snippet still didn't let me to filter my data and i just want a temporary solution
So i thought with my self why not override the default manager apply my filter there, take the dump and then revert my code back. This is of course too hacky and dangerous but in my case made sense.
Yes I had to vim code on live server but you don't need to reload the server since running command through manage.py would run your current code base so the server from the end-user perspective basically remained on-touched.
from django.db.models import Manager
class DahlBookManager(Manager):
def get_queryset(self):
return super().get_queryset().filter(is_edited=False)
class FriendshipQuestion(models.Model):
objects = DahlBookManager()
and then running the dumpdata command did exactly what i needed which was returning all the unedited questions in my case.
Then I git checkout mymodelfile.py to revert it back to the original.
This by no mean is a good solution but it will get somebody either fired or unstuck.
As of Django 3.2, you can use dumpdata to dump a specific app and/or model. For example, for an app named customer:
python manage.py dumpdata customer
or, to dump a model named shoppingcart within the customer app:
python manage.py dumpdata customer.shoppingcart
There are many options with dumpdata, including writing to several output file formats and handling custom managers on models. For example:
python manage.py dumpdata customer --all --indent 4 --output my_fixtures.json
The options:
--all: dumps the records even if you use a custom manager on the model
--indent : amount to indent when writing to file
--output : Send output to a file instead of stdout. Default format is JSON.
See the docs at:
https://docs.djangoproject.com/en/3.2/ref/django-admin/#dumpdata

How to create a log database in django to record the user that created, updated or deleted an object?

I want to record the user in a database who created, updated or deleted an object there using django. I've found a simple solution with threadlocal and a logging abstract class from here : Why is using thread locals in Django bad? ( but that is discouraged ).
But the problem with this solution is that it is extremely difficult to write any unit test. So what would be a better solution for logging event based information about a user who created, updated, or deleted an object in django?
You can try django-simple-history. (https://django-simple-history.readthedocs.io/en/latest/querying_history.html)
It provides an history in django admin or querying history from python code

Beginner Questions about django-viewflow

I am working on my first django-viewflow project, and I have some very basic questions. I have looked at the docs and the cookbook examples.
My question is which fields go into the "normal" django models (models.Model) and which fields go into the Process models? For example, I am building a publishing model, so a document that is uploaded starts in a private state, then goes into a pending state after some processing, and then an editor can update the documents state to publish, and the document is available through the front facing web site. I would assume the state field (private, pending, publish) are part of a process model, but what about the other fields related to the document (author, date, source, topic, etc.)? Do they go into the process model or the models.Model model? Does it matter? What are the considerations in building the models and flows for separation of data between the two types of models?
Another example - why in the Hello World example is the text field in the Process model and not a model.Models model? This field does not seem to have anything to do with the process, but I am probably not understanding how viewflow works.
Thanks!
Mark
That's your choice. Viewflow is the library and has no restriction on data alignment. The only thing that needs to be done is the link between process_pk and the process data. HelloWord is the minimal working sample, that demonstrates a workflow.
You can put everything in the separate mode and provide an FK to in the Process model.
But the state field itself is the antipattern since eventually, you can have several tasks executed in parallel. And even sequential workflow could be constantly changed, new tasks could be added or deleted. You can have only published Boolean or DateTime field in the POST model to filter on that on the front end.
The general rule could be - keep all people workflow decisions in the Process model, and build all data models in a declarative way, keep separated workflow and actual data.

Create an object from Model B while editing an object from Model A?

Hi I'm pretty new to Django and I'm trying to create an app with a Project model, and a Task model. Every project has one or more tasks. The thing I want to do is add a TaskHistory model, and every task has one or more taskhistory. Everytime I change something in my task (because this will be something that I'll have to edit a lot) I want a new associated TaskHistory to be created. Is it possible ?
It's possible, but something that is likely to be coded at the application/view layer rather than directly in the Django models. What you're attempting appears to closely match the log entries that are created as part of the standard django.contrib.admin application so you should look in this app for ideas. django.contrib.admin will log an entry to a LogEntry table every time an object is updated, created or deleted in the admin interface.
It's likely for your application that you'll need to store changes in model content such as a change in a task description, not simply whether a task was created. To achieve this, you're likely to require both the current task object and the updated task details to be able to create a TaskHistory object.
Each view that is capable of modifying a task would also include logic which could create a TaskHistory object, and save both the updated/new Task and the TaskHistory objects as independent model objects, possibly wrapped in a database level transaction to make the changes appear atomically.

Filtering of data according to user logged in Django app

I have a Django app that works well for me, but currently has no notion of user: I am the only one using it, and I would like to change this...
Except for the admin views, the logged-in user should not have access to the data created by other users. There is no shared data between users.
I suppose I have to add a user foreign key to all the models I created. Correct?
Is there a simple way to implement the filtering based on request.user? Can this be done more or less automatically, or do I have to go through all the code to check each and every query done on the database?
I have written the code using TDD, and I intend to follow up... What are the best strategies to ensure that user-filtering is implemented correctly, e.g. that I did not forget to filter an existing query? I suppose I can write tests that show that a particular query is not yet filtered, and implement the filter. But what about the queries that I will write later? Is there a way I can assert that all existing and future queries return objects that only belong to the current user?
Thanks.
Yes, you'll need to add a User FK. Don't forget you'll have to migrate your database tables - either manually, or via a tool like South.
One way of implementing the filter would be to define custom Managers for your models, with a for_user method that takes the User as an argument: something like:
class ForUserManager(models.Manager):
def for_user(self, user):
return self.filter(user=user)
Now you can use this manager - subclassed and/or with a mixin as necessary - on all your models, and remember to use objects.for_user(request.user) everywhere.
This will make testing easier too - your test could monkeypatch that for_user method so that it sets a flag or a counter in a global variable somewhere, and then test that it has incremented as expected.
Edit in response to comment No, as you suspect, that won't work. It's not even that everyone will necessarily get the last-logged-in user: it's that Managers are class-level attributes, and as such are reused throughout a process, so any request served by that server process will use the same one.