How to capture complicated database transactions over multiple forms in Django - django

I need to capture some fairly complicated database changes from my users, including both updating and creating objects for multiple models.
I feel like the obvious way to do this would be by leveraging a sizeable amount of Javascript to create a JSON object containing all the necessary changes that can be POSTed in a single form. I am not keen on this approach as it prevents me from utilizing Django's CreateView and UpdateView classes, as well as the validation that comes with them. Also I am more comfortable in Python than Javascript.
I want to use a series of form POSTs to build up the necessary changes over time, but also need the transaction to be atomic, which, as far as I know, is not possible in Django. Another complication is that the models contain non-nullable fields and I would need to create objects before capturing the user input required to fill them. I do not want to make these fields nullable or use placeholders as this would make it more difficult to validate.
One approach I am considering is to create a duplicate of each of the necessary models to store partial objects. All fields would be nullable so the objects could be updated a bit at a time until all the forms have been POSTed. Objects in the original (main) model could then be created or updated to match the ones in the new (partial) model, which could then be deleted.
class Product(models.Model):
field_a = models.CharField(max_length=255)
field_b = models.PositiveIntegerField()
class PartialProduct(models.Model):
field_a = models.CharField(max_length=255, blank=True, null=True)
field_b = models.PositiveIntegerField(blank=True, null=True)
The benefits of this approach as I see are:
A multi-form approach, leveraging Django's model forms and related views as well as model validation.
Not polluting the main models with incomplete objects.
Enforcing fields not being null in the main models.
The potential drawbacks I can see are:
Duplicating any changes to the main model in the partial model (the approach is not DRY).
It is a somewhat complicated approach (Simple is better than complex)
Are there any drawbacks to using this approach that I have not foreseen, or is there a better one I could use?

Related

Django: Two models with OneToOneField vs a single model

Let's imagine a I have a simple model Recipe:
class Recipe(models.Model):
name = models.CharField(max_length=constants.NAME_MAX_LENGTH)
preparation_time = models.DurationField()
thumbnail = models.ImageField(default=constants.RECIPE_DEFAULT_THUMBNAIL, upload_to=constants.RECIPE_CUSTOM_THUMBNAIL_LOCATION)
ingredients = models.TextField()
description = models.TextField()
I would like to create a view listing all the available recipes where only name, thumbnail, preparation_time and first 100 characters of description will be used. In addition I will have a dedicated view to render all remaining details for a single recipe.
From the efficiency point of view, since description may be a long text, would it make sense to store the extra information in a separate model, let's say 'RecipeDetails' which would not be extracted in a list view but only in a detailed view (maybe using prefetch_related method)? I am thinking about something along:
class Recipe(models.Model):
name = models.CharField(max_length=constants.NAME_MAX_LENGTH)
preparation_time = models.DurationField()
thumbnail = models.ImageField(default=constants.DEFAULT_THUMBNAIL, upload_to=constants.CUSTOM_THUMBNAIL_LOCATION)
description_preview = models.CharField(max_length=100)
class RecipeDetails(models.Model):
recipe = models.OneToOneField(Recipe, related_name="details", primary_key=True)
ingredients = models.TextField()
description = models.TextField()
In my recent online searches people seem to suggest that OneToOneField should be used only for two purposes: 1. inheritance and 2. extending existing models. In other cases two models should be merged into one. This may suggest I am missing something here. Is this a reasonable use of OneToOneField or does it only add to a complexity of an overall design?
inheritance
Don't do that, because inheritance would only be useful if you have baseclass/subclass relationship. The classic example is animal and cat/dog, in which the cats/dogs all have some basic properties that could be extracted, but your Recipe and RecipeDetail don't.
From the efficiency point of view, since description may be a long
text, would it make sense to store the extra information in a separate
model
Storing extra information in a separate model doesn't improve any efficiency. The underline database would create something like a ForeignKey field and plus unique=True to make sure the uniqueness. As far as I concerned, OneToOneField is only useful when your original model is hard to change, e.g., it is from third-party packages or some other awkward situations. Otherwise I still consider adding them to the Recipe model. In this case, you can manage your model easily while avoiding having some extra lookups like recipe.recipedetail.description, you can just do recipe.description.
No, it's not reasonable to split your Recipes. First, your model should contain all properties for being a "Recipe" (and a recipe without ingredients is not a recipe at all). Second, if you want to improve performance, then use the Django's Cache Framework (it was created exactly for improving performance issues). Third, keep it simple and do not over-engineering your development cycle. Do you really need to improve performance right now?
Hope it helps!
First mistake in development, you are thinking in efficiency before your first version is running.
Try to have now a first version, that runs, and later you can think in be more faster based in use cases with your first version. After this you can check if a model and relations, or only a new field in model or using Django Cache for views can do the work.
Your think in efficiency first will be "de-normalize" your Database btw, when one update in the model with full description is done, you need to launch one update to the model with "description-preview" field. trigger in database level? python code for update in app level? nightmares in code design ... before your code runs.

What are the pros and cons of using GenericForeignKey vs multitable inheritance vs OneToOneField?

Context
I am in the process of modeling my data using Django models.
The main model is an Article. It holds the actual content.
Then each Article must be attached to a group of articles. Those group may be a Blog, a Category a Portfolio or a Story. Every Article must be attached to one, and exactly one of those. That is, either a blog, a category or a story. Those models have very different fields and features.
I thought of three ways to reach that goal (and a bonus one that really looks wrong).
Option #1: A generic foreign key
As in django.contrib.contenttypes.fields.GenericForeignKey. It would look like this:
class Category(Model):
# some fields
class Blog(Model):
# some fields
class Article(Model):
group_type = ForeignKey(ContentType)
group_id = PositiveIntegerField()
group = GenericForeignKey('group_type', 'group_id')
# some fields
On the database side, that means no relation actually exists between the models, they are enforced by Django.
Option #2: Multitable inheritance
Make article groups all inherit from an ArticleGroup model. This would look like this:
class ArticleGroup(Model):
group_type = ForeignKey(ContentType)
class Category(ArticleGroup):
# some fields
class Blog(ArticleGroup):
# some fields
class Article(Model):
group = ForeignKey(ArticleGroup)
# some fields
On the database side, this creates an additional table for ArticleGroup, then Category and Blog have an implicit foreign key to that table as their primary key.
Sidenote: I know there is a package that automates the bookkeeping of such constructions.
Option #3: manual OneToOneFields
On the database side, it is equivalent to option #2. But in the code, all relations are made explicit:
class ArticleGroup(Model):
group_type = ForeignKey(ContentType)
class Category(Model):
id = OneToOneField(ArticleGroup, primary_key=True)
# some fields
class Blog(Model):
id = OneToOneField(ArticleGroup, primary_key=True)
# some fields
class Article(Model):
group = ForeignKey(ArticleGroup)
# some fields
I don't really see what the point of that would be, apart from making explicit what Django's inheritance magic implicitly does.
Bonus: multicolumn
It seems pretty dirty so I just add it as a bonus, but it would also be possible to define a nullable ForeignKey to each of Category, Blog, ... directly on the Article model.
So...
...I cannot really decide between those. What are the pros and cons of each approach? Are there some best practices? Did I miss a better approach?
If that matters, I'm using Django 1.8.
It seems noone had advice to share on that one.
I eventually chose the multicolumn option, despite having said it looked ugly. It all came down to 3 things:
Database-based enforceability.
The way Django ORM works with the different constructs.
My own needs (namely, collection queries on the group to get the item list, and individual queries on the items to get the group).
Option #1
Cannot be enforced at the database level.
Could be efficient on queries because the way it is constructed does not fall into usual generic foreign key pitfalls. Those happen when the items are generic, not the collections.
However, due to how the ORM handles GFK, it is impossible to use a custom manager, which I need because my articles are translated using django-hvad.
Option #2
Can be enforced at the database level.
Could be somewhat efficient, but runs into ORM limitations, which is clearly not built around this use. Unless I use extra() or custom queries alot, but at some point there is no reason to use an ORM anymore.
Option #3
Would actually be a bit better than #2, as making things explicit allows easier query optimisation while using the ORM.
Multicolumn
Turns out not being so bad. It can be enforced at the database level (FK constraints plus a manual CHECK to ensure only one of the columns is non-null).
Easy and efficient. A single intuitive query does the job: select_related('category', 'blog', ...).
Though it does have the issue of being harder to extend (any new type will require altering the Article's table as well) and limiting the possible number of types, I'm unlikely to run into those.
Hope it helps anyone with the same dilemma, and still interested in hearing other opinions.

Creating Dynamic Forms in Django

I'm working on a project that involves a form with some standard fields and some custom field users define later. The standard forms are defined on a model in models.py. For example:
class Order(models.model):
number = models.TextField()
date = models.DateField()
I then use this model to create a simple model form to make a way to fill in the information. That's pretty standard Django.
The tricky thing is that my users want to be able to add arbitrary fields to the form. They would like to be able to use the Admin interface to basically modify the form and add values to it at run time.
So, they might want a new text field called "Tracking Number" or something like that. The trick is that they only need it sometimes and they want to be able to add it dynamically without rebuilding the whole database.
I can create a fairly simple model to represent the custom fields like so:
class CustomField(models.Model):
type = models.CharField(choices=FIELD_TYPES)
required = models.BooleanField()
I think I can then take the ModelForm for the Order class and extend it to add these custom fields. What I am unsure of is how to link the custom field values back to the Order.
I know this all might sound odd, but in practice it makes sense. Each user has slightly different needs for the form and want to tweak it. If I have to hard code the models to have their specific fields, then I will have to have a fork for each user. That simply doesn't scale. If instead they can simply add the fields through the admin interface, then things are much simpler.
I feel like this is something that is perhaps already solved by someone out there. I simply cannot find a solution. I can't be the only one who has gotten this kind of request right?

Django: using ContentType vs multi_table_inheritance

I was having a similar problem as in
How to query abstract-class-based objects in Django?
The thread suggests using multi_table_inheritance. I personally think using content_type more conceptually comfortable (just feels more close to logic, at least to me)
Using the example in the previous link, I would just add a StelarType as
class StellarType(models.Model):
"""
Use ContentType so we have a single access to all types
"""
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = generic.GenericForeignKey('content_type', 'object_id')
Then add this to the abstract base model
class StellarObject(BaseModel):
title = models.CharField(max_length=255)
description = models.TextField()
slug = models.SlugField(blank=True, null=True)
stellartype = generic.GenericForeignKey(StellarType)
class Meta:
abstract = True
To sync between StellarObject and StellarType, we can connect post_save signal to create a StellarType instance every time a Planet or Star is created. In this way, I can query StellarObjects through StellarType.
So I'd like to know what's the PRO and CON of using this approach against using multi_table_inheritance? I think both create an additional table in the databse. But how about database performance? how about usability/flexibility? Thanks for any of your input!
To me, ContentType is the way to go when you want to relate an object to one of many models that aren't fundamentally of the same "type". Like if you want to be able to key Comments to Users, Pages, and Pictures on a social network, but there's no reasonable supertype shared by those three models. Sure you could create a "Commentable" supertype, but to me that feels more like a mixin than a fundamental type from which those three things derive. Before ContentType came out, you would have had no choice but to invent supertypes for these kind of relations, which can get really ugly really quickly if you need to do it multiple times in the same application (lets say you also have Events, Alerts, Messages, etc., each of which can apply to a different set of models).
Multi-table inheritance makes the most sense when you want to attach attributes to the base model, such that they will be shared in all concrete models that extend from it, so that you can get polymorphic behavior. Commentable doesn't really fit this mold, because all of that behavior can be put on the Comment model, less so on the Commentable objects. But if you have different classes of Users that share much of the same behavior and should be aggregable, then it makes a lot more sense.
The major pro of multi-table inheritance to me is a cleaner data model, with implicit relationships and inheritance that can be taken advantage of on the Python side (polymorphism is still a bit messy though, as seen here and here). The major pro of ContentType is that it is more general and keeps auxiliary functionality out of your models, at the cost of a bit of a slightly less pristine schema (lots of "meta" fields on your models to define these relationships). And for your example, you still have to rely on post_save, which seems unnecessarily messy/magical to me, as well.
Sorry for reviving old thread. I think it all boils down to the lookup direction. Whether you look up all subclasses for a certain FK (multitable inheritance) or define the referenced class as a content type and look it up based on the table reference and id (contenttypes) makes no big difference in performance - hint: they both suck. I think content types is a nice choice if you want your app to be easily extendible, i.e. others can add new content types to reference against. Multitable is good if you only sometimes need the extra columns defined in extra tables. Sometimes it might also be a good idea to merge all your subtypes and make only one which has a few fields left empty most of the time.

Django Model field with multiple types?

I have the following (simplified) models:
class Structure(models.Model):
name=models.CharField(max_length=100, unique=True)
class Unit(models.Model):
name=models.CharField(max_length=100, unique=True)
Each model, also has a builtFrom field, which shows what the item is built from, for example:
class Unit(models.Model):
name=models.CharField(max_length=100, unique=True)
builtFrom=models.ForeignKey(Structure)
However, builtFrom can be populated from either a Unit type, or a Structure type. Is there an easy way to represent this in my models?
The only thing I can think of is to have a separate model, like so:
class BuiltFromItem(models.Model):
structure=models.ForeignKey(Structure)
unit=models.ForeignKey(Structure)
class Unit(models.Model):
name=models.CharField(max_length=100, unique=True)
builtFrom=models.ForeignKey(BuiltFromItem)
And then have one of the BuiltFromItem fields just be null. Then, when I need the data, figure out whether it is a structure or unit that it is built from. Is there a better solution for this?
You want what the Django docs refer to as a "generic relation". Support for them is built into Django.
Generic relation is probably the best approach, yet it can be a little problematic, if you're planning to manage such models via admin panel. You would then have to add a ModelInline to the models, that generic relation is pointing to, but as far as I know (correct me if I'm wrong), there's no convenient way of picking related object from the other side (from model, where relation is defined), other than choosing model class and manually typing instance primary key.
Picking the best solution actually depends on structure of your models and on what they have in common. Another idea I have, is to use Multi-table inheritance, by defining some BasicObject that is a parent object to Structure and Unit models:
class BasicObject(models.Model):
name=models.CharField(max_length=100, unique=True)
#other common data
builtFrom=models.ForeignKey('BasicObject')
class Structure(BasicObject):
#data specific to Structure
class Unit(BasicObject):
#data specific to Unit
Now all Structure and Unit object will also be BasicObject instances, and you will be able to populate builtFrom field with proper BasicObject instance. It makes queries more expensive, because data is divided into two diffrent tables, so you should consider if this approach is beneficial in your case.