database design for multiple similar content types

database design for multiple similar content types - django

I've worked on multiple sites recently with similar content types but haven't gotten the design I'm looking to achieve.
I have multiple types of content article, interview, video, gallery, blog, etc. All of these models have very similar properties (title, slug, body, pub_date, etc). And since I'm using django and the admin, almost all the admin setting are identical as well. Most will only have one or two additional fields (ie. filename for video, author for blog).
Currents options are
Using single model "Post/Article" and then just have a type_of_content field. This gives me a single model which makes searches easier and faster and its easy to maintain one model. Managers could be used to pull certain types of content.
Have models 'Video, Interview, Audio' subclass a model called "Post/Article". Gains flexibility of working with different models without all the redundacy. Lots of joins though and all the admin code is still duplicated.
Be very redundant and create a separate model for each type of content even though they share the majority of fields. More stuff to maintain, not DRY at all but highest level of flexibility.
Any insight from someone with more experience would be great.
Thank you.

I don't have that much experience with Django, but it sounds like what you want to do is subclass off of an Abstract Base Class. This avoids creating a table for the abstract parent class, so you get the advantage of your option #2 without the need for joins.

Related

Good Django design practice to add a REST api later following DRY

I am starting a web application in pure Django. However, in the future, there might be a requirement for REST api. If it happens, the most obvious choice will be Django REST framework.
Both the "old-fashioned" and REST parts share the models, however, the views are slightly different (permissions definitions, for example) and forms are replaced with serializers. Doing it the most obvious way would mean to duplicate the application logic several times, and thus a failure to follow DRY principle, and so the code becomes unmaintainable.
I got an idea to write all the logic into models (since they are shared), but in such case, there will be no use of permission mixins, generic views and the code would not be among the nicest ones.
Now I ran out of ideas. What is the best practice here?

I'd try to keep things simple as you're not sure about the future requirements for the API, and guessing can introduce extra complexity that may not even be needed when requirements will be clear.
Both Django forms and Rest Framework serializers already offer you a declarative approach that abstracts away the boilerplate code needed for basic stuff, which normally accounts for most of your code anyway.
For example, one of your Django form could look like this:
class ArticleForm(ModelForm):
class Meta:
model = Article
fields = ['title', 'content']
And in the future the DRS serializer would be:
class ArticleSerializer(ModelSerializer):
class Meta:
model = Article
fields = ['title', 'content']
As you can see, if you try and stick to ModelForm and ModelSerializer, there won't be much duplication anyway. You can also simply store the fields list in a variable and just reuse that.
For more custom things, you can start by sharing logic into simple functions, for example:
def save_article_with_author(article_data, author_data):
# custom data manipulation before saving, consider that article_data will be a dictionary either if it comes from deserialized JSON (api) or POST data
# send email, whatever
This function can be shared between your form and serializer.
For everything related to data fetching, I'd try to use Model Managers as much as possible, defining custom querysets that can be resued e.g. for options by forms and serializers.
I tend to avoid writing any logic that doesn't directly read or write data into the model classes. I think that couples too much the business logic with the data layer. As an example, I never want to write any auth/permission checks into a save() method of a model, because that couples different layers too tightly.
As a rule of thumb, imagine this scenario: you add say permissions checks or the logic to send an email when a user is created overriding the save() method of your Article model.
Then, later on you're asked to write a simple manage command that batch-import users from a spreadsheet. At this point, what you did in your save() method really gets in the way, as you can freely access your data through your model without having to bother with permissions, emails and all of that.
Regarding the view layer and assuming you need to implement some shared auth/permission checks and you don't want to have separate views, you can use this approach:
https://www.django-rest-framework.org/topics/html-and-forms/
Blockquote
REST framework is suitable for returning both API style responses, and regular HTML pages. Additionally, serializers can be used as HTML forms and rendered in templates.
Here's some guidelines on how you could dynamically switch from HTML to JSON based to the request content type:
https://www.django-rest-framework.org/api-guide/renderers/#advanced-renderer-usage
This seems like a good option in your situation, I'd just write down a quick proof-of-concept before you go all in to see if you are not too limited for what you need to do.

Is there a model MultiField (any way to compose db models Fields in Django)? Or why would not that be a useful concept?

When building a Django application, we were exposed to (forms) MultiValueField and MultiWidget.
They seem like an interesting approach to compose their respective base classes, giving more modularity.
Yet, now it seems to us that the actual piece that would make those two shine bright would be a db.models.MultiField. Here is the reasoning:
It seems that, when using a ModelForm, Django is enforcing a strict 1-to-1 association between a models.Field and a forms.Field. Now, with forms.MultiValueField, despite this strict 1-to-1 association, you can have a single models.Field actually associated to the numerous forms.Field composing the forms.MultiValueField.
Yet, it is limited to the case where a single models.Field maps more naturally to several forms.Field. What seems very interesting would be the ability to associate any number of models.Fields to any number of forms.Field. The only piece that seems to be missing to get there is an hypothetical models.MultiField. It could communicate with the exterior through a compress() method (see. MultiValueField), and potentially a decompress() method in the other direction (see MultiWidget).
The questions would then be (assuming this requirement is not emerging from a misunderstanding of Django): Is there a way to compose modelds.Field in Django ? If not, why is such an empowering concept not implemented in this great framework ;) ?
EDIT: To give a motivating example, imagine we want to implement a partial date (a date that can be precise to a day, or just precise to a month and a year, or alternatively just a year), with a Model following the one presented in this answer, i.e.:
a DateField, representing the date
a CharField, to indicate whether the date is complete or month + year or just year.
This model is working just fine with the default ModelForm, but now we want to introduce some consistency check: if a date is only precise to the month (month + year or just year), its day part should be 1, and if it is only precise to the year, its month part should also be 1.
This is a cross-fields check (two different Fields from the ModelForm needs to be accessed to complete it), so it has to be implemented at the Form.clean() level. This check would need to be copy-pasted in each form containing a partial date, which goes against the DRY cherished by Django.
Now let's imagine Django is providing this hypothetical models.MultiField, which would be a composite models.Field. We could define a PartialDateclass deriving from MultiField and containing the two leaf fields defined above (DateField and CharField). We could now say that the form field corresponding to this single model field (in a ModelForm) is a class derived from forms.MultiValueField. This class could implement the consistency check above, at the field level: it is not a cross-field check anymore.
This way, we got rid of the code duplication: any model could use a PartialDate field, automatically making any ModelForms mapping to it use the forms.MultiValueField implementing the consistency check, whose code was only written in one place.
(This is a simple example, it is easy to imagine it can get way more complex in production code, with consistency check you do not want to copy paste)

Possible: Yes. You could either subclass djangos model class, or monkey-patch that class into the existing model module.
Just (educated) guessing: I think it is not missing, but not needed.
In DB-Applications, combined fields will almost always come with special business rules. So you will have to implement a different display and validation for each one of them anyway.
Which you already can easily do in the models.Form.
Maybe you should look at customization of models.Form?

Can I make Django admin reflect a hierarchy of models?

Assume a Django application with a few models connected by one-to-many relationships:
class Blog(models.Model):
...
class Post(models.Model):
blog = models.ForeignKey(Blog)
...
class Comment(models.Model):
post = models.ForeignKey(Post)
...
Conceptually, they form a hierarchy, a tree-like structure. I want the Django admin to reflect that. In particular:
in a changelist of posts, every post should have a link to the changelist of corresponding comments;
similarly, a post’s edit page should link to the changelist of comments from the top-right buttons area;
when I open that list of related comments, it needs to reflect the relationship in the breadcrumbs (something like: Posts › “Hello world” › Comments) and, ideally, also in the URL (post/123/comment/).
This should of course also apply to the other levels of the hierarchy.
Number 1 is pretty easy with a custom list_display entry and using the ?post__id= query to the comments changelist. But this is little more than a hack. Generally Django assumes my three models to be independent, top-level entities.
Is there a straightforward way to accomplish this? I guess I could override a bunch of templates and AdminModel methods, but perhaps there is a better solution for what seems like a common situation?

Are you sure you are not just looking at Django Admin Inline Models ?
There is no way that an automated admin will pick up your relationships, because in an RDBS there can be any number of foreign keys / one to one / many to many relations, and Django does not have a customized hierarchical behavior built in.
You can indeed edit the breadcrumb customizing an admin template if you want.
For relations you might also be interested into django MPTT that allows to make hierarchical model instances. Also see this question: Creating efficient database queries for hierarchical models (django) in that respect.

How is this a common situation? Consider the fact a model can have a virtually unlimited number of foreign key relationships, let alone visa versa. How would the admin 'know' how to represent this data the way a user requires without customizing things?
One would suggest you are used to work with content management systems rather than webframeworks (no pun intended). It's important to notice Django isn't a cms, but a webframework you can built on top of as you see fit. In a nutshell: 'Django is rather clueless and unaware of contextual requirements'.
Although the admin is quite a beast out-of-the-box, it can be hard to customize. There have been quite some discussions whether it should even be part of core. I can only suggest, if customizing things tends to get hacky, you should probably write your own 'admin', it's not that hard.

django model design questions

I need to store various info about some movies, books, games, and maybe other media. Starting from publisher to disc count in DVD-box. At first i thought about abstract Item model, with children Book, Movie, Game. But it's all hard-coded and not very scalable, i think. What if i would need to add some new item type?
Then I've read about virtual fields here
Django - designing models with virtual fields?
that got my attention. But looks DB heavy and not very search-able, am i wrong?
What are the best techniques for such cases?

I think you want a concrete Item superclass (since it will likely have common fields, ie title, copyright_date, publisher, etc) and subclasses for each subtype (and further sub-sub-classes if you like, ie from Toy to ActionFigure with number_of_joints field), using multi-table inheritance.
If you are just querying the Item model, this will be fast since Django's ORM won't join to the other tables (and will return Item objects which can then be converted to their "native" type by referencing item.subclassname. Likewise, you can query each of the subclass models individually with some efficiency.
Regarding searchability, if you are using an indexer efficiency doesn't matter too much since the indexing happens infrequently.

Django: django-transmeta - sorting comments

I have created an article site, where articles are published in several languages. I am using transmeta (http://code.google.com/p/django-transmeta/) to support multiple languages in one model.
Also I am using generic comments framework, to make articles commentable. I wonder what will happen if the same article will be commented in one language and then in another. Looks like all comments will be displayed on both variants....
The question actually is:
Is there a possibility to display only comments submitted with current language of the article?

I tried the approach of transmeta for translation of dynamic texts and I had the following experience:
You want another language, you need to change the database model which is generally undesirable
You need every item in both languages, which is not flexible
You have problems linking with other objects (as you point out in your question)
If you take the way of transmeta you will need two solutions:
The transmeta solution for translating fields in a model
For objects connected to a model using transmeta you will need an additional field to determine the language, say CharField with "en", "de", "ru" etc.
These were major drawbacks that made me rethink the approach and switch to another solution: django.contrib.sites. Every model that needs internationalization inherits from a SiteModel:
class SiteModel(models.Model):
site = models.ForeignKey(Site)
Every object that would need transmeta translation is connected to a site. Every connected object can determine its language from the parent object's site attribute.
I basically ran the wikipedia approach and had a Site object for every language on a subdomain (en., de., ru.). For every site I started a server instance that had a custom settings file which would set the SITE_ID and the language of the site. I used django.contrib.sites.managers.CurrentSiteManagerto display only the items in the language of the current site. I also had a manager that would give you objects of every language. I constructed a model that connects objects of the same model from different languages denoting that they are semantically the same (think languages left column on wikipedia). The sites all use the same database and share the same untranslated User model, so users can switch between languages without any problem.
Advantages:
Your database schema doesn't need to change for additional languages
You are flexible: add languages easily, have objects in one language only etc.
Works with (generic) foreign keys, they connect to an object and know what language it is. You can display the comments of an object and they will be in one language. This solves your problem.
Disadvantages:
It's a greater deal to setup: you need a django server instance for every site and some more glue code
If you need e.g an article in different languages, you need another model to connect them
You may not need the django Site model and could implement something that does the same without the need of multiple django server instances.
I don't know what you are trying to build and what I described might not fit to your case, but it worked out perfectly for my project (internationalized community platform built upon pinax: http://www.bpmn-community.org/ ). So if you disclose some more about your project, I might be able to advise an approach.
To finally answer your question: No, the generic comments will not work out of the box with transmeta. As you realised you will have to display comments in both languages for the article that is displayed in one language. Or you will have to hack into the comments and change the model and do other dirty stuff (not recommended). The approach I described works with comments and any other pluggable app.
To answer your questions:
Two Django instances can share one database, no problem there.
If you don't want two Django instances, but one, you will have to do the following: A middleware checks the incoming request, extracts desired language from URL (en.example.com or example.com/en/ etc.) and saves the language preference in the request object. The view will have to take the request object with the language and take care of the filtering of objects accordingly. Since there is no dedicated server for the language (like in the sites approach where the language is stored in the settings.py file), you can only get the language from the request and you will have to pass attributes from the request object to Model managers to filter objects.
You could try to fake a global language state in the django application with an approach like threadlocals middleware, however I don't know if this plays out nicely with django I18N engine (which is also does some thread magic).
If you want to go big with your site in multiple languages, I recommend going for the sites-approach.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js