I need to store various info about some movies, books, games, and maybe other media. Starting from publisher to disc count in DVD-box. At first i thought about abstract Item model, with children Book, Movie, Game. But it's all hard-coded and not very scalable, i think. What if i would need to add some new item type?
Then I've read about virtual fields here
Django - designing models with virtual fields?
that got my attention. But looks DB heavy and not very search-able, am i wrong?
What are the best techniques for such cases?
I think you want a concrete Item superclass (since it will likely have common fields, ie title, copyright_date, publisher, etc) and subclasses for each subtype (and further sub-sub-classes if you like, ie from Toy to ActionFigure with number_of_joints field), using multi-table inheritance.
If you are just querying the Item model, this will be fast since Django's ORM won't join to the other tables (and will return Item objects which can then be converted to their "native" type by referencing item.subclassname. Likewise, you can query each of the subclass models individually with some efficiency.
Regarding searchability, if you are using an indexer efficiency doesn't matter too much since the indexing happens infrequently.
Related
I'm currently working on a model in Django involving one model that can have a variety of different traits depending on what kind of object it is. So, let's say we have a model called Mammal, which can either be an Elephant or a Dolphin (with their own traits "tusk_length" and "flipper_length" respectively).
Basic OOP principles shout "polymorphism", and I'm inclined to agree. But, as I'm new to Django, I first want to know whether or not it is the best way to do so in Django. I've heard of plenty of examples of and some people giving their preferences toward singular giant models
I've already tried using GenericForeignKeys as described here: How can I restrict Django's GenericForeignKey to a list of models?. While this solution works beautifully, I don't like the inability to filter, and that the relationship is only one way. That is, while you can get a Dolphin from a Mammal object, you can't get the Mammal object from the Dolphin.
And so, here are my two choices:
Choice A:
from django.db import models
class Mammal(models.Model):
hair_length = models.IntegerField()
tusk_length = models.IntegerField()
flipper_length = models.IntegerField()
animal_type = models.CharField(max_length = 15, choices = ["Elephant", "Dolphin"]
Choice B:
from django.db import models
class Mammal(models.Model):
hair_length = models.IntegerField()
class Elephant(Mammal):
tusk_length = models.IntegerField()
class Dolphin(Mammal):
flipper_length = models.IntegerField()
Choice B, from what I understand, has the advantage of nicer code when querying and listing all Elephants or Dolphins. However, I've noticed it's not as straightforward to get all of the Elephants from a list of Mammals (is there a query for this?) without putting animal_type in the class, with default being dependent on the class.
This leads to another problem I see with polymorphism, which won't come up in this example above or my application, but is worth mentioning is that it would be difficult to edit a Dolphin object into an Elephant without deleting the Dolphin entirely.
Overall, is there any general preference, or any big reason I shouldn't use polymorphism?
My recommendation, in general with database design, is to avoid inheritance. It complicates both the access and updates.
In Django, try using an abstract class for your base model. That means a db table will not be created for it. Its fields/columns will be auto-created in its child models. The benefit is: code reuse in Django/Python code and a simple, flat design in the database. The penalty is: it's more work to manage/query a mixed collection of child models.
See an example here: Django Patterns: Model Inheritance
Alternatively, you could change the concept of "Mammal" to "MammalTraits." And include a MammalTraits object inside each specific mammal class. In code, that is composition (has-a). In the db, that will be expressed as a foreign key.
We ended up going with a large table with a lot of usually-empty columns. Our reasoning was that (in this case) our Mammal table was all we'd be querying over, and there was no (intuitive) way to filter out by certain types of Mammals besides manually checking whether they had a "dolphin" or "elephant" object, which then threw an error if they didn't. Even looking for the type of an object returned from a query that was definitely an Elephant still returned "Mammal". It would be hard to extend any Pythonic workarounds to writing pure SQL, which one of our data guys does regularly.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Django dynamic model fields
Good Morning guys!
Scenario is the following. For some models on Django, I would like to allow the end user to define his own fields. It would be great if I could keep all Django awesome features like the ORM, so I can still do calls like field__gte to search on the model, still have field validation according to field type, etc. I've thought about two ways of doing this, and I'm more than open for new suggestions. Any feedback would be VERY appreciated.
The first approach, is the Entity-Attribute-Value ( http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model ), which django already has an app for. See http://code.google.com/p/django-custom-field/
I think this would be an OK solution, but I lose the ability to do "mymodel.objects.filter(custom_field_x=something)". Maybe there's a way to regain the ORM, any ideas? But I've heard so many bad stories about this method that I'm little scared to use it.
The second approach would be to have a database table for each of the users (probably no more than a 1000). I've read django has something in the lines of inspectdb, which actually checks which fields are there and produces the model for you. This could be useful but I think maybe I should store the fields this particular user has created and somehow dinamically tell django, hey, we also have this fields in this model. Is this possible? I know it's generally bad to have different tables for each user, but considering this scenario, how would you guys rate this method, would it be ok to have one table for each user?
The model that requires custom fields is for example Person. They might want a custom field to store address, blood type, or any other thing.
MANY THANKS in advance! Have a nice sunday!
Very similar: How to create user defined fields in Django -- but only talks about the EAV, which I would like to avoid. I'm open for new ideas!
One approach is to use a NoSQL document-based solution such as MongoDB which allows you to store objects that have a fluid structure (no such restrictions as pre-defined columns).
Pros:
No restriction on custom field types, number of types of fields, etc.
Retains ORM functionality (django-mongodb)
Other various benefits of NoSQL - which you can read about online
Avoids EAV
Cons:
Need to setup NoSQL server
Additional knowledge required on NoSQL concepts (documents vs. tables)
You may have to maintain two databases - if you decide not to migrate your entire solution to NoSQL (multi-db)
EDIT:
After reading the comments its worth pointing out that depending on which NoSQL solution you go with, you may not need reversion support. CouchDB, for example has built in support for document versioning.
what about creating another model for storing user_defined_fields?
class UserDefinedField(models.Model):
#..................
user = models.ForeignKey(User)
field_name = models.CharField(max_length=50)
field_value = models.TextField()
Then you can do UserDefinedField.objects.filter(field_name=some_name,field_value=somevalue)
I'm trying to understand the purpose of Django Intermediary Models.
Conceptually, they seem to be equivalent to association classes in UML class diagrams. Is there any fundamental difference between the two that I should be aware of?
In spite of the apparent similarity, I've found several resources explaining the purpose of intermediary models, but none of them made any reference to "association classes", which makes me somewhat suspicious.
You're not likely to find any comparisons with UML diagrams in the Django literature - UML modelling isn't really a big thing in the Python world, in my experience.
But looking at your diagram, I'd agree that the concept does seem very similar. Don't forget that the ORM is just that, a mapping of relational concepts onto objects: in this case, the through table maps the intermediary table that is always created in a many-to-many relationship. The only difference is that you only need to specify it manually if you want to add extra information to that relationship, like the enrollment date in your link. If you don't need the extra fields, you don't need to specify the intermediary model, but the table still exists, containing just the foreign keys to each end of the M2M relationship.
They're used to store additional data about a many-to-many relationship. I'm sure this is blasphemy, but I think the best example is from the Ruby on Rails guides, which uses the association between patients and doctors. A doctor has many patients through appointments; a patient has many doctors through appointments as well; but you can't model this relationship directly, because an appointment also has a date and time.
I think you are right that conceptually, they server a similar purpose to association classes in UML.
This is how many-to-many relation is to be implemented in any relational database, it is a fundamental part of relational database design. So I suggest to learn about database design principles first because knowing how database works is necessary for using ORM properly anyway.
wikipedia on Many-to-many
Is it possible to implement 'expando' model in Django, much like Google App Engine has? I found a django app named django-expando on github but it's still in early phase.
It's possible, but it would be a kludge of epic proportions. GAE uses a different database design known as a column-based database, and the Django ORM is designed to link with relational databases. Since technically everything in GAE is stored in one really big table with no schema (that's why you don't have to syncdb for GAE applications), adding arbitrary fields is easy. With relational databases, where each table stores exactly one kind of data (generally) and has a fixed schema, arbitrary fields aren't so easy.
One possible way you could implement this is to create a new model or table for expando properties that stores a table name, object ID, and a TextField for pickled data, and then have all expando models inherit from a subclass that overrides the __setattr__ and __getattr__ methods that will automatically create a new row in this table. However, there are a few major problems with this:
First off, it's a cheap hack and is contrary to the principles of relational databases.
Second, it is not possible to query these expando fields without even more hacks, and even so it would be ludicrously slow.
My recommendation is to find a way to design your database structure so that you don't need expando models.
I've worked on multiple sites recently with similar content types but haven't gotten the design I'm looking to achieve.
I have multiple types of content article, interview, video, gallery, blog, etc. All of these models have very similar properties (title, slug, body, pub_date, etc). And since I'm using django and the admin, almost all the admin setting are identical as well. Most will only have one or two additional fields (ie. filename for video, author for blog).
Currents options are
Using single model "Post/Article" and then just have a type_of_content field. This gives me a single model which makes searches easier and faster and its easy to maintain one model. Managers could be used to pull certain types of content.
Have models 'Video, Interview, Audio' subclass a model called "Post/Article". Gains flexibility of working with different models without all the redundacy. Lots of joins though and all the admin code is still duplicated.
Be very redundant and create a separate model for each type of content even though they share the majority of fields. More stuff to maintain, not DRY at all but highest level of flexibility.
Any insight from someone with more experience would be great.
Thank you.
I don't have that much experience with Django, but it sounds like what you want to do is subclass off of an Abstract Base Class. This avoids creating a table for the abstract parent class, so you get the advantage of your option #2 without the need for joins.