Django Nonrel - Working around multi-table inheritance with noSQL? - django

I'm working on a django-nonrel project running on Google's AppEngine. I want to create a model for a Game which contains details which are generally common to all sports - i.e. gametime, status, location, etc. I've then modelled specific classes for GameBasketball, GameBaseball etc, and these inherit from the base class.
This creates a problem however if I want to retrieve something like all the Games on a certain day:
Game.objects.filter(gametime=mydate)
This will return an error:
DatabaseError: Multi-table inheritance is not supported by non-relational DBs.
I understand that AppEngine doesn't support JOINs and so it makes sense that this fails. But I'm not sure how to properly tackle this problem in a non-relational environment. One solution I've tried is to turn Game into an abstract base class, and while that allows me to model the data in a nice way - it still doesn't resolve the use case above since its not possible to get objects for an abstract base class.
Is the only solution here to put all the data for all possible sports (and just leave fields that aren't relevant to a specific sport null) in the Game model, or is there a more elegant way to solve this problem?
EDIT:
I'm more interested in understanding the correct way of handling this type of issue in any noSQL setup, and not specifically on AppEngine. So feel free to reply even if your answer isn't GAE specific!

For anyone else who has this issue in the future, here is how I eventually went about resolving it. Keep in mind that this is somewhat of a mongo-specific solution, and AFAIK there is no easy way to solve this just using base nonrel classes.
For Mongo I able to solve this by using Embedded Documents (other noSQL setups may have similar soft-schema type features), simplified code below:
class Game(models.Model):
gametime = models.DateTimeField()
# etc
details = EmbeddedModelField() # This is the mongo specific bit.
class GameBasketballDetails(models.Model):
home = models.ForeignKey(Team)
# etc
Now when I instantiate the Game class I save a GameBasketballDetails object in the details field. So I can now do:
games = Game.objects.filter(gametime=mydate)
games[0].details.basketball_specific_method()
This is an acceptable solution since it allows me to query on the Game model but still get the "child" specific info I need!

The reason multi-table inheritance is disallowed in django-nonrel is because the API that Django provides for those sort of models will often use JOIN queries, as you know.
But I think you can just manually set up the same thing Django would have done with it's model inheritance 'sugar' and just avoid doing the joins in your own code.
eg
class Game(models.Model):
gametime = models.DateTimeField()
# etc
class GameBasketball(models.Model):
game = models.OneToOneField(Game)
basketball_specific_field = models.TextField()
you'll need a bit of extra work when you create a new GameBasketball to also create the corresponding Game instance (you could try a custom manager class) but after that you can at least do what you want, eg
qs = Game.objects.filter(gametime=mydate)
qs[0].gamebasketball.basketball_specific_field
django-nonrel and djangoappengine have a new home on GitHub: https://github.com/django-nonrel/
I'm not convinced, next to the speed of GAE datastore API itself, that the choice of python framework makes much difference, or that django-nonrel is inherently slower than webapp2.

Related

Django: storing model property on a field vs. on a different model

I am relatively new to Django and even database design and I have some thoughts I'd like to run by some other people. This isn't really a specific question; I just want to see how other people think about this stuff.
Let's say we have a model for an application to some service. It contains all the ordinary stuff you might imagine an application to contain:
class Application(models.Model):
first_name = CharField(max_length=255)
last_name = CharField(max_length=255)
date_of_birth = DateField()
married = BooleanField()
# ...other stuff
Okay, that's all well and good. But now, imagine the webapp you are writing has the feature that you can complete your application partially, save it, and come back to it later. One way to do this is to add another attribute to the model above:
complete = BooleanField()
It works, it is pretty simple to use, but I don't really like it because it muddies the semantics of an application; it adds information that isn't intrinsically connected to the application. Another approach would be to create another model that keeps track of complete applications:
class CompleteApplication(models.Model):
application = ForeignKey(Application)
I like this a bit better, since it keeps Application clean. However, it does have the disadvantage of messing up queries. Here are the two ways to query all complete applications in the system:
Method 1:
completed_applications = Application.objects.filter(complete=True)
Method 2:
pks = CompleteApplication.objects.all().values_list("application__pk")
complete_applications = Application.object.filter(pk__in=pks)
Method 2 is two lines of code vs. one and also two queries whereas previously one sufficed, so the database performance is going to take a hit.
There is a third way to do things: instead of creating a model that keeps track of complete applications, we could create a metadata model that stores any metadata that we might want to attach to the Application model. For our purposes, this model can contain a field that tracks completeness. However, this approach also has the benefit of allowing for an arbitrary number of metadata fields to be associated with each application without requiring a new DB table for each (as is the case with Method 2 above).
class ApplicationMeta(models.Model):
application = ForeignKey(Application)
complete = BooleanField()
And, for completeness (pun intended), to query all complete applications, we would use the following statement:
completed_applications = Application.objects.all(applicationmeta__complete=True)
Nice and simple, just like Method 1, but the query is certainly more work for the database. This method also has another drawback for certain applications. Pretend, for example, that we want to track some additional information about applications: they can be confirmed, or rejected. However, if an application is not confirmed, it does NOT necessarily mean it is rejected: it could be pending review. Additionally, let's say we want to track the date of confirmation and the date of rejection (if either is applicable, of course). Then, our metadata model becomes the following:
class ApplicationMeta(models.Model):
complete = BooleanField()
confirmed = BooleanField()
rejected = BooleanField()
date_confirmed = DateField()
date_rejected = DateField()
Okay...this works, but it is starting to be a mess. Firstly, we have now opened up our system to potential error: what if somehow an ApplicationMeta instance has both rejected and confirmed set to True? We could do some fancy footwork with our class (maybe override setattr) to throw an exception if something funny happens, so we can prevent from persisting to the DB, but this is added complication that I hope is not necessary. Further, any model will either have at most one of date_confirmed or date_rejected set. Is that a problem? Here, I am not actually certain. My guess is this is likely a waste of space, but I don't actually know that. This example is simple, what if more complicated examples present us with tons of fields that will necessarily not be filled? Seems like bad design.
I'd love to hear some thoughts on these ideas.
Thanks!
If you have a huge amount of possible metadata, the third approach might make sense for performance reasons. I wouldn't do it for a few boolean- and date columns. If you're concerned about the readability of the models themselves, you can factor out any metadata into an abstract base model. You can even reuse the abstract model for other models that require the same metadata. The information will still live in your Application model.
If you do take the second or third approach, I would use a OneToOneField rather than a ForeignKey. It ensures that there are no 2 possible ApplicationMeta models for a single Application, and has the added benefit of a UNIQUE database index.
As for the status of an application, the NullBooleanField is designed for exactly that. It start as None (NULL in the db) meaning "no value". It can then be set to True (accepted) or False (rejected).

Django - Polymorphic Models or one big Model?

I'm currently working on a model in Django involving one model that can have a variety of different traits depending on what kind of object it is. So, let's say we have a model called Mammal, which can either be an Elephant or a Dolphin (with their own traits "tusk_length" and "flipper_length" respectively).
Basic OOP principles shout "polymorphism", and I'm inclined to agree. But, as I'm new to Django, I first want to know whether or not it is the best way to do so in Django. I've heard of plenty of examples of and some people giving their preferences toward singular giant models
I've already tried using GenericForeignKeys as described here: How can I restrict Django's GenericForeignKey to a list of models?. While this solution works beautifully, I don't like the inability to filter, and that the relationship is only one way. That is, while you can get a Dolphin from a Mammal object, you can't get the Mammal object from the Dolphin.
And so, here are my two choices:
Choice A:
from django.db import models
class Mammal(models.Model):
hair_length = models.IntegerField()
tusk_length = models.IntegerField()
flipper_length = models.IntegerField()
animal_type = models.CharField(max_length = 15, choices = ["Elephant", "Dolphin"]
Choice B:
from django.db import models
class Mammal(models.Model):
hair_length = models.IntegerField()
class Elephant(Mammal):
tusk_length = models.IntegerField()
class Dolphin(Mammal):
flipper_length = models.IntegerField()
Choice B, from what I understand, has the advantage of nicer code when querying and listing all Elephants or Dolphins. However, I've noticed it's not as straightforward to get all of the Elephants from a list of Mammals (is there a query for this?) without putting animal_type in the class, with default being dependent on the class.
This leads to another problem I see with polymorphism, which won't come up in this example above or my application, but is worth mentioning is that it would be difficult to edit a Dolphin object into an Elephant without deleting the Dolphin entirely.
Overall, is there any general preference, or any big reason I shouldn't use polymorphism?
My recommendation, in general with database design, is to avoid inheritance. It complicates both the access and updates.
In Django, try using an abstract class for your base model. That means a db table will not be created for it. Its fields/columns will be auto-created in its child models. The benefit is: code reuse in Django/Python code and a simple, flat design in the database. The penalty is: it's more work to manage/query a mixed collection of child models.
See an example here: Django Patterns: Model Inheritance
Alternatively, you could change the concept of "Mammal" to "MammalTraits." And include a MammalTraits object inside each specific mammal class. In code, that is composition (has-a). In the db, that will be expressed as a foreign key.
We ended up going with a large table with a lot of usually-empty columns. Our reasoning was that (in this case) our Mammal table was all we'd be querying over, and there was no (intuitive) way to filter out by certain types of Mammals besides manually checking whether they had a "dolphin" or "elephant" object, which then threw an error if they didn't. Even looking for the type of an object returned from a query that was definitely an Elephant still returned "Mammal". It would be hard to extend any Pythonic workarounds to writing pure SQL, which one of our data guys does regularly.

Is There a Good Way to Forbid Templates Accessing Related Managers Through Model Instances

I need an elegant way of disabling or authorizing related field traversal in Django templates.
Imagine following setup for models.py:
class Person(models.Model):
pass
class Secret(models.Model):
owner = models.ForeignKey(Person, related_name="secrets")
Now imagine this simple view that gives the template QuerySet of all Person instances in the system just so the template could put them in a list.
def show_people(request):
render_to_response("people.html", {people=Person.objects.all()})
Now my problem is that I would not provide the templates myself in this imaginary system and I don't fully trust those who make the templates. The show_people view gives the people.html template the secrets of the Person instances through the related_name="secrets". This example is quite silly but in reality I have model structures where template providers could access all kind of vulnerable data through related managers.
The obvious solution would be not to give models to templates but to convert them in to some more secure data objects. But that would be pain in my case because the system is already quite big and it's up and running.
I think a cool solution to this would be somehow preventing related field traversal in templates. Another solution would be to have such custom related managers that could have access to the request object and filter the initial query set according to the request.user.
A possible solution could be to use a custom model.Manager with your related models.
Set use_for_related_fields = True to force Django to use it instead of the plain manager. modify the manager to filter the data as needed.
also have a look at this:
Django: using managers for related object access (use_for_related_fields docs)
stackoverflow: use_for_related_fields howto, very good explanation here.

Django - Single model instance

I want to add a layout model to my website (a general settings file), and I want it to be available in the admin interface for configuration.
class Layout(...Model):
primary_header
logo_image
...
This structure shouldn't saved be in a table.
I am wondering if there is a built-in feature that can help me do this.
Thanks
My use case is a configurable layout of the website. Wordpress style. I would like to store that data in a concrete class, without having to implement the file /xml serialization myself.
What about an abstract model? It does not save in the database and is meant to be subclassed, but you are allowed to create instances of it and use its attributes. I assume you want some kind of temporary data structure to pass around that meets the requirements of a model instance.
class Layout(models.Model):
class Meta:
abstract = True
If you for some reason need actual concrete models, and are fine with it creating tables for them, you could technically also re-implement the save() method and make it no-op.
I don't really understand where and how you will be using this, but this is indeed a model that doesn't save.
Personally, I have actually used models that aren't intended to be saved, in a project that uses mongodb and the nonrel django fork. I create models that are purely meant to be embedded into other models as nested sub-documents and I never want them to be committed to a separate collection.
Update
Here is another suggestion that might make things a whole lot easier for your goal. Why not just use a normal django model, save it to the database like normal, and create a simple import/export function to save it out to XML or read into an instance from XML. That way you get 100% normal admin functionality, you can still query the database for the values, and the XML part is just a simple add-on. You can also use this to version up preferences and mark a certain one as active.

Django end-user defined fields, how to? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Django dynamic model fields
Good Morning guys!
Scenario is the following. For some models on Django, I would like to allow the end user to define his own fields. It would be great if I could keep all Django awesome features like the ORM, so I can still do calls like field__gte to search on the model, still have field validation according to field type, etc. I've thought about two ways of doing this, and I'm more than open for new suggestions. Any feedback would be VERY appreciated.
The first approach, is the Entity-Attribute-Value ( http://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model ), which django already has an app for. See http://code.google.com/p/django-custom-field/
I think this would be an OK solution, but I lose the ability to do "mymodel.objects.filter(custom_field_x=something)". Maybe there's a way to regain the ORM, any ideas? But I've heard so many bad stories about this method that I'm little scared to use it.
The second approach would be to have a database table for each of the users (probably no more than a 1000). I've read django has something in the lines of inspectdb, which actually checks which fields are there and produces the model for you. This could be useful but I think maybe I should store the fields this particular user has created and somehow dinamically tell django, hey, we also have this fields in this model. Is this possible? I know it's generally bad to have different tables for each user, but considering this scenario, how would you guys rate this method, would it be ok to have one table for each user?
The model that requires custom fields is for example Person. They might want a custom field to store address, blood type, or any other thing.
MANY THANKS in advance! Have a nice sunday!
Very similar: How to create user defined fields in Django -- but only talks about the EAV, which I would like to avoid. I'm open for new ideas!
One approach is to use a NoSQL document-based solution such as MongoDB which allows you to store objects that have a fluid structure (no such restrictions as pre-defined columns).
Pros:
No restriction on custom field types, number of types of fields, etc.
Retains ORM functionality (django-mongodb)
Other various benefits of NoSQL - which you can read about online
Avoids EAV
Cons:
Need to setup NoSQL server
Additional knowledge required on NoSQL concepts (documents vs. tables)
You may have to maintain two databases - if you decide not to migrate your entire solution to NoSQL (multi-db)
EDIT:
After reading the comments its worth pointing out that depending on which NoSQL solution you go with, you may not need reversion support. CouchDB, for example has built in support for document versioning.
what about creating another model for storing user_defined_fields?
class UserDefinedField(models.Model):
#..................
user = models.ForeignKey(User)
field_name = models.CharField(max_length=50)
field_value = models.TextField()
Then you can do UserDefinedField.objects.filter(field_name=some_name,field_value=somevalue)