I've checked a number of SO articles and I don't believe there is any way to accomplish what I want to do, but before I abandon Django I wanted to articulate the question itself and see if I missed something.
I'm implementing a graph (nodes, edges) which can contain subclasses of a base type. That is, an edge can connect a base class to a subclass, or a subclass to a subclass, etc . . . I want to be able to pull all the edges for a given object, find the objects these edges point to, and call some function on these terminal objects. I was hoping that I could call the function in a polymorphic way, but I can't figure out a way to make that happen.
class Node(models.Model):
...
def dosomething():
class SpecialNode(Node):
...
def dosomething():
class Edge(models.Model):
#yes, related_name is weird, but this seems to be what makes sense
source = models.ForeignKey(Node, related_name='targets')
target = models.ForeignKey(Node, related_name='sources')
With this structure I can do:
sourceedges = node.sources.all()
for sourceedge in sourceedges:
sourceedge.source.dosomething()
But the "dosomething" function is always called on the Node object, even if source is actually a SpecialNode object.
I've tried doing this with django_polymorphic but I don't believe this supports M2M inheritance through an Edge object (this is required for other reasons in my app).
I've tried to use contenttypes, but I think you're only allowed one generic relation per class. Edge, in other words, can't have 2 different generic relations in it.
I imagine I could establish an object called an Endpoint which would have just a single generic relation on it, then link the Edge object to it like so:
class Endpoint(models.Model)
...
content_object = generic.GenericForeignKey(...)
class Edge(models.Model)
source = models.ForeignKey(Endpoint, related_name='targets')
target = models.ForeignKey(Endpoint, related_name='sources')
But this introduces another level of indirection in my model and I start to feel like the framework is coding me rather than the other way around :)
Anyway, if someone has figured out a way to get this particular use case done, please let me know. A better way than what I suggested above would be welcome, as I currently like a lot of things I get out of the box with Django.
Thanks
Related
Quick terminology question that's somewhat related to my main question: What is the correct term for a model class and the term for a instance of that class?
I am learning Test Driven Development, and want to make sure I am learning it the right so I can form good habits.
My current project has a SalesmanController, which is pretty much a basic resource controller. Here's my current issue (I can get it working, but I want to make sure its done as "right" as possible)
I have a 'Salesman' model.
The 'Salesman' is mapped as having many 'Sales' using my ORM.
The 'Sales' model is mapped as belongsTo 'Salesman' using my ORM.
I have created a ORMSalesmanRepository which implements the SalesmanRepositoryInterface.
My controller has SalesmanRepositoryInterface passed to it upon construction(constructor dependency injection).
My controller calls the find method on the SalesmanRepositoryInterface implementation it has been given to find the correct salesman.
My view needs information on the salesman and information on all 'Sales' records that belong to him.
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
My gut tells me this implementation is wrong. The orm record implements Array Access, so it still behaves like an array as far as the view knows.
However, when trying to go back and implement my unit tests I am running into issues mocking my dependencies. (I now know with TDD I'm supposed to make my unit tests and then develop the actual implementation, didn't figure this out till recently).
Salesman = MockedSalesman;
SalesRecords = MockedSalesman->Sales;
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject? Should the controller just assume that whatever it gets back is an array (or at least acts like one?)
Also, should it ever come up where something one of my mocked classes returns needs to be mocked again?
Thanks in advance everybody.
What is the correct term for a model class and the term for a instance of that class?
Depends on what you mean with "model classes"? Technically model is a layer, that contains several groups of classes. Most notable ones would be: mappers, services an domain objects. Domain objects as whole are implementation of accumulated knowledge about business requirements, insight from specialists and project goals. This knowledge is referred to as "domain model".
Basically, there is no such thing as "model class". There are classes that are part of model.
What is the controller allowed to assume about what it recieves from a service?
Nothing, because controller should not receive anything from model layer. The responsibility of controller is to alter the state of model layer (and in rare cases - state of current view).
Controller is NOT RESPONSIBLE for:
gather data from model layer,
initializing views
passing data from model layer to views
dealing with authorization checks
The current implementation of SalesmanRepositoryInterface returns an instance of a ORMRecord, and then passes that to the view, which retrieves the sales from the ORMRecord.
It sounds like you are implementing active record pattern. It has very limited use-case, where is is appropriate to use AR - when object mostly consists of getters and setters tht are directly stored in a single table. For anything beyond that active record becomes an anti-pattern because it violates SRP and you loose the ability to test your domain logic without database.
Also, are repository should be returning an instance of domain object and makes sure that you are not retrieving data repeatedly. Repositories are not factories for active record instances.
Is it poor programming to expect my Salesman to return a ORMObject for the controller to use (for chaining relationships maybe?) or is a controller becoming to 'fat' if I'm allowing it to call more than just basic get methods (using arrayaccess []) on the ORMObject?
Yes, it's bad code. Your application logic (one that would usually be contained in services) is leaking in the presentation layer.
At this stage I wouldn't stress too much about your implementation - if you try and write proper tests, you'll quickly find out what works and what doesn't.
The trick is to think hard about what each component is trying to achieve. What is your controller method supposed to do? Most likely it is intended to create a ViewModel of some kind, and then choose which View to render. So there's a few tests right there:
When I call my controller method with given arguments (ShowSalesmanDetail(5))
It should pick the correct View to render ('ShowSalesmanDetail')
It should construct the ViewModel that I expect (A Salesman object with some Sales)
In theory, at this point you don't care how the controller constructs the model, only that it does. In practice though you do need to care, because the controller has dependencies (the big one being the database), which you need to cater for. You've chosen to abstract this with a Repository class that talks to an ORM, but that shouldn't impact the purpose of your tests (though it will definitely alter how you implement those tests).
Ideally the Salesman object in your example would be a regular class, with no dependencies of its own. This way, your repository can construct a Salesman object by populating it from the database/ORM, and your unit tests can also use Salesman objects that you've populated yourself with test data. You shouldn't need to mock your models or data classes.
My personal preference is that you don't take your 'data entities' (what you get back from your ORM) and put them into Views. I would construct a ViewModel class that is tied to one View, and then map from your data entities to your ViewModel. In your example, you might have a Salesman class which represents the data in the database, then a SalesmanModel which represents the information you are displaying on the page (which is usually a subset of what's in the db).
So you might end up with a unit test looking something like this:
public void CallingShowSalesmanShouldReturnAValidModel()
{
ISalesmanRepository repository = A.Fake<ISalesmanRepository>();
SalesmanController controller = new SalesmanController(repository);
const int salesmanId = 5;
Salesman salesman = new Salesman
{
Id = salesmanId,
Name = "Joe Bloggs",
Address = "123 Sesame Street",
Sales = new[]
{
new Sale { OrderId = 123, SaleDate = DateTime.Now.AddMonths(-1) }
}
};
A.CallTo(() => repository.Find(salesmanId)).Returns(salesman);
ViewResult result = controller.ShowSalesman(salesmanId) as ViewResult;
SalesmanModel model = result.Model as SalesmanModel;
Assert.AreEqual(salesman.Id, model.Id);
Assert.AreEqual(salesman.Name, model.Name);
SaleModel saleModel = model.Sales.First();
Assert.AreEqual(salesman.Sales.First().OrderId, saleModel.OrderId);
}
This test is by no means ideal but hopefully gives you an idea of the structure. For reference, the A.Fake<> and A.CallTo() stuff is from FakeItEasy, you could replace that with your mocking framework of choice.
If you were doing proper TDD, you wouldn't have started with the Repository - you'd have written your Controller tests, probably got them passing, and then realised that having all this ORM/DB code in the controller is a bad thing, and refactored it out. The same approach should be taken for the Repository itself and so on down the dependency chain, until you run out of things that you can mock (the ORM layer most likely).
For example we have a method that fetches additional information about a model from third party APIs. Is it okay to put this as a method on the model or should it live outside?
class Entity(models.Model):
name = ...
location = ...
def fetch_location(self):
# fetch the location from another server and store it.
self.location = "result"
If the data is related to the instance than it can be the right place to put it. Only if you get a lot of these you might want to wrap them in a different class for your own readability (i.e. knowing what is internal and what is external from the instance perspective).
The way I generally do it:
Manager: anything pertaining a group of Model instances
Model: anything pertaining a single Model Instance
Well, if you think in terms of Object Oriented Programming, the answer is "yes":
If "the object can do something", than it should be included as a member function (aka method).
But: If several different classes would need the same functionality (e.g. an "Entity Owner" would like to fetch the location by himself without calling my_entity.fetch_location), you should consider a (abstract) class above both classes which implements the behaviour.
If you have to call the method without an existing instance (which seems not to be the case in your example), you might consider writing the method outside of a class or you add the #staticmethod decorator which allows you to call Entity.fetch_location (remember to omit self in this case since there is no self if there is no instance.) I would prefer the staticmethod over a global method because the caller will always know, which class it relates to.
#staticmethod
def fetch_location():
# fetch the location from another server and store it.
self.location = "result"
I'm running a Django shop where we serve each our clients an object graph which is completely separate from the graphs of all the other clients. The data is moderately sensitive, so I don't want any of it to leak from one client to another, nor for one client to delete or alter another client's data.
I would like to structure my code such that I by default write code which adheres to the security requirements (No hard guarantees necessary), but lets me override them when I know I need to.
My main fear is that in a Twig.objects.get(...), I forget to add client=request.client, and likewise for Leaf.objects.get where I have to check that twig__client=request.client. This quickly becomes error-prone and complicated.
What are some good ways to get around my own forgetfulness? How do I make this a thing I don't have to think about?
One candidate solution I have in mind is this:
Set the default object manager as DANGER = models.Manager() on my abstract base class(es).
Have a method ok(request) on said base classes which applies .filter(leaf__twig__branch__trunk__root__client=request.client) as applicable.
use MyModel.ok(request) instead of MyModel.objects wherever feasible.
Can this be improved upon? One not so nice issue is when a view calls a model method, e.g. branch.get_twigs_with_fruit, I now have to either pass a request for it to run through ok or I have to invoke DANGER. I like neither :-\
Is there some way of getting access to the current request? I think that might mitigate the situation...
Ill explain a different problem I had however I think the solution might be something to look into.
Once I was working on a project to visualize data where I needed to have a really big table which will store all the data for all visualizations. That turned out to be a big problem because I would have to do things like Model.objects.filter(visualization=5) which was just not very elegant and not efficient.
To make things simpler and more efficient I ended up creating dynamic models on the fly. Essentially I would create a separate table in the db on the fly and then store a data only for that one visualization in that. My code is something like:
def get_model_class(table_name):
class ModelBase(ModelBase):
def __new__(cls, name, bases, attrs):
name = '{}_{}'.format(name, table_name)
return super(ModelBase, cls).__new__(cls, name, bases, attrs)
class Data(models.Model):
# fields here
__metaclass__ = ModelBase
class Meta(object):
db_table = table_name
return Data
dynamic_model = get_model_class('foo')
This was useful for my purposes because it allowed queries to be much faster but getting back to your issue I think something like this can be useful because this will make sure that each client's data is separate not only via a foreign key, but is actually separated in the db.
Using this method is pretty straight forward except before using the model, you have to call the function to get it for each client. To make things more efficient you can cache/memoize the results of the function call so that it does not have to recompute the same thing more than once.
I am subclassing an existing model. I want many of the members of the parent class to now, instead, be members of the child class.
For example, I have a model Swallow. Now, I am making EuropeanSwallow(Swallow) and AfricanSwallow(Swallow). I want to take some but not all Swallow objects make them either EuropeanSwallow or AfricanSwallow, depending on whether they are migratory.
How can I move them?
It's a bit of a hack, but this works:
swallow = Swallow.objects.get(id=1)
swallow.__class__ = AfricanSwallow
# set any required AfricanSwallow fields here
swallow.save()
I know this is much later, but I needed to do something similar and couldn't find much. I found the answer buried in some source code here, but also wrote an example class-method that would suffice.
class AfricanSwallow(Swallow):
#classmethod
def save_child_from_parent(cls, swallow, new_attrs):
"""
Inputs:
- swallow: instance of Swallow we want to create into AfricanSwallow
- new_attrs: dictionary of new attributes for AfricanSwallow
Adapted from:
https://github.com/lsaffre/lino/blob/master/lino/utils/mti.py
"""
parent_link_field = AfricanSwallow._meta.parents.get(swallow.__class__, None)
new_attrs[parent_link_field.name] = swallow
for field in swallow._meta.fields:
new_attrs[field.name] = getattr(swallow, field.name)
s = AfricanSwallow(**new_attrs)
s.save()
return s
I couldn't figure out how to get my form validation to work with this method however; so it certainly could be improved more; probably means a database refactoring might be the best long-term solution...
Depends on what kind of model inheritance you'll use. See
http://docs.djangoproject.com/en/dev/topics/db/models/#model-inheritance
for the three classic kinds. Since it sounds like you want Swallow objects that rules out Abstract Base Class.
If you want to store different information in the db for Swallow vs AfricanSwallow vs EuropeanSwallow, then you'll want to use MTI. The biggest problem with MTI as the official django model recommends is that polymorphism doesn't work properly. That is, if you fetch a Swallow object from the DB which is actually an AfricanSwallow object, you won't get an instance o AfricanSwallow. (See this question.) Something like django-model-utils InheritanceManager can help overcome that.
If you have actual data you need to preserve through this change, use South migrations. Make two migrations -- first one that changes the schema and another that copies the appropriate objects' data into subclasses.
I suggest using django-model-utils's InheritanceCastModel. This is one implementation I like. You can find many more in djangosnippets and some blogs, but after going trough them all I chose this one. Hope it helps.
Another (outdated) approach: If you don't mind keeping parent's id you can just create brand new child instances from parent's attrs. This is what I did:
ids = [s.pk for s in Swallow.objects.all()]
# I get ids list to avoid memory leak with long lists
for i in ids:
p = Swallow.objects.get(pk=i)
c = AfricanSwallow(att1=p.att1, att2=p.att2.....)
p.delete()
c.save()
Once this runs, a new AfricanSwallow instance will be created replacing each initial Swallow instance
Maybe this will help someone :)
I am in a process of designing a client for a REST-ful web-service.
What is the best way to go about representing the remote resource locally in my django application?
For example if the API exposes resources such as:
List of Cars
Car Detail
Car Search
Dealership summary
So far I have thought of two different approaches to take:
Try to wrangle the django's models.Model to mimic the native feel of it. So I could try to get some class called Car to have methods like Car.objects.all() and such. This kind of breaks down on Car Search resources.
Implement a Data Access Layer class, with custom methods like:
Car.get_all()
Car.get(id)
CarSearch.search("blah")
So I will be creating some custom looking classes.
Has anyone encoutered a similar problem? Perhaps working with some external API's (i.e. twitter?)
Any advice is welcome.
PS: Please let me know if some part of question is confusing, as I had trouble putting it in precise terms.
This looks like the perfect place for a custom manager. Managers are the preferred method for "table-level" functionality, as opposed to "row-level" functionality that belongs in the model class. Basically, you'd define a manager like this:
class CarAPIManager(models.Manager):
def get_detail(self, id):
return self.get(id=id)
def search(self, term):
return self.filter(model_name__icontains=term)
This could be used either as the default manager -- e.g., in your model definition:
class Car(models.Model):
...
objects = CarAPIManager()
# usage
>>> Car.objects.search(...)
or you could just make it an additional manager, as a property of the class:
class Car(models.Model):
...
api = CarAPIManager()
# usage
>>> Car.api.search(...)