I am trying to understand design principles in c++.
In database I have users.
users:
id
name
age
gender
I want to get my users in three ways.
First: I want all my users.
Second: I want all my users filtered by age.
Third: I want all my users filtered by age and gender.
For example, if I use same class for getAllUsers and getFilteredByAge, it means that my class has two responsibility, it is responsible for getting users and also for filtering them. Am I right or not? And how Single-Responsibility Principle works in this example, should I split this three in different classes, or is there any better way ?
A good definition for SRP is:
A module should be responsible to one, and only one, actor.
(Clean Architecture)
This means that if the person telling you what these functions do is the same, then you can leave them in the same class/module.
If, let's say, getAllUsers() is requested by accounting and getUserAtLeastThisOld(int minimumAge) is requested by HR, then it might be sensible to have them in separate classes.
Following are answers to your question
Q] If I use same class for getAllUsers and getFilteredByAge, it means that my class has two responsibility?
A] No,because your class's job is to get users, rather these functions should be overloads and should not be in different classes.
Q] it is responsible for getting users and also for filtering them. Am I right or not?
A] I guess No!, filtering is not a different task, it is something that should be applied before retrieving objects.
Q] how Single-Responsibility Principle works in this example, should I split this three in different classes, or is there any better way ?
A]
In this case I suggest you to have only one class, which should have following functions overloads
GetUsers() - get all users
GetUsers(AgeFilter) - get users as per age filter
GetUsers(AgeFilter, genderFilter) - get users as per age filter and
gender filter
Note : Now suppose in future you have want to add more functionality to this class
for e.g compute salary for user, or adding family details for users
then in such case you can go for creating another class instead of putting burden on single class...
Related
While there is general consensus that multi-table inheritance isn't a very good idea in the long term (Jacobian, Others), am wondering if in some use cases the "extra joins" created by django during querying might be worth it.
My issue is having a Single Source of Truth in the database. Say, for Person Objects who are identified using an Identity Number and Identity Type. E.g. ID Number 222, Type Passport.
class Person(models.Model):
identity_number = models.CharField(max_length=20)
identity_type = models.IntegerField()
class Student(Person):
student_number = models.CharField(max_length=20)
class Employee(Person):
employee_number = models.CharField(max_length=20)
In abstract inheritance, any subclass model of person e.g. Student, Parent, Supervisor, Employee etc inheriting from a Person Abstract Class will have identity_number & identity_type stored in their respective tables
In multi-table inheritance, since they all share the same table, I can be sure that if I create a unique constraint on both columns in the Person Model then no duplicates will exist in the database.
In the abstract inheritance, to keep out duplicates in the database, one would have to build extra validation logic into the application thus also slightly degrading performance meaning it cancels out the "extra join" that django has to do with a concrete inheritance?
It's a mistake to think about your data modeling in object-oriented terms at all. It's an abstraction that fits poorly to relational databases, by hiding some very important details that can massively affect performance (as pointed out in the articles) or correctness (as you've pointed out above).
A traditional SQL approach to your example would offer two possibilities:
Having a Person table with the IDs and then Student, etc. with foreign keys back to it.
Having a single table for everything, with some additional fields to distinguish the different kinds of person.
Now, if your evaluation led you to prefer 1, you might notice that in Django this could be accomplished by using a concrete inheritance model (it's the same as what you describe above). In that case, by all means, use inheritance if you'd find the resulting access patterns in Django more elegant.
So I'm not saying you shouldn't use inheritance, I'm saying you should only look at it after you've modeled your data from the SQL perspective. If you did that in the example above, you would never even consider splitting everything into separate tables—which has all the problems you noted—as suggested by the abstract inheritance model.
I'm working on an application which among other things downloads items that belong to a certain category form a server. I want to make the downloader look like this:
class Downloader
{
Downloader(const ItemCategoryBase &category);
...
}
Each class derived from ItemCategoryBase will provide it's category ID trough a virtual function (in fact that's the only thing each derived class will do).
The issue I'm having is that I have a total of 120 item categories and writing a derived class for each one is going to be painful.
I've considered using a primitive to hold the ID but, I do not wish to implement range checking and throw exceptions in case the ID is out of range mainly because category IDs aren't all part of the same interval.
What I'm looking for is an efficient way of writing code that would fit the scheme above.
Any help is highly appreciated.
If you really have determined that this is the right way to do things, then I would suggest writing a code generator to handle it for you: create a CSV document containing all the Category ID's, and write an app that inserts each ID into template header/source files, and saves it out.. (For instance, put "$CATEGORY_ID" in wherever the Category ID goes in the files, and then just do a replace on "$CATEGORY_ID" with each ID in turn.)
However, I'm not sure I understand your statement: "I've considered using a primitive to hold the ID but, I do not wish to implement range checking and throw exceptions in case the ID is out of range mainly because category IDs aren't all part of the same interval." I can't imagine a case in which you wouldn't have to handle the complexity somewhere in your application anyway, and the range checking wouldn't be hard: just put all the valid Category IDs into a list structure of whatever your ID type is, and a simple index lookup call can answer whether the ID is part of that list.
If I have misunderstood you, what exactly is it about your setup that makes dealing with 120 ItemCategoryBase derived classes simpler than one ItemCategoryBase base class validated against a list of the IDs? You say "mainly because category IDs aren't all part of the same interval," so perhaps the checking against a list would give you what you need there. Otherwise, can you explain a bit more about how it works? Although I realize there are always exceptions, 120 classes doing nothing other than providing different IDs really strikes me as something that's unlikely to be a solution that will serve you well in the long run.
Since you're using C++, why not use templates and specify a non-type template parameter containing the ID?
For example, supposing that the category is an integer:
template<int category_id>
class Downloader : public ItemCategoryBase
{
public:
virtual int get_id()
{
return category_id;
}
};
You might as well let the compiler do the work for you.
Let's say I have an abstract base class that looks like this:
class StellarObject(BaseModel):
title = models.CharField(max_length=255)
description = models.TextField()
slug = models.SlugField(blank=True, null=True)
class Meta:
abstract = True
Now, let's say I have two actual database classes that inherit from StellarObject
class Planet(StellarObject):
type = models.CharField(max_length=50)
size = models.IntegerField(max_length=10)
class Star(StellarObject):
mass = models.IntegerField(max_length=10)
So far, so good. If I want to get Planets or Stars, all I do is this:
Thing.objects.all() #or
Thing.objects.filter() #or count(), etc...
But what if I want to get ALL StellarObjects? If I do:
StellarObject.objects.all()
It of course returns an error, because an abstract class isn't an actual database object, and therefore cannot be queried. Everything I've read says I need to do two queries, one each on Planets and Stars, and then merge them. That seems horribly inefficient. Is that the only way?
At its root, this is part of the mismatch between objects and relational databases. The ORM does a great job in abstracting out the differences, but sometimes you just come up against them anyway.
Basically, you have to choose between abstract inheritance, in which case there is no database relationship between the two classes, or multi-table inheritance, which keeps the database relationship at a cost of efficiency (an extra database join) for each query.
You can't query abstract base classes. For multi-table inheritance you can use django-model-utils and it's InheritanceManager, which extends standard QuerySet with select_subclasses() method, which does right that you need: it left-joins all inherited tables and returns appropriate type instance for each row.
Don't use an abstract base class if you need to query on the base. Use a concrete base class instead.
This is an example of polymorphism in your models (polymorph - many forms of one).
Option 1 - If there's only one place you deal with this:
For the sake of a little bit of if-else code in one or two places, just deal with it manually - it'll probably be much quicker and clearer in terms of dev/maintenance (i.e. maybe worth it unless these queries are seriously hammering your database - that's your judgement call and depends on circumstance).
Option 2 - If you do this quite a bit, or really demand elegance in your query syntax:
Luckily there's a library to deal with polymorphism in django, django-polymorphic - those docs will show you how to do this precisely. This is probably the "right answer" for querying straightforwardly as you've described, especially if you want to do model inheritance in lots of places.
Option 3 - If you want a halfway house:
This kind of has the drawbacks of both of the above, but I've used it successfully in the past to automatically do all the zipping together from multiple query sets, whilst keeping the benefits of having one query set object containing both types of models.
Check out django-querysetsequence which manages the merge of multiple query sets together.
It's not as well supported or as stable as django-polymorphic, but worth a mention nevertheless.
In this case I think there's no other way.
For optimization, you could avoid inheritance from abstract StellarObject and use it as separate table connected via FK to Star and Planet objects.
That way both of them would have ie. star.stellar_info.description.
Other way would be to add additional model for handling information and using StellarObject as through in many2many relation.
I would consider moving away from either an abstract inheritance pattern or the concrete base pattern if you're looking to tie distinct sub-class behaviors to the objects based on their respective child class.
When you query via the parent class -- which it sounds like you want to do -- Django treats the resulting ojects as objects of the parent class, so accessing child-class-level methods requires re-casting the objects into their 'proper' child class on the fly so they can see those methods... at which point a series of if statements hanging off a parent-class-level method would arguably be a cleaner approach.
If the sub-class behavior described above isn't an issue, you could consider a custom manager attached to an abstract base class sewing the models together via raw SQL.
If you're interested mainly in assigning a discrete set of identical data fields to a bunch of objects, I'd relate along a foreign-key, like bx2 suggests.
That seems horribly inefficient. Is that the only way?
As far as I know it is the only way with Django's ORM. As implemented currently abstract classes are a convenient mechanism for abstracting common attributes of classes out to super classes. The ORM does not provide a similar abstraction for querying.
You'd be better off using another mechanism for implementing hierarchy in the database. One way to do this would be to use a single table and "tag" rows using type. Or you can implement a generic foreign key to another model that holds properties (the latter doesn't sound right even to me).
I have a class, let's say Person, which is managed by another class/module, let's say PersonPool.
I have another module in my application, let's say module M, that wants to associate information with a person, in the most efficient way. I considered the following alternatives:
Add a data member to Person, which is accessed by the other part of the application. Advantage is that it is probably the fastest way. Disadvantage is that this is quite invasive. Person doesn't need to know anything about this extra data, and if I want to shield this data member from other modules, I need to make it private and make module M a friend, which I don't like.
Add a 'generic' property bag to Person, in which other modules can add additional properties. Advantage is that it's not invasive (besides having the property bag), and it's easy to add 'properties' by other modules as well. Disadvantage is that it is much slower than simply getting the value directly from Person.
Use a map/hashmap in module M, which maps the Person (pointer, id) to the value we want to store. This looks like the best solution in terms of separation of data, but again is much slower.
Give each person a unique number and make sure that no two persons ever get the same number during history (I don't even want to have these persons reuse a number, because then data of an old person may be mixed up with the data of a new person). Then the external module can simply use a vector to map the person's unique number to the specific data. Advantage is that we don't invade the Person class with data it doesn't need to know of (except his unique nubmer), and that we have a quick way of getting the data specifically for module M from the vector. Disadvantage is that the vector may become really big if lots of persons are deleted and created (because we don't want to reuse the unique number).
In the last alternative, the problem could be solved by using a sparse vector, but I don't know if there are very efficient implementations of a sparse vector (faster than a map/hashmap).
Are there other ways of getting this done?
Or is there an efficient sparse vector that might solve the memory problem of the last alternative?
I would time the solution with map/hashmap and go with it if it performs good enough. Otherwise you have no choice but add those properties to the class as this is the most efficient way.
Alternatively, you can create a subclass of Person, basically forward all the interface methods to the original class but add all the properties you want and just change original Person to your own modified one during some of the calls to M.
This way module M will see the subclass and all the properties it needs but all other modules would think of it as just an instance of Person class and will not be able to see your custom properties.
The first and third are reasonably common techniques. The second is how dynamic programming languages such as Python and Javascript implement member data for objects, so do not dismiss it out of hand as impossibly slow. The fourth is in the same ballpark as how relational databases work. It is possible, but difficult, to make relational databases run the like the clappers.
In short, you've described 4 widely used techniques. The only way to rule any of them out is with details specific to your problem (required performance, number of Persons, number of properties, number of modules in your code that will want to do this, etc), and corresponding measurements.
Another possibility is for module M to define a class which inherits from Person, and adds extra data members. The principle here is that M's idea of a person differs from Person's idea of a person, so describe M's idea as a class. Of course this only works if all other modules operating on the same Person objects are doing so via polymorphism, and furthermore if M can be made responsible for creating the objects (perhaps via dependency injection of a factory). That's quite a big "if". An even bigger one, if nothing other than M needs to do anything life-cycle-ish with the objects, then you may be able to use composition or private inheritance in preference to public inheritance. But none of it is any use if module N is going to create a collection of Persons, and then module M wants to attach extra data to them.
Suppose I have a three level hierarchy consisting of school, students, and classes.
If I expose student as a resource, my question is whether I should always return the parent "school" and the children "classes" along with that student, or whether there should be parm that the user includes to indicate such. Perhaps something like &deep=True?
Or on the other hand, if a user gets a student, and he wants the school, he has to do a GET on the school resource, and likewise if he wants all the classes that a student is taking, he has to do a GET on the classes resource?
I'm trying to keep the design somewhat open for the unknown future user, rather than coding just for what our current requirements demand.
Thanks,
Neal Walters
If you think about Resource design more in the way you think about UI design then the problem becomes easier. There is no reason why you cannot return a subset of school information within the representation of the Student resource and also return a link to a complete representation of School resource in case the user wishes to see more.
I find it useful to think of a REST interface more like a user interface for machines instead of a data access layer. With this mindset it is not a problem to duplicate information in different resource representations.
I know there are lots of people trying to treat REST like a DAL but they are the same people that get upset when they find out that you can't do transactions via a RESTful interface.
Put another way, design your API as you would design a website (but without any of the pretty stuff) and then build a client that can crawl the site for the information it needs.
I think you should avoid thinking of classes as a sub-resource or attribute of a student. An academic class is more than just a time slot on a student's schedule; it has an instructor, a syllabus, etc., all of which may need to be encoded at some point.
As I see it, the following relations hold:
schools have zero or more students
schools have zero or more classes
students have zero or more classes
classes have zero or more students
(You could also trivially extend these with teachers/instructors, if your requirements included such information.)
In addition, each of the above resource types will have any number of attributes beyond the simple links between them.
Given that, I would think you'd want a URL structure something like the following:
http://example.com/lms/schools => list of schools
http://example.com/lms/schools/{school} => info about one school
http://example.com/lms/schools/{school}/students => list of students
http://example.com/lms/schools/{school}/students/{student} => info on one student
http://example.com/lms/schools/{school}/students/{student}/courses => list of courses (as links, not full resources) student is enrolled in
http://example.com/lms/schools/{school}/courses => list of courses
http://example.com/lms/schools/{school}/courses/{course} => info on one course
http://example.com/lms/schools/{school}/courses/{course}/students => list of students (as links, not full resources) enrolled in course
I suppose that adding query parameters to optimize delivery is reasonable. I might make it even more generic and use include=<relation>. This can be extended for all types. Note that
you can use multiple includes: .../student/<id>?include=school&include=student will assign the list [school, student] to the parameter include. This will also allow a general pattern that may be possibly useful for the other resources as well.