I'm wondering in regards to the guideline stating that classes should have around 7 variables +-2, are class variables (class constants) included in this?
Ex:
class Foo
{
static const int SOME_THING;
static const double SOME_OTHER;
static const int BLAH;
int m_ThisVariable;
double m_ThatVariable;
string m_SomeString;
public:
//....
};
Would you consider the above to count as 3 or 6 in regards to the 7 +- 2 rule?
Anyone who honestly thinks that you can arbitrarily define how many member variables a class should have has not written a lot of code or are extremely arrogant. I know it just a guideline, but honestly, if the class is well defined, conforms to the general OOP guidelines of single responsibility, and is easy to maintain, you should just spend your time solving real problems.
BTW, I realize that this is not an actual answer, so let the downvoting begin. I just had to vent :)
EDIT: Just did a little searching and found that this 'guideline' comes from the fact that humans have trouble remembering sequences of information with more than five or six discrete data points. Well, that's nice, and it is something to remember (especially when designing user interfaces), but in practice you cannot design your code this way. Do what makes sense and makes your life easier (maintenance considerations being part of that decision).
Aside from the fact that the number of variables shouldn't arbitrarily be set at a maximum number, I would argue that what is important is considering groups.
As such, I would consider static variables and non-static variables two separate groups (this is visually rendered in your code example as they are separated by a blank line). If they were all grouped together, then I'd think they count as one group.
I don't know however that this analysis has any value whatsoever, as I agree with Ed completely.
By the way, if you want a convenient means of grouping variables together in the IDE without having to actually put them into classes, MSVC supports the #pragma region directive. That just lumps some lines of code together into regions that can be collapsed or expanded by clicking the little "+" icon to the left — it has no effect on the compiled result, it's just markup for the code editor.
I'm pretty sure constants shouldn't be counted. Most classes won't have many (any?), anyway. If your class does have a large number of constants, you probably ought to move them out into their own class, but one or two here and there aren't going to make any difference.
I know everybody is jumping in on the "this is crazy" side of this argument, so I'll just mention that I think it's not a totally unreasonable rule-of-thumb. In that respect, it's like the "no function longer than a single screen-full in the editor". Violating the rule just means you ought to take a careful look at the code and make sure it's not getting more-complex than necessary.
Related
From: http://www.parashift.com/c++-faq-lite/basics-of-inheritance.html#faq-19.9
Three keys: ROI, ROI and ROI.
Every interface you build has a cost and a benefit. Every reusable
component you build has a cost and a benefit. Every test case, every
cleanly structured thing-a-ma-bob, every investment of any sort. You
should never invest any time or any money in any thing if there is not
a positive return on that investment. If it costs your company more
than it saves, don't do it!
Not everyone agrees with me on this; they have a right to be wrong.
For example, people who live sufficiently far from the real world act
like every investment is good. After all, they reason, if you wait
long enough, it might someday save somebody some time. Maybe. We hope.
That whole line of reasoning is unprofessional and irresponsible. You
don't have infinite time, so invest it wisely. Sure, if you live in an
ivory tower, you don't have to worry about those pesky things called
"schedules" or "customers." But in the real world, you work within a
schedule, and you must therefore invest your time only where you'll
get good pay-back.
Back to the original question: when should you invest time in building
a protected interface? Answer: when you get a good return on that
investment. If it's going to cost you an hour, make sure it saves
somebody more than an hour, and make sure the savings isn't "someday
over the rainbow." If you can save an hour within the current project,
it's a no-brainer: go for it. If it's going to save some other project
an hour someday maybe we hope, then don't do it. And if it's in
between, your answer will depend on exactly how your company trades
off the future against the present.
The point is simple: do not do something that could damage your
schedule. (Or if you do, make sure you never work with me; I'll have
your head on a platter.) Investing is good if there's a pay-back for
that investment. Don't be naive and childish; grow up and realize that
some investments are bad because they, in balance, cost more than they
return.
Well, I didn't understand how to correlate this to C++ protected interface.
Please give any real C++ examples to show what this FAQ is talking about.
First off, do not ever treat any programming reference as definitive. Ever. Everything is somebody's opinion, and in the end you should do what works best for you.
So, that said, what this text is basically trying to say is "don't use techniques that cost you more time than they save". One example of the "protected interface" they're describing is the following:
class C {
public:
int x;
};
Now, in Java, all the Java EE programming books will tell you to always implement that class like this:
class C {
public:
int getX() { return x; }
void setX(int x) { this.x = x; }
private:
int x;
};
... that's an implementation of proper encapsulation (technical term: simplifying a little, it means minimizing sharing between discrete parts). The classes using your code are concerned that you have some way to get and set an integer, not that it's actually stored as an int inside the class. So if you use accessor methods, you're better able to change the underlying implementation later: maybe you want it to read that variable from the network?
However, that was a large amount of extra code (in terms of characters) and some extra complexity to implement that. Doing things properly actually has a cost! It's not a cost in terms of correctness of the code - directly - but you spent some number of minutes doing it "better" that you could have spent doing something else, and there is a nonzero amount of work involved in maintaining everything you write, no matter how trivial.
So, what is being said in this passage is in my mind good advice: always double-check that when you go to do something, you're going to get more out of it than what you put in. Sanity check that you are not following an ideal to the detriment of your actual effectiveness as a programmer or a human being.
That's advice that will serve you well in any programming language, and in any walk of life.
From your quote above, the guy sounds like a pedantic jerk :)
Looking at the previous entries in his FAQ, he's really saying the following:
1) A class has two distinct interfaces for two distinct sets of clients:
It has a public interface that serves unrelated classes
It has a protected interface that serves derived classes
2) Should you always go to the trouble of creating two different interfaces for each class?
3) Answer: "no, not necessarily"
Sometimes it's worth the extra effort to create protected getter and setter methods, and make all data "private"
Other times - he says - it's "good enough" to make the data itself "protected". Without doing all the extra work of writing a bunch of extra code, and incurring the consequent size and performance penalties.
Sounds reasonable to me. Do what you need to do - but don't go overboard and do a bunch of unnecessary stuff in the name of "theory".
That's all he's saying - use good judgement, and don't go overboard.
You can't argue with that :)
PS:
FAQ's 19.5 through 19.9 in your link deal with "derived classes". None of this discussion is relevant outside of the question "how should I structure base classes for inheritance?" In other words, it's not a discussion about "classes" in general - only about "how should a super class best make things visible to it's subclasses?".
My current project code base has every unit and its friend refined.
Extract :-
...
typedef int m; // meter
typedef int htz;
typedef int s; // second
...
Good or Bad?
I hate it! Its a pain, there is no benefit, and "m" globally defined, omg!
But I want to state the reason why I hate it, in a bit more of technical/articulate manor... hello readers!
Can people list For/Against arguments for this pattern? Many thanks.
Better to make them custom types, as then you can control conversions and overload operators. Right now, I can do meaningless things like multiply a metre by a hertz. Ideally, m / s would yield a velocity- but it won't. It's meaningless to just typedef them like that.
Presumably they are trying to document intent, but without type safety there is no enforcing it. It is just clutter that increases the barrier of entry for reasoning about the code.
Even if they did try and create type safety, trying to abstract data at low levels just adds complexity. It doesn't make solving problems easier. The variable name describes the contents well enough anyway.
I'm studying for an exam and am trying to figure this question out. The specific question is "Inheritance and object composition both promote code reuse. (T/F)", but I believe I understand the inheritance portion of the question.
I believe inheritance promotes code reuse because similar methods can be placed in an abstract base class such that the similar methods do not have to be identically implemented within multiple children classes. For example, if you have three kinds of shapes, and each shape's method "getName" simply returns a data member '_name', then why re-implement this method in each of the child classes when it can be implemented once in the abstract base class "shape".
However, my best understanding of object composition is the "has-a" relationship between objects/classes. For example, a student has a school, and a school has a number of students. This can be seen as object composition since they can't really exist without each other (a school without any students isn't exactly a school, is it? etc). But I see no way that these two objects "having" each other as a data member will promote code reuse.
Any help? Thanks!
Object composition can promote code reuse because you can delegate implementation to a different class, and include that class as a member.
Instead of putting all your code in your outermost classes' methods, you can create smaller classes with smaller scopes, and smaller methods, and reuse those classes/methods throughout your code.
class Inner
{
public:
void DoSomething();
};
class Outer1
{
public:
void DoSomethingBig()
{
// We delegate part of the call to inner, allowing us to reuse its code
inner.DoSomething();
// Todo: Do something else interesting here
}
private:
Inner inner;
};
class Outer2
{
public:
void DoSomethingElseThatIsBig()
{
// We used the implementation twice in different classes,
// without copying and pasting the code.
// This is the most basic possible case of reuse
inner.DoSomething();
// Todo: Do something else interesting here
}
private:
Inner inner;
};
As you mentioned in your question, this is one of the two most basic Object Oriented Programming principals, called a "has-a relationship". Inheritance is the other relationship, and is called an "is-a replationship".
You can also combine inheritance and composition in quite useful ways that will often multiply your code (and design) reuse potential. Any real world and well-architected application will constantly combine both of these concepts to gain as much reuse as possible. You'll find out about this when you learn about Design Patterns.
Edit:
Per Mike's request in the comments, a less abstract example:
// Assume the person class exists
#include<list>
class Bus
{
public:
void Board(Person newOccupant);
std::list<Person>& GetOccupants();
private:
std::list<Person> occupants;
};
In this example, instead of re-implementing a linked list structure, you've delegated it to a list class. Every time you use that list class, you're re-using the code that implements the list.
In fact, since list semantics are so common, the C++ standard library gave you std::list, and you just had to reuse it.
1) The student knows about a school, but this is not really a HAS-A relationship; while you would want to keep track of what school the student attends, it would not be logical to describe the school as being part of the student.
2) More people occupy the school than just students. That's where the reuse comes in. You don't have to re-define the things that make up a school each time you describe a new type of school-attendee.
I have to agree with #Karl Knechtel -- this is a pretty poor question. As he said, it's hard to explain why, but I'll give it a shot.
The first problem is that it uses a term without defining it -- and "code reuse" means a lot of different things to different people. To some people, cutting and pasting qualifies as code reuse. As little as I like it, I have to agree with them, to at least some degree. Other people define cod reuse in ways that rule out cutting and pasting as being code reuse (classing another copy of the same code as separate code, not reusing the same code). I can see that viewpoint too, though I tend to think their definition is intended more to serve a specific end than be really meaningful (i.e., "code reuse"->good, "cut-n-paste"->bad, therefore "cut-n-paste"!="code reuse"). Unfortunately, what we're looking at here is right on the border, where you need a very specific definition of what code reuse means before you can answer the question.
The definition used by your professor is likely to depend heavily upon the degree of enthusiasm he has for OOP -- especially during the '90s (or so) when OOP was just becoming mainstream, many people chose to define it in ways that only included the cool new OOP "stuff". To achieve the nirvana of code reuse, you had to not only sign up for their OOP religion, but really believe in it! Something as mundane as composition couldn't possibly qualify -- no matter how strangely they had to twist the language for that to be true.
As a second major point, after decades of use of OOP, a few people have done some fairly careful studies of what code got reused and what didn't. Most that I've seen have reached a fairly simple conclusion: it's quite difficult (i.e., essentially impossible) correlate coding style with reuse. Nearly any rule you attempt to make about what will or won't result in code reuse can and will be violated on a regular basis.
Third, and what I suspect tends to be foremost in many people's minds is the fact that asking the question at all makes it sound as if this is something that can/will affect a typical coder -- that you might want to choose between composition and inheritance (for example) based on which "promotes code reuse" more, or something on that order. The reality is that (just for example) you should choose between composition and inheritance primarily based upon which more accurately models the problem you're trying to solve and which does more to help you solve that problem.
Though I don't have any serious studies to support the contention, I would posit that the chances of that code being reused will depend heavily upon a couple of factors that are rarely even considered in most studies: 1) how similar of a problem somebody else needs to solve, and 2) whether they believe it will be easier to adapt your code to their problem than to write new code.
I should add that in some of the studies I've seen, there were factors found that seemed to affect code reuse. To the best of my recollection, the one that stuck out as being the most important/telling was not the code itself at all, but the documentation available for that code. Being able to use the code without basically reverse engineer it contributes a great deal toward its being reused. The second point was simply the quality of the code -- a number of the studies were done in places/situations where they were trying to promote code reuse. In a fair number of cases, people tried to reuse quite a bit more code than they really did, but had to give up on it simply because the code wasn't good enough -- everything from bugs to clumsy interfaces to poor portability prevented reuse.
Summary: I'll go on record as saying that code reuse has probably been the most overhyped, under-delivered promise in software engineering over at least the last couple of decades. Even at best, code reuse remains a fairly elusive goal. Trying to simplify it to the point of treating it as a true/false question based on two factors is oversimplifying the question to the point that it's not only meaningless, but utterly ridiculous. It appears to trivialize and demean nearly the entire practice of software engineering.
I have an object Car and an object Engine:
class Engine {
int horsepower;
}
class Car {
string make;
Engine cars_engine;
}
A Car has an Engine; this is composition. However, I don't need to redefine Engine to put an engine in a car -- I simply say that a Car has an Engine. Thus, composition does indeed promote code reuse.
Object composition does promote code re-use. Without object composition, if I understand your definition of it properly, every class could have only primitive data members, which would be beyond awful.
Consider the classes
class Vector3
{
double x, y, z;
double vectorNorm;
}
class Object
{
Vector3 position;
Vector3 velocity;
Vector3 acceleration;
}
Without object composition, you would be forced to have something like
class Object
{
double positionX, positionY, positionZ, positionVectorNorm;
double velocityX, velocityY, velocityZ, velocityVectorNorm;
double accelerationX, accelerationY, accelerationZ, accelerationVectorNorm;
}
This is just a very simple example, but I hope you can see how even the most basic object composition promotes code reuse. Now think about what would happen if Vector3 contained 30 data members. Does this answer your question?
I have created a class that models time slots in a variable-granularity daily schedule, where, for example, the first time slot is 30 minutes, but the second time slot can be 40 minutes and the first available slot starts at (a value comparable to) 1.
What I want to do now is to define somehow the maximum and minimum allowable values that this class takes and I have two practical questions in order to do so:
1.- Does it make sense to define absolute minimum and maximum in such a way for a custom class? Or better, does it suffice that a value always compares as lower-than any other possible value of the type, given the class's defined relational operators, to be defined the min? (and analogusly for the max)
2.- Assuming the previous question has an answer modeled after "yes" (or "yes but ..."), how do I define such max/min? I know that there is std::numeric_limits<> but from what I read it is intended for "numeric types". Do I interpret that as meaning "represented as a number" or can I make a broader assumption like "represented with numbers" or "having a correspondence to integers"? After all, it would make sense to define the minimum and maximum for a date class, and maybe for a dictionary class, but numeric_limits may not be intended for those uses (I don't have much experience with it). Plus, numeric_limits has a lot of extra members and information that I don't know what to make with. If I don't use numeric_limits, what other well-known / widely-used mechanism does C++ offer to indicate the available range of values for a class?
Having trouble making sense of your question. I think what you're asking is whether it makes sense to be assertive about the class's domain (that data which can be fed to it and make sense), and if so how to be assertive.
The first has a very clear answer: yes, absolutely. You want your class to be, "...easy to use correctly and difficult to use incorrectly." This includes making sure the clients of the class are being told when they do something wrong.
The second has a less clear answer. Much of the time you'll simply want to use the assert() function to assert a function or class's domain. Other times you'll want to throw an exception. Sometimes you want to do both. When performance can be an issue sometimes you want to provide an interface that does neither. Usually you want to provide an interface that can at least be checked against so that the clients can tell what is valid/invalid input before attempting to feed it to your class or function.
The reason you might want to both assert and throw is because throwing an exception destroys stack information and can make debugging difficult, but assert only happens during build and doesn't actually do anything to protect you from running calculations or doing things that can cause crashes or invalidate data. Thus asserting and then throwing is often the best answer so that you can debug when you run into it while testing but still protect the user when those bugs make it to the shelf.
For your class you might consider a couple ways to provide min/max. One is to provide min/max functions in the class's interface. Another might be to use external functionality and yes, numeric_limits might just be the thing since a range is sometimes a type of numeric quantity. You could even provide a more generic interface that has a validate_input() function in your class so that you can do any comparison that might be appropriate.
The second part of your question has a lot of valid answers depending on a lot of variables including personal taste.
As the designer of your schedule/slot code, it's up to you as to how much flexibility/practicality you want.
Two simple approaches would be to either define your own values in that class
const long MIN_SLOT = 1;
const long MAX_SLOT = 999; // for example
Or define another class that holds the definitions
class SchedLimits{
public:
const static long MIN_SLOT = 1;
const static long MAX_SLOT = 999;
}
Simplest of all would be enums. (my thanks to the commenter that reminded me of those)
enum {MIN_SLOT = 1, MAX_SLOT = 999};
Just create some const static members that reflect the minimums and maximums.
Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).
All that exists (grown over several years), "works" and is used for projects.
However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.
Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.
Is there any proven way to go about that task? What would be a good place to start?
I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?
I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.
I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.
Removing templates would only lead to more dependencies, not fewer.
The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)
Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.
But from an architecture point of view, I think avoiding templates is a step in the wrong direction.
Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.
You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).
Other than that, "divide and conqueor"
Write unit tests.
Where the new code must do the same as the old code.
That's one tip at least.
Edit:
If you deprecate old code that you have replaced with the new functionality you
can phase over to the new code little by little.
Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".
Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.
The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.
Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):
Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?
Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.
Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.
Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.
Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.
Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).
As I understand, you are most concerned with build times, and the maintainability of your library?
First, don't try to "fix" all at once.
Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.
Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.
Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.
All considered? OK, now some suggestions for the build times ;)
Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.
If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)
Use precompiled headers. (scales better with fast machines, see above)
Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.
Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.
overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:
// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }
You can declare the overloads in the header, and move the template to the body:
// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);
// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }
float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }
this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.
Check if the PIMPL idiom can move code from the headers into .cpp files.
The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.
More specific advise - and how useful the one given is - depends largely on the actual library.
Good luck!
As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.
Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.
I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.
Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.