Non-friend, non-member functions increase encapsulation? - c++

In the article How Non-Member Functions Improve Encapsulation, Scott Meyers argues that there is no way to prevent non-member functions from "happening".
Syntax Issues
If you're like many people with whom I've discussed this issue, you're
likely to have reservations about the syntactic implications of my
advice that non-friend non-member functions should be preferred to
member functions, even if you buy my argument about encapsulation. For
example, suppose a class Wombat supports the functionality of both
eating and sleeping. Further suppose that the eating functionality
must be implemented as a member function, but the sleeping
functionality could be implemented as a member or as a non-friend
non-member function. If you follow my advice from above, you'd declare
things like this:
class Wombat {
public:
void eat(double tonsToEat);
void sleep(double hoursToSnooze);
};
w.eat(.564);
w.sleep(2.57);
Ah, the uniformity of it all! But this uniformity is misleading,
because there are more functions in the world than are dreamt of by
your philosophy.
To put it bluntly, non-member functions happen. Let us continue with
the Wombat example. Suppose you write software to model these fetching
creatures, and imagine that one of the things you frequently need your
Wombats to do is sleep for precisely half an hour. Clearly, you could
litter your code with calls to w.sleep(.5), but that would be a lot
of .5s to type, and at any rate, what if that magic value were to
change? There are a number of ways to deal with this issue, but
perhaps the simplest is to define a function that encapsulates the
details of what you want to do. Assuming you're not the author of
Wombat, the function will necessarily have to be a non-member, and
you'll have to call it as such:
void nap(Wombat& w) { w.sleep(.5); }
Wombat w;
nap(w);
And there you have it, your dreaded syntactic inconsistency. When you
want to feed your wombats, you make member function calls, but when
you want them to nap, you make non-member calls.
If you reflect a bit and are honest with yourself, you'll admit that
you have this alleged inconsistency with all the nontrivial classes
you use, because no class has every function desired by every client.
Every client adds at least a few convenience functions of their own,
and these functions are always non-members. C++ programers are used to
this, and they think nothing of it. Some calls use member syntax, and
some use non-member syntax. People just look up which syntax is
appropriate for the functions they want to call, then they call them.
Life goes on. It goes on especially in the STL portion of the Standard
C++ library, where some algorithms are member functions (e.g., size),
some are non-member functions (e.g., unique), and some are both (e.g.,
find). Nobody blinks. Not even you.
I can't really wrap my head around what he says in the bold/italic sentence. Why will it necessarily have to be implemented as a non-member? Why not just inherit your own MyWombat class from the Wombat class, and make the nap() function a member of MyWombat?
I'm just starting out with C++, but that's how I would probably do it in Java. Is this not the way to go in C++? If not, why so?

In theory, you sort of could do this, but you really don't want to. Let's consider why you don't want to do this (for the moment, in the original context--C++98/03, and ignoring the additions in C++11 and newer).
First of all, it would mean that essentially all classes have to be written to act as base classes--but for some classes, that's just a lousy idea, and may even run directly contrary to the basic intent (e.g., something intended to implement the Flyweight pattern).
Second, it would render most inheritance meaningless. For an obvious example, many classes in C++ support I/O. As it stands now, the idiomatic way to do that is to overload operator<< and operator>> as free functions. Right now, the intent of an iostream is to represent something that's at least vaguely file-like--something into which we can write data, and/or out of which we can read data. If we supported I/O via inheritance, it would also mean anything that can be read from/written to anything vaguely file-like.
This simply makes no sense at all. An iostream represents something at least vaguely file-like, not all the kinds of objects you might want to read from or write to a file.
Worse, it would render nearly all the compiler's type checking nearly meaningless. Just for example, writing a distance object into a person object makes no sense--but if they both support I/O by being derived from iostream, then the compiler wouldn't have a way to sort that out from one that really did make sense.
Unfortunately, that's just the tip of the iceberg. When you inherit from a base class, you inherit the limitations of that base class. For example, if you're using a base class that doesn't support copy assignment or copy construction, objects of the derived class won't/can't either.
Continuing the previous example, that would mean if you want to do I/O on an object, you can't support copy construction or copy assignment for that type of object.
That, in turn, means that objects that support I/O would be disjoint from objects that support being put in collections (i.e., collections require capabilities that are prohibited by iostreams).
Bottom line: we almost immediately end up with a thoroughly unmanageable mess, where none of our inheritance would any longer make any real sense at all and the compiler's type checking would be rendered almost completely useless.

Because you are then creating a very strong dependency between your new class and the original Wombat. Inheritance is not necessarily good; it is the second strongest relationship between any two entities in C++. Only friend declarations are stronger.
I think most of us did a double-take when Meyers first published that article, but it is generally acknowledged to be true by now. In the world of modern C++ your first instinct should not be to derive from a class. Deriving is the last resort, unless you are adding a new class that really is a specialization of an existing class.
Matters are different in Java. There you inherit. You really have no other choice.

Your idea doesn't work across the board, as Jerry Coffin describes, however it is viable for simple classes that are not part of a hierarchy, such as Wombat here.
There are some couple of dangers to watch out for though:
Slicing - if there is a function that accepts a Wombat by value, then you have to cut off myWombat's extra appendages and they don't grow back. This doesn't occur in Java in which all objects are passed by reference.
Base class pointer - If Wombat is non-polymorphic (i.e. no v-table), it means you cannot easily mix Wombat and myWombat in a container. Deleting a pointer will not properly delete myWombat varieties. (However you could use shared_ptr which tracks a custom deleter).
Type mismatch: If you write any functions that accept a myWombat then they cannot be called with a Wombat. On the other hand, if you write your function to accept a Wombat then you can't use the syntactic sugar of myWombat. Casting doesn't fix this; your code won't interact properly with other parts of the interface.
A way of avoiding all these dangers would be to use containment instead of inheritance: myWombat will have a Wombat private member, and you write forwarding functions for any Wombat properties you want to expose. This is more work in terms of design and maintenance of the myWombat class; but it eliminates the possibility for anyone to use your class erroneously, and it enables you to work around problems such as the contained class being non-copyable.
For polymorphic objects in a hierarchy, you don't have the slicing and base-class-pointer problems, although the type mismatch problem is still there. In fact it's worse. Suppose the hierarchy is:
Animal <-- Marsupial <-- Wombat <-- NorthernHairyNosedWombat
You come along and derive myWombat from Wombat. However, this means that NorthernHairyNosedWombat is a sibling of myWombat, whereas it was a child of Wombat.
So any nice sugar functions you add to myWombat are not usable by NorthernHairyNosedWombat anyway.
Summary: IMHO the benefits are not worth the mess it leaves behind.

Related

Why aren't all functions virtual in C++? [duplicate]

Is there any real reason not to make a member function virtual in C++? Of course, there's always the performance argument, but that doesn't seem to stick in most situations since the overhead of virtual functions is fairly low.
On the other hand, I've been bitten a couple of times with forgetting to make a function virtual that should be virtual. And that seems to be a bigger argument than the performance one. So is there any reason not to make member functions virtual by default?
One way to read your questions is "Why doesn't C++ make every function virtual by default, unless the programmer overrides that default." Without consulting my copy of "Design and Evolution of C++": this would add extra storage to every class unless every member function is made non-virtual. Seems to me this would have required more effort in the compiler implementation, and slowed down the adoption of C++ by providing fodder to the performance obsessed (I count myself in that group.)
Another way to read your questions is "Why do C++ programmers do not make every function virtual unless they have very good reasons not to?" The performance excuse is probably the reason. Depending on your application and domain, this might be a good reason or not. For example, part of my team works in market data ticker plants. At 100,000+ messages/second on a single stream, the virtual function overhead would be unacceptable. Other parts of my team work in complex trading infrastructure. Making most functions virtual is probably a good idea in that context, as the extra flexibility beats the micro-optimization.
Stroustrup, the designer of the language, says:
Because many classes are not designed to be used as base classes. For example, see class complex.
Also, objects of a class with a virtual function require space needed by the virtual function call mechanism - typically one word per object. This overhead can be significant, and can get in the way of layout compatibility with data from other languages (e.g. C and Fortran).
See The Design and Evolution of C++ for more design rationale.
There are several reasons.
First, performance: Yes, the overhead of a virtual function is relatively low seen in isolation. But it also prevents the compiler from inlining, and that is a huge source of optimization in C++. The C++ standard library performs as well as it does because it can inline the dozens and dozens of small one-liners it consists of. Additionally, a class with virtual methods is not a POD datatype, and so a lot of restrictions apply to it. It can't be copied just by memcpy'ing, it becomes more expensive to construct, and takes up more space. There are a lot of things that suddenly become illegal or less efficient once a non-POD type is involved.
And second, good OOP practice. The point in a class is that it makes some kind of abstraction, hides its internal details, and provides a guarantee that "this class will behave so and so, and will always maintain these invariants. It will never end up in an invalid state".
That is pretty hard to live up to if you allow others to override any member function. The member functions you defined in the class are there to ensure that the invariant is maintained. If we didn't care about that, we could just make the internal data members public, and let people manipulate them at will. But we want our class to be consistent. And that means we have to specify the behavior of its public interface. That may involve specific customizability points, by making individual functions virtual, but it almost always also involves making most methods non-virtual, so that they can do the job of ensuring that the invariant is maintained. The non-virtual interface idiom is a good example of this:
http://www.gotw.ca/publications/mill18.htm
Third, inheritance isn't often needed, especially not in C++. Templates and generic programming (static polymorphism) can in many cases do a better job than inheritance (runtime polymorphism). Yes, you sometimes still need virtual methods and inheritance, but it is certainly not the default. If it is, you're Doing It Wrong. Work with the language, rather than try to pretend it was something else. It's not Java, and unlike Java, in C++ inheritance is the exception, not the rule.
I'll ignore performance and memory cost, because I have no way to measure them for the "in general" case...
Classes with virtual member functions are non-POD. So if you want to use your class in low-level code which relies on it being POD, then (among other restrictions) any member functions must be non-virtual.
Examples of things you can portably do with an instance of a POD class:
copy it with memcpy (provided the target address has sufficient alignment).
access fields with offsetof()
in general, treat it as a sequence of char
... um
that's about it. I'm sure I've forgotten something.
Other things people have mentioned that I agree with:
Many classes are not designed for inheritance. Making their methods virtual would be misleading, since it implies child classes might want to override the method, and there shouldn't be any child classes.
Many methods are not designed to be overridden: same thing.
Also, even when things are intended to be subclassed / overridden, they aren't necessarily intended for run-time polymorphism. Very occasionally, despite what OO best practice says, what you want inheritance for is code reuse. For example if you're using CRTP for simulated dynamic binding. So again you don't want to imply your class will play nicely with runtime polymorphism by making its methods virtual, when they should never be called that way.
In summary, things which are intended to be overridden for runtime polymorphism should be marked virtual, and things which don't, shouldn't. If you find that almost all your member functions are intended to be virtual, then mark them virtual unless there's a reason not to. If you find that most of your member functions are not intended to be virtual, then don't mark them virtual unless there's a reason to do so.
It's a tricky issue when designing a public API, because flipping a method from one to the other is a breaking change, so you have to get it right first time. But you don't necessarily know before you have any users, whether your users are going to want to "polymorph" your classes. Ho hum. The STL container approach, of defining abstract interfaces and banning inheritance entirely, is safe but sometimes requires users to do more typing.
The following post is mostly opinion, but here goes:
Object oriented design is three things, and encapsulation (information hiding) is the first of these things. If a class design is not solid on this, then the rest doesn't really matter very much.
It has been said before that "inheritance breaks encapsulation" (Alan Snyder '86) A good discussion of this is present in the group of four design pattern book. A class should be designed to support inheritance in a very specific manner. Otherwise, you open the possibility of misuse by inheritors.
I would make the analogy that making all of your methods virtual is akin to making all your members public. A bit of a stretch, I know, but that's why I used the word 'analogy'
As you are designing your class hierarchy, it may make sense to write a function that should not be overridden. One example is if you are doing the "template method" pattern, where you have a public method that calls several private virtual methods. You would not want derived classes to override that; everyone should use the base definition.
There is no "final" keyword, so the best way to communicate to other developers that a method should not be overridden is to make it non-virtual. (other than easily ignored comments)
At the class level, making the destructor non-virtual communicates that the class should not be used as a base class, such as the STL containers.
Making a method non-virtual communicates how it should be used.
The Non-Virtual Interface idiom makes use of non-virtual methods. For more information please refer to Herb Sutter "Virtuality" article.
http://www.gotw.ca/publications/mill18.htm
And comments on the NVI idiom:
http://www.parashift.com/c++-faq-lite/strange-inheritance.html#faq-23.3
http://accu.org/index.php/journals/269 [See sub-section]

When should I prefer non-member non-friend functions to member functions?

Meyers mentioned in his book Effective C++ that in certain scenarios non-member non-friend functions are better encapsulated than member functions.
Example:
// Web browser allows to clear something
class WebBrowser {
public:
...
void clearCache();
void clearHistory();
void removeCookies();
...
};
Many users will want to perform all these actions together, so WebBrowser might also offer a function to do just that:
class WebBrowser {
public:
...
void clearEverything(); // calls clearCache, clearHistory, removeCookies
...
};
The other way is to define a non-member non-friend function.
void clearBrowser(WebBrowser& wb)
{
wb.clearCache();
wb.clearHistory();
wb.removeCookies();
}
The non-member function is better because "it doesn't increase the number of functions that can access the private parts of the class.", thus leading to better encapsulation.
Functions like clearBrowser are convenience functions because they can't offer any functionality a WebBrowser client couldn't already get in some other way. For example, if clearBrowser didn't exist, clients could just call clearCache, clearHistory, and removeCookies themselves.
To me, the example of convenience functions is reasonable. But is there any example other than convenience function when non-member version excels?
More generally, what are the rules of when to use which?
More generally, what are the rules of when to use which?
Here is what Scott Meyer's rules are (source):
Scott has an interesting article in print which advocates
that non-member non-friend functions improve encapsulation
for classes. He uses the following algorithm to determine
where a function f gets placed:
if (f needs to be virtual)
make f a member function of C;
else if (f is operator>> or operator<<)
{
make f a non-member function;
if (f needs access to non-public members of C)
make f a friend of C;
}
else if (f needs type conversions on its left-most argument)
{
make f a non-member function;
if (f needs access to non-public members of C)
make f a friend of C;
}
else if (f can be implemented via C's public interface)
make f a non-member function;
else
make f a member function of C;
His definition of encapsulation involves the number
of functions which are impacted when private data
members are changed.
Which pretty much sums it all up, and it is quite reasonable as well, in my opinion.
I often choose to build utility methods outside of my classes when they are application specific.
The application is usually in a different context then the engines doing the work underneath. If we take you example of a web browser, the 3 clear methods belongs to the web engine as this is needed functionality that would be difficult to implement anywhere else, however, the ClearEverything() is definitely more application specific. In this instance your application might have a small dialog that has a clear all button to help the user be more efficient. Maybe this is not something another application re-using your web browser engine would want to do and therefor having it in the engine class would just be more clutter.
Another example is a in a mathematic libraries. Often it make sense to have more advanced functionality like mean value or standard derivation implemented as part of a mathematical class. However, if you have an application specific way to calculate some type of mean that is not the standard version, it should probably be outside of your class and part of a utility library specific to you application.
I have never been a big fan of strong hardcoded rules to implement things in one way or another, it’s often a matter of ideology and principles.
M.
Non-member functions are commonly used when the developer of a library wants to write binary operators that can be overloaded on either argument with a class type, since if you make them a member of the class you can only overload on the second argument (the first is implicitly an object of that class). The various arithmetic operators for complex are perhaps the definitive example for this.
In the example you cite, the motivation is of another kind: use the least coupled design that still allows you to do the job.
This means that while clearEverything could (and, to be frank, would be quite likely to) be made a member, we don't make it one because it does not technically have to be. This buys you two things:
You don't have to accept the responsibility of having a clearEverything method in your public interface (once you ship with one, you 're married to it for life).
The number of functions with access to the private members of the class is one lower, hence any changes in the future will be easier to perform and less likely to cause bugs.
That said, in this particular example I feel that the concept is being taken too far, and for such an "innocent" function I 'd gravitate towards making it a member. But the concept is sound, and in "real world" scenarios where things are not so simple it would make much more sense.
Locality and allowing the class to provide 'enough' features while maintaining encapsulation are some things to consider.
If WebBrowser is reused in many places, the dependencies/clients may define multiple convenience functions. This keeps your classes (WebBrowser) lightweight and easy to manage.
The inverse would be that the WebBrowser ends up pleasing all clients, and just becomes some monolithic beast that is difficult to change.
Do you find the class is lacking functionality once it has been put to use in multiple scenarios? Do patterns emerge in your convenience functions? It's best (IMO) to defer formally extending the class's interface until patterns emerge and there is a good reason to add this functionality. A minimal class is easier to maintain, but you don't want redundant implementations all over the place because that pushes the maintenance burden onto your clients.
If your convenience functions are complex to implement, or there is a common case which can improve performance significantly (e.g. to empty a thread safe collection with one lock, rather than one element at a time with a lock each time), then you may also want to consider that case.
There will also be cases where you realize something is genuinely missing from the WebBrowser as you use it.

Single-use class

In a project I am working on, we have several "disposable" classes. What I mean by disposable is that they are a class where you call some methods to set up the info, and you call what equates to a doit function. You doit once and throw them away. If you want to doit again, you have to create another instance of the class. The reason they're not reduced to single functions is that they must store state for after they doit for the user to get information about what happened and it seems to be not very clean to return a bunch of things through reference parameters. It's not a singleton but not a normal class either.
Is this a bad way to do things? Is there a better design pattern for this sort of thing? Or should I just give in and make the user pass in a boatload of reference parameters to return a bunch of things through?
What you describe is not a class (state + methods to alter it), but an algorithm (map input data to output data):
result_t do_it(parameters_t);
Why do you think you need a class for that?
Sounds like your class is basically a parameter block in a thin disguise.
There's nothing wrong with that IMO, and it's certainly better than a function with so many parameters it's hard to keep track of which is which.
It can also be a good idea when there's a lot of input parameters - several setup methods can set up a few of those at a time, so that the names of the setup functions give more clue as to which parameter is which. Also, you can cover different ways of setting up the same parameters using alternative setter functions - either overloads or with different names. You might even use a simple state-machine or flag system to ensure the correct setups are done.
However, it should really be possible to recycle your instances without having to delete and recreate. A "reset" method, perhaps.
As Konrad suggests, this is perhaps misleading. The reset method shouldn't be seen as a replacement for the constructor - it's the constructors job to put the object into a self-consistent initialised state, not the reset methods. Object should be self-consistent at all times.
Unless there's a reason for making cumulative-running-total-style do-it calls, the caller should never have to call reset explicitly - it should be built into the do-it call as the first step.
I still decided, on reflection, to strike that out - not so much because of Jalfs comment, but because of the hairs I had to split to argue the point ;-) - Basically, I figure I almost always have a reset method for this style of class, partly because my "tools" usually have multiple related kinds of "do it" (e.g. "insert", "search" and "delete" for a tree tool), and shared mode. The mode is just some input fields, in parameter block terms, but that doesn't mean I want to keep re-initializing. But just because this pattern happens a lot for me, doesn't mean it should be a point of principle.
I even have a name for these things (not limited to the single-operation case) - "tool" classes. A "tree_searching_tool" will be a class that searches (but doesn't contain) a tree, for example, though in practice I'd have a "tree_tool" that implements several tree-related operations.
Basically, even parameter blocks in C should ideally provide a kind of abstraction that gives it some order beyond being just a bunch of parameters. "Tool" is a (vague) abstraction. Classes are a major means of handling abstraction in C++.
I have used a similar design and wondered about this too. A fictive simplified example could look like this:
FileDownloader downloader(url);
downloader.download();
downloader.result(); // get the path to the downloaded file
To make it reusable I store it in a boost::scoped_ptr:
boost::scoped_ptr<FileDownloader> downloader;
// Download first file
downloader.reset(new FileDownloader(url1));
downloader->download();
// Download second file
downloader.reset(new FileDownloader(url2));
downloader->download();
To answer your question: I think it's ok. I have not found any problems with this design.
As far as I can tell you are describing a class that represents an algorithm. You configure the algorithm, then you run the algorithm and then you get the result of the algorithm. I see nothing wrong with putting those steps together in a class if the alternative is a function that takes 7 configuration parameters and 5 output references.
This structuring of code also has the advantage that you can split your algorithm into several steps and put them in separate private member functions. You can do that without a class too, but that can lead to the sub-functions having many parameters if the algorithm has a lot of state. In a class you can conveniently represent that state through member variables.
One thing you might want to look out for is that structuring your code like this could easily tempt you to use inheritance to share code among similar algorithms. If algorithm A defines a private helper function that algorithm B needs, it's easy to make that member function protected and then access that helper function by having class B derive from class A. It could also feel natural to define a third class C that contains the common code and then have A and B derive from C. As a rule of thumb, inheritance used only to share code in non-virtual methods is not the best way - it's inflexible, you end up having to take on the data members of the super class and you break the encapsulation of the super class. As a rule of thumb for that situation, prefer factoring the common code out of both classes without using inheritance. You can factor that code into a non-member function or you might factor it into a utility class that you then use without deriving from it.
YMMV - what is best depends on the specific situation. Factoring code into a common super class is the basis for the template method pattern, so when using virtual methods inheritance might be what you want.
Nothing especially wrong with the concept. You should try to set it up so that the objects in question can generally be auto-allocated vs having to be newed -- significant performance savings in most cases. And you probably shouldn't use the technique for highly performance-sensitive code unless you know your compiler generates it efficiently.
I disagree that the class you're describing "is not a normal class". It has state and it has behavior. You've pointed out that it has a relatively short lifespan, but that doesn't make it any less of a class.
Short-lived classes vs. functions with out-params:
I agree that your short-lived classes are probably a little more intuitive and easier to maintain than a function which takes many out-params (or 1 complex out-param). However, I suspect a function will perform slightly better, because you won't be taking the time to instantiate a new short-lived object. If it's a simple class, that performance difference is probably negligible. However, if you're talking about an extremely performance-intensive environment, it might be a consideration for you.
Short-lived classes: creating new vs. re-using instances:
There's plenty of examples where instances of classes are re-used: thread-pools, DB-connection pools (probably darn near any software construct ending in 'pool' :). In my experience, they seem to be used when instantiating the object is an expensive operation. Your small, short-lived classes don't sound like they're expensive to instantiate, so I wouldn't bother trying to re-use them. You may find that whatever pooling mechanism you implement, actually costs MORE (performance-wise) than simply instantiating new objects whenever needed.

How to deal with the idea of "many small functions" for classes, without passing lots of parameters?

Over time I have come to appreciate the mindset of many small functions ,and I really do like it a lot, but I'm having a hard time losing my shyness to apply it to classes, especially ones with more than a handful of nonpublic member variables.
Every additional helper function clutters up the interface, since often the code is class specific and I can't just use some generic piece of code.
(To my limited knowledge, anyway, still a beginner, don't know every library out there, etc.)
So in extreme cases, I usually create a helper class which becomes the friend of the class that needs to be operated on, so it has access to all the nonpublic guts.
An alternative are free functions that need parameters, but even though premature optimization is evil, and I haven't actually profiled or disassembled it...
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference, even though that should be a simple address per argument.
Is all this a matter of preference, or is there a widely used way of dealing with that kind of stuff?
I know that trying to force stuff into patterns is a kind of anti pattern, but I am concerned about code sharing and standards, and I want to get stuff at least fairly non painful for other people to read.
So, how do you guys deal with that?
Edit:
Some examples that motivated me to ask this question:
About the free functions:
DeadMG was confused about making free functions work...without arguments.
My issue with those functions is that unlike member functions, free functions only know about data, if you give it to them, unless global variables and the like are used.
Sometimes, however, I have a huge, complicated procedure I want to break down for readability and understandings sake, but there are so many different variables which get used all over the place that passing all the data to free functions, which are agnostic to every bit of member data, looks simply nightmarish.
Click for an example
That is a snippet of a function that converts data into a format that my mesh class accepts.
It would take all of those parameter to refactor this into a "finalizeMesh" function, for example.
At this point it's a part of a huge computer mesh data function, and bits of dimension info and sizes and scaling info is used all over the place, interwoven.
That's what I mean with "free functions need too many parameters sometimes".
I think it shows bad style, and not necessarily a symptom of being irrational per se, I hope :P.
I'll try to clear things up more along the way, if necessary.
Every additional helper function clutters up the interface
A private helper function doesn't.
I usually create a helper class which becomes the friend of the class that needs to be operated on
Don't do this unless it's absolutely unavoidable. You might want to break up your class's data into smaller nested classes (or plain old structs), then pass those around between methods.
I still DREAD the mere thought of passing all the stuff I need sometimes, even just as reference
That's not premature optimization, that's a perfectly acceptable way of preventing/reducing cognitive load. You don't want functions taking more than three parameters. If there are more then three, consider packaging your data in a struct or class.
I sometimes have the same problems as you have described: increasingly large classes that need too many helper functions to be accessed in a civilized manner.
When this occurs I try to seperate the class in multiple smaller classes if that is possible and convenient.
Scott Meyers states in Effective C++ that friend classes or functions is mostly not the best option, since the client code might do anything with the object.
Maybe you can try nested classes, that deal with the internals of your object. Another option are helper functions that use the public interface of your class and put the into a namespace related to your class.
Another way to keep your classes free of cruft is to use the pimpl idiom. Hide your private implementation behind a pointer to a class that actually implements whatever it is that you're doing, and then expose a limited subset of features to whoever is the consumer of your class.
// Your public API in foo.h (note: only foo.cpp should #include foo_impl.h)
class Foo {
public:
bool func(int i) { return impl_->func(i); }
private:
FooImpl* impl_;
};
There are many ways to implement this. The Boost pimpl template in the Vault is pretty good. Using smart pointers is another useful way of handling this, too.
http://www.boost.org/doc/libs/1_46_1/libs/smart_ptr/sp_techniques.html#pimpl
An alternative are free functions that
need parameters, but even though
premature optimization is evil, and I
haven't actually profiled or
disassembled it... I still DREAD the
mere thought of passing all the stuff
I need sometimes, even just as
reference, even though that should be
a simple address per argument.
So, let me get this entirely straight. You haven't profiled or disassembled. But somehow, you intend on ... making functions work ... without arguments? How, exactly, do you propose to program without using function arguments? Member functions are no more or less efficient than free functions.
More importantly, you come up with lots of logical reasons why you know you're wrong. I think the problem here is in your head, which possibly stems from you being completely irrational, and nothing that any answer from any of us can help you with.
Generic algorithms that take parameters are the basis of modern object orientated programming- that's the entire point of both templates and inheritance.

Class design vs. IDE: Are nonmember nonfriend functions really worth it?

In the (otherwise) excellent book C++ Coding Standards, Item 44, titled "Prefer writing nonmember nonfriend functions", Sutter and Alexandrescu recommend that only functions that really need access to the members of a class be themselves members of that class. All other operations which can be written by using only member functions should not be part of the class. They should be nonmembers and nonfriends. The arguments are that:
It promotes encapsulation, because there is less code that needs access to the internals of a class.
It makes writing function templates easier, because you don't have to guess each time whether some function is a member or not.
It keeps the class small, which in turn makes it easier to test and maintain.
Although I see the value in these argument, I see a huge drawback: my IDE can't help me find these functions! Whenever I have an object of some kind, and I want to see what operations are available on it, I can't just type "pMysteriousObject->" and get a list of member functions anymore.
Keeping a clean design is in the end about making your programming life easier. But this would actually make mine much harder.
So I'm wondering if it's really worth the trouble. How do you deal with that?
Scott Meyers has a similar opinion to Sutter, see here.
He also clearly states the following:
"Based on his work with various string-like classes, Jack Reeves has observed that some functions just don't "feel" right when made non-members, even if they could be non-friend non-members. The "best" interface for a class can be found only by balancing many competing concerns, of which the degree of encapsulation is but one."
If a function would be something that "just makes sense" to be a member function, make it one. Likewise, if it isn't really part of the main interface, and "just makes sense" to be a non-member, do that.
One note is that with overloaded versions of eg operator==(), the syntax stays the same. So in this case you have no reason not to make it a non-member non-friend floating function declared in the same place as the class, unless it really needs access to private members (in my experience it rarely will). And even then you can define operator!=() a non-member and in terms of operator==().
I don't think it would be wrong to say that between them, Sutter, Alexandrescu, and Meyers have done more for the quality of C++ than anybody else.
One simple question they ask is:
If a utility function has two independent classes as parameteres, which class should "own" the member function?
Another issue, is you can only add member functions where the class in question is under your control. Any helper functions that you write for std::string will have to be non-members since you cannot re-open the class definition.
For both of these examples, your IDE will provide incomplete information, and you will have to use the "old fashion way".
Given that the most influential C++ experts in the world consider that non-member functions with a class parameter are part of the classes interface, this is more of an issue with your IDE rather than the coding style.
Your IDE will likely change in a release or two, and you may even be able to get them to add this feature. If you change your coding style to suit todays IDE you may well find that you have bigger problems in the future with unextendable/unmaintainable code.
I'm going to have to disagree with Sutter and Alexandrescu on this one. I think if the behavior of function foo() falls within the realm of class Bar's responsibilities, then foo() should be part of bar().
The fact that foo() doesn't need direct access to Bar's member data doesn't mean it isn't conceptually part of Bar. It can also mean that the code is well factored. It's not uncommon to have member functions which perform all their behavior via other member functions, and I don't see why it should be.
I fully agree that peripherally-related functions should not be part of the class, but if something is core to the class responsibilities, there's no reason it shouldn't be a member, regardless of whether it is directly mucking around with the member data.
As for these specific points:
It promotes encapsulation, because there is less code that needs access to the internals of a class.
Indeed, the fewer functions that directly access the internals, the better. That means that having member functions do as much as possible via other member functions is a good thing. Splitting well-factored functions out of the class just leaves you with a half-class, that requires a bunch of external functions to be useful. Pulling well-factored functions away from their classes also seems to discourage the writing of well-factored functions.
It makes writing function templates easier, because you don't have to guess each time whether some function is a member or not.
I don't understand this at all. If you pull a bunch of functions out of classes, you've thrust more responsibility onto function templates. They are forced to assume that even less functionality is provided by their class template arguments, unless we are going to assume that most functions pulled from their classes is going to be converted into a template (ugh).
It keeps the class small, which in turn makes it easier to test and maintain.
Um, sure. It also creates a lot of additional, external functions to test and maintain. I fail to see the value in this.
It's true that external functions should not be part of the interface. In theory, your class should only contain the data and expose the interface for what it is intended and not utilitarian functions. Adding utility functions to the interface just grow the class code base and make it less maintainable. I currently maintain a class with around 50 public methods, that's just insane.
Now, in reality, I agree that this is not easy to enforce. It's often easier to just add another method to your class, even more if you are using an IDE that can really simply add a new method to an existing class.
In order to keep my classes simple and still be able to centralize external function, I often use utility class that works with my class, or even namespaces.
I start by creating the class that will wrap my data and expose the simplest possible interface. I then create a new class for every task I have to do with the class.
Example: create a class Point, then add a class PointDrawer to draw it to a bitmap, PointSerializer to save it, etc.
If you give them a common prefix, then maybe your IDE will help if you type
::prefix
or
namespace::prefix
In many OOP languages non-friend non-class methods are third-class citizens that reside in an orphanage unconnected to anything. When I write a method, I like to pick good parents - a fitting class - where they have the best chances to feel welcome and help.
I would have thought the IDE was actually helping you out.
The IDE is hiding the protected functions from the list because they are not available to the public just as the designer of the class intended.
If you had been within the scope of the class and typed this-> then the protected functions would be displayed in the list.