Related
I've made a programm with libraries.
One library has a interface for to include a header for an extern programm call:
class IDA{
private:
class IDA_A;
IDA_A *p_IDA_A;
public:
IDA();
~IDA();
void A_Function(const char *A_String);
};
Then I opened the header with Kate and placed
class IDA_A;
IDA_A *p_IDA_A;
into the public part.
And this worked?! But why? And can I avoid that?
Greeting
Earlybite
And this worked?!
Yes
But why?
private, protected and public are just hints to the C+++ compiler how the data structure is supposed to be used. The C++ compiler will error out if piece of C++ source doesn't follow this, but that's it.
It has no effect whatsoever on the memory layout. In particular the C++ standard mandates, that member variables of a class must be laid out in memory in the order they are declared in the class definition.
And can I avoid that?
No. In the same way you can not avoid, that some other piece of code static_casts a (opaque) pointer to your class to char* and dumps it out to a file as is, completely with vtable and everything.
If you really want to avoid the user of a library to "look inside" you must give them some handle value and internally use a map (std::map or similar) to look up the actual instance from the handle value. I do this in some of the libraries I develop, for robustness reasons, but also because some of the programming environments there are bindings for don't deal properly with pointers.
Of course this works. The whole private-public stuff is only checked by the compiler. There are no runtime-checks for this. If the library is already compiled, the user only needs the library file (shared object) and the header files for development. If you afterwards change the header file in the pattern you mentioned, IDA_A is considered public by the compiler.
How can you avoid that? There is no way for this. If another user changes your headerfile, you can do nothing about it. Also is the question: Why bother with it? It can have very ugly side-effects if the headerfile for an already compiled lib is been changed. So the one who changes it also has to deal with the issues. There is just some stuff you dont do. One of them is moving stuff in headers around for libraries (unless you are currently developing on that library).
You've got access to the p_IDA_A member variable, and this is always possible in the manner you've outlined. But it is of type IDA_A, which for which you only have a declaration, but not the definition. So you can not do anything meaningful with it. (This method of data hiding is called the pimpl idiom).
Why would you have an abstract base class defining an interface for a library where there is only one (always and forever) derived class?
You may want to swap out the implementation for something like Unit Testing
One reason why you would do this is for testability. It is much simpler to test dependent objects when their dependencies are defined as interfaces. This give the easy ability to mock or stub.
To violate the reused abstraction principle.
In short, don't do this.
Those who say "for testing" are overlooking that you can just replace
Base < - > Derived
Base < - > DerivedForMockingAndTesting
with
Derived < - > DerivedForMockingAndTesting
That is, let your existing implementation Derived serve as the "abstraction" to be mocked out and tested in unit testing.
If you can be 100% certain that there will always be only one and exactly one derived class? Not much reason. BUT: In reality you hardly will be 100% certain of anything and surely not the future of your code.
You may find the need for different versions of that class to be binary compatible. You may also find that for other reasons, you wish to encapsulate the definition of the class- for example, because the definition requires a header which has poor macros, and that kind of thing, or the header containing what's necessary to define the class has a very long compile time.
For example, I wrote a class to encapsulate the features offered by my operating system- for now, things like dynamic loading and creating a window. Even though there'll only ever be one implementation for one compile target (Windows, etc), I chose to use a run-time abstraction, because I wanted to guarantee that the rest of my code never saw a platform-specific header, and the Windows header is full of so many macros and stuff, that I didn't want them leaking out.
The most obvious reason would be to be able to put the class'
implementation in a source file, and not in a header. All that is
exposed in the header is the abstract base class (and a factory function
necessary to construct it, but this could be a static member). This
avoids having to include the header files for any member data; the pimpl
idiom is more idiomatic in C++ for this, but using abstract classes like
this is far from unknown, and works fairly well as well.
Most C++ class method signatures are duplicated between the declaration normally in a header files and the definition in the source files in the code I have read. I find this repetition undesirable and code written this way suffers from poor locality of reference. For instance, the methods in source files often reference instance variables declared in the header file; you end up having to constantly switch between header files and source files when reading code.
Would anyone recommend a way to avoid doing so? Or, am I mainly going to confuse experienced C++ programmers by not doing things in the usual way?
See also Question 538255 C++ code in header files where someone is told that everything should go in the header.
There is an alternative, but the cure is worse than the illness — define all the function bodies in the header, or even inline in the class, like C#. The downsides are that this will bloat compile times significantly, and it'll annoy veteran C++ programmers. It can also get you into some annoying situations of circular dependency that, while solvable, are a nuisance to deal with.
Personally, I just set my IDE to have a vertical split, and put the header file on the right side and the source file on the left.
I assume you're talking about member function declarations in a header file and definitions in source files?
If you're used to the Java/Python/etc. model, it may well seem redundant. In fact, if you were so inclined, you could define all functions inline in the class definition (in the header file). But, you'd definitely be breaking with convention and paying the price of additional coupling and compilation time every time you changed anything minor in the implementation.
C++, Ada, and other languages originally designed for large scale systems kept definitions hidden for a reason--there's no good reason that the users of a class should have to be concerned with its implementation, nor any reason they should have to repeatedly pay to compile it. Less of an issue nowadays with faster systems, but still relevant for really large systems. Additionally, TDD, stubbing and other testing strategies are facilitated by the isolation and quicker compilation.
Don't break with convention. In the end, you will make a ball of worms that doesn't work very well. Plus, compilers will hate you. C/C++ are setup that way for a reason.
C++ language supports function overloading, which means that the entire function signature is basically a way to identify a specific function. For this reason, as long as you declare and define function separately, there's really no redundancy in having to list the parameters again. More precisely, having to list the parameter types is not redundant. Parameters names, on the other hand, play no role in this process and you are free to omit them in the declaration (i.e in the header file), although I belive this limits readability.
You "can" get around the problem. You define an abstract interface class that only contains the pure virtual functions that an outside application will call. Then in the CPP file you provide the actual class that derives from the interface and contains all the class variables. You implement as normal now. The only thing this requires is a way to instantiate the derived implementation class from the interface class. You could do that by providing a static "Create" function that has its implementation in the CPP file.
ie
InterfaceClass* InterfaceClass::Create()
{
return new ImplementationClass;
}
This way you effectively hide the implementation from any outside user. You can't, however, create the class on the stack only on the heap ... but it does solve your problem AND provides a better layer of abstraction. In the end though if you aren't prepared to do this you need to stick with what you are doing.
As I understand, the pimpl idiom is exists only because C++ forces you to place all the private class members in the header. If the header were to contain only the public interface, theoretically, any change in class implementation would not have necessitated a recompile for the rest of the program.
What I want to know is why C++ is not designed to allow such a convenience. Why does it demand at all for the private parts of a class to be openly displayed in the header (no pun intended)?
This has to do with the size of the object. The h file is used, among other things, to determine the size of the object. If the private members are not given in it, then you would not know how large an object to new.
You can simulate, however, your desired behavior by the following:
class MyClass
{
public:
// public stuff
private:
#include "MyClassPrivate.h"
};
This does not enforce the behavior, but it gets the private stuff out of the .h file.
On the down side, this adds another file to maintain.
Also, in visual studio, the intellisense does not work for the private members - this could be a plus or a minus.
I think there is a confusion here. The problem is not about headers. Headers don't do anything (they are just ways to include common bits of source text among several source-code files).
The problem, as much as there is one, is that class declarations in C++ have to define everything, public and private, that an instance needs to have in order to work. (The same is true of Java, but the way reference to externally-compiled classes works makes the use of anything like shared headers unnecessary.)
It is in the nature of common Object-Oriented Technologies (not just the C++ one) that someone needs to know the concrete class that is used and how to use its constructor to deliver an implementation, even if you are using only the public parts. The device in (3, below) hides it. The practice in (1, below) separates the concerns, whether you do (3) or not.
Use abstract classes that define only the public parts, mainly methods, and let the implementation class inherit from that abstract class. So, using the usual convention for headers, there is an abstract.hpp that is shared around. There is also an implementation.hpp that declares the inherited class and that is only passed around to the modules that implement methods of the implementation. The implementation.hpp file will #include "abstract.hpp" for use in the class declaration it makes, so that there is a single maintenance point for the declaration of the abstracted interface.
Now, if you want to enforce hiding of the implementation class declaration, you need to have some way of requesting construction of a concrete instance without possessing the specific, complete class declaration: you can't use new and you can't use local instances. (You can delete though.) Introduction of helper functions (including methods on other classes that deliver references to class instances) is the substitute.
Along with or as part of the header file that is used as the shared definition for the abstract class/interface, include function signatures for external helper functions. These function should be implemented in modules that are part of the specific class implementations (so they see the full class declaration and can exercise the constructor). The signature of the helper function is probably much like that of the constructor, but it returns an instance reference as a result (This constructor proxy can return a NULL pointer and it can even throw exceptions if you like that sort of thing). The helper function constructs a particular implementation instance and returns it cast as a reference to an instance of the abstract class.
Mission accomplished.
Oh, and recompilation and relinking should work the way you want, avoiding recompilation of calling modules when only the implementation changes (since the calling module no longer does any storage allocations for the implementations).
You're all ignoring the point of the question -
Why must the developer type out the PIMPL code?
For me, the best answer I can come up with is that we don't have a good way to express C++ code that allows you to operate on it. For instance, compile-time (or pre-processor, or whatever) reflection or a code DOM.
C++ badly needs one or both of these to be available to a developer to do meta-programming.
Then you could write something like this in your public MyClass.h:
#pragma pimpl(MyClass_private.hpp)
And then write your own, really quite trivial wrapper generator.
Someone will have a much more verbose answer than I, but the quick response is two-fold: the compiler needs to know all the members of a struct to determine the storage space requirements, and the compiler needs to know the ordering of those members to generate offsets in a deterministic way.
The language is already fairly complicated; I think a mechanism to split the definitions of structured data across the code would be a bit of a calamity.
Typically, I've always seen policy classes used to define implementation behavior in a Pimpl-manner. I think there are some added benefits of using a policy pattern -- easier to interchange implementations, can easily combine multiple partial implementations into a single unit which allow you to break up the implementation code into functional, reusable units, etc.
May be because the size of the class is required when passing its instance by values, aggregating it in other classes, etc ?
If C++ did not support value semantics, it would have been fine, but it does.
Yes, but...
You need to read Stroustrup's "Design and Evolution of C++" book. It would have inhibited the uptake of C++.
I'm writing in second-person just because its easy, for you.
You are working with a game engine and really wish a particular engine class had a new method that does 'bla'. But you'd rather not spread your 'game' code into the 'engine' code.
So you could derive a new class from it with your one new method and put that code in your 'game' source directory, but maybe there's another option?
So this is probably completely illegal in the C++ language, but you thought at first, "perhaps I can add a new method to an existing class via my own header that includes the 'parent' header and some special syntax. This is possible when working with a namespace, for example..."
Assuming you can't declare methods of a class across multiple headers (and you are pretty darn sure you can't), what are the other options that support a clean divide between 'middleware/engine/library' and 'application', you wonder?
My only question to you is, "does your added functionality need to be a member function, or can it be a free function?" If what you want to do can be solved using the class's existing interface, then the only difference is the syntax, and you should use a free function (if you think that's "ugly", then... suck it up and move on, C++ wasn't designed for monkeypatching).
If you're trying to get at the internal guts of the class, it may be a sign that the original class is lacking in flexibility (it doesn't expose enough information for you to do what you want from the public interface). If that's the case, maybe the original class can be "completed", and you're back to putting a free function on top of it.
If absolutely none of that will work, and you just must have a member function (e.g. original class provided protected members you want to get at, and you don't have the freedom to modify the original interface)... only then resort to inheritance and member-function implementation.
For an in-depth discussion (and deconstruction of std::string'), check out this Guru of the Week "Monolith" class article.
Sounds like a 'acts upon' relationship, which would not fit in an inheritance (use sparingly!).
One option would be a composition utility class that acts upon a certain instance of the 'Engine' by being instantiated with a pointer to it.
Inheritance (as you pointed out), or
Use a function instead of a method, or
Alter the engine code itself, but isolate and manage the changes using a patch-manager like quilt or Mercurial/MQ
I don't see what's wrong with inheritance in this context though.
If the new method will be implemented using the existing public interface, then arguably it's more object oriented for it to be a separate function rather than a method. At least, Scott Meyers argues that it is.
Why? Because it gives better encapsulation. IIRC the argument goes that the class interface should define things that the object does. Helper-style functions are things that can be done with/to the object, not things that the object must do itself. So they don't belong in the class. If they are in the class, they can unnecessarily access private members and hence widen the hiding of that member and hence the number of lines of code that need to be touched if the private member changes in any way.
Of course if you want to access protected members then you must inherit. If your desired method requires per-instance state, but not access to protected members, then you can either inherit or composite according to taste - the former is usually more concise, but has certain disadvantages if the relationship isn't really "is a".
Sounds like you want Ruby mixins. Not sure there's anything close in C++. I think you have to do the inheritance.
Edit: You might be able to put a friend method in and use it like a mixin, but I think you'd start to break your encapsulation in a bad way.
You could do something COM-like, where the base class supports a QueryInterface() method which lets you ask for an interface that has that method on it. This is fairly trivial to implement in C++, you don't need COM per se.
You could also "pretend" to be a more dynamic language and have an array of callbacks as "methods" and gin up a way to call them using templates or macros and pushing 'this' onto the stack before the rest of the parameters. But it would be insane :)
Or Categories in Objective C.
There are conceptual approaches to extending class architectures (not single classes) in C++, but it's not a casual act, and requires planning ahead of time. Sorry.
Sounds like a classic inheritance problem to me. Except I would drop the code in an "Engine Enhancements" directory & include that concept in your architecture.