How to use c++ classes without names? - c++

I encountered a problem in opensource c++ code. The following is a small and simplified version to describe my problem:
#include <iostream>
using namespace std;
#define TOGETHER2(a,b) a ## b
#define TOGETHER(a,b) TOGETHER2(a,b)
#define GENERATE_NAME(a) TOGETHER(a,__COUNTER__)
#define GENERATE GENERATE_NAME(__seed_)
class base{
}b;
class GENERATE:public base{
}GENERATE;
class GENERATE:public base{
}GENERATE;
class GENERATE:public base{
}GENERATE;
class GENERATE:public base{
}GENERATE;
int main(){
return 0;
}
As we can see, the author defines several classes, which inherites a base class. But the author do not care about the names of the classes. So I wonder how can I use those classes without specifying their names?
Is this a kind of design pattern in c++ that I don't know?
Thank you :)
I want to add my guesses to make the question clear.
My guesses:
The names of these classes are generated from __seed_, but when I search through the files, I can not find other references to __seed_, So I am sure The author did not use the names __seed_1, __seed_2 to create classes. (Actually the author said the the comments that she did not care about the names of the classes)
I also guessed that the author may have used those classes through the interface defined in the base class(virtual function). To do that, the author still needed to create these classes, but as I mentioned I could not find __seed_ in other parts of the code, So the author couldn't create classes, and therefore virtual function do not work either.
Actually, I tried remove these class definitions, and strangely the code compiles correctly. However, It lost some of functionalities, but It did not just core dump. It could still finish some tasks successfully u
So, do anyone know:
How can we use those classes without specifying their names?
Is this design a certain kind of design pattern?
In which situation should we define class without caring about their names?
As I mentioned, I removed some part of the code, and It compiled. How could this happen? I mean, Since I remove many classes from the source code, then If other parts of the code references those classes, the source can not compile. And If it compiles, Can I just conclude that those classes are not needed?
ADDED:
As some of you recommended,
the full source code is here: MIT Cryptdb. In the file ./main/rewrite_const.cc, the author used macro ANON(lion 25) to define many classes without caring about their names.
Really appreciate your help :)

I recommend you to edit the code and add names for the classes. This is a strange design pattern, and I wouldn't recommend you to use such pattern for anything, unless you want to prevent others from using your classes.
If the author wants you to use those classes, there is probably some way you can use them without editing the code and adding the names. You should consult the documentation for this.
As I mentioned, I removed some part of the code, and It compiled. How could this happen? I mean, Since I remove many classes from the source code, then If other parts of the code references those classes, the source can not compile. And If it compiles, Can I just conclude that those classes are not needed?
All those generated classes are derived from the base class. So if you remove one class, all classes that come after it receive a new generated name. If the code now compiles, it means the other code is only calling the methods that are part of the base class. But the other code is now using other classes than what it originally used, which causes the errors you observe.
Consider this:
Initially the generated classes have names A, B, and C.
You remove class A.
Now the generated classes have names A and B. Class named C no longer exists, so the code that uses it should no longer compile. And the code that used class A and B before, it is now using the classes that used to be B and C.

These classes do have names. Only that these names are not revealed to the human reader and are not specified before the preprocessor has run. (If you run the compiler with option -E, it will only run the preprocessor stage and output the code as the compiler proper sees it, including the class names.)
AFAIK, there is no sensible reason to hide the names in this way. If the author doesn't want humans to write code that uses these classes, then there are other ways.
Defining such names in a header file to be included by the user implies that they cannot be used from within the library other than via polymorphism (because the library cannot know their names). This is the reason why removing them made no difference regarding compilation.

Related

C++ class extension technology

I know the official answer for "extension class in C++ like objective-C or c#" is NO. But is there any hack ways to implement this? And what is the cost?
I ask this because my colleague use my parser to generate C++ class files from a special format txt file. They complained that it is difficult to extension the class.
I can't force them to use inheritance, because the class generated is like this:
class A {}
class B : A {}
if my colleague extends A like this:
class C : A {}
then the B class will not benefit from the C class. That means: In our situation, if class C : A, B is meant to inherit from A, then now B should inherit from C now. But it is not possible since the B has hard code to inherit from A. That means, inheritance is not a good option, the truly demand is to extend A.
And using A as a member in a new class is not an option, either. Since our logic is more like a "is-A", not a "has-A", force make A as a member will make the code hard to read.
Currently they directly modify the class header file, and any new member functions is implement in a new cpp file(thanks to C++ class file structure), so if the class changes, the origin cpp file will regenered, they won't care about it, while they use git to merge the new generated header file to the file they have modified.
I can write a parser to scan the header file and do the merge, but write a parser to fully implement C++ standard BNF(http://www.externsoft.ch/download/cpp-iso.html) is difficult.
Currently I decide to use macros, like the mechanism used by flex and bison to replace the action in .y file to the generated c file. But I wondered if there's a easy way.
A common C++ solution is freeFunction(A&) instead of creating class B. Unlike pure OO languages, C++ has free functions which are not class members. Your freeFunction_B(A&) and your colleague's freeFunction_C(A&)` will not interfere.
Obviously this is not a solution when you need to add data members. In that case, there's another option. Leave open the base class:
template<typename BASE> class B : public BASE {
// ...
}
This allows both B<A> and B<C<A>>. Slight downside: C<B<A>> is not the same type as B<C<A>>, which is logical. The members have to be in a certain order in memory, and there are two choices.
(General advice: code generation and C++? That means templates)
As doing a research for some times, I think the term "monkey patch" is the technology what I'm looking for, but it seems only can be implemented with languages which has the reflection feature.
Currently I use "has-a" extension in my code instead of "is-a" to avoid changing the generated code.

Can I use a slim version of my header to be included with the library?

What I mean is my real header file can look like this:
#include "some_internal_class.h"
class MyLibrary {
Type private_member;
void private_function();
public:
MyLibrary();
void function_to_be_called_by_library_users();
};
Now I want to produce a dynamic library containing all the necessary definitions. and I want to ship with it a single header instead of shipping every single header I have in my library.
So I was thinking I could create a slim version of my header like so:
class MyLibrary {
public:
MyLibrary();
void function_to_be_called_by_library_users();
};
Headers are just declarations anyway right? they're never passed to the compiler. And I've declared what the user will be using.
Is that possible? If not, why not?
This is a One Definition Rule violation. The moment you deviate by a single token.
[basic.def.odr]/6
There can be more than one definition of a class type, [...] in a
program provided that each definition appears in a different
translation unit, and provided the definitions satisfy the following
requirements. Given such an entity named D defined in more than one
translation unit, then
each definition of D shall consist of the same sequence of tokens; and
Your program may easily break if you violate the ODR like that. And your build system isn't at all obligated to even warn you about it.
You cannot define a class twice. It breaks the One Definition Rule (ODT). MyLibrary does that, unfortunately.
they're never passed to the compiler
They will. Members of a class must be known at compile time, so that the compiler can determine the class's size.
Header are just declarations anyway right? they're never passed to the
compiler. And I've declared what the user will be using.
No. Headers are part of source code and are compiled together with source files. They contain the information necessary for a compiler to understand how to work with code (in your case, with class MyLibrary).
As an example, you want library users to be able to create objects of class MyLibrary, so you export the constructor. However, this is not sufficient: the compiler needs to know the size of the object to be created, which is impossible unless you specify all the fields.
In practice, deciding what to expose to library users and what to hide as implementation details is a hard question, which requires detailed inspection of the library usage and semantics. If you really want to hide the class internals as implementation detail, here are some common options:
The pimpl idiom is a common solution. It enables you to work with the class as it is usually done, but the implementation details are nicely hidden.
Extract the interface into an abstract class with virtual functions, and use pointers (preferably smart pointers) to work with the objects.
Headers are just declarations anyway right? they're never passed to the compiler.
The moment you do a #include to a file, its content are copied and pasted into your source file exactly as they are.
So even though you don't pass them directly as compiler arguments, they're still part of your code and code in them will be compiled into your translation units.
Solutions by #lisyarus are pretty good.
But another option would be doing it the C way. Which is the most elegant in my opinion.
In C you give your users a handle, which will most likely be a pointer.
Your header would look something like this:
struct MyLibrary;
MyLibrary*
my_library_init();
void
my_library_destroy(MyLibrary*);
void
my_library_function_to_be_called_by_library_users(MyLibrary*);
A very small and simple interface that does not show your users anything you don't want them to see.
Another nice perk is that your build system will not have to recompile your whole program just because you added a field to the MyLibrary struct.
You have to watch out though, because now you have to call my_library_destroy which will carry the logic of your destructor.

Abstract Base Class w/o Polymorphism

Why would you have an abstract base class defining an interface for a library where there is only one (always and forever) derived class?
You may want to swap out the implementation for something like Unit Testing
One reason why you would do this is for testability. It is much simpler to test dependent objects when their dependencies are defined as interfaces. This give the easy ability to mock or stub.
To violate the reused abstraction principle.
In short, don't do this.
Those who say "for testing" are overlooking that you can just replace
Base < - > Derived
Base < - > DerivedForMockingAndTesting
with
Derived < - > DerivedForMockingAndTesting
That is, let your existing implementation Derived serve as the "abstraction" to be mocked out and tested in unit testing.
If you can be 100% certain that there will always be only one and exactly one derived class? Not much reason. BUT: In reality you hardly will be 100% certain of anything and surely not the future of your code.
You may find the need for different versions of that class to be binary compatible. You may also find that for other reasons, you wish to encapsulate the definition of the class- for example, because the definition requires a header which has poor macros, and that kind of thing, or the header containing what's necessary to define the class has a very long compile time.
For example, I wrote a class to encapsulate the features offered by my operating system- for now, things like dynamic loading and creating a window. Even though there'll only ever be one implementation for one compile target (Windows, etc), I chose to use a run-time abstraction, because I wanted to guarantee that the rest of my code never saw a platform-specific header, and the Windows header is full of so many macros and stuff, that I didn't want them leaking out.
The most obvious reason would be to be able to put the class'
implementation in a source file, and not in a header. All that is
exposed in the header is the abstract base class (and a factory function
necessary to construct it, but this could be a static member). This
avoids having to include the header files for any member data; the pimpl
idiom is more idiomatic in C++ for this, but using abstract classes like
this is far from unknown, and works fairly well as well.

When to use Header files that do not declare a class but have function definitions

I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class. Can someone explain to me why and when you would do something like this. Is this a bad practice?
Thanks in advance!
Is this a bad practice?
Not in general. There are a lot of libraries that are header only, meaning they only ship header files. This can be seen as a lightweight alternative to compiled libraries.
More importantly, though, there is a case where you cannot use separate precompiled compilation units: templates must be specialized in the same compilation unit in which they get declared. This may sound arcane but it has a simple consequence:
Function (and class) templates cannot be defined inside cpp files and used elsewhere; instead, they have to be defined inside header files directly (with a few notable exceptions).
Additionally, classes in C++ are purely optional – while you can program object oriented in C++, a lot of good code doesn't. Classes supplement algorithms in C++, not the other way round.
It's not bad practice. The great thing about C++ is that it lets you program in many styles. This gives the language great flexibility and utility, but possibly makes it trickier to learn than other languages that force you to write code in a particular style.
If you had a small program, you could write it in one function - possibly using a couple of goto's for code flow.
When you get bigger, splitting the code into functions helps organize things.
Bigger still, and classes are generally a good way of grouping related functions that work on a certain set of data.
Bigger still, namespaces help out.
Sometimes though, it's just easiest to write a function to do something. This is often the case where you write a function that only works on primitive types (like int). int doesn't have a class, so if you wanted to write a printInt() function, you might make it standalone. Also, if a function works on objects from multiple classes, but doesn't really belong to one class and not the other, that might make sense as a standalone function. This happens a lot when you write operators such as define less than so that it can compare objects of two different classes. Or, if a function can be written in terms of a classes public methods, and doesn't need to access data of the class directly, some people prefer to write that as a standalone function.
But, really, the choice is yours. Whatever is the most simple thing to do to solve your problem is best.
You might start a program off as just a few functions, and then later decide some are related and refactor them into a class. But, if the other standalone functions don't naturally fit into a class, you don't have to force them into one.
An H file is simply a way of including a bunch of declarations. Many things in C++ are useful declarations, including classes, types, constants, global functions, etc.
C++ has a strong object oriented facet. Most OO languages tackle the question of where to deal with operations that don't rely on object state and don't actually need the object.
In some languages, like Java, language restrictions force everything to be in a class, so everything becomes a static member function (e.g., classes with math utilities or algorithms).
In C++, to maintain compatibility with C, you are allowed to declare standalone C-style functions or use the Java style of static members. My personal view is that it is better, when possible, to use the OO style and organize operations around a central concept.
However, C++ does provide the namespaces facilities and often it is used in the same way that a class would be used in those situations - to group a bunch of standalone items where each item is prefixed by the "namespace" name. As others point out, many C++ standard library functions are located this way. My view is that this is much like using a class in Java. However, others would argue that Java uses classes because it doesn't have namespaces.
As long as you use one or the other (rather than a floating standalone non-namespaced function) you're generally going to be ok.
I am fairly new to C++ and I have seen a bunch of code that has method definitions in the header files and they do not declare the header file as a class.
Lets clarify things.
method definitions in the header files
This means something like this:
file "A.h":
class A {
void method(){/*blah blah*/} //definition of a method
};
Is this what you meant?
Later you are saying "declare the header file". There is no mechanism for DECLARING a file in C++. A file can be INCLUDED by witing #include "filename.h". If you do this, the contents of the header file will be copied and pasted to wherever you have the above line before anything gets compiled.
So you mean that all the definitions are in the class definition (not anywhere in A.h FILE, but specifically in the class A, which is limited by 'class A{' and '};' ).
The implication of having method definition in the class definition is that the method will be 'inline' (this is C++ keyword), which means that the method body will be pasted whenever there is a call to it. This is:
good, because the function call mechanism no longer slows down the execution
bad if the function is longer than a short statement, because the size of executable code grows badly
Things are different for templates as someone above stated, but for them there is a way of defining methods such that they are not inline, but still in the header file (they must be in headers). This definitions have to be outside the class definition anyway.
In C++, functions do not have to be members of classes.

Could C++ have not obviated the pimpl idiom?

As I understand, the pimpl idiom is exists only because C++ forces you to place all the private class members in the header. If the header were to contain only the public interface, theoretically, any change in class implementation would not have necessitated a recompile for the rest of the program.
What I want to know is why C++ is not designed to allow such a convenience. Why does it demand at all for the private parts of a class to be openly displayed in the header (no pun intended)?
This has to do with the size of the object. The h file is used, among other things, to determine the size of the object. If the private members are not given in it, then you would not know how large an object to new.
You can simulate, however, your desired behavior by the following:
class MyClass
{
public:
// public stuff
private:
#include "MyClassPrivate.h"
};
This does not enforce the behavior, but it gets the private stuff out of the .h file.
On the down side, this adds another file to maintain.
Also, in visual studio, the intellisense does not work for the private members - this could be a plus or a minus.
I think there is a confusion here. The problem is not about headers. Headers don't do anything (they are just ways to include common bits of source text among several source-code files).
The problem, as much as there is one, is that class declarations in C++ have to define everything, public and private, that an instance needs to have in order to work. (The same is true of Java, but the way reference to externally-compiled classes works makes the use of anything like shared headers unnecessary.)
It is in the nature of common Object-Oriented Technologies (not just the C++ one) that someone needs to know the concrete class that is used and how to use its constructor to deliver an implementation, even if you are using only the public parts. The device in (3, below) hides it. The practice in (1, below) separates the concerns, whether you do (3) or not.
Use abstract classes that define only the public parts, mainly methods, and let the implementation class inherit from that abstract class. So, using the usual convention for headers, there is an abstract.hpp that is shared around. There is also an implementation.hpp that declares the inherited class and that is only passed around to the modules that implement methods of the implementation. The implementation.hpp file will #include "abstract.hpp" for use in the class declaration it makes, so that there is a single maintenance point for the declaration of the abstracted interface.
Now, if you want to enforce hiding of the implementation class declaration, you need to have some way of requesting construction of a concrete instance without possessing the specific, complete class declaration: you can't use new and you can't use local instances. (You can delete though.) Introduction of helper functions (including methods on other classes that deliver references to class instances) is the substitute.
Along with or as part of the header file that is used as the shared definition for the abstract class/interface, include function signatures for external helper functions. These function should be implemented in modules that are part of the specific class implementations (so they see the full class declaration and can exercise the constructor). The signature of the helper function is probably much like that of the constructor, but it returns an instance reference as a result (This constructor proxy can return a NULL pointer and it can even throw exceptions if you like that sort of thing). The helper function constructs a particular implementation instance and returns it cast as a reference to an instance of the abstract class.
Mission accomplished.
Oh, and recompilation and relinking should work the way you want, avoiding recompilation of calling modules when only the implementation changes (since the calling module no longer does any storage allocations for the implementations).
You're all ignoring the point of the question -
Why must the developer type out the PIMPL code?
For me, the best answer I can come up with is that we don't have a good way to express C++ code that allows you to operate on it. For instance, compile-time (or pre-processor, or whatever) reflection or a code DOM.
C++ badly needs one or both of these to be available to a developer to do meta-programming.
Then you could write something like this in your public MyClass.h:
#pragma pimpl(MyClass_private.hpp)
And then write your own, really quite trivial wrapper generator.
Someone will have a much more verbose answer than I, but the quick response is two-fold: the compiler needs to know all the members of a struct to determine the storage space requirements, and the compiler needs to know the ordering of those members to generate offsets in a deterministic way.
The language is already fairly complicated; I think a mechanism to split the definitions of structured data across the code would be a bit of a calamity.
Typically, I've always seen policy classes used to define implementation behavior in a Pimpl-manner. I think there are some added benefits of using a policy pattern -- easier to interchange implementations, can easily combine multiple partial implementations into a single unit which allow you to break up the implementation code into functional, reusable units, etc.
May be because the size of the class is required when passing its instance by values, aggregating it in other classes, etc ?
If C++ did not support value semantics, it would have been fine, but it does.
Yes, but...
You need to read Stroustrup's "Design and Evolution of C++" book. It would have inhibited the uptake of C++.