When should I define a type as a struct or as a class?
I know that struct are value types while classes are reference types. So I wonder, for example, should I define a stack as a struct or a class?
Reason #1 to choose struct vs class: classes have inheritance, structs do not. If you need polymorphism, you must use classes.
Reason #2: structs are normally value types (though you can make them reference types if you work at it). Classes are always reference types. So, if you want a value type, choose a struct. If you want a reference type, it's easiest to go with a class.
Reason #3: If you have a type with a lot of data members, then you're probably going to want a reference type (to avoid expensive copying), in which case, you're probably going to choose a class.
Reason #4: If you want deterministic destruction of your type, then it's going to need to be a struct on the stack. Nothing on the GC heap has deterministic destruction, and the destructiors/finalizers of stuff on the GC heap may never be run. If they're collected by the GC, then their finalizers will be run, but otherwise, they won't. So, if you want your type to automatically be destroyed when it leaves scope, you need to use a struct and put it on the stack.
As for your particular case, containers should normally be reference types (copying all of their elements every time that you pass one around would be insanely expensive), and a Stack is a container, so you're going to want to use a class unless you want to go to the trouble of making it a ref-counted struct, which is decidedly more work. It just has the advantage of guaranteeing that its destructor will run when it's not used anymore.
On a side note, if you create a container which is a class, you're probably going to want to make it final so that its various functions can be inlined (and won't be virtual if that class doesn't derive from anything other than Object and they're not functions that Object has), which can be important for something like a container where performance can definitely matter.
Read "D"iving Into the D Programming Language
In D you get structs and then you get classes. They share many amenities but have different charters: structs are value types, whereas classes are meant for dynamic polymorphism and are accessed solely by reference. That way confusions, slicing-related bugs, and comments à la // No! Do NOT inherit! do not exist. When you design a type, you decide upfront whether it'll be a monomorphic value or a polymorphic reference. C++ famously allows defining ambiguous-gender types, but their use is rare, error-prone, and objectionable enough to warrant simply avoiding them by design.
For your Stack type, you are probably best off defining an interface first and then implementations thereof (using class) so that you don't tie-in a particular implementation of your Stack type to its interface.
Related
I need a template class, which has different members, depending on which ctor is called.
I managed to get a class, which has different members using sfinae with a base class (I did it almost like this SFINAE on member variable).
Now my question is, can I achieve a single template class, which has different members, depending on which ctor of the class is called?
Maybe someone can has an idea how to achieve this.
EDIT: I currently use boost::variant, but the problem is, that the largest object in the variant is huge, and the the smallest is ust a pointer. this is a real performance problem, because most of the time the pointer will be in the variant.
EDIT II: If this would work with a ctor it would be awesome, but if not, a factory-fuction would work as well.
EDIT III (or what I am trying to achieve):
I am currently making a DSL, which translates to C++.
Since I am trying to make polymorphism possible, I am only passing pointers to functions. Beacause some pointers are reference counted and some pointers are raw, depending on what the user wants, there can be shared_pointers and raw pointers of the same class. Thats why I can't make two different classes, because if a function is called on a pointer, it should be the same function, otherwise I have to overload all the fnctions, which would give me
2**n functions when the function has n arguments.
Thats why I am trying to create a class, which could eigther represents a raw pointer or a shared_ptr, based on what is passed to the ctor.
You should simply continue using variant<> but instead of storing your huge class as an object, store it as a pointer as well:
boost::variant<common_case*, huge_class*>
Since you say you usually store a pointer anyway, this doesn't cost you anything, and reclaims 100% of the wasted memory because all object pointers are the same size.
So here's my dilemma:
I have a container which is going to store some objects. I'll interact with the objects in the container as if they were all of the base class. The base class is pure virtual. Some objects can be copied, and some can't. They're all movable though, so that's what I'm sticking with.
To give you an idea, I'm writing a container that is agnostic to accepting a custom shared_ptr and unique_ptr.
All objects will be the same size, and this will be verified at compile time with static_asserts.
I want to move objects around, and change the derived class type as I'm doing this. I'm guessing this is for the most part unsupported in any way, but I'm looking to see if there's enough definition to what I want to do that I can create a properly formed solution.
I want to avoid undefined at all costs, but implementation-defined, and unspecified behaviour is fine.
Can I simply run a memcpy from one object to another in this case? If not, is there something else I can do to get this to work?
I found below post
C++ polymorphism without pointers
that explains to have polymorphism feature C++ must use pointer or reference types.
I looked into some further resources all of them says the same but the reason .
Is there any technical difficulty to support polymorphism with values or it is possible but C++ have decided to not to provide that ability ?
The problem with treating values polymorphically boils down to the object slicing issue: since derived objects could use more memory than their base class, declaring a value in the automatic storage (i.e. on the stack) leads to allocating memory only for the base, not for the derived object. Therefore, parts of the object that belong to the derived class may be sliced off. That is why C++ designers made a conscious decision to re-route virtual member-functions to the implementations in the base class, which cannot touch the data members of the derived class.
The difficulty comes from the fact that what you call objects are allocated in automatic memory (on the stack) and the size must be known at compile-time.
Size of pointers are known at compile-time regardless of what they point to, and references are implemented as pointers under the hood, so no worries there.
Consider objects though:
BaseObject obj = ObjectFactory::createDerived();
How much memory should be allocated for obj if createDerived() conditionally returns derived objects? To overcome this, the object returned is sliced and "converted* to a BaseObject whose size is known.
This all stems from the "pay for what you use" mentality.
The short answer is because the standard specifies it. But are there any insurmountable technical barriers to allowing it?
C++ data structures have known size. Polymorphism typically requires that the data structures can vary in size. In general, you cannot store a different (larger) type within the storage of a smaller type, so storing a child class with extra variables (or other reasons to be larger) within storage for a parent class is not generally possible.
Now, we can get around this. We can create a buffer larger than what is required to store the parent class, and construct child classes within that buffer: but in this case, exposure to said instance will be via references, and you will carefully wrap the class.
This is similar to the technique known as "small object optimization" used by boost::any, boost::variant and many implementations of std::string, where we store (by value) objects in a buffer within a class and manage their lifetime manually.
There is also an issue where Derived pointers to an instance can have different values than Base pointers to an instance: value instances of objects in C++ are presumed to exist where the storage for the instance starts by most implementations.
So in theory, C++ could allow polymorphic instances if we restricted it to derived classes that could be stored in the same memory footprint, with the same "pointer to" value for both Derived and Base, but this would be an extremely narrow corner case, and could reduce the kinds of optimizations and assumptions compilers could make about value instances of a class in nearly every case! (Right now, the compiler can assume that value instances of a class C have virtual methods that are not overridden elsewhere, as an example) That is a non trivial cost for an extremely marginal benefit.
What more, we are capable of using the C++ language to emulate this corner case using existing language features (placement new, references, and manual destruction) if we really need it, without imposing that above cost.
It is not immediately clear what you mean by "polymorphism with values". In C++ when you have an object of type A, it always behaves as an object of type A. This is perfectly normal and logical thing to expect. I don't see how it can possible behave in any other way. So, it is not clear what "ability" that someone decided "not to provide" you are talking about.
Polymorphism in C++ means one thing: virtual function calls made through an expression with polymorphic type are resolved in accordance with the dynamic type of that expression (as opposed to static type for non-virtual functions). That's all there is to it.
Polymorphism in C++ always works in accordance with the above rule. It works that way through pointers. It works that way through references. It works that way through immediate objects ("values" as you called them). So, it not not correct to say that polymorphism in C++ only works with pointers and references. It works with "values" as well. They all follow the same rule, as stated above.
However, for an immediate object (a "value") its dynamic type is always the same as it static type. So, even though polymorphism works for immediate values, it does not demonstrate anything truly "polymorphic". The behavior of an immediate object with polymorphism is the same as it would be without polymorphism. So, polymorphism of an immediate object is degenerate, trivial polymorphism. It exists only conceptually. This is, again, perfectly logical: an object of type A should behave as an object of type A. How else can it behave?
In order to observe the actual non-degenerate polymorphism, one needs an expression whose static type is different from its dynamic type. Non-trivial polymorphism is observed when an expression of static type A behaves (with regard to virtual function calls) as an object of different type B. For this an expression of static type A must actually refer to an object of type B. This is only possible with pointers or references. The only way to create that difference between static and dynamic type of an expression is through using pointers or references.
In other words, its not correct to say that polymorphism in C++ only works through pointers or references. It is correct to say is that with pointers or references polymorphism becomes observable and non-trivial.
Is there any efficiency disadvantage associated with deep inheritance trees (in c++), i.e, a large set of classes A, B, C, and so on, such that B extends A, C extends B, and so one. One efficiency implication that I can think of is, that when we instantiate the bottom most class, say C, then the constructors of B and A are also called, which will have performance implications.
Let's enumerate the operations we should consider:
Construction/destruction
Each constructor/destructor will call its base class equivalents. However, as James McNellis pointed out, you were obviously going to do that work anyway. You didn't derived from A just because it was there. So the work is going to get done one way or another.
Yes, it will involve a few more function calls. But function call overhead will be nothing compared to the actual work any significantly deep class hierarchy will have to actually do. If you're at the point where function call overhead is actually important for performance, I would strongly suggest that calling constructors at all is probably not what you want to be doing in that code.
Object Size
In general, the overhead for a derived class is nothing. The overhead for virtual members is a pointer or for virtual inheritance.
Member Function Calls, Static
By this, I mean calling non-virtual member functions, or calling virtual member functions with class names (ClassName::FunctionName syntax). Both of these allow the compiler to know at compile time which function to call.
The performance of this is invariant with the size of the hierarchy, since it's compile-time determined.
Member Function Calls, Dynamic
This is calling virtual functions with the full and complete expectation of runtime calls.
Under most sane C++ implementations, this is invariant with the size of the object hierarchy. Most implementations use a v-table for each class. Each object has a v-table pointer as a member. For any particular dynamic call, the compiler accesses the v-table pointer, picks out the method, and calls it. Since the v-table is the same for each class, it won't be any slower for a class that has a deep hierarchy than one with a shallow one.
Virtual inheritance plays a bit with this.
Pointer Casts, Static
This refers to static_cast or any equivalent operation. This means the implicit cast from a derived class to a base class, the explicit use of static_cast or C-style casts, etc.
Note that this technically includes reference casting.
The performance of static casts between classes (up or down) is invariant with the size of the hierarchy. Any pointer offsets will be compile-time generated. This should be true for virtual inheritance as well as non-virtual inheritance, but I'm not 100% certain of that.
Pointer Casts, Dynamic
This obviously refers to the explicit use of dynamic_cast. This is typically used when casting from a base class to a derived one.
The performance of dynamic_cast will likely change for a large hierarchy. But sane implementations should only check the classes between the current class and the requested one. So it's simply linear in the number of classes between the two, not linear in the number of classes in the hierarchy.
Typeof
This means the use of the typeof operator to fetch the std::type_info object associated with an object.
The performance of this will be invariant with the size of the hierarchy. If the class is a virtual one (has virtual functions or virtual base classes), then it will simply pull it out of the vtable. If it's not virtual, then it's compile-time defined.
Conclusion
In short, most operations are invariant with the size of the hierarchy. But even in the cases where it has an impact, it's not a problem.
I'd be more concerned with some design ethic where you felt the need to build such a hierarchy. In my experience, hierarchies like this come from two lines of design.
The Java/C# ideal of having everything derived from a common base class. This is a horrible idea in C++ and should never be used. Each object should derive from what it needs to, and only that. C++ was built on the "pay for what you use" principle, and deriving from a common base works against that. In general, anything you could do with such a common base class is either something you shouldn't be doing period, or something that could be done with function overloading (using operator<< to convert to strings, for example).
Misuse of inheritance. Using inheritance when you should be using containment. Inheritance creates an "is a" relationship between objects. More often than not, "has a" relationships (one object having another as a member) are far more useful and flexible. They make it easier to hide data, and you don't allow the user to pretend one class is another.
Make sure that your design does not fall afoul of one of these principles.
There will be but not as bad as the programmer performance implications.
As #Nicol points out, it may be doing a number of things.
If those are things that you require to be done, regardless of design, because they are all precisely necessary steps in getting the program from call main to exit within the fewest possible cycles, then your design is simply a matter of coding clarity (or maybe lack of it :).
In my experience performance tuning, as in this example, what I often see as a huge source of wasted time is over-design of data (i.e. class) structures.
Wierdly enough, the justification for the data structures is often (guess what?) - performance!
In my experience, the thing to do with data structure is keep it as simple as possible and as normalized as possible. If it is completely normalized, then any single change to it can't make it inconsistent. You can't always achieve complete normality, in which case you have to deal with the possibility that the data can be temporarily inconsistent.
This is why people write notification handlers, and this is encouraged in OOP.
The idea is, if you change something in one place, that can trigger notifications that "automatically" propagate the change to other places, trying to maintain consistency.
The problem with notifications is they can run away. Simply changing some boolean property from true to false can cause a fire-storm of notifications ripping through the data structure in ways no one programmer understands, updating databases, painting windows, zipping files, etc. etc. I often find this is where most clock cycles go.
I think it is simpler and far more efficient to temporarily tolerate inconsistency, and periodically repair it with some kind of sweeping process.
Another way data structures go along with huge inefficiency is if the data is effectively being interpreted by some process to produce some output.
This is very common in graphics.
If the data changes at a very slow rate, it may make sense to "compile" it rather than "interpret" it.
In other words, translate it into a simpler instruction set, or source code which is compiled "on the fly", which can then execute far more quickly to produce the desired output.
How are objects stored in memory in C++?
For a regular class such as
class Object
{
public:
int i1;
int i2;
char i3;
int i4;
private:
};
Using a pointer of Object as an array can be used to access i1 as follows?
((Object*)&myObject)[0] === i1?
Other questions on SO seem to suggest that casting a struct to a pointer will point to the first member for POD-types. How is this different for classes with constructors if at all?
Also in what way is it different for non-POD types?
Edit:
In memory therefore would the above class be laid out like the following?
[i1 - 4bytes][i2 - 4bytes][i3 - 1byte][padding - 3bytes][i4 - 4bytes]
Almost. You cast to an Object*, and neglected to take an address. Let's re-ask as the following:
((int*)&myObject)[0] == i1
You have to be really careful with assumptions like this. As you've defined the structure, this should be true in any compiler you're likely to come across. But all sorts of other properties of the object (which you may have omitted from your example) will, as others said, make it non-POD and could (possibly in a compiler-dependent way) make the above statement not true.
Note that I wouldn't be so quick to tell you it would work if you had asked about i3 -- in that case, even for plain POD, alignment or endianness could easily screw you up.
In any case, you should be avoiding this kind of thing, if possible. Even if it works fine now, if you (or anybody else who doesn't understand that you're doing this trick) ever changes the structure order or adds new fields, this trick will fail in all the places you've used it, which may be hard to find.
Answer to your edit: If that's your entire class definition, and you're using one of the mainstream compilers with default options, and running on an x86 processor, then yes, you've probably guessed the right memory layout. But choice of compiler, compiler options, and different CPU architecture could easily invalidate your assumptions.
Classes without virtual members and without inheritance are laid out in memory just like structs. But, when you start getting levels of inheritance things can get tricky and it can be hard to figure out what order things are in memory (particularly multiple inheritance).
When you have virtual members, they have a "vtable" in memory which contains pointers to the actual function which gets created based on the inheritance hierarchy of the class.
The bottom line is: don't access classes this way at all if you can avoid it (and also don't memset them or memcpy them). If you must do this (why?) then take care that you know exactly how your class objects are going to be in memory and be careful to avoid inheritance.
It's difference in that this trick is only valid for POD types. That's really all there is to it. The standard specifies that this cast is valid for a POD type, but makes no guarantees about what happens with non-POD types.
It really depends on the compiler, or rather, it is left up to the compiler to determine the memory layout.
For instance, a mix of public, private, and protected member variables could be laid out such that each access type is contiguous. Or, derived classes could have member variables interleaved with unused space in the super class.
Things get worse with virtual inheritance, where the virtually inherited base classes can be layed out anywhere in the memory allocated for that particular instance.
POD is different because it needs to be compatible with C.
Usually what matters isn't whether the class has a constructor: what matters is whether the class has any virtual methods. For details, google for 'vtable' and 'vptr'.