What are the often misunderstood concepts in C++? [closed] - c++

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are the often misunderstood concepts in c++?

C++ is not C with classes!
And there is no language called C/C++. Everything goes downhill from there.

That C++ does have automatic resource management.
(Most people who claim that C++ does not have memory management try to use new and delete way too much, not realising that if they allowed C++ to manage the resource themselves, the task gets much easier).
Example: (Made with a made up API because I do not have time to check the docs now)
// C++
void DoSomething()
{
File file("/tmp/dosomething", "rb");
... do stuff with file...
// file is automatically free'ed and closed.
}
// C#
public void DoSomething()
{
File file = new File("/tmp/dosomething", "rb");
... do stuff with file...
// file is NOT automatically closed.
// What if the caller calls DoSomething() in a tight loop?
// C# requires you to be aware of the implementation of the File class
// and forces you to accommodate, thus voiding implementation-hiding
// principles.
// Approaches may include:
// 1) Utilizing the IDisposable pattern.
// 2) Utilizing try-finally guards, which quickly gets messy.
// 3) The nagging doubt that you've forgotten something /somewhere/ in your
// 1 million loc project.
// 4) The realization that point #3 can not be fixed by fixing the File
// class.
}

Free functions are not bad just because they are not within a class C++ is not an OOP language alone, but builds upon a whole stack of techniques.
I've heard it many times when people say free functions (those in namespaces and global namespace) are a "relict of C times" and should be avoided. Quite the opposite is true. Free functions allow to decouple functions from specific classes and allow reuse of functionality. It's also recommended to use free functions instead of member functions if the function don't need access to implementation details - because this will eliminate cascading changes when one changes the implementation of a class among other advantages.
This is also reflected in the language: The range-based for loop in C++0x (next C++ version released very soon) will be based on free function calls. It will get begin / end iterators by calling the free functions begin and end.

The difference between assignment and initialisation:
string s = "foo"; // initialisation
s = "bar"; // assignment
Initialisation always uses constructors, assignment always uses operator=

In decreasing order:
make sure to release pointers for allocated memory
when destructors should be virtual
how virtual functions work
Interestingly not many people know the full details of virtual functions, but still seem to be ok with getting work done.

The most pernicious concept I've seen is that it should be treated as C with some addons. In fact, with modern C++ systems, it should be treated as a different language, and most of the C++-bashing I see is based on the "C with add-ons" model.
To mention some issues:
While you probably need to know the difference between delete and delete[], you should normally be writing neither. Use smart pointers and std::vector<>.
In fact, you should be using a * only rarely. Use std::string for strings. (Yes, it's badly designed. Use it anyway.)
RAII means you don't generally have to write clean-up code. Clean-up code is bad style, and destroys conceptual locality. As a bonus, using RAII (including smart pointers) gives you a lot of basic exception safety for free. Overall, it's much better than garbage collection in some ways.
In general, class data members shouldn't be directly visible, either by being public or by having getters and setters. There are exceptions (such as x and y in a point class), but they are exceptions, and should be considered as such.
And the big one: there is no such language as C/C++. It is possible to write programs that can compile properly under either language, but such programs are not good C++ and are not normally good C. The languages have been diverging since Stroustrup started working on "C with Classes", and are less similar now than ever. Using "C/C++" as a language name is prima facie evidence that the user doesn't know what he or she is talking about. C++, properly used, is no more like C than Java or C# are.

The overuse of inheritance unrelated to polymorphism. Most of the time, unless you really do use runtime polymorphism, composition or static polymorphism (i.e., templates) is better.

The static keyword which can mean one of three distinct things depending on where it is used.
It can be a static member function or member variable.
It can be a static variable or function declared at namespace scope.
It can be a static variable declared inside a function.

Arrays are not pointers
They are different. So &array is not a pointer to a pointer, but a pointer to an array. This is the most misunderstood concept in both C and C++ in my opinion. You gotta have a visit to all those SO answers that tell to pass 2-d arrays as type** !

Here is an important concept in C++ that is often forgotten:
C++ should not be simply used like an object
oriented language such as Java or C#.
Inspire yourself from the STL and write generic code.

Here are some:
Using templates to implement polymorphism without vtables, à la ATL.
Logical const-ness vs actual const-ness in memory. When to use the mutable keyword.
ACKNOWLEDGEMENT: Thanks for correcting my mistake, spoulson.
EDIT:
Here are more:
Virtual inheritance (not virtual methods): In fact, I don't understand it at all! (by that, I mean I don't know how it's implemented)
Unions whose members are objects whose respective classes have non-trivial constructors.

Given this:
int x = sizeof(char);
what value is X?
The answer you often hear is dependant on the level of understanding of the specification.
Beginner - x is one because chars are always eight bit values.
Intermediate - it depends on the compiler implementation, chars could be UTF16 format.
Expert - x is one and always will be one since a char is the smallest addressable unit of memory and sizeof determines the number of units of memory required to store an instance of the type. So in a system where a char is eight bits, a 32 bit value will have a sizeof of 4; but in a system where a char is 16 bits, a 32 bit value will have a sizeof of 2.
It's unfortunate that the standard uses 'byte' to refer to a unit of memory since many programmers think of 'byte' as being eight bits.

C++ is a multi-paradigm language. Many people associate C++ strictly with OOP.

a classic among beginners to c++ from c:
confuse delete and delete[]
EDIT:
another classic failure among all levels of experience when using C API:
std::string helloString = "hello world";
printf("%s\n", helloString);
instead of:
printf("%s\n", helloString.c_str());
it happens to me every week. You could use streams, but sometimes you have to deal with printf-like APIs.

Pointers.
Dereferencing the pointers. Through either . or ->
Address of using & for when a pointer is required.
Functions that take params by reference by specifing a & in the signature.
Pointer to pointers to pointers *** or pointers by reference void someFunc(int *& arg)

There are a few things that people seem to be constantly confused by or have no idea about:
Pointers, especially function pointers and multiple pointers (e.g. int(*)(void*), void***)
The const keyword and const correctness (e.g. what is the difference between const char*, char* const and const char* const, and what does void class::member() const; mean?)
Memory allocation (e.g. every pointer new'ed should be deleted, malloc/free should not be mixed with new/delete, when to use delete [] instead of delete, why the C functions are still useful (e.g. expand(), realloc()))
Scope (i.e. that you can use { } on its own to create a new scope for variable names, rather than just as part of if, for etc...)
Switch statements. (e.g. not understanding that they can optimise as well (or better in some cases) than chains of ifs, not understanding fall through and its practical applications (loop unrolling as an example) or that there is a default case)
Calling conventions (e.g. what is the difference between cdecl and stdcall, how would you implement a pascal function, why does it even matter?)
Inheritance and multiple inheritance and, more generally, the entire OO paradigm.
Inline assembler, as it is usually implemented, is not part of C++.

Pointers to members and pointers to member functions.
Non-type template parameters.
Multiple inheritance, particularly virtual base classes and shared base objects.
Order of construction and destruction, the state of virtual functions in the middle of constructing an intermediate base class.
Cast safety and variable sizes. No, you can't assume that sizeof(void *) == sizeof(int) (or any other type for that matter, unless a portable header specifically guarantees it) in portable code.
Pointer arithmetic.

Headers and implementation files
This is also a concept misunderstood by many. Questions like what goes into header files and why it causes link errors if function definitions appear multiple times in a program on the one side but not when class definitions appear multiple times on the other side.
Very similar to those questions is why it is important to have header guards.

If a function accepts a pointer to a pointer, void* will still do it
I've seen that the concept of a void pointer is frequently confused. It's believed that if you have a pointer, you use a void*, and if you have a pointer to a pointer, you use a void**. But you can and should in both cases use void*. A void** does not have the special properties that a void* has.
It's the special property that a void* can also be assigned a pointer to a pointer and when cast back the original value is received.

I think the most misunderstood concept about C++ is why it exists and what its purpose is. Its often under fire from above (Java, C# etc.) and from below (C). C++ has the ability to operate close to the machine to deal with computational complexity and abstraction mechanisms to manage domain complexity.

NULL is always zero.
Many confuse NULL with an address, and think therefor it's not necessarily zero if the platform has a different null pointer address.
But NULL is always zero and it is not an address. It's an zero constant integer expression that can be converted to pointer types.

Memory Alignment.

std::vector does not create elements when reserve is used
I've seen it that programmers argue that they can access members at positions greater than what size() returns if they reserve()'ed up to that positions. That's a wrong assumption but is very common among programmers - especially because it's quite hard for the compiler to diagnose a mistake, which will silently make things "work".

Hehe, this is a silly reply: the most misunderstood thing in C++ programming is the error messages from g++ when template classes fail to compile!

C++ is not C with string and vector!

C structs VS C++ structs is often misunderstood.

C++ is not a typical object oriented language.
Don't believe me? look at the STL, way more templates than objects.
It's almost impossible to use Java/C# ways of writing object oriented code; it simply doesn't work.
In Java/C# programming, there's alot of newing, lots of utility objects that implement some single cohesive functionality.
In C++, any object newed must be deleted, but there's always the problem of who owns the object
As a result, objects tend to be created on the stack
But when you do that, you have to copy them around all the time if you're going to pass them around to other functions/objects, thus wasting a lot of performance that is said to be achieved with the unmanaged environment of C++
Upon realizing that, you have to think about other ways of organizing your code
You might end up doing things the procedural way, or using metaprogramming idioms like smart pointers
At this point, you've realized that OO in C++ cannot be used the same way as it is used in Java/C#
Q.E.D.
If you insist on doing oop with pointers, you'll usually have large (gigantic!) classes, with clearly defined ownership relationships between objects to avoid memory leaks. And then even if you do that, you're already too far from the Java/C# idiom of oop.
Actually I made up the term "object-oriented", and I can tell you I did not have C++ in mind.
-- Alan Kay (click the link, it's a video, the quote is at 10:33)
Although from a purist point of view (e.g. Alan Kay), even Java and C# fall short of true oop

A pointer is an iterator, but an iterator is not always a pointer
This is also an often misunderstood concept. A pointer to an object is a random access iterator: It can be incremented/decremented by an arbitrary amount of elements and can be read and written. However, an iterator class that has operator overloads doing that fulfill those requirements too. So it is also an iterator but is of course not a pointer.
I remember one of my past C++ teachers was teaching (wrongly) that you get a pointer to an element of a vector if you do vec.begin(). He was actually assuming - without knowing - that the vector implements its iterators using pointers.

That anonymous namespaces are almost always what is truly wanted when people are making static variables in C++
When making library header files, the pimpl idiom (http://www.gotw.ca/gotw/024.htm) should be used for almost all private functions and members to aid in dependency management

I still don't get why vector doesn't have a pop_front and the fact that I can't sort(list.begin(), list.end())..

Related

C++ std features and Binary size

I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.

Is it bad design for a class to give access to its data (via ptr/it) when this data can be deleted before the class object is out of scope?

Classic example is iterator invalidation :
std::string test("A");
auto it = test.insert(test.begin()+1,'B');
test.erase();
...
std::cout << *it;
Do you think having this kind of API is bad design, and will be difficult to learn/use for beginners ?
A costly, performance/memory wise, solution would be, in that type of case, to assign the pointer/iterator to an empty string (or a nullptr, but that's not very helpful) when a clear method is used.
Some precisions
I'm thinking of this design for returning const chars* that can be modified internally (maybe they're stored in a std::vector that can be cleared). I don't want to return a std::string (binary compatibility) and I don't want a get(char*,std::size_t) method because of the size argument that needs to be fetched (too slow). Also I don't want to create a wrapper around std::string or my own string class.
I would recommend reading up on Stepanov's design philosophy (pages 9-11):
[This example] is written in a clear object-oriented style with getters and setters. The proponents of this style say that the advantage of having such functions is that it allows programmers later on to change the implementation. What they forget to mention is that sometimes it is awfully good to expose the implementation. Let us see what I mean. It is hard for me to imagine an evolution of a system that would let you keep the interface of get and set, but be able to change the implementation. I could imagine that the implementation outgrows int and you need to switch to long. But that is a different interface. I can imagine that you decide to switch from an array to a list but that also will force you to change the interface, since it is really not a very good idea to index into a linked list.
Now let us see why it is really good to expose the implementation. Let us assume that tomorrow you decide to sort your integers. How can you do it? Could you use the C library qsort? No, since it knows nothing about your getters and setters. Could you use the STL sort? The answer is the same. While you design your class to survive some hypothetical change in the implementation, you did not design it for the very common task of sorting. Of course, the proponents of getters and setters will suggest that you extend your interface with a member function sort. After you do that, you will discover that you need binary search and median, etc. Very soon your class will have 30 member functions but, of course, it will be hiding the implementation. And that could be done only if you are the owner of the class. Otherwise, you need to implement a decent sorting algorithm on top of the setter-getter interface from scratch and that is a far more difficult and dangerous activity than one can imagine. ...
Setters and getters make our daily programming hard but promise huge rewards in the future when we discover better ways to store arrays of integers in memory. But I do not know a single realistic scenario when hiding memory locations inside our data structure helps and exposure hurts; it is, therefore, my obligation to expose a much more convenient interface that also happens to be consistent with the familiar interface to the C arrays. When we program in C++ we should not be ashamed of its C heritage, but make full use of it. The only problems with C++, and even the only problems with C, arise when they themselves are not consistent with their own logic. ...
My remark about exposing the address locations of consecutive integers is not facetious.
It took a major effort to convince the standard committee that such a requirement is an
essential property of vectors; they would not, however, agree that vector iterators should
be pointers and, therefore, on several major platforms – including the Microsoft one – it
is faster to sort your vector by saying the unbelievably ugly
if (!v.empty()) {
sort(&*v.begin(), &*v.begin() + v.size());
}
than the intended
sort(v.begin(), v.end());
Attempts to impose pseudo-abstractness at the cost of efficiency can be defeated, but at a terrible cost.
Stepanov has a lot of other interesting documents available, especially in the "Class Notes" section.
Yes, there are several rules of thumb regarding OOP. No, I'm not convinced that they are really the best way to do things. When you're working with the STL it makes a lot of sense to do things the STL compatible way. And when your abstraction is low level (like std::vector, which is meant specifically to make working with dynamically allocated arrays easier; i.e., it should be usable almost like an array with some added features), then some of those OOP rules of thumb make no sense at all.
To answer the original question: even beginners will eventually need to learn about iterators, object lifetimes, and what I'll call an object's useful life (i.e., "the object hasn't fallen out of scope, but is no longer valid to use, like an invalidated iterator"). I don't see any reason to try to hide those facts of life from the user, so I personally wouldn't rule out an iterator-based API on those grounds. The real question is what your API is meant to abstract and what's it's meant to expose (similar to the fact that a vector is a nicer array and is meant to expose its array nature). If you answer that, you should have a better idea about whether an iterator-based API makes sense.
As Scott Meyers states in Effective C++: yes it is indeed not a good design to grant access to private/protected members via pointers, iterators or references because you never know what the client code will do with it.
As far as I can remember this should be avoided, and it is sometimes better to create a copy of data members which are then returned to the caller.
It is a bad or faulty implementation rather than design.
As for providing access to private or protected members through pointers, basically it destroys one of the basic OOP principle of Abstraction.
I am unsure though as to what the question is, Yes ofcourse it is bad to have implementation which invalidates iterator. What is the real Q here?

create a object : A.new or new A?

Just out of curiosity: Why C++ choose a = new A instead of a = A.new as the way to instantiate an object? Doesn't latter seems more like more object-oriented?
Just out of curiosity: Why C++ choose a = new A instead of a = A.new as the way to instance-lize an object? Doesn't latter seems more like more object-oriented?
Does it?
That depends on how you define "object-oriented".
If you define it, the way Java did, as "everything must have syntax of the form "X.Y", where X is an object, and Y is whatever you want to do with that object, then yes, you're right. This isn't object-oriented, and Java is the pinnacle of OOP programming.
But luckily, there are also a few people who feel that "object-oriented" should relate to the behavior of your objects, rather than which syntax is used on them. Essentially it should be boiled down to what the Wikipedia page says:
Object-oriented programming is a programming paradigm that uses "objects" – data structures consisting of datafields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as information hiding, data abstraction, encapsulation, modularity, polymorphism, and inheritance
Note that it says nothing about the syntax. It doesn't say "and you must call every function by specifying an object name followed by a dot followed by the function name".
And given that definition, foo(x) is exactly as object-oriented as x.foo().
All that matters is that x is an object, that is, it consists of datafields, and a set of methods by by which it can be manipulated. In this case, foo is obviously one of those methods, regardless of where it is defined, and regardless of which syntax is used in calling it.
C++ gurus have realized this long ago, and written articles such as this.
An object's interface is not just the set of member methods (which can be called with the dot syntax). It is the set of functions which can manipulate the object. Whether they are members or friends doesn't really matter. It is object-oriented as long as the object is able to stay consistent, that is, it is able to prevent arbitrary functions from messing with it.
So, why would A.new be more object-oriented? How would this form give you "better" objects?
One of the key goals behind OOP was to allow more reusable code.
If new had been a member of each and every class, that would mean every class had to define its own new operation. Whereas when it is a non-member, every class can reuse the same one. Since the functionality is the same (allocate memory, call constructor), why not put it out in the open where all classes can reuse it? (Preemptive nitpick: Of course, the same new implementation could have been reused in this case as well, by inheriting from some common base class, or just by a bit of compiler magic. But ultimately, why bother, when we can just put the mechanism outside the class in the first place)
The . in C++ is only used for member access so the right hand side of the dot is always an object and not a type. If anything it would be more logical to do A::new() than A.new().
In any case, dynamic object allocation is special as the compiler allocates memory and constructs an object in two steps and adds code to deal with exceptions in either step ensuring that memory is never leaked. Making it look like a member function call rather than a special operation could be considered as obscuring the special nature of the operation.
I think the biggest confusion here is that new has two meanings: there's the built-in new-expression (which combines memory allocation and object creation) and then there's the overloadable operator new (which deals only with memory allocation). The first, as far as I can see, is something whose behavior you cannot change, and hence it wouldn't make sense to masquerade it as a member function. (Or it would have to be - or look like - a member function that no class can implement / override!!)
This would also lead to another inconsistency:
int* p = int.new;
C++ is not a pure OOP language in that not everything is an object.
C++ also allows the use of free functions (which is encouraged by some authors and the example set in the SC++L design), which a C++ programmer should be comfortable with. Of course, the new-expression isn't a function, but I don't see how the syntax reminding vaguely of free-function call can put anybody off in a language where free function calls are very common.
please read the code (it works), and then you'll have different ideas:
CObject *p = (CObject*)malloc(sizeof *p);
...
p = new(p) CObject;
p->DoSomthing();
...
A.new is a static function of A while a = new A allocates memory and calls the object's constructor afterwards
Actually, you can instantiate object with something like A.new, if you add the proper method:
class A{
public: static A* instance()
{ return new A(); }
};
A *a = A::instance();
But that's not the case. Syntax is not the case either: you can distinguish :: and . "operations" by examining right-hand side of it.
I think the reason is memory management. In C++, unlike many other object-oriented languages, memory management is done by user. There's no default garbage collector, although the standard and non-standard libraries contain it, along with various techniques to manage memory. Therefore the programmer must see the new operator to understand that memory allocation is involved here!
Unless having been overloaded, the use of new operator first allocates raw memory, then calls the object constructor that builds it up within the memory allocated. Since the "raw" low-level operation is involved here, it should be a separate language operator and not just one of class methods.
I reckon there is no reason. Its a = new a just because it was first drafted that way. In hindsight, it should probably be a = a.new();
Why one should have seperate new of each class ?
I dont think its needed at all because the objective of new is to
allocate appropriate memory and construct the object by calling constructor.
Thus behaviour of new is unique and independent irrespective of any class. So why dont make is resuable ?
You can override new when you want to do memory management by yourself ( i.e. by allocating memory pool once and returning memory on demand).

C++ Memory management

I've learned in College that you always have to free your unused Objects but not how you actually do it. For example structuring your code right and so on.
Are there any general rules on how to handle pointers in C++?
I'm currently not allowed to use boost. I have to stick to pure c++ because the framework I'm using forbids any use of generics.
I have worked with the embedded Symbian OS, which had an excellent system in place for this, based entirely on developer conventions.
Only one object will ever own a pointer. By default this is the creator.
Ownership can be passed on. To indicate passing of ownership, the object is passed as a pointer in the method signature (e.g. void Foo(Bar *zonk);).
The owner will decide when to delete the object.
To pass an object to a method just for use, the object is passed as a reference in the method signature (e.g. void Foo(Bat &zonk);).
Non-owner classes may store references (never pointers) to objects they are given only when they can be certain that the owner will not destroy it during use.
Basically, if a class simply uses something, it uses a reference. If a class owns something, it uses a pointer.
This worked beautifully and was a pleasure to use. Memory issues were very rare.
Rules:
Wherever possible, use a
smart pointer. Boost has some
good ones.
If you
can't use a smart pointer, null out
your pointer after deleting it.
Never work anywhere that won't let you use rule 1.
If someone disallows rule 1, remember that if you grab someone else's code, change the variable names and delete the copyright notices, no-one will ever notice. Unless it's a school project, where they actually check for that kind of shenanigans with quite sophisticated tools. See also, this question.
I would add another rule here:
Don't new/delete an object when an automatic object will do just fine.
We have found that programmers who are new to C++, or programmers coming over from languages like Java, seem to learn about new and then obsessively use it whenever they want to create any object, regardless of the context. This is especially pernicious when an object is created locally within a function purely to do something useful. Using new in this way can be detrimental to performance and can make it all too easy to introduce silly memory leaks when the corresponding delete is forgotten. Yes, smart pointers can help with the latter but it won't solve the performance issues (assuming that new/delete or an equivalent is used behind the scenes). Interestingly (well, maybe), we have found that delete often tends to be more expensive than new when using Visual C++.
Some of this confusion also comes from the fact that functions they call might take pointers, or even smart pointers, as arguments (when references would perhaps be better/clearer). This makes them think that they need to "create" a pointer (a lot of people seem to think that this is what new does) to be able to pass a pointer to a function. Clearly, this requires some rules about how APIs are written to make calling conventions as unambiguous as possible, which are reinforced with clear comments supplied with the function prototype.
In the general case (resource management, where resource is not necessarily memory), you need to be familiar with the RAII pattern. This is one of the most important pieces of information for C++ developers.
In general, avoid allocating from the heap unless you have to. If you have to, use reference counting for objects that are long-lived and need to be shared between diverse parts of your code.
Sometimes you need to allocate objects dynamically, but they will only be used within a certain span of time. For example, in a previous project I needed to create a complex in-memory representation of a database schema -- basically a complex cyclic graph of objects. However, the graph was only needed for the duration of a database connection, after which all the nodes could be freed in one shot. In this kind of scenario, a good pattern to use is something I call the "local GC idiom." I'm not sure if it has an "official" name, as it's something I've only seen in my own code, and in Cocoa (see NSAutoreleasePool in Apple's Cocoa reference).
In a nutshell, you create a "collector" object that keeps pointers to the temporary objects that you allocate using new. It is usually tied to some scope in your program, either a static scope (e.g. -- as a stack-allocated object that implements the RAII idiom) or a dynamic one (e.g. -- tied to the lifetime of a database connection, as in my previous project). When the "collector" object is freed, its destructor frees all of the objects that it points to.
Also, like DrPizza I think the restriction to not use templates is too harsh. However, having done a lot of development on ancient versions of Solaris, AIX, and HP-UX (just recently - yes, these platforms are still alive in the Fortune 50), I can tell you that if you really care about portability, you should use templates as little as possible. Using them for containers and smart pointers ought to be ok, though (it worked for me). Without templates the technique I described is more painful to implement. It would require that all objects managed by the "collector" derive from a common base class.
G'day,
I'd suggest reading the relevant sections of "Effective C++" by Scott Meyers. Easy to read and he covers some interesting gotchas to trap the unwary.
I'm also intrigued by the lack of templates. So no STL or Boost. Wow.
BTW Getting people to agree on conventions is an excellent idea. As is getting everyone to agree on conventions for OOD. BTW The latest edition of Effective C++ doesn't have the excellent chapter about OOD conventions that the first edition had which is a pity, e.g. conventions such as public virtual inheritance always models an "isa" relationship.
Rob
When you have to use manage memory
manually, make sure you call delete
in the same
scope/function/class/module, which
ever applies first, e.g.:
Let the caller of a function allocate the memory that is filled by it,
do not return new'ed pointers.
Always call delete in the same exe/dll as you called new in, because otherwise you may have problems with heap corruptions (different incompatible runtime libraries).
you could derive everything from some base class that implement smart pointer like functionality (using ref()/unref() methods and a counter.
All points highlighted by #Timbo are important when designing that base class.

What C++ pitfalls should I avoid? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I remember first learning about vectors in the STL and after some time, I wanted to use a vector of bools for one of my projects. After seeing some strange behavior and doing some research, I learned that a vector of bools is not really a vector of bools.
Are there any other common pitfalls to avoid in C++?
A short list might be:
Avoid memory leaks through use shared pointers to manage memory allocation and cleanup
Use the Resource Acquisition Is Initialization (RAII) idiom to manage resource cleanup - especially in the presence of exceptions
Avoid calling virtual functions in constructors
Employ minimalist coding techniques where possible - for example, declaring variables only when needed, scoping variables, and early-out design where possible.
Truly understand the exception handling in your code - both with regard to exceptions you throw, as well as ones thrown by classes you may be using indirectly. This is especially important in the presence of templates.
RAII, shared pointers and minimalist coding are of course not specific to C++, but they help avoid problems that do frequently crop up when developing in the language.
Some excellent books on this subject are:
Effective C++ - Scott Meyers
More Effective C++ - Scott Meyers
C++ Coding Standards - Sutter & Alexandrescu
C++ FAQs - Cline
Reading these books has helped me more than anything else to avoid the kind of pitfalls you are asking about.
Pitfalls in decreasing order of their importance
First of all, you should visit the award winning C++ FAQ. It has many good answers to pitfalls. If you have further questions, visit ##c++ on irc.freenode.org in IRC. We are glad to help you, if we can. Note all the following pitfalls are originally written. They are not just copied from random sources.
delete[] on new, delete on new[]
Solution: Doing the above yields to undefined behavior: Everything could happen. Understand your code and what it does, and always delete[] what you new[], and delete what you new, then that won't happen.
Exception:
typedef T type[N]; T * pT = new type; delete[] pT;
You need to delete[] even though you new, since you new'ed an array. So if you are working with typedef, take special care.
Calling a virtual function in a constructor or destructor
Solution: Calling a virtual function won't call the overriding functions in the derived classes. Calling a pure virtual function in a constructor or desctructor is undefined behavior.
Calling delete or delete[] on an already deleted pointer
Solution: Assign 0 to every pointer you delete. Calling delete or delete[] on a null-pointer does nothing.
Taking the sizeof of a pointer, when the number of elements of an 'array' is to be calculated.
Solution: Pass the number of elements alongside the pointer when you need to pass an array as a pointer into a function. Use the function proposed here if you take the sizeof of an array that is supposed to be really an array.
Using an array as if it were a pointer. Thus, using T ** for a two dimentional array.
Solution: See here for why they are different and how you handle them.
Writing to a string literal: char * c = "hello"; *c = 'B';
Solution: Allocate an array that is initialized from the data of the string literal, then you can write to it:
char c[] = "hello"; *c = 'B';
Writing to a string literal is undefined behavior. Anyway, the above conversion from a string literal to char * is deprecated. So compilers will probably warn if you increase the warning level.
Creating resources, then forgetting to free them when something throws.
Solution: Use smart pointers like std::unique_ptr or std::shared_ptr as pointed out by other answers.
Modifying an object twice like in this example: i = ++i;
Solution: The above was supposed to assign to i the value of i+1. But what it does is not defined. Instead of incrementing i and assigning the result, it changes i on the right side as well. Changing an object between two sequence points is undefined behavior. Sequence points include ||, &&, comma-operator, semicolon and entering a function (non exhaustive list!). Change the code to the following to make it behave correctly: i = i + 1;
Misc Issues
Forgetting to flush streams before calling a blocking function like sleep.
Solution: Flush the stream by streaming either std::endl instead of \n or by calling stream.flush();.
Declaring a function instead of a variable.
Solution: The issue arises because the compiler interprets for example
Type t(other_type(value));
as a function declaration of a function t returning Type and having a parameter of type other_type which is called value. You solve it by putting parentheses around the first argument. Now you get a variable t of type Type:
Type t((other_type(value)));
Calling the function of a free object that is only declared in the current translation unit (.cpp file).
Solution: The standard doesn't define the order of creation of free objects (at namespace scope) defined across different translation units. Calling a member function on an object not yet constructed is undefined behavior. You can define the following function in the object's translation unit instead and call it from other ones:
House & getTheHouse() { static House h; return h; }
That would create the object on demand and leave you with a fully constructed object at the time you call functions on it.
Defining a template in a .cpp file, while it's used in a different .cpp file.
Solution: Almost always you will get errors like undefined reference to .... Put all the template definitions in a header, so that when the compiler is using them, it can already produce the code needed.
static_cast<Derived*>(base); if base is a pointer to a virtual base class of Derived.
Solution: A virtual base class is a base which occurs only once, even if it is inherited more than once by different classes indirectly in an inheritance tree. Doing the above is not allowed by the Standard. Use dynamic_cast to do that, and make sure your base class is polymorphic.
dynamic_cast<Derived*>(ptr_to_base); if base is non-polymorphic
Solution: The standard doesn't allow a downcast of a pointer or reference when the object passed is not polymorphic. It or one of its base classes has to have a virtual function.
Making your function accept T const **
Solution: You might think that's safer than using T **, but actually it will cause headache to people that want to pass T**: The standard doesn't allow it. It gives a neat example of why it is disallowed:
int main() {
char const c = ’c’;
char* pc;
char const** pcc = &pc; //1: not allowed
*pcc = &c;
*pc = ’C’; //2: modifies a const object
}
Always accept T const* const*; instead.
Another (closed) pitfalls thread about C++, so people looking for them will find them, is Stack Overflow question C++ pitfalls.
Some must have C++ books that will help you avoid common C++ pitfalls:
Effective C++
More Effective C++
Effective STL
The Effective STL book explains the vector of bools issue :)
Brian has a great list: I'd add "Always mark single argument constructors explicit (except in those rare cases you want automatic casting)."
Not really a specific tip, but a general guideline: check your sources. C++ is an old language, and it has changed a lot over the years. Best practices have changed with it, but unfortunately there's still a lot of old information out there. There have been some very good book recommendations on here - I can second buying every one of Scott Meyers C++ books. Become familiar with Boost and with the coding styles used in Boost - the people involved with that project are on the cutting edge of C++ design.
Do not reinvent the wheel. Become familiar with the STL and Boost, and use their facilities whenever possible rolling your own. In particular, use STL strings and collections unless you have a very, very good reason not to. Get to know auto_ptr and the Boost smart pointers library very well, understand under which circumstances each type of smart pointer is intended to be used, and then use smart pointers everywhere you might otherwise have used raw pointers. Your code will be just as efficient and a lot less prone to memory leaks.
Use static_cast, dynamic_cast, const_cast, and reinterpret_cast instead of C-style casts. Unlike C-style casts they will let you know if you are really asking for a different type of cast than you think you are asking for. And they stand out viisually, alerting the reader that a cast is taking place.
The web page C++ Pitfalls by Scott Wheeler covers some of the main C++ pitfalls.
Two gotchas that I wish I hadn't learned the hard way:
(1) A lot of output (such as printf) is buffered by default. If you're debugging crashing code, and you're using buffered debug statements, the last output you see may not really be the last print statement encountered in the code. The solution is to flush the buffer after each debug print (or turn off the buffering altogether).
(2) Be careful with initializations - (a) avoid class instances as globals / statics; and (b) try to initialize all your member variables to some safe value in a ctor, even if it's a trivial value such as NULL for pointers.
Reasoning: the ordering of global object initialization is not guaranteed (globals includes static variables), so you may end up with code that seems to fail nondeterministically since it depends on object X being initialized before object Y. If you don't explicitly initialize a primitive-type variable, such as a member bool or enum of a class, you'll end up with different values in surprising situations -- again, the behavior can seem very nondeterministic.
I've already mentioned it a few times, but Scott Meyers' books Effective C++ and Effective STL are really worth their weight in gold for helping with C++.
Come to think of it, Steven Dewhurst's C++ Gotchas is also an excellent "from the trenches" resource. His item on rolling your own exceptions and how they should be constructed really helped me in one project.
Using C++ like C. Having a create-and-release cycle in the code.
In C++, this is not exception safe and thus the release may not be executed. In C++, we use RAII to solve this problem.
All resources that have a manual create and release should be wrapped in an object so these actions are done in the constructor/destructor.
// C Code
void myFunc()
{
Plop* plop = createMyPlopResource();
// Use the plop
releaseMyPlopResource(plop);
}
In C++, this should be wrapped in an object:
// C++
class PlopResource
{
public:
PlopResource()
{
mPlop=createMyPlopResource();
// handle exceptions and errors.
}
~PlopResource()
{
releaseMyPlopResource(mPlop);
}
private:
Plop* mPlop;
};
void myFunc()
{
PlopResource plop;
// Use the plop
// Exception safe release on exit.
}
The book C++ Gotchas may prove useful.
Here are a few pits I had the misfortune to fall into. All these have good reasons which I only understood after being bitten by behaviour that surprised me.
virtual functions in constructors aren't.
Don't violate the ODR (One Definition Rule), that's what anonymous namespaces are for (among other things).
Order of initialization of members depends on the order in which they are declared.
class bar {
vector<int> vec_;
unsigned size_; // Note size_ declared *after* vec_
public:
bar(unsigned size)
: size_(size)
, vec_(size_) // size_ is uninitialized
{}
};
Default values and virtual have different semantics.
class base {
public:
virtual foo(int i = 42) { cout << "base " << i; }
};
class derived : public base {
public:
virtual foo(int i = 12) { cout << "derived "<< i; }
};
derived d;
base& b = d;
b.foo(); // Outputs `derived 42`
The most important pitfalls for beginning developers is to avoid confusion between C and C++. C++ should never be treated as a mere better C or C with classes because this prunes its power and can make it even dangerous (especially when using memory as in C).
Check out boost.org. It provides a lot of additional functionality, especially their smart pointer implementations.
PRQA have an excellent and free C++ coding standard based on books from Scott Meyers, Bjarne Stroustrop and Herb Sutter. It brings all this information together in one document.
Not reading the C++ FAQ Lite. It explains many bad (and good!) practices.
Not using Boost. You'll save yourself a lot of frustration by taking advantage of Boost where possible.
Be careful when using smart pointers and container classes.
Avoid pseudo classes and quasi classes... Overdesign basically.
Forgetting to define a base class destructor virtual. This means that calling delete on a Base* won't end up destructing the derived part.
Keep the name spaces straight (including struct, class, namespace, and using). That's my number-one frustration when the program just doesn't compile.
To mess up, use straight pointers a lot. Instead, use RAII for almost anything, making sure of course that you use the right smart pointers. If you write "delete" anywhere outside a handle or pointer-type class, you're very likely doing it wrong.
Read the book C++ Gotchas: Avoiding Common Problems in Coding and Design.
Blizpasta. That's a huge one I see a lot...
Uninitialized variables are a huge mistake that students of mine make. A lot of Java folks forget that just saying "int counter" doesn't set counter to 0. Since you have to define variables in the h file (and initialize them in the constructor/setup of an object), it's easy to forget.
Off-by-one errors on for loops / array access.
Not properly cleaning object code when voodoo starts.
static_cast downcast on a virtual base class
Not really... Now about my misconception: I thought that A in the following was a virtual base class when in fact it's not; it's, according to 10.3.1, a polymorphic class. Using static_cast here seems to be fine.
struct B { virtual ~B() {} };
struct D : B { };
In summary, yes, this is a dangerous pitfall.
Always check a pointer before you dereference it. In C, you could usually count on a crash at the point where you dereference a bad pointer; in C++, you can create an invalid reference which will crash at a spot far removed from the source of the problem.
class SomeClass
{
...
void DoSomething()
{
++counter; // crash here!
}
int counter;
};
void Foo(SomeClass & ref)
{
...
ref.DoSomething(); // if DoSomething is virtual, you might crash here
...
}
void Bar(SomeClass * ptr)
{
Foo(*ptr); // if ptr is NULL, you have created an invalid reference
// which probably WILL NOT crash here
}
Forgetting an & and thereby creating a copy instead of a reference.
This happened to me twice in different ways:
One instance was in an argument list, which caused a large object to be put on the stack with the result of a stack overflow and crash of the embedded system.
I forgot the & on an instance variable, with the effect that the object was copied. After registering as a listener to the copy I wondered why I never got the callbacks from the original object.
Both where rather hard to spot, because the difference is small and hard to see, and otherwise objects and references are used syntactically in the same way.
Intention is (x == 10):
if (x = 10) {
//Do something
}
I thought I would never make this mistake myself, but I actually did it recently.
The essay/article Pointers, references and Values is very useful. It talks avoid avoiding pitfalls and good practices. You can browse the whole site too, which contains programming tips, mainly for C++.
I spent many years doing C++ development. I wrote a quick summary of problems I had with it years ago. Standards-compliant compilers are not really a problem anymore, but I suspect the other pitfalls outlined are still valid.
#include <boost/shared_ptr.hpp>
class A {
public:
void nuke() {
boost::shared_ptr<A> (this);
}
};
int main(int argc, char** argv) {
A a;
a.nuke();
return(0);
}