Explain C++ Mutability through Indirection example - c++

In C++ Programming Language, 4th edition, section 16.2.9.4 "Mutability through Indirection" has a sketch of an example of using indirection instead of the mutable keyword for lazy evaluation.
struct cache {
bool valid;
string rep;
};
class Date {
public:
// ...
string string_rep() const;
private:
cache * c; // initialize in constructor
void compute_cache_value() const;
// ...
};
string Date::string_rep() const {
if (!c->valid) {
compute_cache_value();
c->valid = true;
}
return c->rep;
}
Full runnable example.
There isn't a lot of explanation:
Declaring a member mutable is most appropriate when only a small part of a representation of small object is allowed to change. More complicated cases are often better handled by placing the changing data in a separate object and accessing it directly.
I'm am looking for a more complete explanation. In particular,
What is the smallness constraint? Is it a small amount of memory or a small amount of logic?
Doesn't initializing c in the constructor defeat (to a nontrivial degree) the laziness? That is, it does work you many never need.
Why is c a naked pointer instead of something like unique_ptr? The previous chapters went to a bit of effort to demonstrate exception safety and RAII.
Why not just have a mutable cache c member if you're going to allocate and initialize c in the constructor anyway?
In other words, is this a real pattern or a contrived example to demonstrate indirection versus const-ness?

The "smallness constraint" isn´t a real constraint, just a hint how to write the code (and it has not much to do with used memory). If you have a class with 30 members, and 20 out of them are mutable, it could make more sense to separate it into two classes (one for the mutable part and one for the rest).
Why it´s not a smart pointer: Don´t know, but probably a too tired book author :p
Why it´s a pointer at all: You´re right, it´s not necessary. Making a mutable cache object without any pointer will work too, and if nothing pointer-like is needed otherwise (like getting an existing object from outside), the pointer only adds another possibility for making bugs.

Warning: neither of following text has any practical value for battlefield programmers, rather than for programmers-philosophers or just plain philosophers. Also, I'm a big fan of Bjarne Stroustrup and my opinion might be biased. Unfortunately, StackOverflow format is not fit for book discussions.
Also, we are discussing awkward issues about constness-mutability where we should lie to compiler and to class user. And there are no single right opinion. I'm ready to be commented and downvoted ;)
In short:
you probably didn't understand right what lazy initialization means exactly (probably because the term is not chosen correctly in the book) As with RAII, we know that Bjarne is known to be bad at picking terminology ;)
there are few decisions that have to be made when writing a book about programming. So some of the questions boil down to "How to write a book?" rather than "How to write production code?".
In long:
What is the smallness constraint? Is it a small amount of memory or a
small amount of logic?
I'll quote Bjarne again:
Declaring a member mutable is most appropriate when only a small part of a representation of small object is allowed to change
I think he meant "small number of data members" here. Refactoring by grouping data members into separate class is a good advice in general. What's the ratio between "appropriate" and "small"? You decide it for yourself (given a real problem, a profiler tool and constraints on memory/speed/battery_life/money/client_happiness etc.)
Doesn't initializing c in the constructor defeat (to a nontrivial degree) the laziness? That is, it does work you many never need.
Well, by lazy initialization we mean non-calculating proper string value (i.e. compute_cache_value()) each time user ask for a string, but only when really needed. Not an initialization with empty string, right? (std::string initializes to empty string on construction anyway)
There is no any constructors in Bjarne's code in chapters 16.2.9.3 and 16.2.9.4! And you don't calculate string in a constructor in your code too, but initialize it with an empty string literal. All calculations are delayed until last moment. So, lazy initialization works perfectly for me here.
As further premature optimization, if you want real lazy initialization, you could probably left cache* pointer uninitialized in the constructor, and allocate on first Date::string_rep() call. This will safe bunch of heap if your cache is big and if user never need it. And this way you wrap calculations in cache constructor which renders lazy evaluation to really lazy initialization
Why is c a naked pointer instead of something like unique_ptr? The
previous chapters went to a bit of effort to demonstrate exception
safety and RAII.
In "C++ Programming Language, 4th edition" smart pointers are introduced in chapter 17, and we are talking about chapter 16. Also, it doesn't really matter to describe the mutability and brings no advantages as long as you manage to delete in destructor. Another thing is that author would have explained in this chapter why you can mutate a resource owned by smart_ptr cache having only constant smart_ptr object inside const method, which will pull in describing operator overloading (and most high-level Java and Python programers would throw away the book at that place ;) ).
Apart of this, that's a hard question in general. First of all, Books by Bjarne Stroustrup are considered mostly as teaching materials or guides. So, when teaching newcomers, should we jump in to smart pointers or to teach raw pointers first? Should we instantly use Standard library or left it for the last chapters? C++14 from the beginning or "C+" first? Who knows? There is also a problem known as "over-usage of smart pointers", notably shared_ptr.
Why not just have a mutable cache c member if you're going to allocate and initialize c in the constructor anyway?
That's what described in 16.2.9.3, right?
And adding a level of indirection here is the alternative solution (showing "there is no universal solutions for all purposes") and demonstration of this amazing quote:
All problems in computer science can be solved by another level of indirection, except for the problem of too many layers of indirection.
– David J. Wheeler
No. As user xan clarified, 16.9.3 is about multiple mutable members, whereas a single mutable struct would provide some separation of concerns.
Hope you enjoy the reading!

Related

C++ std features and Binary size

I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.

std::auto_ptr<T> Usage

I've read a reasonable amount in decent textbooks about the auto_ptr class. While I understand what it is, and how it gets you around the problem of getting exceptions in places like constructors, I am having trouble figuring out when someone would actually use it.
An auto_ptr can only hold a single type (no array new[] initialization is supported). It changes ownership when you pass it into functions or try and duplicate it (it's not a reference counting smart pointer).
What is a realistic usage scenario for this class give its limitations? It seems like most of the textbook examples of its use are reaching because there isn't even a reason to be using a pointer over a stack variable in most of the cases...
Anyway, I'll stop my rant - but if you can provide a short example/description or a link to a good usage scenario for this I'd be grateful. I just want to know where I should use it in practice in case I come across the situation - I like to practice what I learn so I remember it.
I'll give you a short example for a good usage. Consider this:
auto_ptr<SomeResource> some_function() {
auto_ptr<SomeResource> my_ptr = get_the_resource();
function_that_throws_an_exception();
return my_ptr;
}
The function that raises an exception would normally cause your pointer to be lost, and the object pointed to would not be deleted. With the auto_ptr this can't happen, since it is destroyed when it leaves the frame it was created, if it hasn't been assigned (for example with return).
auto_ptr has been deprecated in the now finalized C++11 standard. Some of the replacements are already available through TR1 or the Boost libraries. Examples are shared_ptr and unique_ptr (scoped_ptr in boost).

Is it bad design for a class to give access to its data (via ptr/it) when this data can be deleted before the class object is out of scope?

Classic example is iterator invalidation :
std::string test("A");
auto it = test.insert(test.begin()+1,'B');
test.erase();
...
std::cout << *it;
Do you think having this kind of API is bad design, and will be difficult to learn/use for beginners ?
A costly, performance/memory wise, solution would be, in that type of case, to assign the pointer/iterator to an empty string (or a nullptr, but that's not very helpful) when a clear method is used.
Some precisions
I'm thinking of this design for returning const chars* that can be modified internally (maybe they're stored in a std::vector that can be cleared). I don't want to return a std::string (binary compatibility) and I don't want a get(char*,std::size_t) method because of the size argument that needs to be fetched (too slow). Also I don't want to create a wrapper around std::string or my own string class.
I would recommend reading up on Stepanov's design philosophy (pages 9-11):
[This example] is written in a clear object-oriented style with getters and setters. The proponents of this style say that the advantage of having such functions is that it allows programmers later on to change the implementation. What they forget to mention is that sometimes it is awfully good to expose the implementation. Let us see what I mean. It is hard for me to imagine an evolution of a system that would let you keep the interface of get and set, but be able to change the implementation. I could imagine that the implementation outgrows int and you need to switch to long. But that is a different interface. I can imagine that you decide to switch from an array to a list but that also will force you to change the interface, since it is really not a very good idea to index into a linked list.
Now let us see why it is really good to expose the implementation. Let us assume that tomorrow you decide to sort your integers. How can you do it? Could you use the C library qsort? No, since it knows nothing about your getters and setters. Could you use the STL sort? The answer is the same. While you design your class to survive some hypothetical change in the implementation, you did not design it for the very common task of sorting. Of course, the proponents of getters and setters will suggest that you extend your interface with a member function sort. After you do that, you will discover that you need binary search and median, etc. Very soon your class will have 30 member functions but, of course, it will be hiding the implementation. And that could be done only if you are the owner of the class. Otherwise, you need to implement a decent sorting algorithm on top of the setter-getter interface from scratch and that is a far more difficult and dangerous activity than one can imagine. ...
Setters and getters make our daily programming hard but promise huge rewards in the future when we discover better ways to store arrays of integers in memory. But I do not know a single realistic scenario when hiding memory locations inside our data structure helps and exposure hurts; it is, therefore, my obligation to expose a much more convenient interface that also happens to be consistent with the familiar interface to the C arrays. When we program in C++ we should not be ashamed of its C heritage, but make full use of it. The only problems with C++, and even the only problems with C, arise when they themselves are not consistent with their own logic. ...
My remark about exposing the address locations of consecutive integers is not facetious.
It took a major effort to convince the standard committee that such a requirement is an
essential property of vectors; they would not, however, agree that vector iterators should
be pointers and, therefore, on several major platforms – including the Microsoft one – it
is faster to sort your vector by saying the unbelievably ugly
if (!v.empty()) {
sort(&*v.begin(), &*v.begin() + v.size());
}
than the intended
sort(v.begin(), v.end());
Attempts to impose pseudo-abstractness at the cost of efficiency can be defeated, but at a terrible cost.
Stepanov has a lot of other interesting documents available, especially in the "Class Notes" section.
Yes, there are several rules of thumb regarding OOP. No, I'm not convinced that they are really the best way to do things. When you're working with the STL it makes a lot of sense to do things the STL compatible way. And when your abstraction is low level (like std::vector, which is meant specifically to make working with dynamically allocated arrays easier; i.e., it should be usable almost like an array with some added features), then some of those OOP rules of thumb make no sense at all.
To answer the original question: even beginners will eventually need to learn about iterators, object lifetimes, and what I'll call an object's useful life (i.e., "the object hasn't fallen out of scope, but is no longer valid to use, like an invalidated iterator"). I don't see any reason to try to hide those facts of life from the user, so I personally wouldn't rule out an iterator-based API on those grounds. The real question is what your API is meant to abstract and what's it's meant to expose (similar to the fact that a vector is a nicer array and is meant to expose its array nature). If you answer that, you should have a better idea about whether an iterator-based API makes sense.
As Scott Meyers states in Effective C++: yes it is indeed not a good design to grant access to private/protected members via pointers, iterators or references because you never know what the client code will do with it.
As far as I can remember this should be avoided, and it is sometimes better to create a copy of data members which are then returned to the caller.
It is a bad or faulty implementation rather than design.
As for providing access to private or protected members through pointers, basically it destroys one of the basic OOP principle of Abstraction.
I am unsure though as to what the question is, Yes ofcourse it is bad to have implementation which invalidates iterator. What is the real Q here?

What are the often misunderstood concepts in C++? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are the often misunderstood concepts in c++?
C++ is not C with classes!
And there is no language called C/C++. Everything goes downhill from there.
That C++ does have automatic resource management.
(Most people who claim that C++ does not have memory management try to use new and delete way too much, not realising that if they allowed C++ to manage the resource themselves, the task gets much easier).
Example: (Made with a made up API because I do not have time to check the docs now)
// C++
void DoSomething()
{
File file("/tmp/dosomething", "rb");
... do stuff with file...
// file is automatically free'ed and closed.
}
// C#
public void DoSomething()
{
File file = new File("/tmp/dosomething", "rb");
... do stuff with file...
// file is NOT automatically closed.
// What if the caller calls DoSomething() in a tight loop?
// C# requires you to be aware of the implementation of the File class
// and forces you to accommodate, thus voiding implementation-hiding
// principles.
// Approaches may include:
// 1) Utilizing the IDisposable pattern.
// 2) Utilizing try-finally guards, which quickly gets messy.
// 3) The nagging doubt that you've forgotten something /somewhere/ in your
// 1 million loc project.
// 4) The realization that point #3 can not be fixed by fixing the File
// class.
}
Free functions are not bad just because they are not within a class C++ is not an OOP language alone, but builds upon a whole stack of techniques.
I've heard it many times when people say free functions (those in namespaces and global namespace) are a "relict of C times" and should be avoided. Quite the opposite is true. Free functions allow to decouple functions from specific classes and allow reuse of functionality. It's also recommended to use free functions instead of member functions if the function don't need access to implementation details - because this will eliminate cascading changes when one changes the implementation of a class among other advantages.
This is also reflected in the language: The range-based for loop in C++0x (next C++ version released very soon) will be based on free function calls. It will get begin / end iterators by calling the free functions begin and end.
The difference between assignment and initialisation:
string s = "foo"; // initialisation
s = "bar"; // assignment
Initialisation always uses constructors, assignment always uses operator=
In decreasing order:
make sure to release pointers for allocated memory
when destructors should be virtual
how virtual functions work
Interestingly not many people know the full details of virtual functions, but still seem to be ok with getting work done.
The most pernicious concept I've seen is that it should be treated as C with some addons. In fact, with modern C++ systems, it should be treated as a different language, and most of the C++-bashing I see is based on the "C with add-ons" model.
To mention some issues:
While you probably need to know the difference between delete and delete[], you should normally be writing neither. Use smart pointers and std::vector<>.
In fact, you should be using a * only rarely. Use std::string for strings. (Yes, it's badly designed. Use it anyway.)
RAII means you don't generally have to write clean-up code. Clean-up code is bad style, and destroys conceptual locality. As a bonus, using RAII (including smart pointers) gives you a lot of basic exception safety for free. Overall, it's much better than garbage collection in some ways.
In general, class data members shouldn't be directly visible, either by being public or by having getters and setters. There are exceptions (such as x and y in a point class), but they are exceptions, and should be considered as such.
And the big one: there is no such language as C/C++. It is possible to write programs that can compile properly under either language, but such programs are not good C++ and are not normally good C. The languages have been diverging since Stroustrup started working on "C with Classes", and are less similar now than ever. Using "C/C++" as a language name is prima facie evidence that the user doesn't know what he or she is talking about. C++, properly used, is no more like C than Java or C# are.
The overuse of inheritance unrelated to polymorphism. Most of the time, unless you really do use runtime polymorphism, composition or static polymorphism (i.e., templates) is better.
The static keyword which can mean one of three distinct things depending on where it is used.
It can be a static member function or member variable.
It can be a static variable or function declared at namespace scope.
It can be a static variable declared inside a function.
Arrays are not pointers
They are different. So &array is not a pointer to a pointer, but a pointer to an array. This is the most misunderstood concept in both C and C++ in my opinion. You gotta have a visit to all those SO answers that tell to pass 2-d arrays as type** !
Here is an important concept in C++ that is often forgotten:
C++ should not be simply used like an object
oriented language such as Java or C#.
Inspire yourself from the STL and write generic code.
Here are some:
Using templates to implement polymorphism without vtables, à la ATL.
Logical const-ness vs actual const-ness in memory. When to use the mutable keyword.
ACKNOWLEDGEMENT: Thanks for correcting my mistake, spoulson.
EDIT:
Here are more:
Virtual inheritance (not virtual methods): In fact, I don't understand it at all! (by that, I mean I don't know how it's implemented)
Unions whose members are objects whose respective classes have non-trivial constructors.
Given this:
int x = sizeof(char);
what value is X?
The answer you often hear is dependant on the level of understanding of the specification.
Beginner - x is one because chars are always eight bit values.
Intermediate - it depends on the compiler implementation, chars could be UTF16 format.
Expert - x is one and always will be one since a char is the smallest addressable unit of memory and sizeof determines the number of units of memory required to store an instance of the type. So in a system where a char is eight bits, a 32 bit value will have a sizeof of 4; but in a system where a char is 16 bits, a 32 bit value will have a sizeof of 2.
It's unfortunate that the standard uses 'byte' to refer to a unit of memory since many programmers think of 'byte' as being eight bits.
C++ is a multi-paradigm language. Many people associate C++ strictly with OOP.
a classic among beginners to c++ from c:
confuse delete and delete[]
EDIT:
another classic failure among all levels of experience when using C API:
std::string helloString = "hello world";
printf("%s\n", helloString);
instead of:
printf("%s\n", helloString.c_str());
it happens to me every week. You could use streams, but sometimes you have to deal with printf-like APIs.
Pointers.
Dereferencing the pointers. Through either . or ->
Address of using & for when a pointer is required.
Functions that take params by reference by specifing a & in the signature.
Pointer to pointers to pointers *** or pointers by reference void someFunc(int *& arg)
There are a few things that people seem to be constantly confused by or have no idea about:
Pointers, especially function pointers and multiple pointers (e.g. int(*)(void*), void***)
The const keyword and const correctness (e.g. what is the difference between const char*, char* const and const char* const, and what does void class::member() const; mean?)
Memory allocation (e.g. every pointer new'ed should be deleted, malloc/free should not be mixed with new/delete, when to use delete [] instead of delete, why the C functions are still useful (e.g. expand(), realloc()))
Scope (i.e. that you can use { } on its own to create a new scope for variable names, rather than just as part of if, for etc...)
Switch statements. (e.g. not understanding that they can optimise as well (or better in some cases) than chains of ifs, not understanding fall through and its practical applications (loop unrolling as an example) or that there is a default case)
Calling conventions (e.g. what is the difference between cdecl and stdcall, how would you implement a pascal function, why does it even matter?)
Inheritance and multiple inheritance and, more generally, the entire OO paradigm.
Inline assembler, as it is usually implemented, is not part of C++.
Pointers to members and pointers to member functions.
Non-type template parameters.
Multiple inheritance, particularly virtual base classes and shared base objects.
Order of construction and destruction, the state of virtual functions in the middle of constructing an intermediate base class.
Cast safety and variable sizes. No, you can't assume that sizeof(void *) == sizeof(int) (or any other type for that matter, unless a portable header specifically guarantees it) in portable code.
Pointer arithmetic.
Headers and implementation files
This is also a concept misunderstood by many. Questions like what goes into header files and why it causes link errors if function definitions appear multiple times in a program on the one side but not when class definitions appear multiple times on the other side.
Very similar to those questions is why it is important to have header guards.
If a function accepts a pointer to a pointer, void* will still do it
I've seen that the concept of a void pointer is frequently confused. It's believed that if you have a pointer, you use a void*, and if you have a pointer to a pointer, you use a void**. But you can and should in both cases use void*. A void** does not have the special properties that a void* has.
It's the special property that a void* can also be assigned a pointer to a pointer and when cast back the original value is received.
I think the most misunderstood concept about C++ is why it exists and what its purpose is. Its often under fire from above (Java, C# etc.) and from below (C). C++ has the ability to operate close to the machine to deal with computational complexity and abstraction mechanisms to manage domain complexity.
NULL is always zero.
Many confuse NULL with an address, and think therefor it's not necessarily zero if the platform has a different null pointer address.
But NULL is always zero and it is not an address. It's an zero constant integer expression that can be converted to pointer types.
Memory Alignment.
std::vector does not create elements when reserve is used
I've seen it that programmers argue that they can access members at positions greater than what size() returns if they reserve()'ed up to that positions. That's a wrong assumption but is very common among programmers - especially because it's quite hard for the compiler to diagnose a mistake, which will silently make things "work".
Hehe, this is a silly reply: the most misunderstood thing in C++ programming is the error messages from g++ when template classes fail to compile!
C++ is not C with string and vector!
C structs VS C++ structs is often misunderstood.
C++ is not a typical object oriented language.
Don't believe me? look at the STL, way more templates than objects.
It's almost impossible to use Java/C# ways of writing object oriented code; it simply doesn't work.
In Java/C# programming, there's alot of newing, lots of utility objects that implement some single cohesive functionality.
In C++, any object newed must be deleted, but there's always the problem of who owns the object
As a result, objects tend to be created on the stack
But when you do that, you have to copy them around all the time if you're going to pass them around to other functions/objects, thus wasting a lot of performance that is said to be achieved with the unmanaged environment of C++
Upon realizing that, you have to think about other ways of organizing your code
You might end up doing things the procedural way, or using metaprogramming idioms like smart pointers
At this point, you've realized that OO in C++ cannot be used the same way as it is used in Java/C#
Q.E.D.
If you insist on doing oop with pointers, you'll usually have large (gigantic!) classes, with clearly defined ownership relationships between objects to avoid memory leaks. And then even if you do that, you're already too far from the Java/C# idiom of oop.
Actually I made up the term "object-oriented", and I can tell you I did not have C++ in mind.
-- Alan Kay (click the link, it's a video, the quote is at 10:33)
Although from a purist point of view (e.g. Alan Kay), even Java and C# fall short of true oop
A pointer is an iterator, but an iterator is not always a pointer
This is also an often misunderstood concept. A pointer to an object is a random access iterator: It can be incremented/decremented by an arbitrary amount of elements and can be read and written. However, an iterator class that has operator overloads doing that fulfill those requirements too. So it is also an iterator but is of course not a pointer.
I remember one of my past C++ teachers was teaching (wrongly) that you get a pointer to an element of a vector if you do vec.begin(). He was actually assuming - without knowing - that the vector implements its iterators using pointers.
That anonymous namespaces are almost always what is truly wanted when people are making static variables in C++
When making library header files, the pimpl idiom (http://www.gotw.ca/gotw/024.htm) should be used for almost all private functions and members to aid in dependency management
I still don't get why vector doesn't have a pop_front and the fact that I can't sort(list.begin(), list.end())..

What C++ pitfalls should I avoid? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I remember first learning about vectors in the STL and after some time, I wanted to use a vector of bools for one of my projects. After seeing some strange behavior and doing some research, I learned that a vector of bools is not really a vector of bools.
Are there any other common pitfalls to avoid in C++?
A short list might be:
Avoid memory leaks through use shared pointers to manage memory allocation and cleanup
Use the Resource Acquisition Is Initialization (RAII) idiom to manage resource cleanup - especially in the presence of exceptions
Avoid calling virtual functions in constructors
Employ minimalist coding techniques where possible - for example, declaring variables only when needed, scoping variables, and early-out design where possible.
Truly understand the exception handling in your code - both with regard to exceptions you throw, as well as ones thrown by classes you may be using indirectly. This is especially important in the presence of templates.
RAII, shared pointers and minimalist coding are of course not specific to C++, but they help avoid problems that do frequently crop up when developing in the language.
Some excellent books on this subject are:
Effective C++ - Scott Meyers
More Effective C++ - Scott Meyers
C++ Coding Standards - Sutter & Alexandrescu
C++ FAQs - Cline
Reading these books has helped me more than anything else to avoid the kind of pitfalls you are asking about.
Pitfalls in decreasing order of their importance
First of all, you should visit the award winning C++ FAQ. It has many good answers to pitfalls. If you have further questions, visit ##c++ on irc.freenode.org in IRC. We are glad to help you, if we can. Note all the following pitfalls are originally written. They are not just copied from random sources.
delete[] on new, delete on new[]
Solution: Doing the above yields to undefined behavior: Everything could happen. Understand your code and what it does, and always delete[] what you new[], and delete what you new, then that won't happen.
Exception:
typedef T type[N]; T * pT = new type; delete[] pT;
You need to delete[] even though you new, since you new'ed an array. So if you are working with typedef, take special care.
Calling a virtual function in a constructor or destructor
Solution: Calling a virtual function won't call the overriding functions in the derived classes. Calling a pure virtual function in a constructor or desctructor is undefined behavior.
Calling delete or delete[] on an already deleted pointer
Solution: Assign 0 to every pointer you delete. Calling delete or delete[] on a null-pointer does nothing.
Taking the sizeof of a pointer, when the number of elements of an 'array' is to be calculated.
Solution: Pass the number of elements alongside the pointer when you need to pass an array as a pointer into a function. Use the function proposed here if you take the sizeof of an array that is supposed to be really an array.
Using an array as if it were a pointer. Thus, using T ** for a two dimentional array.
Solution: See here for why they are different and how you handle them.
Writing to a string literal: char * c = "hello"; *c = 'B';
Solution: Allocate an array that is initialized from the data of the string literal, then you can write to it:
char c[] = "hello"; *c = 'B';
Writing to a string literal is undefined behavior. Anyway, the above conversion from a string literal to char * is deprecated. So compilers will probably warn if you increase the warning level.
Creating resources, then forgetting to free them when something throws.
Solution: Use smart pointers like std::unique_ptr or std::shared_ptr as pointed out by other answers.
Modifying an object twice like in this example: i = ++i;
Solution: The above was supposed to assign to i the value of i+1. But what it does is not defined. Instead of incrementing i and assigning the result, it changes i on the right side as well. Changing an object between two sequence points is undefined behavior. Sequence points include ||, &&, comma-operator, semicolon and entering a function (non exhaustive list!). Change the code to the following to make it behave correctly: i = i + 1;
Misc Issues
Forgetting to flush streams before calling a blocking function like sleep.
Solution: Flush the stream by streaming either std::endl instead of \n or by calling stream.flush();.
Declaring a function instead of a variable.
Solution: The issue arises because the compiler interprets for example
Type t(other_type(value));
as a function declaration of a function t returning Type and having a parameter of type other_type which is called value. You solve it by putting parentheses around the first argument. Now you get a variable t of type Type:
Type t((other_type(value)));
Calling the function of a free object that is only declared in the current translation unit (.cpp file).
Solution: The standard doesn't define the order of creation of free objects (at namespace scope) defined across different translation units. Calling a member function on an object not yet constructed is undefined behavior. You can define the following function in the object's translation unit instead and call it from other ones:
House & getTheHouse() { static House h; return h; }
That would create the object on demand and leave you with a fully constructed object at the time you call functions on it.
Defining a template in a .cpp file, while it's used in a different .cpp file.
Solution: Almost always you will get errors like undefined reference to .... Put all the template definitions in a header, so that when the compiler is using them, it can already produce the code needed.
static_cast<Derived*>(base); if base is a pointer to a virtual base class of Derived.
Solution: A virtual base class is a base which occurs only once, even if it is inherited more than once by different classes indirectly in an inheritance tree. Doing the above is not allowed by the Standard. Use dynamic_cast to do that, and make sure your base class is polymorphic.
dynamic_cast<Derived*>(ptr_to_base); if base is non-polymorphic
Solution: The standard doesn't allow a downcast of a pointer or reference when the object passed is not polymorphic. It or one of its base classes has to have a virtual function.
Making your function accept T const **
Solution: You might think that's safer than using T **, but actually it will cause headache to people that want to pass T**: The standard doesn't allow it. It gives a neat example of why it is disallowed:
int main() {
char const c = ’c’;
char* pc;
char const** pcc = &pc; //1: not allowed
*pcc = &c;
*pc = ’C’; //2: modifies a const object
}
Always accept T const* const*; instead.
Another (closed) pitfalls thread about C++, so people looking for them will find them, is Stack Overflow question C++ pitfalls.
Some must have C++ books that will help you avoid common C++ pitfalls:
Effective C++
More Effective C++
Effective STL
The Effective STL book explains the vector of bools issue :)
Brian has a great list: I'd add "Always mark single argument constructors explicit (except in those rare cases you want automatic casting)."
Not really a specific tip, but a general guideline: check your sources. C++ is an old language, and it has changed a lot over the years. Best practices have changed with it, but unfortunately there's still a lot of old information out there. There have been some very good book recommendations on here - I can second buying every one of Scott Meyers C++ books. Become familiar with Boost and with the coding styles used in Boost - the people involved with that project are on the cutting edge of C++ design.
Do not reinvent the wheel. Become familiar with the STL and Boost, and use their facilities whenever possible rolling your own. In particular, use STL strings and collections unless you have a very, very good reason not to. Get to know auto_ptr and the Boost smart pointers library very well, understand under which circumstances each type of smart pointer is intended to be used, and then use smart pointers everywhere you might otherwise have used raw pointers. Your code will be just as efficient and a lot less prone to memory leaks.
Use static_cast, dynamic_cast, const_cast, and reinterpret_cast instead of C-style casts. Unlike C-style casts they will let you know if you are really asking for a different type of cast than you think you are asking for. And they stand out viisually, alerting the reader that a cast is taking place.
The web page C++ Pitfalls by Scott Wheeler covers some of the main C++ pitfalls.
Two gotchas that I wish I hadn't learned the hard way:
(1) A lot of output (such as printf) is buffered by default. If you're debugging crashing code, and you're using buffered debug statements, the last output you see may not really be the last print statement encountered in the code. The solution is to flush the buffer after each debug print (or turn off the buffering altogether).
(2) Be careful with initializations - (a) avoid class instances as globals / statics; and (b) try to initialize all your member variables to some safe value in a ctor, even if it's a trivial value such as NULL for pointers.
Reasoning: the ordering of global object initialization is not guaranteed (globals includes static variables), so you may end up with code that seems to fail nondeterministically since it depends on object X being initialized before object Y. If you don't explicitly initialize a primitive-type variable, such as a member bool or enum of a class, you'll end up with different values in surprising situations -- again, the behavior can seem very nondeterministic.
I've already mentioned it a few times, but Scott Meyers' books Effective C++ and Effective STL are really worth their weight in gold for helping with C++.
Come to think of it, Steven Dewhurst's C++ Gotchas is also an excellent "from the trenches" resource. His item on rolling your own exceptions and how they should be constructed really helped me in one project.
Using C++ like C. Having a create-and-release cycle in the code.
In C++, this is not exception safe and thus the release may not be executed. In C++, we use RAII to solve this problem.
All resources that have a manual create and release should be wrapped in an object so these actions are done in the constructor/destructor.
// C Code
void myFunc()
{
Plop* plop = createMyPlopResource();
// Use the plop
releaseMyPlopResource(plop);
}
In C++, this should be wrapped in an object:
// C++
class PlopResource
{
public:
PlopResource()
{
mPlop=createMyPlopResource();
// handle exceptions and errors.
}
~PlopResource()
{
releaseMyPlopResource(mPlop);
}
private:
Plop* mPlop;
};
void myFunc()
{
PlopResource plop;
// Use the plop
// Exception safe release on exit.
}
The book C++ Gotchas may prove useful.
Here are a few pits I had the misfortune to fall into. All these have good reasons which I only understood after being bitten by behaviour that surprised me.
virtual functions in constructors aren't.
Don't violate the ODR (One Definition Rule), that's what anonymous namespaces are for (among other things).
Order of initialization of members depends on the order in which they are declared.
class bar {
vector<int> vec_;
unsigned size_; // Note size_ declared *after* vec_
public:
bar(unsigned size)
: size_(size)
, vec_(size_) // size_ is uninitialized
{}
};
Default values and virtual have different semantics.
class base {
public:
virtual foo(int i = 42) { cout << "base " << i; }
};
class derived : public base {
public:
virtual foo(int i = 12) { cout << "derived "<< i; }
};
derived d;
base& b = d;
b.foo(); // Outputs `derived 42`
The most important pitfalls for beginning developers is to avoid confusion between C and C++. C++ should never be treated as a mere better C or C with classes because this prunes its power and can make it even dangerous (especially when using memory as in C).
Check out boost.org. It provides a lot of additional functionality, especially their smart pointer implementations.
PRQA have an excellent and free C++ coding standard based on books from Scott Meyers, Bjarne Stroustrop and Herb Sutter. It brings all this information together in one document.
Not reading the C++ FAQ Lite. It explains many bad (and good!) practices.
Not using Boost. You'll save yourself a lot of frustration by taking advantage of Boost where possible.
Be careful when using smart pointers and container classes.
Avoid pseudo classes and quasi classes... Overdesign basically.
Forgetting to define a base class destructor virtual. This means that calling delete on a Base* won't end up destructing the derived part.
Keep the name spaces straight (including struct, class, namespace, and using). That's my number-one frustration when the program just doesn't compile.
To mess up, use straight pointers a lot. Instead, use RAII for almost anything, making sure of course that you use the right smart pointers. If you write "delete" anywhere outside a handle or pointer-type class, you're very likely doing it wrong.
Read the book C++ Gotchas: Avoiding Common Problems in Coding and Design.
Blizpasta. That's a huge one I see a lot...
Uninitialized variables are a huge mistake that students of mine make. A lot of Java folks forget that just saying "int counter" doesn't set counter to 0. Since you have to define variables in the h file (and initialize them in the constructor/setup of an object), it's easy to forget.
Off-by-one errors on for loops / array access.
Not properly cleaning object code when voodoo starts.
static_cast downcast on a virtual base class
Not really... Now about my misconception: I thought that A in the following was a virtual base class when in fact it's not; it's, according to 10.3.1, a polymorphic class. Using static_cast here seems to be fine.
struct B { virtual ~B() {} };
struct D : B { };
In summary, yes, this is a dangerous pitfall.
Always check a pointer before you dereference it. In C, you could usually count on a crash at the point where you dereference a bad pointer; in C++, you can create an invalid reference which will crash at a spot far removed from the source of the problem.
class SomeClass
{
...
void DoSomething()
{
++counter; // crash here!
}
int counter;
};
void Foo(SomeClass & ref)
{
...
ref.DoSomething(); // if DoSomething is virtual, you might crash here
...
}
void Bar(SomeClass * ptr)
{
Foo(*ptr); // if ptr is NULL, you have created an invalid reference
// which probably WILL NOT crash here
}
Forgetting an & and thereby creating a copy instead of a reference.
This happened to me twice in different ways:
One instance was in an argument list, which caused a large object to be put on the stack with the result of a stack overflow and crash of the embedded system.
I forgot the & on an instance variable, with the effect that the object was copied. After registering as a listener to the copy I wondered why I never got the callbacks from the original object.
Both where rather hard to spot, because the difference is small and hard to see, and otherwise objects and references are used syntactically in the same way.
Intention is (x == 10):
if (x = 10) {
//Do something
}
I thought I would never make this mistake myself, but I actually did it recently.
The essay/article Pointers, references and Values is very useful. It talks avoid avoiding pitfalls and good practices. You can browse the whole site too, which contains programming tips, mainly for C++.
I spent many years doing C++ development. I wrote a quick summary of problems I had with it years ago. Standards-compliant compilers are not really a problem anymore, but I suspect the other pitfalls outlined are still valid.
#include <boost/shared_ptr.hpp>
class A {
public:
void nuke() {
boost::shared_ptr<A> (this);
}
};
int main(int argc, char** argv) {
A a;
a.nuke();
return(0);
}