C++ style vs. performance? - c++

C++ style vs. performance - is using C-style things, that are faster the some C++ equivalents, that bad practice ? For example:
Don't use atoi(), itoa(), atol(), etc. ! Use std::stringstream <- probably sometimes it's better, but always? What's so bad using the C functions? Yep, C-style, not C++, but whatever? This is C++, we're looking for performance all the time..
Never use raw pointers, use smart pointers instead - OK, they're really useful, everyone knows that, I know that, I use the all the time and I know how much better they're that raw pointers, but sometimes it's completely safe to use raw pointers.. Why not? "Not C++ style? <- is this enough?
Don't use bitwise operations - too C-style? WTH? Why not, when you're sure what you're doing? For example - don't do bitwise exchange of variables ( a ^= b; b ^= a; a ^= b; ) - use standard 3-step exchange. Don't use left-shift for multiplying by two. Etc, etc.. (OK, that's not C++ style vs. C-style, but still "not good practice" )
And finally, the most expensive - "Don't use enum-s to return codes, it's too C-style, use exceptions for different errors" ? Why? OK, when we're talking about error handling on deep levels - OK, but why always? What's so wrong with this, for example - when we're talking about a function, that returns different error codes and when the error handling will be implemented only in the function, that calls the first one? I mean - no need to pass the error codes on a upper level. Exceptions are rather slow and they're exceptions for exceptional situations, not for .. beauty.
etc., etc., etc.
Okay, I know that good coding style is very, very important <- the code should be easy to read and understand. I know that there's no need from micro optimizations, as the modern compilers are very smart and Compiler optimizations are very powerful. But I also know how expensive is the exceptions handling, how (some) smart_pointers are implemented, and that there's no need from smart_ptr all the time.. I know that, for example, atoi is not that "safe" as std::stringstream is, but still.. What about performance?
EDIT: I'm not talking about some really hard things, that are only C-style specific. I mean - don't wonder to use function pointers or virtual methods and these kind of stuff, that a C++ programmer may not know, if never used such things (while C programmers do this all the time). I'm talking about some more common and easy things, such as in the examples.

In general, the thing you're missing is that the C way often isn't faster. It just looks more like a hack, and people often think hacks are faster.
Never use raw pointers, use smart pointers instead - OK, they're really useful, everyone knows that, I know that, I use the all the time and I know how much better they're that raw pointers, but sometimes it's completely safe to use raw pointers.. Why not?
Let's turn the question on its head. Sometimes it's safe to use raw pointers. Is that alone a reason to use them? Is there anything about raw pointers that is actually superior to smart pointers? It depends. Some smart pointer types are slower than raw pointers. Others aren't. What is the performance rationale for using a raw pointer over a std::unique_ptr or a boost::scoped_ptr? Neither of them have any overhead, they just provide safer semantics.
This isn't to say that you should never use raw pointers. Just that you shouldn't do it just because you think you need performance, or just because "it seems safe". Do it when you need to represent something that smart pointers can't. As a rule of thumb, use pointers to point to things, and smart pointers to take ownership of things. But it's a rule of thumb, not a universal rule. Use whichever fits the task at hand. But don't blindly assume that raw pointers will be faster. And when you use smart pointers, be sure you are familiar with them all. Too many people just use shared_ptr for everything, and that is just awful, both in terms of performance and the very vague shared ownership semantics you end up applying to everything.
Don't use bitwise operations - too C-style? WTH? Why not, when you're sure what you're doing? For example - don't do bitwise exchange of variables ( a ^= b; b ^= a; a ^= b; ) - use standard 3-step exchange. Don't use left-shift for multiplying by two. Etc, etc.. (OK, that's not C++ style vs. C-style, but still "not good practice" )
That one is correct. And the reason is "it's faster". Bitwise exchange is problematic in many ways:
it is slower on a modern CPU
it is more subtle and easier to get wrong
it works with a very limited set of types
And when multiplying by two, multiply by two. The compiler knows about this trick, and will apply it if it is faster. And once again, shifting has many of the same problems. It may, in this case, be faster (which is why the compiler will do it for you), but it is still easier to get wrong, and it works with a limited set of types. In paticular, it might compile fine with types that you think it is safe to do this trick with... And then blow up in practice. In particular, bit shifting on negative values is a minefield. Let the compiler navigate it for you.
Incidentally, this has nothing to do with "C style". The exact same advice applies in C. In C, a regular swap is still faster than the bitwise hack, and bitshifting instead of a multiply will still be done by the compiler if it is valid and if it is faster.
But as a programmer, you should use bitwise operations for one thing only: to do bitwise manipulation of integers. You've already got a multiplication operator, so use that when you want to multiply. And you've also got a std::swap function. Use that if you want to swap two values. One of the most important tricks when optimizing is, perhaps surprisingly, to write readable, meaningful code. That allows your compiler to understand the code and optimize it. std::swap can be specialized to do the most efficient exchange for the particular type it's used on. And the compiler knows several ways to implement multiplication, and will pick the fastest one depending on circumstance... If you tell it to. If you tell it to bit shift instead, you're just misleading it. Tell it to multiply, and it will give you the fastest multiply it has.
And finally, the most expensive - "Don't use enum-s to return codes, it's too C-style, use exceptions for different errors" ?
Depends on who you ask. Most C++ programmers I know of find room for both. But keep in mind that one unfortunate thing about return codes is that they're easily ignored. If that is unacceptable, then perhaps you should prefer an exception in this case. Another point is that RAII works better together with exceptions, and a C++ programmer should definitely use RAII wherever possible. Unfortunately, because constructors can't return error codes, exceptions are often the only way to indicate errors.
but still.. What about performance?
What about it? Any decent C programmer would be happy to tell you not to optimize prematurely.
Your CPU can execute perhaps 8 billion instructions per second. If you make two calls to a std::stringstream in that second, is that going to make a measurable dent in the budget?
You can't predict performance. You can't make up a coding guideline that will result in fast code. Even if you never throw a single exception, and never ever use stringstream, your code still won't automatically be fast. If you try to optimize while you write the code, then you're going to spend 90% of the effort optimizing the 90% of the code that is hardly ever executed. In order to get a measurable improvement, you need to focus on the 10% of the code that make up 95% of the execution time. Trying to make everything fast just results in a lot of wasted time with little to show for it, and a much uglier code base.

I'd advise against atoi, and atol as a rule, but not just on style grounds. They make it essentially impossible to detect input errors. While a stringstream can do the same job, strtol (for one example) is what I'd usually advise as the direct replacement.
I'm not sure who's giving that advice. Use smart pointers when they're helpful, but when they're not, there's nothing wrong with using a raw pointer.
I really have no idea who thinks it's "not good practice" to use bitwise operators in C++. Unless there were some specific conditions attached to that advice, I'd say it was just plain wrong.
This one depends heavily on where you draw the line between an exceptional input, and (for example) an input that's expected, but not usable. Generally speaking, if you're accepting input direct from the user, you can't (and shouldn't) classify anything as truly exceptional. The main good point of exceptions (even in a situation like this) is ensuring that errors aren't just ignored. OTOH, I don't think that's always the sole criterion, so you can't say it's the right way to handle every situation.
All in all, it sounds to me like you've gotten some advice that's dogmatic to the point of ignoring reality. It's probably best ignored or at least viewed as one rather extreme position about how C++ could be written, not necessarily how it always (or ever, necessarily) should be written.

Adding to #Jerry Coffin's answer, which I think is extremely useful, I would like to present some subjective observations.
The thing is that programmers tend to get fancy. That is, most of us really like writing fancy code just for the sake of it. This is perfectly fine as long as you are doing the project on your own. Remember a good software is the one whose binary code works as expected and not the one whose source code is clean. However when it comes to larger projects which are developed and maintained by lots of people, it is economically better to write simpler code so that no one from the team loses time to understand what you meant. Even at the cost of runtime(naturally minor cost). That's why many people, including myself, would discourage using the xor trick instead of assignment(you may be surprised but there are extremely many programmers out there that haven't heard of the xor trick). The xor trick works only for integers anyway, and the traditional way of swapping integers is very fast anyway, so using the xor trick is just being fancy.
using itoa, atoi etc instead of streams is faster. Yes, it is. But how much faster? Not much. Unless most of your program does only conversions from text to string and vice versa you won't notice the difference. Why do people use itoa, atoi etc? Well, some of them do, because they are unaware of the c++ alternative. Another group does because it's just one LOC. For the former group - shame on you, for the latter - why not boost::lexical_cast?
exceptions... ah ... yeah, they can be slower than return codes but in most cases not really. Return codes can contain information, which is not an error. Exceptions should be used to report severe errors, ones which cannot be ignored. Some people forget about this and use exceptions for simulating some weird signal/slot mechanisms (believe me, I have seen it, yuck). My personal opinion is that there is nothing wrong using return codes, but severe errors should be reported with exceptions, unless the profiler has shown that refraining from them would considerably boost the performance
raw pointers - My own opinion is this: never use smart pointers when it's not about ownership. Always use smart pointers when it's about ownership. Naturally with some exceptions.
bit-shifting instead of multiplication by powers of two. This, I believe, is a classic example of premature optimization. x << 3; I bet at least 25% of your co-workers will need some time before they will understand/realize this means x * 8; Obfuscated (at least for 25%) code for which exact reasons? Again, if the profiler tells you this is the bottleneck (which I doubt will be the case for extremely rare cases), then green light, go ahead and do it (leaving a comment that in fact this means x * 8)
To sum it up. A good professional acknowledges the "good styles", understands why and when they are good, and rightfully makes exceptions because he knows what he's doing. Average/bad professionals are categorized into 2 types: first type doesn't acknowledge good style, doesn't even understand what and why it is. fire them. The other type treats the style as a dogma, which is not always good.

What's a best practice ? Wikipedia's words are better than mine would be :
A best practice is a technique,
method, process, activity, incentive,
or reward which conventional wisdom
regards as more effective at
delivering a particular outcome than
any other technique, method, process,
etc. when applied to a particular
condition or circumstance.
[...]
A given best practice is only
applicable to particular condition or
circumstance and may have to be
modified or adapted for similar
circumstances. In addition, a "best"
practice can evolve to become better
as improvements are discovered.
I believe there is no such thing as universal truth in programming : if you think that something is a better fit in your situation than a so called "best practice", then do what you believe is right, but know perfectly why you do (ie: prove it with numbers).

Functions with mutable char* arguments are bad in C++ because it's too difficult to manually handle their memory, since we have an alternatives. They aren't generic, we can't easily switch from char to wchar_t as basic_string allows. Also lexical_cast is more straight replacement for atoi, itoa.
If you don't really need smartness of a smart pointer - don't use it.
To swap use swap. Use bitwise operations only for bitwise operations - checking/setting/inverting flags, etc.
Exceptions are fast. They allow removing error checking condition branches, so if they really "never happen" - they increase performance.

Multiplication by bitshifting doesn't improve performance in C, the compiler will do that for you. Just be sure to multiply or divide by 2^n values for performance.
Bitfield swapping is also something that'll probably just confuse your compiler.
I'm not very experienced with string handling in C++, but from from what I know, it's hard to believe it's more flexible than scanf and printf.
Also, these "you should never" statements, I generally regard them as recommendations.

All of your questions are a-priori. What I mean is you are asking them in the abstract, not in the context of any specific program whose performance is your concern.
That's like trying to swim without being in water.
If you do tuning on a specific concrete program, you will find performance problems, and chances are they will have almost nothing whatever to do with these abstract questions. They will most likely all be things you could not have thought of a-priori.
For a specific example of this, look here.
If I could generalize from experience, a major source of performance problems is galloping generality.
That is, while data structure abstraction is generally considered a good thing, any good thing can be massively over-used, and then it becomes a crippling bad thing. This is not rare. In my experience it is typical.

I think you're answering big parts of your question on your own. I personally prefer easy-to-read code (even if you understand C style, maybe the next to read your code has more trouble with it) and safe code (which suggests stringstream, exceptions, smart pointers...)
If you really have something where it makes sense to consider bitwise operations - ok. But often I see C programmers use a char instead of a couple of bools. I do NOT like this.
Speed is important, but most of the time is usually required at a few hotspots in a program. So unless you measure that some technique is a problem (or you know pretty sure that it will become one) I would rather use what you call C++ style.

Why the expensiveness of exceptions is an argument? Exceptions are exceptions because they are rare. Their performance doesn't influence the overall performance. The steps you have to take to make your code exception-safe do not influence the performance either. But on the other hand exceptions are convenient and flexible.

This is not really an "answer", but if you work in a project where performance is important (e.g. embedded/games), people usually do the faster C way instead of the slower C++ way in the ways you described.
The exception may be bitwise operations, where not as much is gained as you might think. For example, "Don't use left-shift for multiplying by two." A half-way decent compiler will generate the same code for << 2 and * 2.

Related

Am I casting integer types too much?

I'm very new to Rust and I've been re-solving Project Euler questions. The thing is, I realised that I kept casting integer types (mainly i32-i64) around to fit my statement; for iterators, in loops, for function inputs, for conditionals etc. Is it normal?
I'm guessing I'm doing something wrong working with one-off functions going through PE and coming from mainly dynamically typed languages.
I always try to use the smallest possible (or most feasible) integral type for the problem and I feel like I should just go with i64 for everything and be done with it instead of so much casting around.
Which is a better/recommended approach, blanket i64 type or sensible integer types with casting in code?
edit: After the comments I wanted to clarify, this is not exactly a code-review query but about best practices and readability concerns as in which of the two choices are preferred. I reckon the performance impact of casting is negligible when not misused in loops.
Unrelated PS: I was doing P4 with a twist of prime factors, turns out there are no palindromes that are a product of two 4-digit primes and the largest one from two 3-digit primes is 99899
I always try to use the smallest possible (or most feasible) int type for the problem
Here you are. Optimization. (Because why else would you do that?)
Rust encourages you to think about the integer types. This leads to a better defined and explicit program behaviour, helps catch a certain type of bugs and lets you optimize.
But coming from a language that wasn't that meticulous you'd likely to overdo it. That's how our mind often works: when we encounter some new ability or skill we try to use it everywhere.
This is often a problem with Rust. The language would introduce programmers to certain new concepts (safe borrowing, zero-cost futures) and then as people we'd rush and jumble the things we had little experience with, borrowing ourselves into a corner or using the futures absolutely everywhere.
You're trying to optimize your program, using the smallest possible integer type, even though you feel it makes your life less comfortable. If you're asking this question on stackoverflow then I'd go out on a limb and say, yes, you're casting too much!
You probably know the Knuth's motto, that "premature optimization is the root of all evil".
Don't let over-optimizing ruin your programming experience. Optimize only when you're comfortable with it. When you want to!
If you're afraid of premature pessimization then save this fear for the algorithms. Remember that programming in most dynamic languages you don't have that much freedom to over-optimize and yet your dynamic language programs still work.

Does C++ code run faster if there is no structure in program

I know it helps a lot if we structure our programs using classes, structs etc. but does it help in terms of running speed that we avoid these structures and write code plain in terms of basic C++ syntax?
For example, I am trying to write a program that works on vectors. Now it sounds tempting to write a class vector and define its methods like set_at_index(int i) that sets the value of specific row i of this vector. Furthermore I can check whether i<=N where N is the length of the vector in question.
My confusion is that with these routine every set_at_index method that is used a lot will require one 'if' statement. So if I want my code to run faster should I avoid it and go with declaring an array and manually take care that there is no memory leak?
Is there any way I can check for the memory leaks without putting burden on the code speed?
Yes, bounds checking will take slightly more time. But it will take so little extra time that it will only matter if the code is being run 28894389375 times and then it might add up to a millisecond. Note that std::vector only performs bounds checking if you use the at member function, not if you use operator[]. Also, if you are doing anything like writing to a file or printing text to the console, doing that one time will likely take more time than ten million bounds-checked array accesses, because I/O is relatively very very slow.
Typically, without bounds checking code using classes will run at the same speed as code using plain arrays. The problem with manually managing memory like you suggest is that it's easy to forget to clean it up, or to clean it up only through one path of execution through the program, or to fail to clean it up in the event of an exception. It's really hardly ever worth it. Also, it'll be just as fast to use a vector class without bounds checking as it will be to use a dynamic array without bounds checking. You pay for it either way.
I also suggest using std::vector instead of writing your own vector class since they do pretty much every optimisation you could do yourself, and they usually have the advantage of being able to write the code for their specific compiler and perhaps be able to take advantage of things that only that compiler does because they know more of its implementation. The STL classes are also rigorously tested and written by experts (usually).
You should write your code first, then measure with a profiler to see the bottlenecks in your code if it is not fast enough already, then optimise the bottlenecks. I will bet that bounds checking on arrays is probably not going to be one of those bottlenecks.
Checking for memory leaks can be done with a tool like valgrind. You don't do it in the code itself.
Don't try to over optimize before you even start writing. Go ahead and write code that is easily maintainable, readable, and as bug free as possible. Once you have things working, you can start profiling to see the real bottlenecks.
"Premature optimization is root of all evils" - Donald Knuth. (this is true 97% of the time).
Unless you profile your application and see that your class encapsulation is a bottleneck that does slow your application in a significant amount, don't hesitate to have high level structures. It will brings you plenty of benefits like readability, maintainance, and understanding what you do. That's what brings OOP: Big scale programs.
Some good answers have already been posted, and premature optimization is indeed inadvisable as others have said. However, let me put your question in slightly another light.
I know it helps a lot if we structure our programs using classes,
structs etc. but does it help in terms of running speed that we avoid
these structures and write code plain in terms of basic C++ syntax?
Theoretically, most properly written C++ code should run just as fast with fully developed classes as without, but
there are exceptions to the rule;
the effort required to write the C++ code theoretically properly may be too great; and
the same features of the C++ compiler that make it hard to write incorrect code can make it all too easy to write grossly inefficient code.
Point-by-point remarks follow.
Consider a complex three-dimensional vector type, of which each instance consists of six doubles (three real parts and three imaginary parts). If there were not so many doubles, your compiler might load them directly into your microprocessor's registers, but with six they are likely to remain on the stack when the complex three-dimensional vector is loaded. Some operations however on a complex three-dimensional vector do not require all six doubles, but only one, two or three of them. If so, then it might be preferable to store the six floating-point components separately. Thus, rather than an array of 1000 vectors, you'd keep six arrays of 1000 doubles each. Of course, one can (and probably should) bind the arrays together in a class of some kind, but -- for efficiency reasons only -- a good design might never explicitly associate individual elements from one array to another.
Sometimes, you know where your data is and what you want to do with it, and C++'s elaborate organizational and access-control facilities only get in your way. In this case, you might skip the high-level C++ and just do what you want in primitive, hackworthy, brutish, machete-wielding C-style code. Indeed, C++ explicitly supports this style of coding by making it possible -- nay, easy -- to wrap the primitive C code safely within a module and thus to hide its horror from the rest of your beautiful C++ program. Of course, if you hand your code a machete, so to speak, then you take a risk, don't you, because your code may hack up data you never wanted it to, and your compiler will stand aside and let it do it; but sometimes the risk is worth the gain, and sometimes the risk is even fun (and character-building) for a programmer's change of pace.
This point is the most subtle of the three. Where a user-defined type consists partly of other user-defined types, multiple layers of constructors will be called and implicitly invoked. This is great, and usually it is what you want, especially if you have a good unit-testing regime at each layer. The rose however has a thorn, as it were. A properly written constructor is careful never to lose anything it needs. So, unless the programmer is most careful, a constructor may quietly make a lot of strictly unnecessary copies of very large objects. Sometimes, the programmer will mentally lose track of all the levels of implicit invocation, which he never would have done if he had had to handle each invocation explicitly. Also, your data in an object of one type may lack access to a member function to which it can easily gain access, so long is it temporarily copies itself to an object of another type (you can avoid the copy with the use of handle types, reference counting and so forth, but this is not free: it's quite a bit of work). Even if the programmer is conscious of the implicit copy, the implicit copy is so much easier to code in the moment that the temptation to do so is sometimes too great -- especially when a deadline looms! Several hidden inefficiencies can arise in these ways. One can, and should, work around such inefficiencies, of course, but it can take a lot of coding effort to do so and, even then, your compiler is so busy helping you to avoid logical errors that it tends to cause you to create inadvertent inefficiencies that you would never purposely have created. The unnecessary, hidden copying of data is a much bigger problem in C++ than it ever was in C.
All in all, I would say that the C++ trade-off is worth it 80 percent of the time. C++'s organizational and access-control facilities merit the effort it takes to apply them properly. If your question regards the 20 percent, well, there is more than one valid approach to programming, in my view. Sometimes it really does help "that we avoid these structures and write code plain in terms of basic C++ syntax," as you have said.
Usually, no. Sometimes, yes. I think that the earlier answers are right, though, that the particular example you have posed is probably better treated in boring, neat, orderly C++, without tricks.
Two things:
DO NOT do any kind of premature optimization, CPUs are fast nowadays, compilers are smart and able to figure out optimizations that you wouldn't think of in months of looking at your code.
you can easily check things like memory leaks by profiling your code and/or using conditional compilation. Leaks shouldn't occur on release versions so you should just skip that checks.

C++: Using '.' operator on expressions and function calls

I was wondering if it is good practice to use the member operator . like this:
someVector = (segment.getFirst() - segment.getSecond()).normalize().normalCCW();
Just made that to show the two different things I was wondering, namely if using (expressions).member/function() and foo.getBar().getmoreBar() were in keeping with the spirit of readability and maintainability. In all the c++ code and books I learned from, I've never seen it used in this way, yet its intoxicatingly easy to use it as such. Don't want to develop any bad habits though.
Probably more (or less) important than that, I was also wondering if there would be any performance gains/losses by using it in this fashion, or unforeseen pitfalls that would introduce bugs into the program.
Thank you in advance!
or unforeseen pitfalls that would introduce bugs into the program
Well, the possible pitfalls would be
Harder to debug. You won't be able to view the results of each function call, so if one of them is returning something unexpected you will need to break it up into smaller segments to see what is going on. Also, any call in the chain may fail completely, so again, you may have to break it up to find out which call is failing.
Harder to read (sometimes). Chaining function calls can make the code harder to read. It depends on the situation, there's no hard and fast rule here. If the expression is even somewhat complex it can make things hard to follow. I don't have any problem reading your specific example.
It ultimately comes down to personal preference. I don't strive to fit as much as possible on a single line, and I have been bitten enough times by chaining where I shouldn't that I tend to break things up a bit. However, for simple expressions which are not likely to fail, chaining is fine.
Yes, this is perfectly acceptable and in fact would be completely unreadable in a lot of contexts if you were to NOT do this.
It's called method chaining.
There MIGHT be some performance gain in that you're not creating temporary variables. But any competent compiler will optimise it anyway.
it is perfectly valid to use it the way you showed. It is used in the named parameter idiom described in C++ faq lite for example.
One reason it is not always used is when you have to store intermediate result for performance reasons (if normalize is costly and you have to use it more than one time, it is better to store the result in a variable) or readability.
my2c
Using a variable to hold intermediate results can sometimes enhance readability, especially if you use good variable names. Excessive chaining can make it hard to understand what is happening. You have to use your judgement to decide if it's worthwhile to break down chains using variables. The example you present above is not excessive to me. Performance shouldn't differ much one way or the other if you enable optimization.
someVector = (segment.getFirst() - segment.getSecond()).normalize().normalCCW();
Not an answer to your question, but I should tell you that
the behavior of the expression (segment.getFirst() - segment.getSecond()) is not well-defined as per the C++ Standard. The order in which each operand is evaluated is unspecified by the Standard!
Also, see this related topic : Is this code well-defined?
I suppose what you are doing is less readable, however on the other hand, too many temporary variables can also become unreadable.
As far performance I'm sure there is a little overhead when making temporary variables but the compiler could optimize that out.
There's no big problem with using it in this way- some APIs benefit greatly from method chaining. Plus, it's misleading to create a variable, and then only use it once. When someone reads the next line, they don't have to think about all those variables that you now didn't keep.
It depends of what you're doing.
For readability you should try to use intermediate variables.
Assign calculation results to pointers, and then use them.

Efficiency of program

I want to know whether there is an effect on program efficiency by adopting object oriented approach to a problem as compared to the structured programming approach in any programming language but specially in c++.
Maybe. Maybe not.
You can write efficient object-oriented code. You can write inefficient structured code.
It depends on the application, how well the code is written, and how heavily the code is optimized. In general, you should write code so that it has a good, clean, modular architecture and is well designed, then if you have problems with performance optimize the hot spots that are causing performance issues.
Use object oriented programming where it makes sense to use it and use structured programming where it makes sense to use it. You don't have to choose between one and the other: you can use both.
I remember back in the early 1990's when C++ was young there were studies done about this. If I remember correctly, the guys who took (well written) C++ programs and recoded them in C got around a 15% increase in speed. The guys who took C programs and recoded them in C++, and modified the imperative style of C to an OO style (but same algorithms) for C++ got the same or better performance. The apparent contradiction was explained by the observation that the C programs, in being translated to an object oriented style, became better organized. Things that you did in C because it was too much code and trouble to do better could more easily be done properly in C++.
Thinking back about this I wonder about the conclusion some. Writing a program a second time will always result in a better program, so it didn't have to be imperative to OO style that made the difference. Todays computer architectures are designed with hardware support for common operations done by OO programs, and compilers have gotten better at using the instructions, so I think that it is likely that whatever overhead a virtual function call had in 1992 it is far smaller today.
There doesn't have to be, if you are very careful to avoid it. If you just take the most straightforward approach, using dynamic allocation, virtual functions, and (especially) passing objects by value, then yes there will be inefficiency.
It doesn't have to be. Algorithm is all matters. I agree encapsulation will slow you down little bit, but compilers are there to optimize.
You would say no if this is the question in computer science paper.
However in the real development environment this tends to be true if the OOP paradigm is used correctly. The reason is that in real development process, we generally need to maintain our code base and that the time when OOP paradigm could help us. One strong point of OOP over structured programming like C is that in OOP it is easier to make the code maintainable. When the code is more maintainable, it means less bug and less time to fix bug and less time needed for implementing new features. The bottom line is then we will have more time to focus on the efficiency of the application.
The problem is not technical, it is psychological. It is in what it encourages you to do by making it easy.
To make a mundane analogy, it is like a credit card. It is much more efficient than writing checks or using cash. If that is so, why do people get in so much trouble with credit cards? Because they are so easy to use that they abuse them. It takes great discipline not to over-use a good thing.
The way OO gets abused is by
Creating too many "layers of abstraction"
Creating too much redundant data structure
Encouraging the use of notification-style code, attempting to maintain consistency within redundant data structures.
It is better to minimize data structure, and if it must be redundant, be able to tolerate temporary inconsistency.
ADDED:
As an illustration of the kind of thing that OO encourages, here's what I see sometimes in performance tuning: Somebody sets SomeProperty = true;. That sounds innocent enough, right? Well that can ripple to objects that contain that object, often through polymorphism that's hard to trace. That can mean that some list or dictionary somewhere needs to have things added to it or removed from it. That can mean that some tree or list control needs controls added or removed or shuffled. That can mean windows are being created or destroyed. It can also mean some things need to be changed in a database, which might not be local so there's some I/O or mutex locking to be done.
It can really get crazy. But who cares? It's abstract.
There could be: the OO approach tends to be closer to a decoupled approach where different modules don't go poking around inside each other. They are restricted to public interfaces, and there is always a potential cost in that. For example, calling a getter instead of just directly examining a variable; or calling a virtual function by default because the type of an object isn't sufficiently obvious for a direct call.
That said, there are several factors that diminish this as a useful observation.
A well written structured program should have the same modularity (i.e. hiding implementations), and therefore incur the same costs of indirection. The cost of calling a function pointer in C is probably going to be very similar to the cost of calling a virtual function in C++.
Modern JITs, and even the use of inline methods in C++, can remove the indirection cost.
The costs themselves are probably relatively small (typically just a few extra simple operations per instruction call). This will be insignificant in a program where the real work is done in tight loops.
Finally, a more modular style frees the programmer to tackle more complicated, but hopefully less complex algorithms without the peril of low level bugs.

Learning C++ and overcautiousness

How should I learn C++? I hear that the language gives enough rope to shoot myself in the head, so should I treat every C++ line I write as a potential minefiled?
How should I learn C++?
Refer to:
Books to refer for learning OOP through C++
https://stackoverflow.com/questions/631793/good-book-to-learn-c-from
The Definitive C++ Book Guide and List
https://stackoverflow.com/questions/1122921/suggested-c-books
https://stackoverflow.com/questions/1686906/what-is-a-very-practical-c-book
https://stackoverflow.com/questions/681551/a-c-book-that-covers-non-syntax-related-problems
I hear that the language gives enough
rope to shoout myself in the head, so
should I treat every C++ line I write
as a potential minefiled?
A C++ statement can do either what you want it to do, or something else. This depends on your understanding of what that C++ statement means. But this is not specific to this language.
By learning the language, and using techniques for building correct software (like Object Oriented design, especially Design by Contract, and testing techniques), you will be able to guarantee that your program behaves as you intended it to.
I love your metaphor! What Stroustrup actually said was:
http://en.wikiquote.org/wiki/Bjarne_Stroustrup
C makes it easy to shoot yourself in
the foot; C++ makes it harder, but
when you do it blows your whole leg
off.
This was many years ago. I started learning C++ in ca. 1991 and it really was a minefield. There were no common libraries, no debuggers and the AT&T approach used a C code generator. There are now many good IDEs which support C++.
Personally I moved to Java because I find it a cleaner language but C++ is fine as long as you don't try to be tricky. Avoid native C constructs where there are existing class libraries (Stroustrup initially did not provide a String class as he though it a useful "rite of passage" to have to write one!) Now you can use a proven one.
I'm assuming you have no choice in the language. How you go about it depends on where you are coming from. C++ is not the easiest of the object-oriented languages to start on and Stroustrup's book is not necessarily the best intro.
UPDATE the OP is worried about blowing themselves up when learning the language. Generally it's a good idea to start with a subset of what one will do later. I assumed the OP is worried about:
Things which you have to know and use whatever level you program at (such as destructors)
Things which add additional complexity to the learning process and can be shelved until later (such as multiple inheritance)
what follows are some places where I blew myself up... They are not subjective, they happened!
There are some up-front gotchas that don't exist in Java or C#.
destructors. You have to manage your own memory. Failing to write destructors will blow your fingers and toes off.
equality. You will have to write an equals method (in simple Java you may get away without it)
copy constructor. Ditto. a = b will invoke this. Bites you in the bottom.
And I'd suggest avoiding multiple inheritance unless you really need it. Then avoid it anyway.
And avoid operator overloading. It looks cute to write:
vector1 = vector2 + vector3;
but
vector1 = vector2.plus(vector3);
is just as clear, only a few more characters, and you can search for it.
Well, it's not a minefield.
Really, the most problems are related to anything related to pointers, so you'll have to understand them (which it's not easy at first) and be careful when using them.
I think it's more a question of experience, having all the basics clear and trying to get a clear design since the beggining.
More than a minefield, I think it's like going to the most dangeours neighbourhood in your town. Yes, it's dangerous, but only for the ones without the attitude. :-D
I would say that that C++ is a challenging environment, if not a minefield. The fundamental issue is that problem symptoms and problem causes are not always easy to tie up. As Khelben has said one major reason for that is that we have pointers to deal with and hence we can do quite a lot of damage when pointers are not pointing where we think they are.
So you need to pay special attention when dealing with arrays and pointers, out-by-one errors can result in memory corruption and these then result in interesting problem manifestations.
Every formal language is a minefield. There're less mines in managed environments. For instance, in C# if you overblow an array you won't cause someone else's remote function to do strange things. You won't have code run differently in tests and prod because someone forgot to initialize a variable in constructor.
However, these are the easy ones. You learn to avoid them, and then you stay with the real mines, which are there in every language.
More specifically, these are some of the most important points when moving to C++:
always initialize variables. even theoretical possibility of having your program logic depend on what was in the memory beforehand is a nightmare.
dependencies: avoid data members of other compound types (classes) without pimpl idiom. This will make your users exposed to the inner workings of the types you use, and increase compilation time dramatically. Dependencies are your enemy.
in C++, you can optimize for performance in ridiculously huge number of ways. don't. Unless you are in the innermost loop of a heavy math software, and even then don't.
avoid DLLs on windows. They don't work with singletons, causing problems to popular libraries.
use boost, shared pointers whenever you can. avoid reinventing the wheel and regular pointers.
use std::string, smart containers instead of arrays. These are dangerous. It will be faster than managed containers anyway.
use RAII. This one is priceless.
prefer data members to inheritance, or you will expose the base type definition to your type's users.
learn to avoid nested includes with forward declarations.
How should I learn C++?
depends. where are you coming from? anyway, I'd suggest:
use an up-to-date compiler such as gcc-4.4 or 4.5
C++0x is worth it for the type inference alone (local variables don't need explicit type designations)
write small, standalone, short-lived utilities (try porting such tools written in other languages)
STL has complex parts, but the basic things are easy, don't shy away from it. FMPOV it embodies the spirit of C++
use state-of-the-art C++ libraries: stuff like Boost.Foreach, Boost.Tuple, Boost.Regex or Boost.Optional turn C++ into serious competition in the scripting department
when you're comfortable:
learn to generalize your code with templates
learn to use RAII
then:
add C libraries to the mix. this might be the first time you'll need to tinker with pointers and casts!
add OOP if you feel like it
should I treat every C++ line I write as a potential minefiled?
be cautious, but don't worry too much. it's true that you can't know what a + b means without knowing the whole program containing such an expression because of operator overloads and argument-dependent lookup, and I've seen many people whine about it. a killing counter-argument is that you cannot really know what a->plus(b) does in Java or a scripting language in the face of inheritance: all methods are virtual, yoyo effect in extremis! (this does kill me in large codebases with rampant inheritance written in languages w/o ADL or operator overloading!)
anecdotes from my experience learning basics of C and C++:
C: unless you do something really, really, really stupid, the program will compile just fine, and SIGSEGV or SIGBUS as soon as you run it
C++: unless you do something really, really, really "clever", the program will either fail to compile, or compile and do what you mean (a mantra Perl "inherited" from Interlisp as I've been told).
a ranty post scriptum:
C++ can be used as a much higher-level language than C: whereas you can't do almost anything beyond simple arithmetic without pointers in C, it's possible to write complete programs in C++ without a pointer in sight, save for char **argv.
there's a whole class of programs that can be implemented in C++ using it as a "scripting" language with unparalleled runtime speed and simple runtime environment (the "dll hell" is nothing compared to the volatility of real scripting languages).
however, the "scripting language" cloak is a leaky abstraction: it's built from native C++ mechanisms such as ADL, operator overloading and templates, and that has its price. get ready for abysmal compile times and unintelligible error messages. OTOH, at least the error messages can be greatly improved with tools like STLfilt, and I think it's well worth it overall.
one thing where C++ really shines in contrast to environments such as Java (perhaps C# too? don't know that one that well) is destructors (vs finalizers and GC). it's one of the pillars of the "scriptiness" of the language. whereas GC adds a whole level of semantic complexity (things don't cease to exist as soon as they're inaccessible from the program) and syntactic verbiage and duplication (finally), destructors are the workhorse of natural semantics and obviate code duplication that's unavoidable with finally.
BTW, "enough rope to shoot myself in the head" almost killed me. I think I'll borrow it. ;)
C++ has some pitfalls certainly, but writting safe code is certainly possible.
Some things to think about. There are far from the only things for writting safe C++ code but they seem like a good start.
Use std::string and std::vector to store strings and collections rather than C style strings and native arrays. They are much easier to get right.
When you allocate an object using new always think about who owns the pointer to this memory and is responsible for deleting it. If you can't think of a single owner for the data that manages it's lifetime, then either rethink your design or think about using a "smart pointer" to manage the lifetime.
Prefer indexing into arrays rather than using pointer arithmetic where possible. Whenever you index into an array every time ask your self "How do I know that this index can only index a valid index in the array".
If a class has a pointer to some data then write methods to act on that data. Don't write methods that return that pointer or at some point you'll end up using the pointer after the data has been deleted. (Not always possible but something to aim for)
If you write simple code that uses strings and vectors and as much as possible encapsule pointers as members of classes that both manage the lifetime of the data and provide the methods that act on that data then that's a good strarting point.
As others have said, read effective c++ and other books.
In C++, foot shoots YOU.
The question is do you need it for anything? If you want to make game code, 3d tools, or something similar you pretty much have to have it. If not, you don't. The errors people are afraid of are seldom big killers but there are plenty of other things that will come up if you make a large enough project.
You may find this spoof interview with Bjarne Stroustrup to be enlightening:
http://www-users.cs.york.ac.uk/susan/joke/cpp.htm
The syntax of C++ is easy, just like Java or C# with pointers. So learning C++ is fast.
The hard thing is that when it comes to a project, C++ is harder to use and more error prone compared to Java or C#. It is just too flexible and the programmer is responsible for too many things.
In a 100 lines of code, you don't need to worry about memory and null pointers at all as you can find them quickly. But when it comes to 10000 lines of code, memory management could be hard. The exception mechanism in C++ is also weak. Thirdly, you need to worry the null pointer problem in C++ in a big project.
I look at the dilemma from a different perspective. The more discipline you have in development the faster you can develop quality robust code. Assembly requires more discipline than C. C requires more discipline than C++.
Don't worry about hanging yourself, blowing your foot or leg off. Just work on improving your quality develop process. For example, a code review will help regardless of the language. Unit testing and test frameworks will also save some bloodshed. Everything boils down to project deadlines and money.