I believe C is generally faster to compile than C++, because it lacks features like late binding and operator overloading. I'm interested in knowing which features of C++ tend to slow the compilation process the most?
This is a difficult question to answer in a meaningful way. If you look purely at lines of code per second (or something on that order), there's no question that a C compiler should be faster than a C++ compiler. By itself, that doesn't mean much though.
The mention of late-binding in the question is an excellent case in point: it's almost certainly true that compiling a C++ virtual function is at least somewhat slower than compiling a C (non-virtual) function. That doesn't mean much though -- the two aren't equivalent at all. The C equivalent of a C++ virtual function will typically be a pointer to a function or else code that uses a switch on an enumerated type to determine which of a number of pieces of code to invoke.
By the time you create code that's actually equivalent, it's open to question whether C will have any advantage at all. In fact, my guess would be rather the opposite: at least in the compilers I've written, an awful lot of the time is spent on the front-end, doing relatively simple things like just tokenizing the input stream. Given the extra length I'd expect from code like this in C, by the time you had code that was actually equivalent, it wouldn't surprise me a whole lot if it ended up about the same or even somewhat slower to compile.
Operator overloading could give somewhat the same effect: on one hand, the code that overloads the operator almost certainly takes a bit of extra time to compile. At the same time, the code that uses the overloaded operator will often be shorter specifically because it uses an overloaded operator instead of needing to invoke functions via names that will almost inevitably be longer. That's likely to reduce that expensive up-front tokenization step, so if you use the overloaded operator a lot, overall compilation time might actually be reduced.
Templates can be a bit the same way, except that in this case it's often substantially more difficult to even conceive of a reasonable comparison. Just for example, when you're doing sorting in C, you typically use qsort, which takes a pointer to a function to handle the comparison. The most common alternative in C++ is std::sort, which is a template that includes a template argument for the comparison. The difference is that since that is a template argument, the code for the comparison is typically generated inline instead of being invoked via a pointer.
In theory I suppose one could perhaps write a giant macro to do the same -- but I'm pretty sure I've never seen such a thing actually done, so it's extremely difficult to guess at how much slower or faster it might be to use if it one existed. Given the simplicity of macros versus templates, I'd guess it would compile faster, but exactly how much faster will probably remain forever a mystery; I'm certainly not going to try to write a complete Quicksort or Introsort in a C macro!
Templates are a turing complete functional language that's executed at compilation time. So they can cause the compiler to take quite a long time, though most compilers have a recursion depth limit that strongly limits this.
Templates are generally more compiler intensive, partly because the whole templated library needs to be compiled. This is especially true for STL, where all the code for the library is in header files and needs to be compiled whenever the client code is compiled.
Putting code in the header files will slow down your compilation. Not just because there is more code to compile but because changing the code in one file may cause lots of recompilations to be necessary.
It's a smaller language with fewer rules so the compiler has less work to do.
But all of that is moot. A Saturn V is slower off the line than a Prius. Measure their speeds 30 seconds down the line, however...
Related
When it comes to procedural programming, functional decomposition is ideal for maintaining complicated code. However, functions are expensive- adding to the call stack, passing parameters, storing return addresses. all of this takes extra time! When speed is crucial, how can I get the best of both worlds? I want a highly decomposed program without any necessary overhead introduced by function calls. I'm familiar with the keyword: "inline" but that seems to be only be a suggestion to the compiler, and if used incorrectly by the programmer it will yield an even slower program. I'm using g++, so will the -03 flag optimize away my functions that call functions that call functions..
I just wanted to know, if my concerns are valid and if there are any methods to combat this issue.
First, as always when dealing with performance issues, you should try and measure what are your bottlenecks with a profiler. The first thing coming out is usually not function calls and by a large margin. If you did this, then please read on.
Then, you can anticipate a bit what functions you want inlined by using the inline keyword. The compiler is usually smart enough to know what to inline and what not to inline (it can inline functions you forgot and may not inline some you mentionned if he thinks it won't help).
If (really) you still want to improve performance on function calls and want to force inlining, some compilers allow you to do so (see this question). Please consider that massive inlining may actually decrease performance: your code will use a lot of memory and you may get more cache misses on the code than before (which is not good).
If it's a specific piece of code you're worried about you can measure the time yourself. Just run it in a loop a large number of times and get the system time before and after. Use the difference to find the average time of each call.
As always the numbers you get are subjective, since they will vary depending on your system and compiler. You can compare the times you get from different methods to see which is generally faster, such as replacing the function with a macro. My guess is however you won't notice much difference, or at the very least it will be inconsequential.
If you don't know where the slowdown is follow J.N's advice and use a code profiler and optimise where it's needed. As a rule of thumb always pass large objects to functions by reference or const reference to avoid copy times.
I highly doubt speed is that curcial, but my suggestion would be to use preprocessor macros.
For example
#define max(a,b) ( a > b ? a : b )
This would seem obvious to me, but I don't consider myself an expect in C++, so I may have misunderstood the question.
I am not sure if this is a valid comparison or a valid statement but over the years I have heard folks claiming that the programs written in C++ generally take a longer time for compilation than the same written in C and that the applications coded in C++ are generally slower at run time than ones written in C.
Is there any truth in these statements?
Apart from reaping the benefits of OOP flexibility that C++ provides, should the above comparison be given a consideration purely from a compilation/execution time perspective?
I hope that this doesn't get closed as too generic or vague, it is just an attempt to know the actual facts about statements I have been hearing over the years from many programmers(C programmers predominantly).
I'll answer one specific part of the question that's pretty objective. C++ code that uses templates is going to be slower to compile than C code. If you don't use templates (which you probably will if you use the standard library) it should be very similar compilation time.
EDIT:
In terms of runtime it's much more subjective. Even though C may be a somewhat lower level language, C++ optimizers are getting really good, and C++ lends itself to more naturally representing real world concepts. If it's easier to represent your requirements in code (as I'd argue in C++) it's often easier to write better (and more performant) code than you would in another language. I don't think there's any objective data showing C or C++ being faster in all possible cases. I would actually suggest picking your language based on project needs and then write it in that language. If things are too slow, profile and proceed with normal performance improvement techniques.
The relatively runtime speed is a bit hard to predict. At one time, when most people thought of C++ as being all about inheritance, and used virtual functions a lot (even when they weren't particularly appropriate), code written in C++ was typically a little bit slower than equivalent C.
With (what most of us would consider) modern C++, the reverse tends to be true: templates give enough more compile-time flexibility that you can frequently produce code that's noticeably faster than any reasonable equivalent in C. In theory you could always avoid that by writing specialized code equivalent to the result of "expanding" a template -- but in reality, doing so is exceedingly rare, and quite expensive.
There is something of a tendency for C++ to be written somewhat more generally as well -- just for example, reading data into std::string or std::vector (or std::vector<std::string>) so the user can enter an arbitrary amount of data without buffer overflow or the data simply be truncated at some point. In C it's a lot more common to see somebody just code up a fixed-size buffer, and if you enter more than that, it either overflows or truncates. Obviously enough, you pay something for that -- the C++ code typically ends up using dynamic allocation (new), which is typically slower than just defining an array. OTOH, if you write C to accomplish the same thing, you end up writing a lot of extra code, and it typically runs about the same speed as the C++ version.
In other words, it's pretty easy to write C that's noticeably faster for things like benchmarks and single-use utilities, but the speed advantage evaporates in real code that has to be robust. In the latter case, about the best you can usually hope for is that the C code is equivalent to a C++ version, and in all honesty doing even that well is fairly unusual (at least IME).
Comparing compilation speed is no easier. On one hand, it's absolutely true that templates can be slow -- at least with most compilers, instantiating templates is quite expensive. On a line-for-line basis, there's no question that C will almost always be faster than anything in C++ that uses templates much. The problem with that is that a line-for-line comparison rarely makes much sense -- 10 lines of C++ may easily be equivalent to hundreds or even thousands of lines of C. As long as you look only at compile time (not development time), the balance probably favors C anyway, but certainly not by nearly as dramatic a margin as might initially seem to be the case. This also depends heavily on the compiler: just for example, clang does a lot better than gcc in this respect (and gcc has improved a lot in the last few years too).
The run time of C++ versus C would suffer only if you use certain C++-specific features. Exceptions and virtual function calls add to run time compared to returning error code and direct calls. On the other hand, if you find yourself using function pointers in C (as, say, GTK does) you are already paying at least some of the price for virtual functions. And checking error code after each function return will consume time too - you don't do it when you use exceptions.
On the other hand, inlining and templates in C++ may allow you to do a lot of work compile-time - work that C defers to run time. In some cases, C++ may end up faster than C.
If you compile the same code as C and C++, there should be no difference.
If you let the compiler do the job for you, like expanding templates, that will take some time. If you do the same in C, with cut-and-paste or some intricate macros, it will take up your time instead.
In some cases, inline expansion of templates will actually result in code that is more specialized and runs faster than the equivalent C code. Like here:
http://www2.research.att.com/~bs/new_learning.pdf
Or this report showing that many C++ features have no runtime cost:
http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf
Although its an old question I'd like to add my 5cents here, as I'm probably not the only one who finds this question via a search engine.
I can't comment on compilation speed but on execution speed:
To the best of my knowledge, there is only one feature in c++ that costs performance, even if you don't use it. This feature are the c++ exceptions, because they prevent a few compiler optimizations (that is the reason, why noexcept was introduced in c++11). However, if you use some kind of error checking mechanism, then exceptions are probably more efficient than the combination of return value checking and a lot if else statements. This is especially true, if you have to escalate an error up the stack.
Anyway, if you turn off exceptions during compilation, c++ introduces no overhead, except in places where you deliberately use the associated features (e.g. you dont't have to pay for polymorphism if you don't use virtual functions), whereas most features introduce no runtime overhead at all (overloading, templates, namespaces a.s.o.).
On the other hand, most forms of generic code will be much faster in c++ than the equivalent in c, because c++ provides the built-in mechanisms (templates and classes) to do this. A typical example is c's qsort vs c++'s std::sort. The c++ version is usually much faster, because inside sort, the used comparator function is known at compile time, which at least saves a call through a function lookup and in the best case allows a lot of additional compiler optimizations.
That being said, the "problem" with c++ is that it's easy to hide complexity from the user such that seemingly innocent code might be much slower than expected. This is mainly due to operator overloading, polymorphism and constructors/destructors, but even a simple call to a member function hides the passed this-pointer which also isn't a NOP.
Considering operator overloading: When you see a * in c, you know this is (on most architectures) a single, cheap assembler instruction, in c++ on the other hand it can be a complex function call (think about matrix multiplication). That doesn't mean, you could implement the same functionality in c any faster but in c++ you don't directly see that this might be an expensive operation.
Destructors are a similar case: In "modern" c++, you will hardly see any explicit destructions via delete but any local variable that goes out of scope might potentially trigger an expensive call to a (virtual) destructor without a single line of code indicating this (ignoring } of course).
And finally, some people (especially coming from Java) tend to write complex class hierarchies with lots of virtual functions, where each call to such a function is a hidden indirect function call which is difficult or impossible to optimize.
So while hiding complexity from the programmer is in general a good thing it sometimes has an adverse effect on runtime, if the programmer is not aware of the costs of these "easy to use" constructs.
As a summary I would say, that c++ makes it easier for inexperienced programmers to write slow code (because they don't directly see inefficiencies in a program). But c++ also allows good programmers to write "good", correct and fast code faster than with c - which gives them more time to think about optimizations when they really are necessary.
P.S.:
Two things I haven't mentioned (probably among others that I simply forgot) are c++'s ability for complex compile time computations (thanks to templates and constexpr) and c's restrict keyword. This is because didn't use either of them in time critical programs yet and so I can't comment on their general usefulness and the real world performance benefit.
C applications are faster to compile and execute than C++ applications.
C++ applications are generally slower at runtime, and are much slower to compile than C programs.
Look at these links:
C vs C++ (a)
C vs C++ (b)
Efficient C/C++
C++ Compile Time and RunTime Performance
I am wondering if it is still worth with modern compilers and their optimizations to write some critical code in C instead of C++ to make it faster.
I know C++ might lead to bad performance in case classes are copied while they could be passed by reference or when classes are created automatically by the compiler, typically with overloaded operators and many other similar cases; but for a good C++ developer who knows how to avoid all of this, is it still worth writing code in C to improve performance?
I'm going to agree with a lot of the comments. C syntax is supported, intentionally (with divergence only in C99), in C++. Therefore all C++ compilers have to support it. In fact I think it's hard to find any dedicated C compilers anymore. For example, in GCC you'll actually end up using the same optimization/compilation engine regardless of whether the code is C or C++.
The real question is then, does writing plain C code and compiling in C++ suffer a performance penalty. The answer is, for all intents and purposes, no. There are a few tricky points about exceptions and RTTI, but those are mainly size changes, not speed changes. You'd be so hard pressed to find an example that actually takes a performance hit that it doesn't seem worth it do write a dedicate module.
What was said about what features you use is important. It is very easy in C++ to get sloppy about copy semantics and suffer huge overheads from copying memory. In my experience this is the biggest cost -- in C you can also suffer this cost, but not as easily I'd say.
Virtual function calls are ever so slightly more expensive than normal functions. At the same time forced inline functions are cheaper than normal function calls. In both cases it is likely the cost of pushing/popping parameters from the stack that is more expensive. Worrying about function call overhead though should come quite late in the optimization process -- as it is rarely a significant problem.
Exceptions are costly at throw time (in GCC at least). But setting up catch statements and using RAII doesn't have a significant cost associated with it. This was by design in the GCC compiler (and others) so that truly only the exceptional cases are costly.
But to summarize: a good C++ programmer would not be able to make their code run faster simply by writing it in C.
measure! measure before thinking about optimizing, measure before applying optimization, measure after applying optimization, measure!
If you must run your code 1 nanosecond faster (because it's going to be used by 1000 people, 1000 times in the next 1000 days and that second is very important) anything goes.
Yes! it is worth ...
changing languages (C++ to C; Python to COBOL; Mathlab to Fortran; PHP to Lisp)
tweaking the compiler (enable/disable all the -f options)
use different libraries (even write your own)
etc
etc
What you must not forget is to measure!.
pmg nailed it. Just measure instead of global assumptions. Also think of it this way, compilers like gcc separate the front, middle, and back end. so the frontend fortran, c, c++, ada, etc ends up in the same internal middle language if you will that is what gets most of the optimization. Then that generic middle language is turned into assembler for the specific target, and there are target specific optimizations that occur. So the language may or may not induce more code from the front to middle when the languages differ greatly, but for C/C++ I would assume it is the same or very similar. Now the binary size is another story, the libraries that may get sucked into the binary for C only vs C++ even if it is only C syntax can/will vary. Doesnt necessarily affect execution performance but can bulk up the program file costing storage and transfer differences as well as memory requirements if the program loaded as a while into ram. Here again, just measure.
I also add to the measure comment compile to assembler and/or disassemble the output and compare the results of your different languages/compiler choices. This can/will supplement the timing differences you see when you measure.
The question has been answered to death, so I won't add to that.
Simply as a generic question, assuming you have measured, etc, and you have identified that a certain C++ (or other) code segment is not running at optimal speed (which generally means you have not used the right tool for the job); and you know you can get better performance by writing it in C, then yes, definitely, it is worth it.
There is a certain mindset that is common, trying to do everything from one tool (Java or SQL or C++). Not just Maslow's Hammer, but the actual belief that they can code a C construct in Java, etc. This leads to all kinds of performance problems. Architecture, as a true profession, is about placing code segments in the appropriate architectural location or platform. It is the correct combination of Java, SQL and C that will deliver performance. That produces an app that does not need to be re-visited; uneventful execution. In which case, it will not matter if or when C++ implements this constructors or that.
I am wondering if it is still worth with modern compilers and their optimizations to write some critical code in C instead of C++ to make it faster.
no. keep it readable. if your team prefers c++ or c, prefer that - especially if it is already functioning in production code (don't rewrite it without very good reasons).
I know C++ might lead to bad performance in case classes are copied while they could be passed by reference
then forbid copying and assigning
or when classes are created automatically by the compiler, typically with overloaded operators and many other similar cases
could you elaborate? if you are referring to templates, they don't have additional cost in runtime (although they can lead to additional exported symbols, resulting in a larger binary). in fact, using a template method can improve performance if (for example) a conversion would otherwise be necessary.
but for a good C++ developer who knows how to avoid all of this, is it still worth writing code in C to improve performance?
in my experience, an expert c++ developer can create a faster, more maintainable program.
you have to be selective about the language features that you use (and do not use). if you break c++ features down to the set available in c (e.g., remove exceptions, virtual function calls, rtti) then you're off to a good start. if you learn to use templates, metaprogramming, optimization techniques, avoid type aliasing (which becomes increasingly difficult or verbose in c), etc. then you should be on par or faster than c - with a program which is more easily maintained (since you are familiar with c++).
if you're comfortable using the features of c++, use c++. it has plenty of features (many of which have been added with speed/cost in mind), and can be written to be as fast as c (or faster).
with templates and metaprogramming, you could turn many runtime variables into compile-time constants for exceptional gains. sometimes that goes well into micro-optimization territory.
Two, maybe trivial questions:
1. Why can not I beat the STD functions?
Really. I spent the last three days implementing something faster than std::sort, just for the sake of doing it. It is supposed to be an introsort, and I suspect it uses the single pivot version quicksort inside. Epic fail. Mine was at least twice as slow.
In my utter bitterness I even copy-pasted other - top notch - programmers code. No avail.
I benchmarked my other algorithms too... My binary search, and upper_bound, lower_bound versions are so stripped down it couldn't really be made with less instructions. Still, they are about twice as slow.
I ask, why, why, why? And this leads me to my next question...
2. Where can I find the source code of the STL library functions?
Of course, I want to look at their sources! Is it even possible to write more efficient code than those, or am I at an abstraction level with my "simple" main.cpp where I can not reach optimisations utilized by the STL library?
I mean for example... Let's take the maps... wich are simple associative containers. The documentation says it is implemented with a red-black tree. Now... would it worth it to try implement my own red-black tree, or they took this joy :-) away from me and I should just throw every data I get my hands on into the map container?
I hope this does make sense.
If not, please forgive me.
The short answer is "if it was possible to write faster code which did the same thing, then the standard library would have done it already".
The standard library is designed by clever people, and the reason it was made part of C++ is that other clever people recognized it as being clever. And since then, 15 years have passed in which other clever people tried to take these specifications and write the absolutely most efficient code to implement it that they could.
That's a lot of cleverness you're trying to compete with. ;)
So there is no magic in the STL, they don't cheat, or use tricks unavailable to you. It is just very carefully designed to maximize performance.
The thing about C++ is that it's not a fast language as such. If you're not careful, it is easy to introduce all sorts of inefficiencies: virtual function calls, cache misses, excessive memory allocations, unnecessary copying of objects, all of this can cripple the performance of C++ code if you're not careful.
With care, you can write code that's about as efficient as the STL. It's not that special. But in general, the only way you're going to get faster code is to change the requirements. The standard library is required to be general, to work as well as possible across all use cases. If your requirement is more specific, it is sometimes possible to write specialized code that favors those specific cases. But then the tradeoff is that the code either will not work, or will be inefficient, in other cases.
A final point is that a key part of the reason why the STL is so clever, and why it was adopted into the standard, is that it it is pretty much zero-overhead. The standard libraries in many languages are "fast enough", but not as fast as hand-rolled code. They have a sorting algorithm, but it's not quite as fast as if you wrote it yourself in-place. It might use a few casts to and from a common "object" base class, or maybe use boxing on value types. The STL is designed so that pretty much everything can be inlined by the compiler, yielding code equivalent to if you'd hand-rolled it yourself. It uses templates to specialize for the type you're using, so there's no overhead of converting to a type understood by the container or algorithm.
That's why it's hard to compete with. It is a ridiculously efficient libary, and it had to be. With the mentality of your average C or C++ programmer, especially 10-15 years ago, no one would ever use a std::vector if it was 5% slower than a raw array. No one would use iterators and std algorithms if they weren't as fast as just writing the loop yourself.
So the STL pioneered a lot of clever C++ tricks in order to become just as efficient as hand-rolled C code.
They are probably optimized to a great extent. Such implementations consider memory page faults, cache misses etc.
Getting the source of those implementations depends on the compiler they are shipped with. I think most compilers (even Microsoft) will allow you to see them.
I think the most important things to know are the architecture you are compiling to and the operating system (if any) your program will be running on. Understanding these things will allow you to precisely target the hardware.
There are also countless optimization techniques. This is a good summary. Also, global optimization is whole science, so there are certainly many things to learn.
There are some clever things on this site, too. Sok sikert!
Looking at the disassembled version of your code versus their code and comparing them may give you some insights into why their is faster than yours.
It seems like a fool's errand to reimplement from scratch standard library functionality for the sake of making faster versions. You'd be far better served trying to modify their version to achieve your goals, although even then you really need to understand the underlying platform to be able to judge the value of the changes that you're making.
I would guess that were you to post your sort routine that it would be torn apart in minutes and you would gain an understanding of why your version is so substantially slower than the standard library version.
Most IDEs have a command to open a named header file using the compiler's search paths. I use this pretty often and tend to keep the code for algorithm open.
For me, the code you're looking for is in
/usr/include/c++/4.2.1/bits/stl_algo.h
/usr/include/c++/4.2.1/bits/stl_tree.h
Note that a lot of people have done their theses on sorting and tree-balancing (fields I would think are picked to the bone, I would not attempt research there), and many of them are probably more determined than you to make GCC's standard library faster.
That said, there's always the possibility of exploiting patterns specific to your code (some subranges are already sorted, frequently used specific small sequence sizes, etc).
Two answers, probably generally-applicable:
You will probably not be able to implement more efficient versions of algorithms that many other smart people have spent much more time optimizing. By virtue of time and testing alone, the STD algorithms will be pretty good.
For identical algorithms, optimization is something which is "very hard" with all the current hardware and configuration variations. To give an example, the primary factor for algorithm performance on a particular platform might be which levels of cache its most frequently used routines can be stored in, which is not generally something you can optimize by hand. Hence, the compiler is generally much more of a factor for actual algorithm performance than any particular code you can write.
But yeah... if you're really serious about optimization, get down into the assembly and compare. My advice, though, would be to focus on other things, unless it's your job to optimize your implementation or something. Just my 2c.
I have a section of my program that contains a large amount of math with some rather long equations. Its long and unsightly and I wish to replace it with a function. However, chunk of code is used an extreme number of times in my code and also requires a lot of variables to be initialized.
If I'm worried about speed, is the cost of calling the function and initializing the variables negligible here or should i stick to directly coding it in each time?
Thanks,
-Faken
Most compilers are smart about inlining reasonably small functions to avoid the overhead of a function call. For functions big enough that the compiler won't inline them, the overhead for the call is probably a very small fraction of the total execution time.
Check your compiler documentation to understand it's specific approach. Some older compilers required or could benefit from hints that a function is a candidate for inlining.
Either way, stick with functions and keep your code clean.
Are you asking if you should optimize prematurely?
Code it in a maintainable manner first; if you then find that this section is a bottleneck in the overall program, worry about tuning it at that point.
You don't know where your bottlenecks are until you profile your code. Anything you can assume about your code hot spots is likely to be wrong. I remember once I wanted to optimize some computational code. I ran a profiler and it turned out that 70 % of the running time was spent zeroing arrays. Nobody would have guessed it by looking at the code.
So, first code clean, then run a profiler, then optimize the rough spots. Not earlier. If it's still slow, change algorithm.
Modern C++ compilers generally inline small functions to avoid function call overhead. As far as the cost of variable initialization, one of the benefits of inlining is that it allows the compiler to perform additional optimizations at the call site. After performing inlining, if the compiler can prove that you don't need those extra variables, the copying will likely be eliminated. (I assume we're talking about primitives, not things with copy constructors.)
The only way to answer that is to test it. Without knowing more about the proposed function, nobody can really say whether the compiler can/will inline that code or not. This may/will also depend on the compiler and compiler flags you use. Depending on the compiler, if you find that it's really a problem, you may be able to use different flags, a pragma, etc., to force it to be generated inline even if it wouldn't be otherwise.
Without knowing how big the function would be, and/or how long it'll take to execute, it's impossible guess how much effect on speed it'll have if it isn't generated inline.
With both of those being unknown, none of us can really guess at how much effect moving the code into a function will have. There might be none, or little or huge.