Is ++(a = b); faster than a = b + 1;? - c++

Is it faster to use ++(a = b); instead of a = b + 1;?
For my understanding, the first approach consists of the operations:
move the value of b to a
increment a in memory
while the second approach does:
push b and 1 to the stack
call add
pop the result to a register
move the register to a
Does it actually take less cycles? Or does the compiler (gcc for example) do an optimization so it does not make a difference?

edit: TIL that ++(a=b) is wrong illegal UB, at least in pre-C++11. Nevertheless, I'll discuss this assuming it's either legal or the compiler does what you expect.
Generally speaking, a = b + 1; is faster.
The optimizer will most surely make the same of both. If not, it is more likely to optimize the second version, because it is a very common thing to write, and omtimizers are more likely to recognize common things than weird corner cases.
Why do I say it should be the same after optimization, but the second is faster? Because of the fellow developers. Everyone recognizes a = b + 1; immediately. Noone really has to think about it. The other case is more likely to trigger a reaction in the likes of "wtf is he doing there, and why?". Many people will figure out eventually what you did there. Some will not. Some might even introduce bugs because of it. Few people will find out why you did it and nevertheless stumble each time they have to read that line. Everyone will lose time wondering while reading that line. That's why the other is faster.
Caveat: all this is written silently assuming that you are talking of builtin types, like ints or pointers. Your interpretation of what the two do supports that. If we're talking of UDTs, the two lines are not even guaranteed to do the same. It then depends completely on how operator=, operator++ and operator+ and maybe the conversion from int are implemented. Nevertheless, if the implementations make you conside to write ++(a=b), they are most likely bad implementations and should be improved rather than hacked around.
tl;dr: if I'd catch you doing ++(a=b) in any codebase I work on, we'd have to have a serious talk ;-)

There is no simple answer to this question. The question has been flagged with C++ so we have no way of knowing what this code is actually doing without knowing the precise type of all the operands. Also, the context within which the code appears will make a difference to the way the optimiser generates code - the compiler could alias the variables and move the increment into instructions further down the program, for example, into effective address calculations for the two variables.
But the real question is, why do you care? As Arne said above, readability is far more important and you've not posted a scenario whereby any difference would have a measurable effect.
Only worry about it if it is actually causing a problem.

With optimizations on, they generate exactly the same code for me so they will perform exactly the same. This shouldn't be a surprise as the effects of both statements are exactly the same.
++(a = b); is undefined behaviour because there are two unsequenced modifications to a.
Although the value computation of a in a = b is sequenced before the modification of a due ++, the side-effect of a = b (storage to a) is unsequenced relative to the side-effect of ++ (again, storage to a).

Related

How much do C/C++ compilers optimize conditional statements?

I recently ran into a situation where I wrote the following code:
for(int i = 0; i < (size - 1); i++)
{
// do whatever
}
// Assume 'size' will be constant during the duration of the for loop
When looking at this code, it made me wonder how exactly the for loop condition is evaluated for each loop. Specifically, I'm curious as to whether or not the compiler would 'optimize away' any additional arithmetic that has to be done for each loop. In my case, would this code get compiled such that (size - 1) would have to be evaluated for every loop iteration? Or is the compiler smart enough to realize that the 'size' variable won't change, thus it could precalculate it for each loop iteration.
This then got me thinking about the general case where you have a conditional statement that may specify more operations than necessary.
As an example, how would the following two pieces of code compile:
if(6)
if(1+1+1+1+1+1)
int foo = 1;
if(foo + foo + foo + foo + foo + foo)
How smart is the compiler? Will the 3 cases listed above be converted into the same machine code?
And while I'm at, why not list another example. What does the compiler do if you are doing an operation within a conditional that won't have any effect on the end result? Example:
if(2*(val))
// Assume val is an int that can take on any value
In this example, the multiplication is completely unnecessary. While this case seems a lot stupider than my original case, the question still stands: will the compiler be able to remove this unnecessary multiplication?
Question:
How much optimization is involved with conditional statements?
Does it vary based on compiler?
Short answer: the compiler is exceptionally clever, and will generally optimise those cases that you have presented (including utterly ignoring irrelevant conditions).
One of the biggest hurdles language newcomers face in terms of truly understanding C++, is that there is not a one-to-one relationship between their code and what the computer executes. The entire purpose of the language is to create an abstraction. You are defining the program's semantics, but the computer has no responsibility to actually follow your C++ code line by line; indeed, if it did so, it would be abhorrently slow as compared to the speed we can expect from modern computers.
Generally speaking, unless you have a reason to micro-optimise (game developers come to mind), it is best to almost completely ignore this facet of programming, and trust your compiler. Write a program that takes the inputs you want, and gives the outputs you want, after performing the calculations you want… and let your compiler do the hard work of figuring out how the physical machine is going to make all that happen.
Are there exceptions? Certainly. Sometimes your requirements are so specific that you do know better than the compiler, and you end up optimising. You generally do this after profiling and determining what your bottlenecks are. And there's also no excuse to write deliberately silly code. After all, if you go out of your way to ask your program to copy a 50MB vector, then it's going to copy a 50MB vector.
But, assuming sensible code that means what it looks like, you really shouldn't spend too much time worrying about this. Because modern compilers are so good at optimising, that you'd be a fool to try to keep up.
The C++ language specification permits the compiler to make any optimization that results in no observable changes to the expected results.
If the compiler can determine that size is constant and will not change during execution, it can certainly make that particular optimization.
Alternatively, if the compiler can also determine that i is not used in the loop (and its value is not used afterwards), that it is used only as a counter, it might very well rewrite the loop to:
for(int i = 1; i < size; i++)
because that might produce smaller code. Even if this i is used in some fashion, the compiler can still make this change and then adjust all other usage of i so that the observable results are still the same.
To summarize: anything goes. The compiler may or may not make any optimization change as long as the observable results are the same.
Yes, there is a lot of optimization, and it is very complex.
It varies based on the compiler, and it also varies based on the compiler options
Check
https://meta.stackexchange.com/questions/25840/can-we-stop-recommending-the-dragon-book-please
for some book recomendations if you really want to understand what a compiler may do. It is a very complex subject.
You can also compile to assembly with the -S option (gcc / g++) to see what the compiler is really doing. Use -O3 / ... / -O0 / -O to experiment with different optimization levels.

Is uninitialized local variable the fastest random number generator?

I know the uninitialized local variable is undefined behaviour(UB), and also the value may have trap representations which may affect further operation, but sometimes I want to use the random number only for visual representation and will not further use them in other part of program, for example, set something with random color in a visual effect, for example:
void updateEffect(){
for(int i=0;i<1000;i++){
int r;
int g;
int b;
star[i].setColor(r%255,g%255,b%255);
bool isVisible;
star[i].setVisible(isVisible);
}
}
is it that faster than
void updateEffect(){
for(int i=0;i<1000;i++){
star[i].setColor(rand()%255,rand()%255,rand()%255);
star[i].setVisible(rand()%2==0?true:false);
}
}
and also faster than other random number generator?
As others have noted, this is Undefined Behavior (UB).
In practice, it will (probably) actually (kind of) work. Reading from an uninitialized register on x86[-64] architectures will indeed produce garbage results, and probably won't do anything bad (as opposed to e.g. Itanium, where registers can be flagged as invalid, so that reads propagate errors like NaN).
There are two main problems though:
It won't be particularly random. In this case, you're reading from the stack, so you'll get whatever was there previously. Which might be effectively random, completely structured, the password you entered ten minutes ago, or your grandmother's cookie recipe.
It's Bad (capital 'B') practice to let things like this creep into your code. Technically, the compiler could insert reformat_hdd(); every time you read an undefined variable. It won't, but you shouldn't do it anyway. Don't do unsafe things. The fewer exceptions you make, the safer you are from accidental mistakes all the time.
The more pressing issue with UB is that it makes your entire program's behavior undefined. Modern compilers can use this to elide huge swaths of your code or even go back in time. Playing with UB is like a Victorian engineer dismantling a live nuclear reactor. There's a zillion things to go wrong, and you probably won't know half of the underlying principles or implemented technology. It might be okay, but you still shouldn't let it happen. Look at the other nice answers for details.
Also, I'd fire you.
Let me say this clearly: we do not invoke undefined behavior in our programs. It is never ever a good idea, period. There are rare exceptions to this rule; for example, if you are a library implementer implementing offsetof. If your case falls under such an exception you likely know this already. In this case we know using uninitialized automatic variables is undefined behavior.
Compilers have become very aggressive with optimizations around undefined behavior and we can find many cases where undefined behavior has lead to security flaws. The most infamous case is probably the Linux kernel null pointer check removal which I mention in my answer to C++ compilation bug? where a compiler optimization around undefined behavior turned a finite loop into an infinite one.
We can read CERT's Dangerous Optimizations and the Loss of Causality (video) which says, amongst other things:
Increasingly, compiler writers are taking advantage of undefined
behaviors in the C and C++ programming languages to improve
optimizations.
Frequently, these optimizations are interfering with
the ability of developers to perform cause-effect analysis on their
source code, that is, analyzing the dependence of downstream results
on prior results.
Consequently, these optimizations are eliminating
causality in software and are increasing the probability of software
faults, defects, and vulnerabilities.
Specifically with respect to indeterminate values, the C standard defect report 451: Instability of uninitialized automatic variables makes for some interesting reading. It has not been resolved yet but introduces the concept of wobbly values which means the indeterminatness of a value may propagate through the program and can have different indeterminate values at different points in the program.
I don't know of any examples where this happens but at this point we can't rule it out.
Real examples, not the result you expect
You are unlikely to get random values. A compiler could optimize the away the loop altogether. For example, with this simplified case:
void updateEffect(int arr[20]){
for(int i=0;i<20;i++){
int r ;
arr[i] = r ;
}
}
clang optimizes it away (see it live):
updateEffect(int*): # #updateEffect(int*)
retq
or perhaps get all zeros, as with this modified case:
void updateEffect(int arr[20]){
for(int i=0;i<20;i++){
int r ;
arr[i] = r%255 ;
}
}
see it live:
updateEffect(int*): # #updateEffect(int*)
xorps %xmm0, %xmm0
movups %xmm0, 64(%rdi)
movups %xmm0, 48(%rdi)
movups %xmm0, 32(%rdi)
movups %xmm0, 16(%rdi)
movups %xmm0, (%rdi)
retq
Both of these cases are perfectly acceptable forms of undefined behavior.
Note, if we are on an Itanium we could end up with a trap value:
[...]if the register happens to hold a special not-a-thing value,
reading the register traps except for a few instructions[...]
Other important notes
It is interesting to note the variance between gcc and clang noted in the UB Canaries project over how willing they are to take advantage of undefined behavior with respect to uninitialized memory. The article notes (emphasis mine):
Of course we need to be completely clear with ourselves that any such expectation has nothing to do with the language standard and everything to do with what a particular compiler happens to do, either because the providers of that compiler are unwilling to exploit that UB or just because they have not gotten around to exploiting it yet. When no real guarantee from the compiler provider exists, we like to say that as-yet unexploited UBs are time bombs: they’re waiting to go off next month or next year when the compiler gets a bit more aggressive.
As Matthieu M. points out What Every C Programmer Should Know About Undefined Behavior #2/3 is also relevant to this question. It says amongst other things (emphasis mine):
The important and scary thing to realize is that just about any
optimization based on undefined behavior can start being triggered on
buggy code at any time in the future. Inlining, loop unrolling, memory
promotion and other optimizations will keep getting better, and a
significant part of their reason for existing is to expose secondary
optimizations like the ones above.
To me, this is deeply dissatisfying, partially because the compiler
inevitably ends up getting blamed, but also because it means that huge
bodies of C code are land mines just waiting to explode.
For completeness sake I should probably mention that implementations can choose to make undefined behavior well defined, for example gcc allows type punning through unions while in C++ this seems like undefined behavior. If this is the case the implementation should document it and this will usually not be portable.
No, it's terrible.
The behaviour of using an uninitialised variable is undefined in both C and C++, and it's very unlikely that such a scheme would have desirable statistical properties.
If you want a "quick and dirty" random number generator, then rand() is your best bet. In its implementation, all it does is a multiplication, an addition, and a modulus.
The fastest generator I know of requires you to use a uint32_t as the type of the pseudo-random variable I, and use
I = 1664525 * I + 1013904223
to generate successive values. You can choose any initial value of I (called the seed) that takes your fancy. Obviously you can code that inline. The standard-guaranteed wraparound of an unsigned type acts as the modulus. (The numeric constants are hand-picked by that remarkable scientific programmer Donald Knuth.)
Good question!
Undefined does not mean it's random. Think about it, the values you'd get in global uninitialized variables were left there by the system or your/other applications running. Depending what your system does with no longer used memory and/or what kind of values the system and applications generate, you may get:
Always the same.
Be one of a small set of values.
Get values in one or more small ranges.
See many values dividable by 2/4/8 from pointers on 16/32/64-bit system
...
The values you'll get completely depend on which non-random values are left by the system and/or applications. So, indeed there will be some noise (unless your system wipes no longer used memory), but the value pool from which you'll draw will by no means be random.
Things get much worse for local variables because these come directly from the stack of your own program. There is a very good chance that your program will actually write these stack locations during the execution of other code. I estimate the chances for luck in this situation very low, and a 'random' code change you make tries this luck.
Read about randomness. As you'll see randomness is a very specific and hard to obtain property. It's a common mistake to think that if you just take something that's hard to track (like your suggestion) you'll get a random value.
Many good answers, but allow me to add another and stress the point that in a deterministic computer, nothing is random. This is true for both the numbers produced by an pseudo-RNG and the seemingly "random" numbers found in areas of memory reserved for C/C++ local variables on the stack.
BUT... there is a crucial difference.
The numbers generated by a good pseudorandom generator have the properties that make them statistically similar to truly random draws. For instance, the distribution is uniform. The cycle length is long: you can get millions of random numbers before the cycle repeats itself. The sequence is not autocorrelated: for instance, you will not begin to see strange patterns emerge if you take every 2nd, 3rd, or 27th number, or if you look at specific digits in the generated numbers.
In contrast, the "random" numbers left behind on the stack have none of these properties. Their values and their apparent randomness depend entirely on how the program is constructed, how it is compiled, and how it is optimized by the compiler. By way of example, here is a variation of your idea as a self-contained program:
#include <stdio.h>
notrandom()
{
int r, g, b;
printf("R=%d, G=%d, B=%d", r&255, g&255, b&255);
}
int main(int argc, char *argv[])
{
int i;
for (i = 0; i < 10; i++)
{
notrandom();
printf("\n");
}
return 0;
}
When I compile this code with GCC on a Linux machine and run it, it turns out to be rather unpleasantly deterministic:
R=0, G=19, B=0
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
R=130, G=16, B=255
If you looked at the compiled code with a disassembler, you could reconstruct what was going on, in detail. The first call to notrandom() used an area of the stack that was not used by this program previously; who knows what was in there. But after that call to notrandom(), there is a call to printf() (which the GCC compiler actually optimizes to a call to putchar(), but never mind) and that overwrites the stack. So the next and subsequent times, when notrandom() is called, the stack will contain stale data from the execution of putchar(), and since putchar() is always called with the same arguments, this stale data will always be the same, too.
So there is absolutely nothing random about this behavior, nor do the numbers obtained this way have any of the desirable properties of a well-written pseudorandom number generator. In fact, in most real-life scenarios, their values will be repetitive and highly correlated.
Indeed, as others, I would also seriously consider firing someone who tried to pass off this idea as a "high performance RNG".
Undefined behavior means that the authors of compilers are free to ignore the problem because programmers will never have a right to complain whatever happens.
While in theory when entering UB land anything can happen (including a daemon flying off your nose) what normally means is that compiler authors just won't care and, for local variables, the value will be whatever is in the stack memory at that point.
This also means that often the content will be "strange" but fixed or slightly random or variable but with a clear evident pattern (e.g. increasing values at each iteration).
For sure you cannot expect it being a decent random generator.
Undefined behaviour is undefined. It doesn't mean that you get an undefined value, it means that the the program can do anything and still meet the language specification.
A good optimizing compiler should take
void updateEffect(){
for(int i=0;i<1000;i++){
int r;
int g;
int b;
star[i].setColor(r%255,g%255,b%255);
bool isVisible;
star[i].setVisible(isVisible);
}
}
and compile it to a noop. This is certainly faster than any alternative. It has the downside that it will not do anything, but such is the downside of undefined behaviour.
Not mentioned yet, but code paths that invoke undefined behavior are allowed to do whatever the compiler wants, e.g.
void updateEffect(){}
Which is certainly faster than your correct loop, and because of UB, is perfectly conformant.
Because of security reasons, new memory assigned to a program has to be cleaned, otherwise the information could be used, and passwords could leak from one application into another. Only when you reuse memory, you get different values than 0. And it is very likely, that on a stack the previous value is just fixed, because the previous use of that memory is fixed.
Your particular code example would probably not do what you are expecting. While technically each iteration of the loop re-creates the local variables for the r, g, and b values, in practice it's the exact same memory space on the stack. Hence it won't get re-randomized with each iteration, and you will end up assigning the same 3 values for each of the 1000 colors, regardless of how random the r, g, and b are individually and initially.
Indeed, if it did work, I would be very curious as to what's re-randomizing it. The only thing I can think of would be an interleaved interrupt that piggypacked atop that stack, highly unlikely. Perhaps internal optimization that kept those as register variables rather than as true memory locations, where the registers get re-used further down in the loop, would do the trick, too, especially if the set visibility function is particularly register-hungry. Still, far from random.
As most of people here mentioned undefined behavior. Undefined also means that you may get some valid integer value (luckily) and in this case this will be faster (as rand function call is not made).
But don't practically use it. I am sure this will terrible results as luck is not with you all the time.
Really bad! Bad habit, bad result.
Consider:
A_Function_that_use_a_lot_the_Stack();
updateEffect();
If the function A_Function_that_use_a_lot_the_Stack() make always the same initialization it leaves the stack with the same data on it. That data is what we get calling updateEffect(): always same value!.
I performed a very simple test, and it wasn't random at all.
#include <stdio.h>
int main() {
int a;
printf("%d\n", a);
return 0;
}
Every time I ran the program, it printed the same number (32767 in my case) -- you can't get much less random than that. This is presumably whatever the startup code in the runtime library left on the stack. Since it uses the same startup code every time the program runs, and nothing else varies in the program between runs, the results are perfectly consistent.
You need to have a definition of what you mean by 'random'.
A sensible definition involves that the values you get should have little correlation. That's something you can measure. It's also not trivial to achieve in a controlled, reproducible manner. So undefined behaviour is certainly not what you are looking for.
There are certain situations in which uninitialized memory may be safely read using type "unsigned char*" [e.g. a buffer returned from malloc]. Code may read such memory without having to worry about the compiler throwing causality out the window, and there are times when it may be more efficient to have code be prepared for anything memory might contain than to ensure that uninitialized data won't be read (a commonplace example of this would be using memcpy on partially-initialized buffer rather than discretely copying all of the elements that contain meaningful data).
Even in such cases, however, one should always assume that if any combination of bytes will be particularly vexatious, reading it will always yield that pattern of bytes (and if a certain pattern would be vexatious in production, but not in development, such a pattern won't appear until code is in production).
Reading uninitialized memory might be useful as part of a random-generation strategy in an embedded system where one can be sure the memory has never been written with substantially-non-random content since the last time the system was powered on, and if the manufacturing process used for the memory causes its power-on state to vary in semi-random fashion. Code should work even if all devices always yield the same data, but in cases where e.g. a group of nodes each need to select arbitrary unique IDs as quickly as possible, having a "not very random" generator which gives half the nodes the same initial ID might be better than not having any initial source of randomness at all.
As others have said, it will be fast, but not random.
What most compilers will do for local variables is to grab some space for them on the stack, but not bother setting it to anything (the standard says they don't need to, so why slow down the code you're generating?).
In this case, the value you'll get will depend on what was on previously on the stack - if you call a function before this one that has a hundred local char variables all set to 'Q' and then call you're function after that returns, then you'll probably find your "random" values behave as if you've memset() them all to 'Q's.
Importantly for your example function trying to use this, these values wont change each time you read them, they'll be the same every time. So you'll get a 100 stars all set to the same colour and visibility.
Also, nothing says that the compiler shouldn't initialize these value - so a future compiler might do so.
In general: bad idea, don't do it.
(like a lot of "clever" code level optimizations really...)
As others have already mentioned, this is undefined behavior (UB), but it may "work".
Except from problems already mentioned by others, I see one other problem (disadvantage) - it will not work in any language other than C and C++. I know that this question is about C++, but if you can write code which will be good C++ and Java code and it's not a problem then why not? Maybe some day someone will have to port it to other language and searching for bugs caused by "magic tricks" UB like this definitely will be a nightmare (especially for an inexperienced C/C++ developer).
Here there is question about another similar UB. Just imagine yourself trying to find bug like this without knowing about this UB. If you want to read more about such strange things in C/C++, read answers for question from link and see this GREAT slideshow. It will help you understand what's under the hood and how it's working; it's not not just another slideshow full of "magic". I'm quite sure that even most of experienced C/c++ programmers can learn a lot from this.
Not a good idea to rely our any logic on language undefined behaviour. In addition to whatever mentioned/discussed in this post, I would like to mention that with modern C++ approach/style such program may not be compile.
This was mentioned in my previous post which contains the advantage of auto feature and useful link for the same.
https://stackoverflow.com/a/26170069/2724703
So, if we change the above code and replace the actual types with auto, the program would not even compile.
void updateEffect(){
for(int i=0;i<1000;i++){
auto r;
auto g;
auto b;
star[i].setColor(r%255,g%255,b%255);
auto isVisible;
star[i].setVisible(isVisible);
}
}
I like your way of thinking. Really outside the box. However the tradeoff is really not worth it. Memory-runtime tradeoff is a thing, including undefined behavior for runtime is not.
It must give you a very unsettling feeling to know you are using such "random" as your business logic. I woudn't do it.
Use 7757 every place you are tempted to use uninitialized variables. I picked it randomly from a list of prime numbers:
it is defined behavior
it is guaranteed to not always be 0
it is prime
it is likely to be as statistically random as uninitualized
variables
it is likely to be faster than uninitialized variables since its
value is known at compile time
There is one more possibility to consider.
Modern compilers (ahem g++) are so intelligent that they go through your code to see what instructions affect state, and what don't, and if an instruction is guaranteed to NOT affect the state, g++ will simply remove that instruction.
So here's what will happen. g++ will definitely see that you are reading, performing arithmetic on, saving, what is essentially a garbage value, which produces more garbage. Since there is no guarantee that the new garbage is any more useful than the old one, it will simply do away with your loop. BLOOP!
This method is useful, but here's what I would do. Combine UB (Undefined Behaviour) with rand() speed.
Of course, reduce rand()s executed, but mix them in so compiler doesn't do anything you don't want it to.
And I won't fire you.
Using uninitialized data for randomness is not necessarily a bad thing if done properly. In fact, OpenSSL does exactly this to seed its PRNG.
Apparently this usage wasn't well documented however, because someone noticed Valgrind complaining about using uninitialized data and "fixed" it, causing a bug in the PRNG.
So you can do it, but you need to know what you're doing and make sure that anyone reading your code understands this.

what's the difference of i++ and ++i in for loop? [duplicate]

Perhaps it doesn't matter to the compiler once it optimizes, but in C/C++, I see most people make a for loop in the form of:
for (i = 0; i < arr.length; i++)
where the incrementing is done with the post fix ++. I get the difference between the two forms. i++ returns the current value of i, but then adds 1 to i on the quiet. ++i first adds 1 to i, and returns the new value (being 1 more than i was).
I would think that i++ takes a little more work, since a previous value needs to be stored in addition to a next value: Push *(&i) to stack (or load to register); increment *(&i). Versus ++i: Increment *(&i); then use *(&i) as needed.
(I get that the "Increment *(&i)" operation may involve a register load, depending on CPU design. In which case, i++ would need either another register or a stack push.)
Anyway, at what point, and why, did i++ become more fashionable?
I'm inclined to believe azheglov: It's a pedagogic thing, and since most of us do C/C++ on a Window or *nix system where the compilers are of high quality, nobody gets hurt.
If you're using a low quality compiler or an interpreted environment, you may need to be sensitive to this. Certainly, if you're doing advanced C++ or device driver or embedded work, hopefully you're well seasoned enough for this to be not a big deal at all. (Do dogs have Buddah-nature? Who really needs to know?)
It doesn't matter which you use. On some extremely obsolete machines, and in certain instances with C++, ++i is more efficient, but modern compilers don't store the result if it's not stored. As to when it became popular to postincriment in for loops, my copy of K&R 2nd edition uses i++ on page 65 (the first for loop I found while flipping through.)
For some reason, i++ became more idiomatic in C, even though it creates a needless copy. (I thought that was through K&R, but I see this debated in other answers.) But I don't think there's a performance difference in C, where it's only used on built-ins, for which the compiler can optimize away the copy operation.
It does make a difference in C++, however, where i might be a user-defined type for which operator++() is overloaded. The compiler might not be able to assert that the copy operation has no visible side-effects and might thus not be able to eliminate it.
As for the reason why, here is what K&R had to say on the subject:
Brian Kernighan
you'll have to ask dennis (and it might be in the HOPL paper). i have a
dim memory that it was related to the post-increment operation in the
pdp-11, though beyond that i don't know, so don't quote me.
in c++ the preferred style for iterators is actually ++i for some subtle
implementation reason.
Dennis Ritchie
No particular reason, it just became fashionable. The code produced
is identical on the PDP-11, just an inc instruction, no autoincrement.
HOPL Paper
Thompson went a step further by inventing the ++ and -- operators, which increment or decrement; their prefix or postfix position determines whether the alteration occurs before or after noting the value of the operand. They were not in the earliest versions of B, but appeared along the way. People often guess that they were created to use the auto-increment and auto-decrement address modes provided by the DEC PDP-11 on which C and Unix first became popular. This is historically impossible, since there was no PDP-11 when B was developed. The PDP-7, however, did have a few ‘auto-increment’ memory cells, with the property that an indirect memory reference through them incremented the cell. This feature probably suggested such operators to Thompson; the generalization to make them both prefix and postfix was his own. Indeed, the auto-increment cells were not used directly in implementation of the operators, and a stronger
motivation for the innovation was probably his observation that the translation of ++x was smaller than that of x=x+1.
For integer types the two forms should be equivalent when you don't use the value of the expression. This is no longer true in the C++ world with more complicated types, but is preserved in the language name.
I suspect that "i++" became more popular in the early days because that's the style used in the original K&R "The C Programming Language" book. You'd have to ask them why they chose that variant.
Because as soon as you start using "++i" people will be confused and curios. They will halt there everyday work and start googling for explanations. 12 minutes later they will enter stack overflow and create a question like this. And voila, your employer just spent yet another $10
Going a little further back than K&R, I looked at its predecessor: Kernighan's C tutorial (~1975). Here the first few while examples use ++n. But each and every for loop uses i++. So to answer your question: Almost right from the beginning i++ became more fashionable.
My theory (why i++ is more fashionable) is that when people learn C (or C++) they eventually learn to code iterations like this:
while( *p++ ) {
...
}
Note that the post-fix form is important here (using the infix form would create a one-off type of bug).
When the time comes to write a for loop where ++i or i++ doesn't really matter, it may feel more natural to use the postfix form.
ADDED: What I wrote above applies to primitive types, really. When coding something with primitive types, you tend to do things quickly and do what comes naturally. That's the important caveat that I need to attach to my theory.
If ++ is an overloaded operator on a C++ class (the possibility Rich K. suggested in the comments) then of course you need to code loops involving such classes with extreme care as opposed to doing simple things that come naturally.
At some level it's idiomatic C code. It's just the way things are usually done. If that's your big performance bottleneck you're likely working on a unique problem.
However, looking at my K&R The C Programming Language, 1st edition, the first instance I find of i in a loop (pp 38) does use ++i rather than i++.
Im my opinion it became more fashionable with the creation of C++ as C++ enables you to call ++ on non-trivial objects.
Ok, I elaborate: If you call i++ and i is a non-trivial object, then storing a copy containing the value of i before the increment will be more expensive than for say a pointer or an integer.
I think my predecessors are right regarding the side effects of choosing postincrement over preincrement.
For it's fashonability, it may be as simple as that you start all three expressions within the for statement the same repetitive way, something the human brain seems to lean towards to.
I would add up to what other people told you that the main rule is: be consistent. Pick one, and do not use the other one unless it is a specific case.
If the loop is too long, you need to reload the value in the cache to increment it before the jump to the begining.
What you don't need with ++i, no cache move.
In C, all operators that result in a variable having a new value besides prefix inc/dec modify the left hand variable (i=2, i+=5, etc). So in situations where ++i and i++ can be interchanged, many people are more comfortable with i++ because the operator is on the right hand side, modifying the left hand variable
Please tell me if that first sentence is incorrect, I'm not an expert with C.

Why worry about 'undefined behavior' in >> of signed type?

My question is related to this one and will contain few questions.
For me the most obvious (means I would use it in my code) solution to above problem is just this:
uint8_t x = some value;
x = (int8_t)x >> 7;
Yes, yes I hear you all .... undefined behavior and this is why I've not posted my 'solution'.
I have a feeling (maybe it is only my sick mind) that term 'undefined behavior' is overused on SO just to justify downvoting someone if question is tagged c/c++.
So - let's (for a while) put aside C/C++ standards and think about everyday life/programming, real compiler implementations and code they generate for contemporary hardware.
Taking into account the following:
As far as I remember all the hardware I had encountered had distinct instructions for arithmetic and logical shift.
All compilers that I know translate >> into arithmetic shift for signed types and logical shift for unsigned types.
I cannot recall any compiler ever emitting div-like low level instruction when >> was used in c/c++ code (and we are not talking about operator overloading here).
All the hardware I know use U2.
So ... is there anything (any contemporary compiler, hardware) that behaves differently than mentioned above? Put simply should I ever be worried about right shifting signed value not being translated to arithmetic shift?
My 'solution' compiles to just one low level instruction on many platforms while others require multiple low level instructions. What would you use in your code?
Truth please ;-)
Why worry about 'undefined behavior' in >> of signed type?
Because it doesn't really matter how well defined any particular undefined behaviour is now; the point is that it may break at any point in the future. You're relying on a side-effect that may be optimized (or un-optimized) away at any point for any reason or no reason.
Also, I don't want to have to ask somebody with detailed knowledge of many different compiler's implementations before I use something I shouldn't use in the first place, so I skip it.
Yes, there are compilers which behave different from what you assume.
In particular, optimization phases within compilers. These take advantage of the known possible values of variables, and will derive those possible values from the absence of UB. A pointer must be non-null if it's been dereferenced, an integer must be non-zero if it's been used as a divider, and a right-shifted value must be non-negative.
And that works back in time:
if (x<0) {
printf("This is dead code\n");
}
x >> 3;
What it really comes down to is, are you willing to take the risk?
"The standard doesn't guarantee yada yada" is nice and all, but let's be honest now, the risk isn't big. If you're going to run your code on some crazy platform, you generally know in advance. And if it takes you by surprise, well, that's the risk you took.
Also, the workaround is horrible. If you're not going to need it, it's just polluting your codebase with pointless "function calls instead of right shifts" that will be harder to maintain (and thus carry a cost). And you'll never to able to "paste and forget" code from other places into the project - you'd always have to check the code for the possibility of right shifting negative signed integers.

rate ++a,a++,a=a+1 and a+=1 in terms of execution efficiency in C.Assume gcc to be the compiler [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Is there a performance difference between i++ and ++i in C++?
In terms of usage of the following, please rate in terms of execution time in C.
In some interviews i was asked which shall i use among these variations and why.
a++
++a
a=a+1
a+=1
Here is what g++ -S produces:
void irrelevant_low_level_worries()
{
int a = 0;
// movl $0, -4(%ebp)
a++;
// incl -4(%ebp)
++a;
// incl -4(%ebp)
a = a + 1;
// incl -4(%ebp)
a += 1;
// incl -4(%ebp)
}
So even without any optimizer switches, all four statements compile to the exact same machine code.
You can't rate the execution time in C, because it's not the C code that is executed. You have to profile the executable code compiled with a specific compiler running on a specific computer to get a rating.
Also, rating a single operation doesn't give you something that you can really use. Todays processors execute several instructions in parallel, so the efficiency of an operation relies very much on how well it can be paired with the instructions in the surrounding code.
So, if you really need to use the one that has the best performance, you have to profile the code. Otherwise (which is about 98% of the time) you should use the one that is most readable and best conveys what the code is doing.
The circumstances where these kinds of things actually matter is very rare and few in between. Most of the time, it doesn't matter at all. In fact I'm willing to bet that this is the case for you.
What is true for one language/compiler/architecture may not be true for others. And really, the fact is irrelevant in the bigger picture anyway. Knowing these things do not make you a better programmer.
You should study algorithms, data structures, asymptotic analysis, clean and readable coding style, programming paradigms, etc. Those skills are a lot more important in producing performant and manageable code than knowing these kinds of low-level details.
Do not optimize prematurely, but also, do not micro-optimize. Look for the big picture optimizations.
This depends on the type of a as well as on the context of execution. If a is of a primitive type and if all four statements have the same identical effect then these should all be equivalent and identical in terms of efficiency. That is, the compiler should be smart enough to translate them into the same optimized machine code. Granted, that is not a requirement, but if it's not the case with your compiler then that is a good sign to start looking for a better compiler.
For most compilers it should compile to the same ASM code.
Same.
For more details see http://www.linux-kongress.org/2009/slides/compiler_survey_felix_von_leitner.pdf
I can't see why there should be any difference in execution time, but let's prove me wrong.
a++
and
++a
are not the same however, but this is not related to efficiency.
When it comes to performance of individual lines, context is always important, and guessing is not a good idea. Test and measure is better
In an interview, I would go with two answers:
At first glance, the generated code should be very similar, especially if a is an integer.
If execution time was definitely a known problem - you have to measure it using some kind of profiler.
Well, you could argue that a++ is short and to the point. It can only increment a by one, but the notation is very well understood. a=a+1 is a little more verbose (not a big deal, unless you have variablesWithGratuitouslyLongNames), but some might argue it's more "flexible" because you can replace the 1 or either of the a's to change the expression. a+=1 is maybe not as flexible as the other two but is a little more clear, in the sense that you can change the increment amount. ++a is different from a++ and some would argue against it because it's not always clear to people who don't use it often.
In terms of efficiency, I think most modern compilers will produce the same code for all of these but I could be mistaken. Really, you'd have to run your code with all variations and measure which performs best.
(assuming that a is an integer)
It depends on the context, and if we are in C or C++. In C the code you posted (except for a-- :-) will cause a modern C compiler to produce exactly the same code. But by a very high chance the expected answer is that a++ is the fastest one and a=a+1 the slowest, since ancient compilers relied on the user to perform such optimizations.
In C++ it depends of the type of a. When a is a numeric type, it acts the same way as in C, which means a++, a+=1 and a=a+1 generate the same code. When a is a object, it depends if any operator (++, + and =) is overloaded, since then the overloaded operator of the a object is called.
Also when you work in a field with very special compilers (like microcontrollers or embedded systems) these compilers can behave very differently on each of these input variations.