As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have the following code in C++ here:
#include <iostream>
int main(int argc, const char * argv[])
{
goto line2;
line1:
std::cout << "line 1";
goto line3;
line2:
std::cout << "line 2";
goto line1;
line3:
std::cout << "line 3";
goto line4;
line4:
std::cout << "Hello, World!\n";
return 0;
}
If I made a larger program of lets say 10,000 lines of code and I decide I am never going to use functions that I write myself, I only use goto statements. I only use global variables. I am slightly insane in terms of best practices, but its for a very specific purpose. The question is, would this be efficient to jump around with goto statements? What if I have 1000 goto labels?
Do the goto statements translate directly into machine code which tells the computer just to JUMP to a different memory address? Is this a lower cost in the machine to jump around like this when compared with the cost to call a function?
I wish to know as I want to write a very efficient program to do some computations and I need to be very efficient without resorting to Assembly/Machine code.
No need to tell me this is a bad idea in terms of maintenance, understandability of code, best practices, I'm very aware of that, I just wish to have an answer to the question. I don't want any debate between whether its good to use function calls or good to use goto.
To clarify the question, I am concerned in this case of using gotos only with a 10,000 line program as to how it would compare with a traditional program using functions. There are multiple ways to compare and contrast between these two programs, for example how would the CPU cache perform. What sort of saving would it give without function calls. Without a call stack, how would this impact on the CPU cache, as CPU caches usually keep the stack close. Would there be therefor a case where it is likely to have a negative performance hit due to the cache not being utilized correctly. What is the actual cost of calling a function as compared to a jump in terms of time efficiency. There's a lot of ways to compare and contrast the two styles of programming in terms of efficiency.
Do the goto statements translate directly into machine code which tells the computer just to JUMP to a different memory address?
Yes.
Is this a lower cost in the machine to jump around like this when compared with the cost to call a function?
Yes.
However, when the compiler sees a function call, it doesn't have to actually generate code to call a function. It can take the guts of the function and stick them right in where the call was, not even a jump. So it could be more efficient to call a function!
Additionally, the smaller your code, the more efficient it will be (generally speaking), since it is more likely to fit in the CPU cache. The compiler can see this, and can determine when a function is small and it's better to inline it, or when it's big and better to separate it and make it a real function, to generate the fastest code (if you have it set to generate the fastest code possible). You can't see this, so you guess and probably guess wrong.
And those are just some of the obvious ones. There are so many other optimisations a compiler can do. Let the compiler decide. it's smarter than you. It's smarter than me. The compiler knows all. Seriously, Cthulhu is probably a compiler.
You said not to, but I'm going to say it: I highly advise you to profile your code before deciding to do this, I can almost guarantee it's not worth your time. The compiler (most of which are near-AI level smart) can probably generate as fast or faster code with regular function calls, not to mention the maintenance aspect.
Do the goto statements translate directly into machine code which
tells the computer just to JUMP to a different memory address?
Pretty much.
Is this a lower cost in the machine to jump around like this when
compared with the cost to call a function?
A function call is going to make a pretty similar jump, but before you can make the jump, you have to set up the new stack frame for the new function, push on parameters according to calling conventions, and at the end set up any return value and unwind. Yes, it's probably faster to not do this.
I am slightly insane
Yes.
1) The question is, would this be efficient to jump around with goto statements? What if I have 1000 goto labels?
From your small example with 4 goto labels, where you jump back and forth, no it is not efficient in terms of performance. To avoid overhead in function call mechanism, this method is disabling many other optimization which the compiler will automatically do for you. I am not listing them, but this worth reading.
2) Do the goto statements translate directly into machine code which tells the computer just to JUMP to a different memory address?
YES (As others correctly pointed out)
3) Is this a lower cost in the machine to jump around like this when compared with the cost to call a function?
YES, only if your compiler is pre historic, and doesn't have any optimization mechanism in built. Otherwise NO.
And I am not talking about best practices..
Yes, the machine code generated from goto will be a straight JUMP. And this will probably be faster than a function call, because nothing has to be done to the stack (although a call without any variables to pass will be optimized in such a way that it might be just as fast).
And God help you when something doesn't work with this code. Or when someone else has to maintain it.
hard to answer your query precisely, it depends on complexity of your program and use of goto statements within.
goto statement is equivalent to unconditional jump instruction (e.g. jmp ). The scope of goto will be within the file.
Ritchie suggest that avoid using goto statement, and if still want to/have to use goto statement then use it in top-down approach, Dont use it in bottom-up approach.
Well these are text book details.
Practically you should be very sure where to use goto statement and after goto jump where your program flow will be, otherwise as you mentioned with 1000 goto statements it will be difficult for you also to decide the program flow, forget about others. So further improvement in your program will be very very difficult.
There are plenty of other facilities such as looping, conditional statements, break and continue statements, functions etc to help you avoid such problems.
Hope it helps.....
In short, 'GoTo' statements can be efficient, but that is really how they are used. According to Herbert Schildt (C++ from the GROUND UP), "There are no programming situations that require the use of the goto statement -- it is not an item necessary for making the language complete." Ultimately, the primary reason for many programmers disliking the statement is that goto statements tend to clutter your code and/or make it very difficult to read because, as per the name, gotos can jump from place to place. With that being said, there are times where a goto statement can reduce clutter as well as make code more efficient, but that is totally dependent upon how you use them and the context that they are used in. Personally, I would recommend using function calls, as oppose to several goto statements; others may disagree.
Related
I have a function which does a task, lets call this function F(). Now, I need to do this task n times where is sufficiently small. I can think of doing 2 things:
//Code Here...
Code-for-function-F()
Code-for-function-F()
.
.
.
Code-for-function-F()
//following code
//Code Here
for(int i=0; i<n; ++i)
F()
//Following code
In the first case, I avoid function call overheads. But since the code is repeated n-times, the code can be rather large and would lead to worse cache locality/performance. For the second case, cache would be better utilized but results in overheads due to function calls. I was wondering if someone has done an analysis on which of the two is better approach.
PS: I understand that actual answer might depend on what code profiling tells me, is there a theoretically better approach between the two? I am using c++ on Linux.
There is no one-fits-them-all answer when the question is which code is faster. You have to measure it.
However, the optimizations you have in mind, loop-unrolling and function inlining, are techniques that the compiler is really good at. It is rare that applying them explicitly in your code helps the compiler to perform better optimizations. I would rather worry about preventing such compiler optimizations by writing unnecessarily clever code.
If you have a concrete example, I suggest you to take a look at godbolt. It is a nice tool that can help you to see the effect of variations on the code on the output of the compiler.
Also dont forget the famous quote from D.Knuth:
Programmers waste enormous amounts of time thinking about, or worrying
about, the speed of noncritical parts of their programs, and these
attempts at efficiency actually have a strong negative impact when
debugging and maintenance are considered. We should forget about small
efficiencies, say about 97% of the time: premature optimization is the
root of all evil. Yet we should not pass up our opportunities in that
critical 3%.
Often it is cited incomplete, while the last part is as important as the rest: "Yet we should not pass up our opportunities in that critical 3%.". To know where those 3% are you have to profile your code.
TL;DR: Don't do premature optimizations. Measure and profile first, only then you know where it is worth to improve and if you can get an improvement at all.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm wondering what I should be using instead of goto statements?
Should I be using nested if/while/do-while statements?
They say that using goto creates 'spaghetti code', but surely if someone is writing a large console application and they have if statement after if statement in an attempt to control the flow, that's going to make a mess?
I'm asking as many people have asked why the goto statement is bad, but not what to replace it with. I'm sure a lot of beginners could do with this.
This is for C++.
You are much better off using functions, loops, and conditional statements. Use break and continue as necessary.
I can nearly guarantee you any situation in which you are utilizing goto there is a better alternative. There is one notable exception to this: multi-level break.
while(1){
while(1){
goto breakOut;
}
//more code here?
}
breakOut:
In this one (relatively) rare situation, goto can be used in place of a typical "break" to make it clear we are actually getting out of a nested loop. The other way to approach this is with a "done" variable:
while(!done){
while(!done){
done = true;
break;
}
if(done){break;}
//More code here? If so, the above line is important!
}
As you can see, the done variable is more verbose when you have additional processing in outer loops, so a goto is a cleaner way of breaking free!
However, in 99% of cases you really, really, don't want to start writing a bunch of goto statements. Really think through each one.
With functions the above could also be written like so:
bool innerLoop(){
while(1){
return false;
}
return true;
}
...
while(innerLoop()){ //can only be written this way if the inner loop is the first thing that should run.
//More code here?
}
...
Sometimes breaking an inner loop out in this way can be messy if there are a lot of dependencies on the outer one. But it remains a viable way of breaking out of code early with return statements instead of goto or break.
If you can write your logic using a clean, modern construct, then use it. Otherwise, if goto makes sense, use it.
In general, goto can make your code harder to read and follow the flow of the code. For that reason, newbies are told to avoid goto altogether. This encourages them to think in terms of other constructs.
But some people just get religious about it. Coding is not a religion. And if a goto makes sense, C++ has a perfectly valid goto statement and you should use it when it makes good sense.
It doesn't make sense to me to ask what you should use instead of goto. You can generally use some type of loop or other construct, depending on what you are doing. But where goto makes sense, use it.
Don’t listen to people who say "never use goto". You're quite right that there are cases where nesting scoped blocks will make a heck of a lot more mess than a goto. It has its place, just as a switch/case does. However, with functions you can often refactor away the entire argument and keep people happy.
Simply forget that there is the goto statement in C++ and there will not be questions about how to replace the goto.:)
As for me I never see a code with goto statement that I could call it as a good code. Usually a goto statement is the source of some kind of errors and difficulties. Especially it is difficult to modify such code. Moreover one goto statement usually starts to produce other goto statements in the code.:)
The goto is bad because it allows you to jump from context to context. The context is a vector of all your variables (with their values) at a particular point of your program. A program execution graph shows you how your program jumps from context to context. It's in your best interests to keep the execution graph as simple as possible. The simplest one is a chain, the next step is an execution tree. A loop adds complexity, but it's a managed complexity. If you have a node in your execution graph, which is reachable via more than one execution path, and you need to understand a context at this node, then you'll need to follow more than one execution path back. These "merge" nodes add a lot to the complexity of your program. Each labeled statement (the goto target) is a potential merge node.
So, try don't use the goto operator at all - the language itself will force you to find a manageable solution using loops, boolean variables, classes, functions etc. and your program will be more clear and understandable. C++ exceptions is another manageable way to jump between contexts.
There are computing geniuses, who can keep in their minds and process very complex execution graphs, so they don't care much about complexity of their programs or a next programmer, who will be assigned to support or take over their code in the future. I guess, we have some of them here :-)
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Note: For this question I will mainly be referring to C++, however this may apply to other languages.
Note: Please assume there is no recursion.
People often say (if you have an exceptionally large function) to "break up" a function into several smaller
functions, but is this logical? What if I know for a fact that I will never use one of those smaller functions,
that is just a waste of: memory, performance, and you may have to jump around the code more when reading it.
Also what if you are only going to use a (hypothetically large) function once, should you just insert the
function body into the place where it would be called (for the same reasons as last time i.e: memory, performance, and you may have to jump around the code more when reading it)? So... to make a function or not to make a function, that is the question.
TO ALL
*EDIT*
I am still going through all the answers, however from what I have read so far I have formed a hypothesis.
Would it be correct to say split it up functions during development, but do what I suggest in the question before deployment, along with making functions you use once in development, but inserting bodies before deployment?
This really depends on the context.
When we say the size of a function, we actually mean the semantic distance of lines inside the function. We prefer that one function should do only one thing. If your function only does one thing and semantic distance is small inside it, then it is OK to have large function.
However, it is not good practice to make a function do a lot of things and it is better to refactor such functions to a few smaller ones with good naming and good placement of codes, such that the user of the code does not need to jump around.
Don't worry too much about performance and memory. Your compiler should take care of the bulk of that for you, especially for very thin functions.
My goal is typically to ensure that the given function call can be replaced entirely in the reader's memory--the developer can treat the abstraction purely. Take this:
// Imagine here that these are real variable/function names as written by a
// lazy coder. I have seen code like this in the wild.
void someFunc(int arg1, int arg2) {
int val3 = doFirstPart(arg1, field1);
int val4 = doSecondPart(arg2, val3);
queue.push(val4);
}
The refactoring of doFirstPart and doSecondPart buys you very little, and likely makes things harder to understand. The problem here isn't method extraction, though: The problem is poor naming and abstraction! You will have to read doFirstPart and doSecondPart or the point of the whole function is lost.
Consider this, instead:
void pushLatestRateAndValue(int rate, int value) {
int rateIndex = calculateRateIndex(rate, latestRateTable);
int valueIndex = caludateValueIndex(rateIndex, value);
queue.push(valueIndex);
}
In this contrived example, you don't have to read calculateRateIndex or calculateValueIndex unless you really want to dig deep--you know exactly what it does just by reading it.
Aside from that, it may be a matter of personal style. I know that some coders prefer to extract every business "statement" into a different function, but I find that a little hard to read. My personal preference is to look for an opportunity to extract a function from any function longer than one "screenful" (~25 lines) which has the advantage of keeping the entire function visible at once, and also because 25 lines happens to my personal mental limit of short-term memory and temporary understanding.
There are many good arguments for not making a routine longer than roughly what will fit on one page. One that most people don't always think about is that - unless you deploy debug symbols, which most people don't do - that stack trace coming in from the field is a lot easier to analyze and turn into a hypothesis about a cause when the routines that it refers to are small than when the error turn out to be occuring somewhere in that 2,000-line whale of a method that you never got around to split up.
I read recently, in an article on game programming written in 1996, that using global variables is faster than passing parameters.
Was this ever true, and if so, is this still true today?
Short answer - No, good programmers make code go faster by knowing and using the appropriate tools for the job, and then optimizing in a methodical way where their code does not meet their requirements.
Longer answer - This article, which in my opinion is not especially well-written, is not in any case general advice on program speedup but '15 ways to do faster blits'. Extrapolating this to the general case is missing the writer's point, whatever you think of the merits of the article.
If I was looking for performance advice, I would place zero credence in an article that does not identify or show a single concrete code change to support the assertions in the sample code, and without suggesting that measuring the code might be a good idea. If you are not going to show how to make the code better, why include it?
Some of the advice is years out of date - FAR pointers stopped being an issue on the PC a long time ago.
A serious game developer (or any other professional programmer, for that matter) would have a good laugh about advice like this:
You can either take out the assert's
completely, or you can just add a
#define NDEBUG when you compile the final version.
My advice to you, if you really wish to evaluate the merit of any of these 15 tips, and since the article is 14 years old, would be to compile the code in a modern compiler (Visual C++ 10 say) and try to identify any area where using a global variable (or any of the other tips) would make it faster.
[Just joking - my real advice would be to ignore this article completely and ask specific performance questions on Stack Overflow as you hit issues in your work that you cannot resolve. That way the answers you get will be peer reviewed, supported by example code or good external evidence, and current.]
When you switch from parameters to global variables, one of three things can happen:
it runs faster
it runs the same
it runs slower
You will have to measure performance to see what's faster in a non-trivial concrete case. This was true in 1996, is true today and is true tomorrow.
Leaving the performance aside for a moment, global variables in a large project introduce dependencies which almost always make maintenance and testing much harder.
When trying to find legitimate uses of globals variables for performance reasons today I very much agree with the examples in Preet's answer: very often needed variables in microcontroller programs or device drivers. The extreme case is a processor register which is exclusively dedicated to the global variable.
When reasoning about the performance of global variables versus parameter passing, the way the compiler implements them is relevant. Global variables typically are stored at fixed locations. Sometimes the compiler generates direct addressing to access the globals. Sometimes however, the compiler uses one more indirection and uses a kind of symbol table for globals. IIRC gcc for AIX did this 15 years ago. In this environment, globals of small types were always slower than locals and parameter passing.
On the other hand, a compiler can pass parameters by pushing them on the stack, by passing them in registers or a mixture of both.
Everyone has already given the appropriate caveat answers about this being platform and program specific, needing to actually measure timings, etc. So, with that all said already, let me answer your question directly for the specific case of game programming on x86 and PowerPC.
In 1996, there were certain cases where pushing parameters onto the stack took extra instructions and could cause a brief stall inside the Intel CPU pipeline. In those cases there could be a very small speedup from avoiding parameter passing altogether and reading data from literal addresses.
This isn't true any more on the x86 or on the PowerPC used in most game consoles. Using globals is usually slower than passing parameters for two reasons:
Parameter passing is implemented better now. Modern CPUs pass their parameters in registers, so reading a value from a function's parameter list is faster than a memory load operation. The x86 uses register shadowing and store forwarding, so what looks like shuffling data onto the stack and back can actually be a simple register move.
Data cache latency far outweighs CPU clock speed in most performance considerations. The stack, being heavily used, is almost always in cache. Loading from an arbitrary global address can cause a cache miss, which is a huge penalty as the memory controller has to go and fetch the data from main RAM. ("Huge" here is 600 cycles or more.)
What do you mean, "faster"?
I know for a fact, that understanding a program with global variables takes me a whole lot more time than one without.
If the extra time it takes the programmer(s) is less than the time gained by the users when they run the program with globals, then I'd say using global is faster.
But consider that the program is going to be run by 10 people once a day for 2 years. And that it takes 2.84632 secs without globals and 2.84217 secs with globals (a 0.00415 sec increase). That's 727 seconds less of TOTAL runtime. Gaining 10 minutes of run time is not worth the introduction of a global as regards programmer time.
To a degree any code that avoids processor instructions (ie shorter code) will be faster. However how much faster? Not very! Also note that compiler optimisation strategies may result in the smaller code anyway.
These days this is only an optimisation on very specific applications usually in ultra time critical drivers or micro-control code.
Putting aside the issues of maintainability and correctness, there are basically two factors that will govern performance with regard to globals vs. parameters.
When you make a global you avoid a copy. That's slightly faster. When you pass a parameter by value, it has to be copied so that a function can work on a local copy of it and not damage the caller's copy of the data. At least in theory. Some modern optimizers do pretty tricky things if they identify that your code is well behaved. A function may get automatically inlined, and the compiler may notice that the function doesn't do anything to the parameters, and just optimise away any copying.
When you make a global, you are lying to the cache. When you have all of your variables neatly contained in your function, and a few parameters, the data will tend to all be in one place. Some of the variables will be in registers, and some will probably be in cache right away because they are right 'next to' each other. Using a lot of global variables is basically pathological behavior for the cache. There is no guarantee that various globals will be used by the same functions. Location has no obvious correlation with usage. Perhaps you have a small enough working set that it makes no difference where anything is, and it all winds up in cache.
All of this just adds up to the point made by a poster above me:
When you switch from parameters to
global variables, one of three things
can happen:
* it runs faster
* it runs the same
* it runs slower
You will have to measure performance
to see what's faster in a non-trivial
concrete case. This was true in 1996,
is true today and is true tomorrow.
Depending on the specific behavior of your exact compiler, and precise details of the hardware that you use to run your code, it's possible that global variables could be a very slight performance win in some cases. That possibility may be worth trying it on some code that runs too slow as an experiment. It's probably not worth dedicating yourself to, as the answer of your experiment could change tomorrow. So, the right answer is almost always to go with "correct" design patterns and avoid the uglier design. Look for better algorithms, more efficient data structures, etc., before intentionally trying to spaghettify your project. Much better payoff in the long run.
And, aside from the dev time vs user time argument, I'll add the dev time vs. Moore's time argument. If you assume Moore's law will make computers something like half again as fast every year, then for the sake of a simple round number, we can assume that progress happens in a steady 1% progress per week. IF you are looking at a microoptimisation that may improve things like 1%, and it will add a week to the project from complicating things, then just taking the week off will have the same effect on average run times for your users.
Perhaps a micro optimisation, and would probably be wiped out by optimisations your compiler could generate without resort to such practices. In fact the use of globals may even inhibit some compiler optimisations. Reliable and maintainable code would generally be of greater value, and globals are not conducive to that.
Using globals to replace function parameters renders all such functions non-reentrant, which may be a problem if multi-threading is used - not a common practice in game development in 1996, but more common with the advent of multi-core processors. It also precludes recursion, although that is probably less of an issue since recursion has its own issues.
In any significant body of code, there is likely to be more mileage in higher-level optimisation of algorithms and data structures. Moreover there are options open to you other than global variables that avoid parameter passing, most especially C++ class-member variables.
If the habitual use of global variables in your code makes a measurable or useful difference to its performance, I would question the design first.
For a discussion of the problems inherent in global variables and some ways to avoid them see A Pox on Globals by Jack Gannsle. The article relates to embedded systems development, but is generally applicable; its just that some embedded systems developers think they have good reason to use globals, probably for all the same misguided reasons used to justify it in game development.
Well, if you are considering using global parameters instead of parameter passing, that could mean that you have a long chain of methods/functions that you have to pass that parameter down. It that is the case, you really WILL save CPU cycles by switching from parameter to global variable.
So, guys that say that it depends, I guess that they are plain wrong. Even with REGISTER parameter passing, there will still be MORE cpu cycles and MORE overhead for pushing the parameters down to the callee.
HOWEVER - I never do that. CPUs are superior now, and at times when there were 12Mhz 8086s that could be the issue. Nowadays, if you don't write embedded or super-turbo-charged performance code, stick to that which looks good in code, which doesn't break code logic, and thrives to be modular.
And lastly, leave machine language code generation to compiler - guys who designed it are best at knowing how their baby performs and will make your code run at its best.
In general (but it may depend greatly on compiler and platform implementation), passing parameters mean writing them onto the stack which you would not need with global variable.
That said, global variable may mean include page refresh in the MMU or memory controller whereas the stack may be located in much faster memory available to the processor...
Sorry, no good answer for a general question like this, just measure it (and try different scenarios too)
It was faster when we had <100mhz processors. Now that that processors are 100x faster this 'problem' is 100x less significant. It wasnt a big deal then, it was a big deal when you did it in assembly and had no (good) optimizer.
Says the guy who programmed on a 3mhz processor. Yes you read that right and 64k was NOT enough.
I see a lot of theoretical answers, but no practical advice for your scenario. What I'm guessing is that you have a large number of parameters to pass down through a number of function calls, and you're worried about accumulated overhead from many levels of call frames and many parameters at each level. Otherwise your concern is completely unfounded.
If this is your scenario, you should probably put all of the parameters in a "context" structure and pass a pointer to that structure. This will ensure data locality, and makes it so you don't have to pass more than one argument (the pointer) at each function call.
Parameters accessed this way are slightly more expensive to access than true function arguments (you need an extra register to hold the pointer to the base of the structure, as opposed to the frame pointer which would serve this purpose with function arguments), and individually (but probably not with cache effects factored in) more expensive to access than global variables in normal, non-PIC code. However, if your code is in a shared library/DLL using position independent code, the cost of accessing parameters passed by pointer to struct is cheaper than accessing a global variable and identical to accessing static variables, due to GOT and GOT-relative addressing. This is another reason never to use global variables for parameter passing: if you may eventually put your code in a shared library/DLL, any possible performance benefits will suddenly backfire!
Like everything else: yes and no. There is no one answer because it depends on context.
Counterpoints:
Imagine programming on Itanium where you have hundreds of registers. You can put quite a few globals into those, which will be faster than the typical way globals are implemented in C (some static address (although they might just hardcode the globals into instructions if they are word length)). Even if the globals are in cache the whole time, registers may still be faster.
In Java, overuse of globals (statics) can decrease performance because of initialization locks that have to be done. If 10 classes want to access some static class, they all have to wait for that class to finish initializing its static fields, which can take anywhere form no time up to forever.
In any case, global state is just bad practice, it raises code complexity. Well designed code is naturally fast enough 99.9% of the time. It seems like newer languages are removing global state all together. E removes global state because it violates their security model. Haskell removes state all together. The fact that Haskell exists and has implementations that outperform most other languages is proof enough for me that I will never use globals again.
Also, in the near future, when we all have hundreds of cores, global state isn't really going to help much.
It might still be true, under some circumstances.
A global variable might be as fast as a pointer to a variable, where its pointer is stored in/passed through registers only. So, it is a question about the count of registers, you can use.
To speed-optimize a function call, you could do several other things, that might perform better with global-variable-hacks:
Minimize the count of local variables in the function to a few (explicit) register variables.
Minimize the count of parameters of the function, i.e. by using pointers to structures instead of using the same parameter-constellations in functions that call each other.
Make the function "naked", that means that it does not use the stack at all.
Use "proper-tail-calls" (does neither work with java/-bytecode nor java-/ecma-script)
If there is no better way, hack yourself sth like TABLES_NEXT_TO_CODE, which locates your global variables next to the function code. In functional languages this is a backend-optimization that uses the function-pointer as data-pointer, too; but as long as you do not program in a functional language, you only need to locate those variables beside those used by the function. Then again, you only want this to remove the stack-handling from your function. If your compiler generates assembler code that handles the stack, then there is no point in doing this, you could use pointers instead.
I've found this "gcc attribute overview":
http://www.ohse.de/uwe/articles/gcc-attributes.html
and I can give you these tags for googling:
- Proper Tail Call (it is mostly relevant to imperative backends of functional languages)
- TABLES_NEXT_TO_CODE (it is mostly relevant to Haskell and LLVM)
But you have 'spaghetti code', when you often use global variables.
Sometimes, mainly for optimization purposes, very simple operations are implemented as complicated and clumsy code.
One example is this integer initialization function:
void assign( int* arg )
{
__asm__ __volatile__ ( "mov %%eax, %0" : "=m" (*arg));
}
Then:
int a;
assign ( &a );
But actually I don't understand why is it written in this way...
Have you seen any example with real reasons to do so?
In the case of your example, I think it is a result of the fallacious assumption that writing code in assembly is automatically faster.
The problem is that the person who wrote this didn't understand WHY assembly can sometimes run faster. That is, you know more than the compiler what you are trying to do and can sometimes use this knowledge to write code at a lower level that is more performant based on not having to make assumptions that the compiler will.
In the case of a simple variable assignment, I seriously doubt that holds true and the code is likely to perform slower because it has the additional overhead of managing the assign function on the stack. Mind you, it won't be noticeably slower, the main cost here is code that is less readable and maintainable.
This is a textbook example of why you shouldn't implement optimizations without understanding WHY it is an optimization.
It seems that the assembly code intent was to ensure that the assignment to the *arg int location will be done every time - preventing (on purpose) any optimization from the compiler in this regard.
Usually the volatile keyword is used in C++ (and C...) to tell the compiler that this value should not be kept in a register (for instance) and reused from that register (optimization in order to get the value faster) as it can be changed asynchronously (by an external module, an assembly program, an interruption etc...).
For instance, in a function
int a = 36;
g(a);
a = 21;
f(a);
in this case the compiler knows that the variable a is local to the function and is not modified outside the function (a pointer on a is not provided to any call for instance). It may use a processor register to store and use the a variable.
In conclusion, that ASM instruction seems to be injected to the C++ code in order not to perform some optimizations on that variable.
While there are several reasonable justifications for writing something in assembly, in my experience those are uncommonly the actual reason. Where I've been able to study the rationale, they boil down to:
Age: The code was written so long ago that it was the most reasonable option for dealing with compilers of the era. Typically, before about 1990 can be justified, IMHO.
Control freak: Some programmers have trust issues with the compiler, but aren't inclined to investigate its actual behavior.
Misunderstanding: A surprisingly widespread and persistent myth is that anything written in assembly language inherently results in more efficient code than writing in a "clumsy" compiler—what with all its mysterious function entry/exit code, etc. Certainly a few compilers deserved this reputation
To be "cool": When time and money are not factors, what better way to strut a programmer's significantly elevated hormone levels than some macho, preferably inscrutable, assembly language?
The example you give seems flawed, in that the assign() function is liable to be slower than directly assigning the variable, reason being that calling a function with arguments involves stack usage, whereas just saying int a = x is liable to compile to efficient code without needing the stack.
The only times I have benefited from using assembler is by hand optimising the assembler output produced by the compiler, and that was in the days where processor speeds were often in the single megahertz range. Algorithmic optimisation tends to give a better return on investment as you can gain orders of magnitudes in improvement rather than small multiples. As others have already said, the only other times you go to assembler is if the compiler or language doesn't do something you need to do. With C and C++ this is very rarely the case any more.
It could well be someone showing off that they know how to write some trivial assembler code, making the next programmers job more difficult, and possibly as a half assed measure to protect their own job. For the example given, the code is confusing, possibly slower than native C, less portable, and should probably be removed. Certainly if I see any inline assmebler in any modern C code, I'd expect copious comments explaining why it is absolutely necessary.
Let compilers optimize for you. There's no possible way this kind of "optimization" will ever help anything... ever!