How to (computed) goto and longjmp in C++? - c++

I don't usually code C++, but a strange comp sci friend of mine got sick of looking at my wonderful FORTRAN programs and challenged me to rewrite one of them in C++, since he likes my C++ codes better. (We're betting money here.) Exact terms being that it needs to be compilable in a modern C++ compiler. Maybe he hates a good conio.h - I don't know.
Now I realize there are perfectly good ways of writing in C++, but I'm going for a personal win here by trying to make my C++ version as FORTRAN-esque as possible. For bonus points, this might save me some time and effort when I'm converting code.
SO! This brings me to the following related queries:
On gotos:
How do you work a goto?
What are the constraints on gotos in C++?
Any concerns about scope? (I'm going to try to globally scope as much as possible, but you never know.)
If I use the GCC extension to goto to a void pointer array, are there any new concerns about undefined behavior, etc?
On longjmp:
How would you safely use a longjmp?
What are the constraints on longjmps in C++?
What does it do to scope?
Are there any specific moments when it looks like a longjmp should be safe but in fact it isn't that I should watch out for?
How would I simulate a computed goto with longjmp?
Is there any tangible benefit to using longjmp over goto if I only have one function in my program?
Right now my main concern is making a computed goto work for this. It looks like I'll probably use the longjmp to make this work because a void pointer array isn't a part of the C++ standard but a GCC specific extension.

I'll bite and take the downvote.
I seriously doubt that your friend will find Fortran written in C++ any easier (which is effectively what you'll get if you use goto and longjmp significantly) to read and he might even find it harder to follow. The C++ language is rather different from Fortran and I really don't think you should attempt a straight conversion from Fortran to C++. It will just make the C++ harder to maintain and you might as well stay with your existing codebase.
goto: You set up a label (my_label:) and then use the goto command goto my_label; which will cause your program flow to execute at the statement following the goto. You can't jump past the initialization of a variable or between functions. You can't create an array of goto targets but you can create an array of object or function pointers to jump to.
longjmp: There is no reason to prefer longjmp over goto if you have only one function. But if you have only one function, again, you really aren't writing C++ and you'll be better off in the long run just maintaining your Fortran.

You'll get plenty of haterade about using goto at all. Normally I'd jump right on the bandwagon, but in this particular case it sounds more like code golf to me. So here you go.
Use goto to move the instruction pointer to a "label" in your code, which is a C++ identifier followed by a colon. Here's a simple example of a working program:
#include <iostream>
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
int i = 0;
step:
cout << "i = " << i;
++i;
if( i < 10 )
goto step;
}
In this case, step: is the label.
There are concerns about context.
You can only goto to a label within the current function.
If your goto skips the initialization of a variable, you may evoke Undefined Behavior (Code which will compile, but you can't say for sure what it will actually do.).
You cannot goto in to a try block or catch handler. However, you can goto out of a try block.
You "can goto" with pointers etc provided the other concerns are met. If the pointer in question is in-scope at the call site and in-scope at the branch site, no problem.

I think this reference has most of the information you are looking for.
goto
longjmp

computed goto --> switch
Really, they share a (common, but not universal) underling implementation as a jump table.

If I understand the original question, the question is actually an interesting one. Rewording the question (to what I think is an equivalent question): "How do you do a FORTRAN computed goto in C?"
First we need to know what a computed goto is: Here is a link to one explanation: http://h21007.www2.hp.com/portal/download/files/unprot/fortran/docs/lrm/lrm0124.htm.
An example of a computed GOTO is:
GO TO (12,24,36), INDEX
Where 12, 24, and 36 are statement numbers. (C language labels could serve as an equivalent, but is not the only thing that could be an equivalent.)
And where INDEX is a variable, but could be the result of a formula.
Here is one way (but not the only way) to do the same thing in C:
int SITU(int J, int K)
{
int raw_value = (J * 5) + K;
int index = (raw_value % 5) - 1;
return index;
}
int main(void)
{
int J = 5, K= 2;
// fortran computed goto statement: GO TO (320,330,340,350,360), SITU(J,K) + 1
switch (SITU(J,K) + 1)
{
case 0: // 320
// code statement 320 goes here
printf("Statement 320");
break;
case 1: // 330
// code statement 330 goes here
printf("Statement 330");
break;
case 2: // 340
// code statement 340 goes here
printf("Statement 340");
break;
case 3: // 350
// code statement 350 goes here
printf("Statement 350");
break;
case 4: // 360
// code statement 360 goes here
printf("Statement 360");
break;
}
printf("\nPress Enter\n");
getchar();
return 0;
}
In this particular example, we see that you do not need C gotos to implement a FORTRAN computed goto!

Longjmp can get you out of a signal handler which can be nice - and it'll add some confusion as it will not reset automatic (stack-based) variables in the function it long jumps to defined prior to the setjmp line. :)

There is a GCC extension called Labels as Values that will help you code golf, essentially giving you computed goto. You can generate the labels automatically, of course. You will probably need to do that since you can't know how many bytes of machine code each line will generate.

Related

2 while loops vs if else statement in 1 while loop

First a little introduction:
I'm a novice C++ programmer (I'm new to programming) writing a little multiplication tables practising program. The project started as a small program to teach myself the basics of programming and I keep adding new features as I learn more and more about programming. At first it just had basics like ask for input, loops and if-else statements. But now it uses vectors, read and writes to files, creates a directory etc.
You can see the code here: Project on Bitbucket
My program now is going to have 2 modes: practise a single multiplication table that the user can choose himself or practise all multiplication tables mixed. Now both modes work quite different internally. And I developed the mixed mode as a separate program, as would ease the development, I could just focus on writing the code itself instead of also bothering how I will integrate it in the existing code.
Below the code of the currently separate mixed mode program:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
#include <time.h>
using namespace std;
using std::string;
int newquestion(vector<int> remaining_multiplication_tables, vector<int> multiplication_tables, int table_selecter){
cout << remaining_multiplication_tables[table_selecter] << " * " << multiplication_tables[remaining_multiplication_tables[table_selecter]-1]<< " =" << "\n";
return remaining_multiplication_tables[table_selecter] * multiplication_tables[remaining_multiplication_tables[table_selecter]-1];
}
int main(){
int usersanswer_int;
int cpu_answer;
int table_selecter;
string usersanswer;
vector<int> remaining_multiplication_tables = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
vector<int> multiplication_tables(10, 1);//fill vector with 10 elements that contain the value '1'. This vector will store the "progress" of each multiplication_table.
srand(time(0));
table_selecter = rand() % remaining_multiplication_tables.size();
cpu_answer = newquestion(remaining_multiplication_tables, multiplication_tables, table_selecter);
while(remaining_multiplication_tables.size() != 0){
getline(cin, usersanswer);
stringstream usersanswer_stream(usersanswer);
usersanswer_stream >> usersanswer_int;
if(usersanswer_int == cpu_answer){
cout << "Your answer is correct! :)" << "\n";
if(multiplication_tables[remaining_multiplication_tables[table_selecter]-1] == 10){
remaining_multiplication_tables.erase(remaining_multiplication_tables.begin() + table_selecter);
}
else{
multiplication_tables[remaining_multiplication_tables[table_selecter]-1] +=1;
}
if (remaining_multiplication_tables.size() != 0){
table_selecter = rand() % remaining_multiplication_tables.size();
cpu_answer = newquestion(remaining_multiplication_tables, multiplication_tables, table_selecter);
}
}
else{
cout << "Unfortunately your answer isn't correct! :(" << "\n";
}
}
return 0;
}
As you can see the newquestion function for the mixed mode is quite different. Also the while loop includes other mixed mode specific code.
Now if I want to integrate the mixed multiplication tables mode into the existing main program I have 2 choices:
-I can clutter up the while loop with if-else statements to check each time the loop runs whether mode == 10 (single multiplication table mode) or mode == 100 (mixed multiplication tables mode). And also place a if-else statement in the newquestion() function to check if mode == 10 or mode == 100
-I can let the program check on startup whether the user chose single multiplication table or mixed multiplication tables mode and create 2 while loops and 2 newquestion() functions. That would look like this:
int newquestion_mixed(){
//newquestion function for mixed mode
}
int newquestion_single(){
//newquestion function for single mode
}
//initialization
if mode == 10
//create necessary variables for single mode
while(){
//single mode loop
}
else{
//create necessary variables for mixed mode
while(){
//mixed mode loop
}
}
Now why would I bother creating 2 separate loops and functions? Well isn't it inefficient if the program checks each time the loop runs (each time the user is asked a new question, for example: '5 * 3 =') which mode the user chose? I'm worried about the performance with this option. Now I hear you think: but why would you bother about performance for such a simple, little non-performance critical application with the extremely powerful processors today and the huge amounts of RAM? Well, as I said earlier this program is mainly about teaching myself a good coding style and learning how to program etc. So I want to teach myself the good habits from the beginning.
The 2 while loops and functions option is much more efficient will use less CPU, but more space and includes duplicating code. I don't know if this is a good style either.
So basically I'm asking the experts what's the best style/way to handle this kind of things. Also if you spot something bad in my code/bad style please tell me, I'm very open to feedback because I'm still a novice. ;)
First, a fundamental rule of programming is that of "don't prematurely optimize the code" - that is, don't fiddle around with little details, before you have the code working correctly, and write code that expresses what you want done as clearly as possible. This is good coding style. To obsess over the details of "which is faster" (in a loop that spends most of it's time waiting for the user to input some number) is not good coding style.
Once it's working correcetly, analyse (using for example a profiler tool) where the code is spending it's time (assuming performance is a major factor in the first place). Once you have located the major "hotspot", then try to make that better in some way - how you go about that depends very much on what that particular hot-spot code does.
As to which performs best will highly depend on details the code and the compiler (and which compiler optimizations are chosen). It is quite likely that having an if inside a while-loop will run slower, but modern compilers are quite clever, and I have certainly seen cases where the compiler hoists such a choice out of the loop, in cases where the conditions don't change. Having two while-loops is harder for the compiler to "make better", because it most likely won't see that you are doing the same thing in both loops [because the compiler works from the bottom of the parse-tree up, and it will optimize the inside of the while-loop first, then go out to the if-else side, and at that point it's "lost track" of what's going on inside each loop].
Which is clearer, to have one while loop with an if inside, or an if with two while-loops, that's another good question.
Of course, the object oriented solution is to have two classes - one for mixed, another for single - and just run one loop, that calls the relevant (virtual) member function of the object created based on an if-else statement before the loop.
Modern CPU branch predictors are so good that if during the loop the condition never changes, it will probably be as fast as having two while loops in each branch.

Switch statement with huge number of cases

What happens if the switch has more than 5000 case. What are the drawbacks and how we can replace it with something faster?
Note: I am not expecting to use array to store cases as it's the same.
There's no specific reason to think you'd want anything other than a switch/case statement (and indeed I'd actively expect it to be unhelpful). The compiler should create efficient dispatching code, which might involve some combination of static [sparse] table(s) and direct indexing, binary branching etc.; it's got insights into the static values of the cases and should do an excellent job (retuning it on the fly each time you change the cases, whereas new values that don't fit well with a hand-crafted approach - such as wildly differing values when you'd had a pretty packed array lookup - could require reworking of code or silently cause memory bloat or a performance drop).
People really cared about this kind of thing back when C was trying to win over hard-core assembly programmers... the compilers were held accountable for generating good code. Put another way - if it's not (measurably) broken, don't fix it.
More generally, it's great to be curious about this kind of thing and get people's ideas on alternatives and their performance implications, but if you really care and the performance difference could make a useful difference to your program (especially if profiling suggests it) then always benchmark with your program doing real work.
As food for thought... in case one might be stuck with an old/buggy/inefficient compiler or just love hacking.
Inner work of switch statement consist of two parts. Finding address to jump, and well jumping there. For the first part you need to use a table to find the address. If the number of cases increases, table gets bigger - searching address to jump takes time. This is the point compilers tries to optimize, combining several techniques but one easy approach is to use table directly which depends on case value space.
In a back of the napkin example;
switch (n) {
case 1: foo(); break;
case 2: bar(); break;
case 3: baz(); break;
}
with such piece of code compiler can create an array of jump_addresses and directly get the address by array[n]. Now search just took O(1). But if you had a switch like below:
switch (n) {
case 10: foo(); break;
case 17: bar(); break;
case 23: baz(); break;
// and a lot other
}
compiler needs to generate a table containing case_id, jump_address pairs and code to search through that structure which with worst implementation can take O(n). (Decent compilers optimize the hell out of such scenario when they are fully unleashed by enabling their optimization flags to a degree that when you need to debug such optimized code your brain starts to fry.)
Then question is can you do this all yourself at C level to beat the compiler? and funny thing is while creating tables and searching through them seems easy, jumping to a variable point using goto is not possible in standard C. So there is a chance that if you are not going to use function pointers due to overhead or code structure, you are stuck... well if you are not using GCC. GCC has a non-standard feature called Labels as Values which helps you to get pointers to labels.
To complete the example you can write the second switch statement with "labels as values" feature like this:
const void *cases[] = {&&case_foo, &&case_bar, &&case_baz, ....};
goto *labels[n];
case_foo:
foo();
goto switch_end;
case_bar:
bar();
goto switch_end;
case_baz:
baz();
goto switch_end;
// and a lot other
switch_end:
Of course if you are talking about 5000 cases, it is much better if you write a piece of code to create this code for you - and it is probably only way to maintain such software.
As closing notes; will this improve your daily work? No. Will this improve your skills? Yes and talking from experience, I once found myself improved a security algorithm in a smart card just by optimizing case values. It is a strange world.
Try to use Dictionary class with Delegate values. At least it makes code a little bit more readable.
Big switch statement, generally auto-generated one, may take long time to compile. But I like the idea that compiler optimizes the switch statement.
One way to break apart the switch statement is to use bucketing,
int getIt(int input)
{
int bucket = input%16;
switch(bucket)
{
case 1:
return getItBucket1(input);
case 2:
return getItBucket2(input);
...
...
}
return -1;
}
So in the code above, we broke apart our switch statement into 16 parts. It is easy to change the number of buckets in auto-generated code.
This code has added run-time cost of one layer of indirection or function-call. . But considering the buckets defined in different files, it is faster to compile them in parallel.

Is the GOTO statement bad in C? How so? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Everyone is aware of Dijkstra's Letters to the editor: go to statement considered harmful (also here .html transcript and here .pdf) and there has been a formidable push since that time to eschew the goto statement whenever possible. While it's possible to use goto to produce unmaintainable, sprawling code, it nevertheless remains in modern programming languages. Even the advanced continuation control structure in Scheme can be described as a sophisticated goto.
What circumstances warrant the use of goto? When is it best to avoid?
As a follow-up question: C provides a pair of functions, setjmp() and longjmp(), that provide the ability to goto not just within the current stack frame but within any of the calling frames. Should these be considered as dangerous as goto? More dangerous?
Dijkstra himself regretted that title, for which he was not responsible. At the end of EWD1308 (also here .pdf) he wrote:
Finally a short story for the record.
In 1968, the Communications of the ACM
published a text of mine under the
title "The goto statement considered
harmful", which in later years would
be most frequently referenced,
regrettably, however, often by authors
who had seen no more of it than its
title, which became a cornerstone of
my fame by becoming a template: we
would see all sorts of articles under
the title "X considered harmful" for
almost any X, including one titled
"Dijkstra considered harmful". But
what had happened? I had submitted a
paper under the title "A case against
the goto statement", which, in order
to speed up its publication, the
editor had changed into a "letter to
the Editor", and in the process he had
given it a new title of his own
invention! The editor was Niklaus
Wirth.
A well thought out classic paper about this topic, to be matched to that of Dijkstra, is Structured Programming with go to Statements, by Donald E. Knuth. Reading both helps to reestablish context and a non-dogmatic understanding of the subject. In this paper, Dijkstra's opinion on this case is reported and is even more strong:
Donald E. Knuth: I believe that by presenting such a
view I am not in fact disagreeing
sharply with Dijkstra's ideas, since
he recently wrote the following:
"Please don't fall into the trap of
believing that I am terribly
dogmatical about [the go to
statement]. I have the uncomfortable
feeling that others are making a
religion out of it, as if the
conceptual problems of programming
could be solved by a single trick, by
a simple form of coding discipline!"
A coworker of mine said the only reason to use a GOTO is if you programmed yourself so far into a corner that it is the only way out. In other words, proper design ahead of time and you won't need to use a GOTO later.
I thought this comic illustrates that beautifully "I could restructure the program's flow, or use one little 'GOTO' instead." A GOTO is a weak way out when you have weak design. Velociraptors prey on the weak.
The following statements are generalizations; while it is always possible to plead exception, it usually (in my experience and humble opinion) isn't worth the risks.
Unconstrained use of memory addresses (either GOTO or raw pointers) provides too many opportunities to make easily avoidable mistakes.
The more ways there are to arrive at a particular "location" in the code, the less confident one can be about what the state of the system is at that point. (See below.)
Structured programming IMHO is less about "avoiding GOTOs" and more about making the structure of the code match the structure of the data. For example, a repeating data structure (e.g. array, sequential file, etc.) is naturally processed by a repeated unit of code. Having built-in structures (e.g. while, for, until, for-each, etc.) allows the programmer to avoid the tedium of repeating the same cliched code patterns.
Even if GOTO is low-level implementation detail (not always the case!) it's below the level that the programmer should be thinking. How many programmers balance their personal checkbooks in raw binary? How many programmers worry about which sector on the disk contains a particular record, instead of just providing a key to a database engine (and how many ways could things go wrong if we really wrote all of our programs in terms of physical disk sectors)?
Footnotes to the above:
Regarding point 2, consider the following code:
a = b + 1
/* do something with a */
At the "do something" point in the code, we can state with high confidence that a is greater than b. (Yes, I'm ignoring the possibility of untrapped integer overflow. Let's not bog down a simple example.)
On the other hand, if the code had read this way:
...
goto 10
...
a = b + 1
10: /* do something with a */
...
goto 10
...
The multiplicity of ways to get to label 10 means that we have to work much harder to be confident about the relationships between a and b at that point. (In fact, in the general case it's undecideable!)
Regarding point 4, the whole notion of "going someplace" in the code is just a metaphor. Nothing is really "going" anywhere inside the CPU except electrons and photons (for the waste heat). Sometimes we give up a metaphor for another, more useful, one. I recall encountering (a few decades ago!) a language where
if (some condition) {
action-1
} else {
action-2
}
was implemented on a virtual machine by compiling action-1 and action-2 as out-of-line parameterless routines, then using a single two-argument VM opcode which used the boolean value of the condition to invoke one or the other. The concept was simply "choose what to invoke now" rather than "go here or go there". Again, just a change of metaphor.
Sometimes it is valid to use GOTO as an alternative to exception handling within a single function:
if (f() == false) goto err_cleanup;
if (g() == false) goto err_cleanup;
if (h() == false) goto err_cleanup;
return;
err_cleanup:
...
COM code seems to fall into this pattern fairly often.
I can only recall using a goto once. I had a series of five nested counted loops and I needed to be able to break out of the entire structure from the inside early based on certain conditions:
for{
for{
for{
for{
for{
if(stuff){
GOTO ENDOFLOOPS;
}
}
}
}
}
}
ENDOFLOOPS:
I could just have easily declared a boolean break variable and used it as part of the conditional for each loop, but in this instance I decided a GOTO was just as practical and just as readable.
No velociraptors attacked me.
Goto is extremely low on my list of things to include in a program just for the sake of it. That doesn't mean it's unacceptable.
Goto can be nice for state machines. A switch statement in a loop is (in order of typical importance): (a) not actually representative of the control flow, (b) ugly, (c) potentially inefficient depending on language and compiler. So you end up writing one function per state, and doing things like "return NEXT_STATE;" which even look like goto.
Granted, it is difficult to code state machines in a way which make them easy to understand. However, none of that difficulty is to do with using goto, and none of it can be reduced by using alternative control structures. Unless your language has a 'state machine' construct. Mine doesn't.
On those rare occasions when your algorithm really is most comprehensible in terms of a path through a sequence of nodes (states) connected by a limited set of permissible transitions (gotos), rather than by any more specific control flow (loops, conditionals, whatnot), then that should be explicit in the code. And you ought to draw a pretty diagram.
setjmp/longjmp can be nice for implementing exceptions or exception-like behaviour. While not universally praised, exceptions are generally considered a "valid" control structure.
setjmp/longjmp are 'more dangerous' than goto in the sense that they're harder to use correctly, never mind comprehensibly.
There never has been, nor will there
ever be, any language in which it is
the least bit difficult to write bad
code. -- Donald Knuth.
Taking goto out of C would not make it any easier to write good code in C. In fact, it would rather miss the point that C is supposed to be capable of acting as a glorified assembler language.
Next it'll be "pointers considered harmful", then "duck typing considered harmful". Then who will be left to defend you when they come to take away your unsafe programming construct? Eh?
We already had this discussion and I stand by my point.
Furthermore, I'm fed up with people describing higher-level language structures as “goto in disguise” because they clearly haven't got the point at all. For example:
Even the advanced continuation control structure in Scheme can be described as a sophisticated goto.
That is complete nonsense. Every control structure can be implemented in terms of goto but this observation is utterly trivial and useless. goto isn't considered harmful because of its positive effects but because of its negative consequences and these have been eliminated by structured programming.
Similarly, saying “GOTO is a tool, and as all tools, it can be used and abused” is completely off the mark. No modern construction worker would use a rock and claim it “is a tool.” Rocks have been replaced by hammers. goto has been replaced by control structures. If the construction worker were stranded in the wild without a hammer, of course he would use a rock instead. If a programmer has to use an inferior programming language that doesn't have feature X, well, of course she may have to use goto instead. But if she uses it anywhere else instead of the appropriate language feature she clearly hasn't understood the language properly and uses it wrongly. It's really as simple as that.
In Linux: Using goto In Kernel Code on Kernel Trap, there's a discussion with Linus Torvalds and a "new guy" about the use of GOTOs in Linux code. There are some very good points there and Linus dressed in that usual arrogance :)
Some passages:
Linus: "No, you've been brainwashed by
CS people who thought that Niklaus
Wirth actually knew what he was
talking about. He didn't. He doesn't
have a frigging clue."
-
Linus: "I think goto's are fine, and
they are often more readable than
large amounts of indentation."
-
Linus: "Of course, in stupid languages
like Pascal, where labels cannot be
descriptive, goto's can be bad."
In C, goto only works within the scope of the current function, which tends to localise any potential bugs. setjmp and longjmp are far more dangerous, being non-local, complicated and implementation-dependent. In practice however, they're too obscure and uncommon to cause many problems.
I believe that the danger of goto in C is greatly exaggerated. Remember that the original goto arguments took place back in the days of languages like old-fashioned BASIC, where beginners would write spaghetti code like this:
3420 IF A > 2 THEN GOTO 1430
Here Linus describes an appropriate use of goto: http://www.kernel.org/doc/Documentation/CodingStyle (chapter 7).
Today, it's hard to see the big deal about the GOTO statement because the "structured programming" people mostly won the debate and today's languages have sufficient control flow structures to avoid GOTO.
Count the number of gotos in a modern C program. Now add the number of break, continue, and return statements. Furthermore, add the number of times you use if, else, while, switch or case. That's about how many GOTOs your program would have had if you were writing in FORTRAN or BASIC in 1968 when Dijkstra wrote his letter.
Programming languages at the time were lacking in control flow. For example, in the original Dartmouth BASIC:
IF statements had no ELSE. If you wanted one, you had to write:
100 IF NOT condition THEN GOTO 200
...stuff to do if condition is true...
190 GOTO 300
200 REM else
...stuff to do if condition is false...
300 REM end if
Even if your IF statement didn't need an ELSE, it was still limited to a single line, which usually consisted of a GOTO.
There was no DO...LOOP statement. For non-FOR loops, you had to end the loop with an explicit GOTO or IF...GOTO back to the beginning.
There was no SELECT CASE. You had to use ON...GOTO.
So, you ended up with a lot of GOTOs in your program. And you couldn't depend on the restriction of GOTOs to within a single subroutine (because GOSUB...RETURN was such a weak concept of subroutines), so these GOTOs could go anywhere. Obviously, this made control flow hard to follow.
This is where the anti-GOTO movement came from.
Go To can provide a sort of stand-in for "real" exception handling in certain cases. Consider:
ptr = malloc(size);
if (!ptr) goto label_fail;
bytes_in = read(f_in,ptr,size);
if (bytes_in=<0) goto label_fail;
bytes_out = write(f_out,ptr,bytes_in);
if (bytes_out != bytes_in) goto label_fail;
Obviously this code was simplified to take up less space, so don't get too hung up on the details. But consider an alternative I've seen all too many times in production code by coders going to absurd lengths to avoid using goto:
success=false;
do {
ptr = malloc(size);
if (!ptr) break;
bytes_in = read(f_in,ptr,size);
if (count=<0) break;
bytes_out = write(f_out,ptr,bytes_in);
if (bytes_out != bytes_in) break;
success = true;
} while (false);
Now functionally this code does the exact same thing. In fact, the code generated by the compiler is nearly identical. However, in the programmer's zeal to appease Nogoto (the dreaded god of academic rebuke), this programmer has completely broken the underlying idiom that the while loop represents, and did a real number on the readability of the code. This is not better.
So, the moral of the story is, if you find yourself resorting to something really stupid in order to avoid using goto, then don't.
Donald E. Knuth answered this question in the book "Literate Programming", 1992 CSLI. On p. 17 there is an essay "Structured Programming with goto Statements" (PDF). I think the article might have been published in other books as well.
The article describes Dijkstra's suggestion and describes the circumstances where this is valid. But he also gives a number of counter examples (problems and algorithms) which cannot be easily reproduced using structured loops only.
The article contains a complete description of the problem, the history, examples and counter examples.
Goto considered helpful.
I started programming in 1975. To 1970s-era programmers, the words "goto considered harmful" said more or less that new programming languages with modern control structures were worth trying. We did try the new languages. We quickly converted. We never went back.
We never went back, but, if you are younger, then you have never been there in the first place.
Now, a background in ancient programming languages may not be very useful except as an indicator of the programmer's age. Notwithstanding, younger programmers lack this background, so they no longer understand the message the slogan "goto considered harmful" conveyed to its intended audience at the time it was introduced.
Slogans one does not understand are not very illuminating. It is probably best to forget such slogans. Such slogans do not help.
This particular slogan however, "Goto considered harmful," has taken on an undead life of its own.
Can goto not be abused? Answer: sure, but so what? Practically every programming element can be abused. The humble bool for example is abused more often than some of us would like to believe.
By contrast, I cannot remember meeting a single, actual instance of goto abuse since 1990.
The biggest problem with goto is probably not technical but social. Programmers who do not know very much sometimes seem to feel that deprecating goto makes them sound smart. You might have to satisfy such programmers from time to time. Such is life.
The worst thing about goto today is that it is not used enough.
Attracted by Jay Ballou adding an answer, I'll add my £0.02. If Bruno Ranschaert had not already done so, I'd have mentioned Knuth's "Structured Programming with GOTO Statements" article.
One thing that I've not seen discussed is the sort of code that, while not exactly common, was taught in Fortran text books. Things like the extended range of a DO loop and open-coded subroutines (remember, this would be Fortran II, or Fortran IV, or Fortran 66 - not Fortran 77 or 90). There's at least a chance that the syntactic details are inexact, but the concepts should be accurate enough. The snippets in each case are inside a single function.
Note that the excellent but dated (and out of print) book 'The Elements of Programming Style, 2nd Edn' by Kernighan & Plauger includes some real-life examples of abuse of GOTO from programming text books of its era (late-70s). The material below is not from that book, however.
Extended range for a DO loop
do 10 i = 1,30
...blah...
...blah...
if (k.gt.4) goto 37
91 ...blah...
...blah...
10 continue
...blah...
return
37 ...some computation...
goto 91
One reason for such nonsense was the good old-fashioned punch-card. You might notice that the labels (nicely out of sequence because that was canonical style!) are in column 1 (actually, they had to be in columns 1-5) and the code is in columns 7-72 (column 6 was the continuation marker column). Columns 73-80 would be given a sequence number, and there were machines that would sort punch card decks into sequence number order. If you had your program on sequenced cards and needed to add a few cards (lines) into the middle of a loop, you'd have to repunch everything after those extra lines. However, if you replaced one card with the GOTO stuff, you could avoid resequencing all the cards - you just tucked the new cards at the end of the routine with new sequence numbers. Consider it to be the first attempt at 'green computing' - a saving of punch cards (or, more specifically, a saving of retyping labour - and a saving of consequential rekeying errors).
Oh, you might also note that I'm cheating and not shouting - Fortran IV was written in all upper-case normally.
Open-coded subroutine
...blah...
i = 1
goto 76
123 ...blah...
...blah...
i = 2
goto 76
79 ...blah...
...blah...
goto 54
...blah...
12 continue
return
76 ...calculate something...
...blah...
goto (123, 79) i
54 ...more calculation...
goto 12
The GOTO between labels 76 and 54 is a version of computed goto. If the variable i has the value 1, goto the first label in the list (123); if it has the value 2, goto the second, and so on. The fragment from 76 to the computed goto is the open-coded subroutine. It was a piece of code executed rather like a subroutine, but written out in the body of a function. (Fortran also had statement functions - which were embedded functions that fitted on a single line.)
There were worse constructs than the computed goto - you could assign labels to variables and then use an assigned goto. Googling assigned goto tells me it was deleted from Fortran 95. Chalk one up for the structured programming revolution which could fairly be said to have started in public with Dijkstra's "GOTO Considered Harmful" letter or article.
Without some knowledge of the sorts of things that were done in Fortran (and in other languages, most of which have rightly fallen by the wayside), it is hard for us newcomers to understand the scope of the problem which Dijkstra was dealing with. Heck, I didn't start programming until ten years after that letter was published (but I did have the misfortune to program in Fortran IV for a while).
There is no such things as GOTO considered harmful.
GOTO is a tool, and as all tools, it can be used and abused.
There are, however, many tools in the programming world that have a tendency to be abused more than being used, and GOTO is one of them. the WITH statement of Delphi is another.
Personally I don't use either in typical code, but I've had the odd usage of both GOTO and WITH that were warranted, and an alternative solution would've contained more code.
The best solution would be for the compiler to just warn you that the keyword was tainted, and you'd have to stuff a couple of pragma directives around the statement to get rid of the warnings.
It's like telling your kids to not run with scissors. Scissors are not bad, but some usage of them are perhaps not the best way to keep your health.
Since I began doing a few things in the linux kernel, gotos don't bother me so much as they once did. At first I was sort of horrified to see they (kernel guys) added gotos into my code. I've since become accustomed to the use of gotos, in some limited contexts, and will now occasionally use them myself. Typically, it's a goto that jumps to the end of a function to do some kind of cleanup and bail out, rather than duplicating that same cleanup and bailout in several places in the function. And typically, it's not something large enough to hand off to another function -- e.g. freeing some locally (k)malloc'ed variables is a typical case.
I've written code that used setjmp/longjmp only once. It was in a MIDI drum sequencer program. Playback happened in a separate process from all user interaction, and the playback process used shared memory with the UI process to get the limited info it needed to do the playback. When the user wanted to stop playback, the playback process just did a longjmp "back to the beginning" to start over, rather than some complicated unwinding of wherever it happened to be executing when the user wanted it to stop. It worked great, was simple, and I never had any problems or bugs related to it in that instance.
setjmp/longjmp have their place -- but that place is one you'll not likely visit but once in a very long while.
Edit: I just looked at the code. It was actually siglongjmp() that I used, not longjmp (not that it's a big deal, but I had forgotten that siglongjmp even existed.)
It never was, as long as you were able to think for yourself.
Because goto can be used for confusing metaprogramming
Goto is both a high-level and a low-level control expression, and as a result it just doesn't have a appropriate design pattern suitable for most problems.
It's low-level in the sense that a goto is a primitive operation that implements something higher like while or foreach or something.
It's high-level in the sense that when used in certain ways it takes code that executes in a clear sequence, in an uninterrupted fashion, except for structured loops, and it changes it into pieces of logic that are, with enough gotos, a grab-bag of logic being dynamically reassembled.
So, there is a prosaic and an evil side to goto.
The prosaic side is that an upward pointing goto can implement a perfectly reasonable loop and a downward-pointing goto can do a perfectly reasonable break or return. Of course, an actual while, break, or return would be a lot more readable, as the poor human wouldn't have to simulate the effect of the goto in order to get the big picture. So, a bad idea in general.
The evil side involves a routine not using goto for while, break, or return, but using it for what's called spaghetti logic. In this case the goto-happy developer is constructing pieces of code out of a maze of goto's, and the only way to understand it is to simulate it mentally as a whole, a terribly tiring task when there are many goto's. I mean, imagine the trouble of evaluating code where the else is not precisely an inverse of the if, where nested ifs might allow in some things that were rejected by the outer if, etc, etc.
Finally, to really cover the subject, we should note that essentially all early languages except Algol initially made only single statements subject to their versions of if-then-else. So, the only way to do a conditional block was to goto around it using an inverse conditional. Insane, I know, but I've read some old specs. Remember that the first computers were programmed in binary machine code so I suppose any kind of an HLL was a lifesaver; I guess they weren't too picky about exactly what HLL features they got.
Having said all that I used to stick one goto into every program I wrote "just to annoy the purists".
If you're writing a VM in C, it turns out that using (gcc's) computed gotos like this:
char run(char *pc) {
void *opcodes[3] = {&&op_inc, &&op_lda_direct, &&op_hlt};
#define NEXT_INSTR(stride) goto *(opcodes[*(pc += stride)])
NEXT_INSTR(0);
op_inc:
++acc;
NEXT_INSTR(1);
op_lda_direct:
acc = ram[++pc];
NEXT_INSTR(1);
op_hlt:
return acc;
}
works much faster than the conventional switch inside a loop.
Denying the use of the GOTO statement to programmers is like telling a carpenter not to use a hammer as it Might damage the wall while he is hammering in a nail. A real programmer Knows How and When to use a GOTO. I’ve followed behind some of these so-called ‘Structured Programs’ I’ve see such Horrid code just to avoid using a GOTO, that I could shoot the programmer. Ok, In defense of the other side, I’ve seen some real spaghetti code too and again, those programmers should be shot too.
Here is just one small example of code I’ve found.
YORN = ''
LOOP
UNTIL YORN = 'Y' OR YORN = 'N' DO
CRT 'Is this correct? (Y/N) : ':
INPUT YORN
REPEAT
IF YORN = 'N' THEN
CRT 'Aborted!'
STOP
END
-----------------------OR----------------------
10: CRT 'Is this Correct (Y)es/(N)o ':
INPUT YORN
IF YORN='N' THEN
CRT 'Aborted!'
STOP
ENDIF
IF YORN<>'Y' THEN GOTO 10
"In this link http://kerneltrap.org/node/553/2131"
Ironically, eliminating the goto introduced a bug: the spinlock call was omitted.
The original paper should be thought of as "Unconditional GOTO Considered Harmful". It was in particular advocating a form of programming based on conditional (if) and iterative (while) constructs, rather than the test-and-jump common to early code. goto is still useful in some languages or circumstances, where no appropriate control structure exists.
About the only place I agree Goto could be used is when you need to deal with errors, and each particular point an error occurs requires special handling.
For instance, if you're grabbing resources and using semaphores or mutexes, you have to grab them in order and you should always release them in the opposite manner.
Some code requires a very odd pattern of grabbing these resources, and you can't just write an easily maintained and understood control structure to correctly handle both the grabbing and releasing of these resources to avoid deadlock.
It's always possible to do it right without goto, but in this case and a few others Goto is actually the better solution primarily for readability and maintainability.
-Adam
One modern GOTO usage is by the C# compiler to create state machines for enumerators defined by yield return.
GOTO is something that should be used by compilers and not programmers.
Until C and C++ (amongst other culprits) have labelled breaks and continues, goto will continue to have a role.
If GOTO itself were evil, compilers would be evil, because they generate JMPs. If jumping into a block of code, especially following a pointer, were inherently evil, the RETurn instruction would be evil. Rather, the evil is in the potential for abuse.
At times I have had to write apps that had to keep track of a number of objects where each object had to follow an intricate sequence of states in response to events, but the whole thing was definitely single-thread. A typical sequence of states, if represented in pseudo-code would be:
request something
wait for it to be done
while some condition
request something
wait for it
if one response
while another condition
request something
wait for it
do something
endwhile
request one more thing
wait for it
else if some other response
... some other similar sequence ...
... etc, etc.
endwhile
I'm sure this is not new, but the way I handled it in C(++) was to define some macros:
#define WAIT(n) do{state=(n); enque(this); return; L##n:;}while(0)
#define DONE state = -1
#define DISPATCH0 if state < 0) return;
#define DISPATCH1 if(state==1) goto L1; DISPATCH0
#define DISPATCH2 if(state==2) goto L2; DISPATCH1
#define DISPATCH3 if(state==3) goto L3; DISPATCH2
#define DISPATCH4 if(state==4) goto L4; DISPATCH3
... as needed ...
Then (assuming state is initially 0) the structured state machine above turns into the structured code:
{
DISPATCH4; // or as high a number as needed
request something;
WAIT(1); // each WAIT has a different number
while (some condition){
request something;
WAIT(2);
if (one response){
while (another condition){
request something;
WAIT(3);
do something;
}
request one more thing;
WAIT(4);
}
else if (some other response){
... some other similar sequence ...
}
... etc, etc.
}
DONE;
}
With a variation on this, there can be CALL and RETURN, so some state machines can act like subroutines of other state machines.
Is it unusual? Yes. Does it take some learning on the part of the maintainer? Yes. Does that learning pay off? I think so. Could it be done without GOTOs that jump into blocks? Nope.
I actually found myself forced to use a goto, because I literally couldn't think of a better (faster) way to write this code:
I had a complex object, and I needed to do some operation on it. If the object was in one state, then I could do a quick version of the operation, otherwise I had to do a slow version of the operation. The thing was that in some cases, in the middle of the slow operation, it was possible to realise that this could have been done with the fast operation.
SomeObject someObject;
if (someObject.IsComplex()) // this test is trivial
{
// begin slow calculations here
if (result of calculations)
{
// just discovered that I could use the fast calculation !
goto Fast_Calculations;
}
// do the rest of the slow calculations here
return;
}
if (someObject.IsmediumComplex()) // this test is slightly less trivial
{
Fast_Calculations:
// Do fast calculations
return;
}
// object is simple, no calculations needed.
This was in a speed critical piece of realtime UI code, so I honestly think that a GOTO was justified here.
Hugo
One thing I've not seen from any of the answers here is that a 'goto' solution is often more efficient than one of the structured programming solutions often mentioned.
Consider the many-nested-loops case, where using 'goto' instead of a bunch of if(breakVariable) sections is obviously more efficient. The solution "Put your loops in a function and use return" is often totally unreasonable. In the likely case that the loops are using local variables, you now have to pass them all through function parameters, potentially handling loads of extra headaches that arise from that.
Now consider the cleanup case, which I've used myself quite often, and is so common as to have presumably been responsible for the try{} catch {} structure not available in many languages. The number of checks and extra variables that are required to accomplish the same thing are far worse than the one or two instructions to make the jump, and again, the additional function solution is not a solution at all. You can't tell me that's more manageable or more readable.
Now code space, stack usage, and execution time may not matter enough in many situations to many programmers, but when you're in an embedded environment with only 2KB of code space to work with, 50 bytes of extra instructions to avoid one clearly defined 'goto' is just laughable, and this is not as rare a situation as many high-level programmers believe.
The statement that 'goto is harmful' was very helpful in moving towards structured programming, even if it was always an over-generalization. At this point, we've all heard it enough to be wary of using it (as we should). When it's obviously the right tool for the job, we don't need to be scared of it.
I avoid it since a coworker/manager will undoubtedly question its use either in a code review or when they stumble across it. While I think it has uses (the error handling case for example) - you'll run afoul of some other developer who will have some type of problem with it.
It’s not worth it.
Almost all situations where a goto can be used, you can do the same using other constructs. Goto is used by the compiler anyway.
I personally never use it explicitly, don't ever need to.
You can use it for breaking from a deeply nested loop, but most of the time your code can be refactored to be cleaner without deeply nested loops.

How to store goto labels in an array and then jump to them?

I want to declare an array of "jumplabels".
Then I want to jump to a "jumplabel" in this array.
But I have not any idea how to do this.
It should look like the following code:
function()
{
"gotolabel" s[3];
s[0] = s0;
s[1] = s1;
s[2] = s2;
s0:
....
goto s[v];
s1:
....
goto s[v];
s2:
....
goto s[v];
}
Does anyone have a idea how to perform this?
It is possible with GCC feature known as "labels as values".
void *s[3] = {&&s0, &&s1, &&s2};
if (n >= 0 && n <=2)
goto *s[n];
s0:
...
s1:
...
s2:
...
It works only with GCC!
goto needs a compile-time label.
From this example it seems that you are implementing some kind of state machine. Most commonly they are implemented as a switch-case construct:
while (!finished) switch (state) {
case s0:
/* ... */
state = newstate;
break;
/* ... */
}
If you need it to be more dynamic, use an array of function pointers.
There's no direct way to store code addresses to jump to in C.
How about using switch.
#define jump(x) do{ label=x; goto jump_target; }while(0)
int label=START;
jump_target:
switch(label)
{
case START:
/* ... */
case LABEL_A:
/* ... */
}
You can find similar code produced by every stack-less parser / state machine generator.
Such code is not easy to follow so unless it is generated code or your problem is most
easily described by state machine I would recommend not do this.
could you use function pointers instead of goto?
That way you can create an array of functions to call and call the appropriate one.
In plain standard C, this not possible as far as I know. There is however an extension in the GCC compiler, documented here, that makes this possible.
The extension introduces the new operator &&, to take the address of a label, which can then be used with the goto statement.
That's what switch statements are for.
switch (var)
{
case 0:
/* ... */
break;
case 1:
/* ... */
break;
default:
/* ... */
break; /* not necessary here */
}
Note that it's not necessarily translated into a jump table by the compiler.
If you really want to build the jump table yourself, you could use a function pointers array.
You might want to look at setjmp/longjmp.
You can't do it with a goto - the labels have to be identifiers, not variables or constants. I can't see why you would not want to use a switch here - it will likely be just as efficient, if that is what is concerning you.
For a simple answer, instead of forcing compilers to do real stupid stuff, learn good programming practices.
Tokenizer? This looks like what gperf was made for. No really, take a look at it.
Optimizing compilers (including GCC) will compile a switch statement into a jump table (making a switch statement exactly as fast as the thing you're trying to construct) IF the following conditions are met:
Your switch cases (state numbers) start at zero.
Your switch cases are strictly increasing.
You don't skip any integers in your switch cases.
There are enough cases that a jump table is actually faster (a couple dozen compare-and-gotos in the checking-each-case method of dealing with switch statements is actually faster than a jump table.)
This has the advantage of allowing you to write your code in standard C instead of relying on a compiler extension. It will work just as fast in GCC. It will also work just as fast in most optimizing compilers (I know the Intel compiler does it; not sure about Microsoft stuff). And it will work, although slower, on any compiler.

Guideline: while vs for

Disclaimer: I tried to search for similar question, however this returned about every C++ question... Also I would be grateful to anyone that could suggest a better title.
There are two eminent loop structure in C++: while and for.
I purposefully ignore the do ... while construct, it is kind of unparalleled
I know of std::for_each and BOOST_FOREACH, but not every loop is a for each
Now, I may be a bit tight, but it always itches me to correct code like this:
int i = 0;
while ( i < 5)
{
// do stuff
++i; // I'm kind and use prefix notation... though I usually witness postfix
}
And transform it in:
for (int i = 0; i < 5; ++i)
{
// do stuff
}
The advantages of for in this example are multiple, in my opinion:
Locality: the variable i only lives in the scope of the loop
Pack: the loop 'control' is packed, so with only looking at the loop declaration I can figure if it is correctly formed (and will terminate...), assuming of course that the loop variable is not further modified within the body
It may be inlined, though I would not always advised it (that makes for tricky bugs)
I have a tendency therefore not to use while, except perhaps for the while(true) idiom but that's not something I have used in a while (pun intended). Even for complicated conditions I tend to stick to a for construct, though on multiple lines:
// I am also a fan of typedefs
for (const_iterator it = myVector.begin(), end = myVector.end();
it != end && isValid(*it);
++it)
{ /* do stuff */ }
You could do this with a while, of course, but then (1) and (2) would not be verified.
I would like to avoid 'subjective' remarks (of the kind "I like for/while better") and I am definitely interested to references to existing coding guidelines / coding standards.
EDIT:
I tend to really stick to (1) and (2) as far as possible, (1) because locality is recommended >> C++ Coding Standards: Item 18, and (2) because it makes maintenance easier if I don't have to scan a whole body loop to look for possible alterations of the control variable (which I takes for granted using a for when the 3rd expression references the loop variables).
However, as gf showed below, while do have its use:
while (obj.advance()) {}
Note that this is not a rant against while but rather an attempt to find which one of while or for use depending on the case at hand (and for sound reasons, not merely liking).
Not all loops are for iteration:
while(condition) // read e.g.: while condition holds
{
}
is ok, while this feels forced:
for(;condition;)
{
}
You often see this for any input sources.
You might also have implicit iteration:
while(obj.advance())
{
}
Again, it looks forced with for.
Additionally, when forcing for instead of while, people tend to misuse it:
for(A a(0); foo.valid(); b+=x); // a and b don't relate to loop-control
Functionally, they're the same thing, of course. The only reason to differentiate is to impart some meaning to a maintainer or to some human reader/reviewer of the code.
I think the while idiom is useful for communicating to the reader of the code that a non-linear test is controlling the loop, whereas a for idiom generally implies some kind of sequence. My brain also kind of "expects" that for loops are controlled only by the counting expression section of the for statement arguments, and I'm surprised (and disappointed) when I find someone conditionally messing with the index variable inside the execution block.
You could put it in your coding standard that "for" loops should be used only when the full for loop construct is followed: the index must be initialized in the initializer section, the index must be tested in the loop-test section, and the value of the index must only be altered in the counting expression section. Any code that wants to alter the index in the executing block should use a while construct instead. The rationale would be "you can trust a for loop to execute using only the conditions you can see without having to hunt for hidden statements that alter the index, but you can't assume anything is true in a while loop."
I'm sure there are people who would argue and find plenty of counter examples to demonstrate valid uses of for statements that don't fit my model above. That's fine, but consider that your code can be "surprising" to a maintainer who may not have your insight or brilliance. And surprises are best avoided.
i does not automatically increase within a while loop.
while (i < 5) {
// do something with X
if (X) {
i++;
}
}
One of the most beautiful stuff in C++ is the algorithms part of STL. When reading code written properly using STL, the programmer would be reading high-level loops instead of low-level loops.
I don't believe that compilers can optimize significantly better if you chose to express your loop one way or the other. In the end it all boils down to readability, and that's a somewhat subjective matter (even though most people probably agree on most examples' readability factor).
As you have noticed, a for loop is just a more compact way of saying
type counter = 0;
while ( counter != end_value )
{
// do stuff
++counter;
}
While its syntax is flexible enough to allow you to do other things with it, I try to restrict my usage of for to examples that aren't much more complicated than the above. OTOH, I wouldn't use a while loop for the above.
I tend to use for loops when there is some kind of counting going on and the loop ends when the counting ends. Obviously you have your standard for( i=0; i < maxvalue; i++ ), but also things like for( iterator.first(); !iterator.is_done(); iterator.next() ).
I use while loops when it's not clear how many times the loop might iterate, i.e. "loop until some condition that cannot be pre-computed holds (or fails to hold)".
// I am also a fan of typedefs
for (const_iterator it = myVector.begin(), end = myVector.end();
it != end && isValid(*it);
++it)
{ /* do stuff */ }
It seems to me that the above code, is rather less readable than the code below.
// I am also a fan of typedefs
const_iterator it = myVector.begin();
end = myVector.end();
while(it != end && isValid(*it))
{
/* do stuff */
++it}
Personally, I think legibility trumps these kind of formatting standards. If another programmer can't easily read your code, that leads to mistakes at worst, and at best it results in wasted time which costs the company money.
In Ye Olde C, for and while loops were not the same.
The difference was that in for loops, the compiler was free to assign a variable to a CPU register and reclaim the register after the loop. Thus, code like this had non-defined behaviour:
int i;
for (i = 0; i < N; i++) {
if (f(i)) break;
}
printf("%d", i); /* Non-defined behaviour, unexpected results ! */
I'm not 100% sure, but I believe this is described in K&R
This is fine:
int i = 0;
while (i < N) {
if (f(i)) break;
i++;
}
printf("%d", i);
Of course, this is compiler-dependent. Also, with time, compilers stopped making use of that freedom, so if you run the first code in a modern C compiler, you should get the expected results.
I wouldn't be so quick to throw away do-while loops. They are useful if you know your loop body will run at least once. Consider some code which creates one thread per CPU core. With a for loop it might appear:
for (int i = 0; i < number_of_cores; ++i)
start_thread(i);
Uentering the loop, the first thing that is checked is the condition, in case number_of_cores is 0, in which case the loop body should never run. Hang on, though - this is a totally redundant check! The user will always have at least one core, otherwise how is the current code running? The compiler can't eliminate the first redundant comparison, as far as it knows, number_of_cores could be 0. But the programmer knows better. So, eliminating the first comparison:
int i = 0;
do {
start_thread(i++);
} while (i < number_of_cores);
Now every time this loop is hit there is only one comparison instead of two on a one-core machine (with a for loop the condition is true for i = 0, false for i = 1, whereas the do while is false all the time). The first comparison is omitted, so it's faster. With less potential branching, there is less potential for branch mispredicts, so it is faster. Because the while condition is now more predictable (always false on 1-core), the branch predictor can do a better job, which is faster.
Minor nitpick really, but not something to be thrown away - if you know the body will always run at least once, it's premature pessimization to use a for loop.