Why do we have different versions of main functions in c++? [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the difference between _tmain() and main() in C++?
How the void main(...), int main(..) and int _tmain() differs. They all are single entry and single exit systems. But in what situations we use these start-up functions?

void main() is invalid; the C++ standard requires main to return int. Some compilers let you get away with it.
int main() is one of the two standard forms. The other is int main(int argc, char *argv[]), which can be used to receive command-line arguments. Implementations may permit other forms, but are not required to -- but all such forms must return int.
int _tmain() is specific to Microsoft.

The reason why different implementations support different entry points (or choices of entry points) is that different OSes or environments have different ways of running programs, or different ways of passing argument information into the program.
There are two kinds of C++ implementation:
"hosted" implementations assume the existence of some kind of OS. On hosted implementations, main is required in conforming programs and must return int.
"freestanding" implementations don't assume the existence of an OS. On freestanding implementations it's up to the implementation whether to require main or not, but the standard does still say that if main is required then it must return int.
It's common practice for implementations to provide the facilities of a hosted implementation, but to allow entry points other than main. This conforms to the standard for a hosted implementation, provided that a conforming program that does define main is accepted. In effect the implementation allows (as an extension) certain non-conforming programs with no main function provided they contain instead the implementation-defined alternative. Technically I think it must diagnose the "error", but in practice nobody would use such an extension by accident, so they probably don't want to see a diagnostic.
Similarly, a conforming implementation can accept a program containing void main. Again, for the implementation to conform it must diagnose that the program doesn't conform.
The meaning of a non-conforming program that the implementation accepts anyway, is up to the implementation.
_tmain is a MS extension. It is an alias for main in narrow-character builds and wmain in wide-character builds. wmain is also a MS extension, it's like main except that argv are provided as wide strings instead of narrow strings. So this is an example of an environment where there are two different ways to give arguments to programs, depending whether or not the program handles characters outside the range of narrow characters (i.e. outside the 8-bit code page).

Related

Why does modern C++ still retain the old C style prototype for main with int argc, char** argv

Other than backward compatibility, is there any specific reason for not changing the prototype and benefit from modern features of C++?
Modern C++ encourages the use of value semantics.
Why should argv still be a char array with pointer semantics, which could possible lead to issues if not properly handled?
In Java we have the class with, void main(String[] args).
Am I missing anything fundamental?
Other than backward compatibility
This is the reason and considered most important in moving forwards with C++.
But there is a proposal: A Modern C++ Signature for main
Many systems contain code written in languages other than C++. While it would be possible to have the main entry-point function in user code accept C++ string parameters and then have an implementation-supplied wrapper which translates the underlying platform's arguments into C++ strings and then passes them to the user-code entry point, this would only be advantageous in cases where the underlying environment represented strings in some fashion other than zero-terminated sequences of characters, and where the wrapper would be able to avoid having to first translate to C-style strings before translating into C++ strings. Otherwise, anything that could be done by the wrapper could be done just as well by user code.

What is the origin of void main?

Often times I see the infamous void main() around the forums and almost immediately a comment following the question telling the user to never use void main() (which I am in complete agreement with). But where is the origin of void main()?
Why am I still seeing newer people pick up the bad habit of having main return nothing when the proper way is to return an int.
I understand WHY this method is wrong as explained in this question and multitudes of others, but I don't how this method of declaring main came about or even why it is still taught to some students.
Even Bjarne Stroustrup has written void main, in C++, so it's indeed a common anti-meme, and an old one, predating Java and other contemporary languages that support void main. Of course Bjarne has also written that void main has never been part of either C or C++. However, for this latter statement (in his FAQ), at least as of C99 it looks as if Bjarne is wrong, because the N869 draft of the C99 standard says in its §5.1.2.2.3/1 that
“If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument; reaching the } that terminates the main function returns a value of 0. If the return type is not compatible with int, the termination status returned to the host environment is unspecified.”
And earlier, in its §5.1.2.2.1/1 it states about the signature of main,
“ or in some other implementation-defined manner.”
A return type “not compatible with int” could, for example, be void.
So, while this is not a complete answer (I doubt that historical sources about this are available on the net), at least it goes some way towards correcting the assumptions of the question. It is not the case that void main is a complete abomination in C and C++. But in C++ it's invalid: it's a C thing that's not supported in a hosted C++ implementation.
I have been a victim of this problem, so I think I can tell you why this happens, During our C lectures the faculties have to start our lectures using a sample program (probably "Hello World") and for that they have to use main() method.
But since they don't want to confuse students and also they don't want to get into the complexity of teaching the return types and return statements at the very start of their C programming lessons, they use(and also ask us to use) void main() and tell us to assume this as the default type till we study functions and return types in detail.
Hence this leads to develop a wrong habit of using void main() from the very first lecture of our C-Programming.
Hope that explains u well about why most of the Computer Programmers especially the newer ones pick up this bad practice.
Cheers,
Mayank
Personally I think it's the following: K&R C didn't require to specify a return type and implicitly assumed it to be int and at the same time the examples in K&R didn't use a return value.
For example the first code in K&R first edition is the following:
#include <stdio.h>
main()
{
printf("Hello World\n");
}
So it's no wonder that people reading this later (after a void type was added to the language as an extension by some compilers) assumed that main actually had a void return statement.. I would've done the same thing.
Actually K&R does say later:
In the interests of simplicity, we have omitted return statements from
our main functions up to this point, but we will include them
hereafter, as a reminder that programs should return status to their
environment.
So that's just another example of what happens when you write incorrect code and include a disclaimer later under the assumption that people will read everything before doing stupid things ;)
As one author amongst a number of others, Herbert Schildt wrote some popular but not necessarily high quality books which espoused the idea.
One egregious example is his The Annotated C Standard. He quotes the ISO/IEC 9899:1990 standard on left-hand pages and provides annotations on the right-hand pages. When he quotes section 5.1.2.2.1 Program Startup, it says:
The function called at program startup is named main. The implementation declares no prototype for this function. It can be defined with no parameters:
int main(void) { /* ... */ }
or with two parameters (…):
int main(int argc, char *argv[]) { /* ... */ }
This doesn't include the 'or in some other implementation-defined manner' clause that was added to C99.
Then, in the annotations, he says:
Interestingly, there is no prototype for main() declared by the compiler. You are therefore free to declare main() as required by your program. For example, here are three common methods of declaring main():
void main(void) /* no return value, no parameters */
int main(void) /* return a value, no parameters */
/* return a value and include command-line parameters */
int main(int argc, char *argv[])
The first variation is not allowed by the C90 standard, despite what he says, but innocent readers might be confused.
Note that section 5.1.2.2.3 Program termination says:
A return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument. If the main function executes a return that specifies no value, the termination status returned to the hosted environment is undefined.
Since you'd find that exit takes an int argument, it is clear from this that the return type of main should be int.
The commentary says:
In most implementations, the return value from main(), if there is one, is returned to the operating system. Remember, if you don't explicitly return a value from main() then the value passed to the operating system is, technically, undefined. Though most compilers will automatically return 0 when no other return value is specified (even when main() is declared as void), you should not rely on this fact because it is not guaranteed by the standard.
Some of this commentary is so much bovine excrement, a view in which I am not alone in holding. The only merit in the book is that it includes almost all of the C90 standard (there's one page missing from the description of fprintf — the same page got printed twice) for far less than the cost of the standard. It's been argued that the difference in price represents the loss of value from the commentary. See Lysator generally for some information on C, and Clive Feather's review of The Annotated C Standard.
Another of his books is C: The Complete Reference, which made it to at least the 4th Edition. The 3rd Edition used void main() extensively; this may have been cleaned up by the 4th Edition, but it's sad it took that many editions to get such a fundamental issue correct.
Embedded programs that run on bare metal, that is without an operating system, never return. On power up, the reset vector jumps indirectly (there is some memory initialization that happens first) to main and inside of main, there is an infinite while (1){} loop. Semantically, a return value for main doesn't make sense.
Possible reasons:
Java programmers used to writing public static void main(...).
Missing return statement could have some assume main does't return, although it implicitly returns 0.
In C you were able to write main() with no return type, and it would be int by default. Maybe some assume a missing return type is equivalent to a void.
Bad books / teachers?
From a C++ point-of-view, 3 sources of confusion exist:
Fundamentalist PC/desktop programmers who fanatically and blindly preach int main() without actually knowing the complete picture in the standard themselves. C and C++ have completely different rules for how main() should be declared in freestanding systems (when programming bare metal embedded systems or operative systems).
The C language, which historically has had different rules compared with C++. And in C, the rules for main() have changed over time.
Legacy compilers and coding standards from the dark ages, including programming teachers stuck in the 1980s.
I'll address each source of confusion in this answer.
The PC/desktop programmers are problematic since they assume that hosted systems are the only systems existing and therefore spread incorrect/incomplete propaganda about the correct form of main(), dogmatically stating that you must use int main(), incorrectly citing the standard while doing so, if at all.
Both the C and C++ standards has always listed two kinds of systems: freestanding and hosted.
In freestanding implementations, void main (void) has always been allowed in C. In C++, freestanding implementations are slightly different: a freestanding implementation may not name the entry function main() or it has to follow the stated forms that return int.
Not even Bjarne Stroustrup manages to cite the standards or explain this correctly/completely, so no wonder that the average programmer is confused! (He is citing the hosted environment sub-chapter and fails to cite all relevant parts of it).
This is all discussed in detail with references to the standard(s) here, Bjarne and others please read.
Regarding void main (void) in hosted systems, this originates way back, from somewhere in the dark ages before the ISO C standard, where everything was allowed.
I would suspect that the major culprit behind it is the Borland Turbo C compiler, which was already the market leader when ISO C was released in 1990. This compiler allowed void main (void).
And it should be noted that void main (void) for hosted implementations was implicitly forbidden in C90 for hosted systems, no implementation-defined forms were allowed. So Turbo C was never a strictly conforming implementation. Yet it is still used in schools (particularly in India)! Teaching every student incorrect programming standards from scratch.
Since C99, void main (void) and other forms became allowed in C, because of a strange sentence which was added: "or in some other implementation-defined manner". This is also discussed in the linked answer above, with references to the C99 rationale and other parts of the C standard that are assuming that a hosted system main() may not return int.
Therefore in C, void main (void) is (arguably) currently an allowed form for hosted implementations, given that the compiler documents what it does. But note that since this is implementation-defined behavior, it is the compiler that determines whether this form is allowed or not, not the programmer!
In C++, void main (void) is not an allowed form.

How CRT calls main , having different parameter

We can write main function in several ways,
int main()
int main(int argc,char *argv[])
int main(int argc,char *argv[],char * environment)
How run-time CRT function knows which main should be called. Please notice here, I am not asking about Unicode supported or not.
The accepted answer is incorrect, there's no special code in the CRT to recognize the kind of main() declaration.
It works because of the cdecl calling convention. Which specifies that arguments are pushed on the stack from right to left and that the caller cleans up the stack after the call. So the CRT simply passes all arguments to main() and pops them again when main() returns. The only thing you need to do is specify the arguments in the right order in your main() function declaration. The argc parameter has to be first, it is the one on the top of the stack. argv has to be second, etcetera. Omitting an argument makes no difference, as long as you omit all the ones that follow as well.
This is also why the printf() function can work, it has a variable number of arguments. With one argument in a known position, the first one.
In general, the compiler/linker would need to recognise the particular form of main that you are using and then include code to adapt that from the system startup function to your C or C++ main function.
It is true that specific compilers on specific platforms could get away without doing this, using the methods that Hans describes in his answer. However, not all platforms use the stack to pass parameters, and it is possible to write conforming C and C++ implementations which have incompatible parameter lists. For such cases, then the compiler/linker would need to determine which form of main to call.
Hmmm. It seems that perhaps the currently accepted answer, which indicates that the previously accepted answer is incorrect, is itself incorrect. The tags on this question indicate it applies to C++ as well as C, so I’ll stick to the C++ spec, not C99. Regardless of all other explanations or arguments, the primary answer to this question is that “main() is treated special in an implementation-defined way.” I believe that David's answer is technically more correct than Hans', but I'll explain it in more detail....
The main() function is a funny one, treated by the compiler & linker with behavior that matches no other function. Hans is correct that there is no special code in the CRT to recognize different signatures of main(), but his assertion that it “works because of the cdecl calling convention” applies only to specific platform(s), notably Visual Studio. The real reason that there’s no special code in the CRT to recognize different signatures of main() is that there’s no need to. And though it’s sort of splitting hairs, it’s the linker whose job it is to tie the startup code into main() at link time, it’s not the CRT’s job at startup time.
Much of how the main() function is treated is implementation-defined, as per the C++ spec (see Section 3.6, “Start and termination”). It’s likely that most implementations’ compilers treat main() implicitly with something akin to extern “C” linkage, leaving main() in a non-decorated state so that regardless of its function prototype, its linker symbol is the same. Alternatively, the linker for an implementation could be smart enough to scan through the symbol table looking for any whose decorated name resolves to some form of “[int|void] main(...)” (note that void as a return type is itself an implementation-specific thing, as the spec itself says that the return type of main() must be ‘int’). Once such a function is found in the available symbols, the linker could simply use that where the startup code refers to “main()”, so the exact symbol name doesn’t necessarily have to match anything in particular; it could even be wmain() or other, as long as either the linker knows what variations to look for, or the compiler endows all of the variations with the same symbol name.
Also key to note is that the spec says that main() may not be overloaded, so the linker shouldn’t have to “pick” between multiple user implementations of various forms of main(). If it finds more than one, that’s a duplicate symbol error (or other similar error) even if the argument lists don’t match. And though all implementations “shall” allow both
int main() { /* ... */ }
and
int main(int argc, char* argv[]) { /* ... */ }
they are also permitted to allow other argument lists, including the version you show that includes an environment string array pointer, and any other variation that makes sense in any given implementation.
As Hans indicates, the Visual Studio compiler’s cdecl calling convention (and calling conventions of many other compilers) provide a framework wherein a caller can set up the calling environment (i.e. the stack, or ABI-defined registers, or some combination of the two) in such a way that a variable number of arguments can be passed, and when the callee returns, the caller is responsible for cleanup (popping the used argument space off the stack, or in the case of registers, nothing needs done for cleanup). This setup lends itself neatly to the startup code passing more parameters than might be needed, and the user’s main() implementation is free to use or not use any of these arguments, as is the case with many platforms’ treatment of the various forms of main() you list in your question. However, this is not the only way a compiler+linker could accomplish this goal: Instead, the linker could choose between various versions of the startup code based on the definition of your main(). Doing so would allow a wide variety of main() argument lists that would otherwise be impossible with the cdecl caller-cleanup model. And since all of that is implementation-defined, it’s legal per the C++ spec, as long as the compiler+linker supports at least the two combinations shown above (int main() and int main(int, char**)).
The C 99 Standard (5.1.2.2.1 Program startup) says that an implementation enforces no prototype for the main() function, and that a program can define it as either of:
1) int main(void);
2) int main(int argc, char *argv[]);
or in a manner semantically equivalent to 2), e.g.
2') int main(int argc, char **argv);
or in other implementation defined ways. It does not mandate that the prototype:
3) int main(int argc, char *argv[],char * envp[]);
will have the intended behaviour - although that prototype must compile, because any prototype must compile. 3) is supported by GCC and Microsoft C among other compilers. (N.B. The questioner's
3rd prototype has char *envp rather than char *envp[], whether by accident or because he/she has some other compiler).
Both GCC and Microsoft C will compile main() with any prototype whatsoever, as they ought to. They parse the prototype that you actually specify and generate assembly language to consume the arguments, if any, in the correct manner. Thus for example they will each generate the expected behaviour for the program:
#include <stdio.h>
void main(double d, char c)
{
printf("%lf\n",d);
putchar(c);
}
if you could find a way of passing a double and a char directly to the program, not via an array of strings.
These observations can be verified by enabling the assembly language listings for experimental programs.
The question of how the compiler's standard CRT permits us to invoke the generated implementation of main() is distinct from the question of how main() may be defined to the compiler.
For both GCC and MS C, main() may defined any way we like. In each case however the implemention's standard CRT, AFIK, supports passing arguments to main() only than as per 3). So 1) - 2') will also have the expected behavior by ignoring excess arguments, and we have no other options short of providing a non-standard runtime of our own.
Hans Passant's answer seems incidentally misleading in suggesting that argc tells the function how many subsequent arguments to consume in the same manner as the first argument to printf(). If argc is present at all, it only denotes the number of elements in the the array passed as the second argument argv. It does not indicate how many arguments are passed to main(). Both GCC and MS C figure out how what arguments are expected by parsing the prototype that you write - essentially what a compiler does with any function except those, like printf(), that
are defined to take a variable number of arguments.
main() does not take a variable number of arguments. It takes the arguments you specify in your definition, and the standard CRTs of the usual compilers assume them to be (int, char *[], char *[]).
First, the main function is treated specifically in GCC (e.g. the main_identifier_node in file gcc/c-family/c-common.c of the source tree of GCC 4.7)
And the C11 and C++11 standards have specific wording and specification about it.
Then, the C calling ABI conventions are usually so that extra arguments don't harm much.
So you can think of it as if both the language specification and the compiler have specific things regarding "overloading" of main.
I even think that main might not be an ordinary function. I believe that some words in the standard -which I don't have right now- might be e.g. understood as forbidding taking its address or recursing on main.
In practice, main is called by some assembly code compiled into crt*.o files linked by gcc. Use gcc -v to understand more what is happenning.

Why is it bad to type void main() in C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Difference between void main and int main?
Why is
void main() {
//return void
}
bad?
The other day I typed this and someone pointed out to me that it is wrong to do so. I was so confused. I have been writing like this for a while now, I know it isn't C++ standard, but the compiler doesn't give out any warnings. Why is this wrong?
Because the compiler you use does not error out on it, it doesn't mean other compilers won't. You know its not standard, after all...
It is wrong exactly because it is not standard. One compiler might accept this, another might complain, and the pedantic believers will burn your ass on the stake anyways.
Because every program should indicate to other programs whether or not it completed successfully, or if there was some sort of error, and you can't do that if your main doesn't return anything.
Plus, the standard says that main should return an int.
It's wrong because the standard (at least C++03) states that main should return an int (for hosted environments, that is - freestanding environments like embedded systems can pretty well do whatever they want). From 3.6.1 Main function, paragraph 2:
An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a return type of type int, but otherwise its type is implementation-defined.
All implementations shall allow both of the following definitions of main: int main() { /* ... */ } and int main(int argc, char* argv[]) { /* ... */ }.
If you value portability at all (and you should), you should writ code that conforms with the standard as much as practicable.
Undefined behaviour like:
x = x++ + --x;
may work (for whatever definition of "work" you have) under some circumstances as well, that doesn't make it a good idea :-)
It's nonstandard.
i.e. you're not writing "C++" (as it was conceived) when you write this. It might look like C++, but you're not following the rules, so you're not actually writing C++.
Also its result is undefined in most cases.
Unlike in other languages like C++ or C#, where "bad" behavior causes errors, C++ allows anything to happen when an erroneous construct is used. So you can't depend on the compiler doing the "correct" thing, because it may do so one time, but not another.
In general, you want to avoid undefined behavior, so you shouldn't do this.

C++ void return type of main()

Some C++ compilers allow the main function to have return type void. But doesn't the Operating System require int type value returned to specify whether the program ended well or not?
C++ does not allow main to have a void return type. The published C++ standard requires it to be int. Some C++ compilers allow you to use void, but that's not recommended. In general, the OS doesn't care one way or the other. A specific OS might require a program to give a return value, but it doesn't necessarily have to come from main's return value. If the C++ compiler allows void, then it probably provides some other means of specifying the program's exit code.
C++ allows main function to have return type void
No, it doesn't.
The C++ standard only requires 2 different types of main signatures. Others may be optionally added if the return type is int.
Implementations of C++ which allow void return types are incorrect in terms of the C++ standard.
C++03 standard S. 3.6.1-2:
An implementation shall not predefine the main function.
This function shall not be overloaded. It shall
have a return type of type int, but otherwise its type is implementation-defined.
All implementations shall allow both
of the following definitions of main:
int main() { /* ... */ }
int main(int argc, char* argv[]) {/* ... */ }
If you want portable C++ code, or to write good C++ examples then you should always use one of the 2 variations above.
main returning void is accepted for backwards compatibility, but it is not legal.
In this case, the exit code will be 0. You can still change the exit code, using exit function.
The C++ standard does not allow main() to have a return type of void. Most compilers will let it pass for historical reasons, though.
In languages where a void return from main is legal (not C++), the OS usually sees a return value of 0 on normal (non-exceptional) program termination.
That's why void main() is not allowed by standard C++ - though some compilers (e.g. gcc) does allow it.
To make it short: always use int main(), never void main().
Depending on the compiler, you may be able to use a void main function, however the proper way (that a truly standard compliant compiler should follow) is to return int with 0 being a nice & clean exit and anything else indicating that your program has done something wrong.
But doesn't OS require int type value returned to specify whether program
ended well or not?
Why would it always? On windows when you double click on the icon, the process dies after it ends. OS do not check for the return type there. Even on linux if you just run the binary as ./runBinary, it simply runs and exits. The OS do not show message by itself that it fails or succeeds.
All the above answers are right that the standard says it is int, but some compilers allow void too.