Why calling main() is not allowed in C++ - c++

C++03 3.6.1.3: The function main shall not be used (3.2) within a program. ...
I wonder why this rule exists... Is anyone aware of any system/implementation where it would be a problem if main were used?
P.S. 1. I know the definition of the term used. 2. I know there are simple workarounds like calling a single MyMain() from main() and using MyMain() instead. 3. The question is about real-world implementations which would have a problem if the restriction weren't there. Thanks!

In addition to the other answers: The c++ spec guarantees that all static initialization has happened before main is called.
If code could call main then some static scoped object could call main, in which case a fundamental guarantee is violated.
The spec can't say "static scoped objects should not call main()" because many objects are not written specifically to be instantiated at static scope always. It also can't say that constructors should not call main() - because its very difficult to audit and prove that a constructor isn't calling a method, calling a method, that might sometimes, call main().

I'd imagine this preserves an implementation's freedom to prefix main with code to construct globals and statics, accept any parameters identifying the environment and command line arguments and map them to the argc/argv/env conventions of C++, construct an appropriate stack and exception framework for the application to execute etc. Consider that not all environments may allow an executable image to have any other symbol designated as initialisation code to be run before main().
Similarly, cleanup code may be appended to main(), along with a call to the OS with some mapping from the 0/non-zero convention of C and C++ to the actual success/failure values used by that specific OS.
Consequently, calling main from elsewhere could attempt a second reinitialisation of the application framework or force an unintended exit to the OS - sounds catastrophic to me.

C++'s main() is a strange little function that has different syntax for exception-handling, doesn't have to return a value, even though it has to be defined as returning int, etc. I don't know whether this affects any real implementations, but I would guess that the restriction exists to give compiler writers some latitude in how they implement main().

Related

What is a magic function in C/C++ (in regards to OpenMP)

Currently looking at this guide to using OpenMP with C/C++ programs and wonder what they mean by creating a magic function in the quote below:
Internally, GCC implements this by creating a magic function and
moving the associated code into that function, so that all the
variables declared within that block become local variables of that
function (and thus, locals to each thread). ICC, on the other hand,
uses a mechanism resembling fork(), and does not create a magic
function. Both implementations are, of course, valid, and semantically
identical.
A "magic" function is a function created by the compiler - its magicness comes from the fact that you as a programmer don't need to do anything, it's "magically done for you".

Is the C++ calling convention constrained by the standard, since the return type of a function does not need to be defined when the fn is declared?

While studying the One Definition Rule in Wikipedia, I became stuck on the following example in the Examples section:
struct S; // declaration of S
...
S f(); // ok, no definition required
...
I know that space on the stack needs to be allotted for the return value, but seeing this example made me think that C++ calling conventions might dictate that stack management for the return value is handled by the code block in which the function is defined, rather than the code block in which it is called. So I investigated "C vs. C++ calling convention" (recalling that the issue of stack return value allocation might be a primary difference), and came across this answer, which indicates that "calling convention" is not defined by the standard.
However, given the apparent requirement that the above code snippet is valid, it seems to me that there must be some constraints on calling convention in order to support the above code snippet.
Am I right? Does the C++ standard implicitly require that stack management for the return value of a function be handled by the code that defines the function, in order to support the syntax above?
As mentioned in the comments
As you have written your example, Both Struct S and function f are forward declarations. The Compiler Will indeed complain if you attempt to use either
** EDIT as noted by Steven Sudit, function f is not a forward declaration but a function prototype**
and
Also, I believe that default calling convention ( and optional calling conventions ) are explicitly implementation dependent with the exception of those with external linkage. If you search the c++ standard for "calling convention". It is mentioned only once in section 7.5 Linkage Specifications
As to your specific question
Am I right? Does the C++ standard implicitly require that stack management for the return value of a function be handled by the code that defines the function, in order to support the syntax above?
Definitely not, as many compilers support calling conventions where the values are not even passed/returned on the stack (FASTCALL) or microsofts version of (thiscall) where the caller cleans the stack.
The C/C++ standard does not define calling conventions. That is the job of compiler vendors to implement on their own, as evident by the fact that calling convention keywords start with underscores indicating they are vendor-provided extensions.
The C/C++ standard defines the base rules (how to assign values to parameters and return values, pass by-value vs by-reference, etc), but the calling conventions dictate how to accomplish those rules in different ways (passing parameters via stack or registers, in which order, which registers, who cleans up the stack, etc).
In the casev of x86, vendors have agreed on the semantics of the __cdecl and __stdcall calling conventions for interoperability (although there are some slight variations in __cdecl implementations in some cases), but other calling conventions are vendor-specific (Microsoft's __fastcall/__thiscall, Borland's __fastcall/__safecall/__msfastcall, etc).
In the case of x64, there is only one calling convention, dictated by x64 itself. Calling convention keywords are silently ignored by x64 compiler so existing code will still compile and work correctly (as long as it is not using inline assembly to access/manipulate the call stack directly).

Using constructor of static data to perform work before main()

Our system has a plugin-based architecture with each module effectively having a 'main' function. I need to have a small piece of code run before a module's main() is invoked. I've had success putting the code in the constructor of a dummy class, then declaring one static variable of that class, eg:
namespace {
class Dummy {
public:
Dummy() { /* do work here */ }
};
Dummy theDummy;
}
void main() {...}
This seems to work well, but is it a valid solution in terms of the compiler guaranteeing the code will run? Is there any chance it could detect that theDummy is not referenced anywhere else in the system and compile/link it away completely, or will it realise that the constructor needs to run? Thanks
This seems to work well, but is it a valid solution in terms of the compiler guaranteeing the code will run? Is there any chance it could detect that theDummy is not referenced anywhere else in the system and compile/link it away completely, or will it realise that the constructor needs to run?
See n3797 S3.7.1/2:
If a variable with static storage duration has initialization or a destructor with side effects, it shall not be eliminated even if it appears to be unused,
Yes, the initialisation has to run. It cannot be simply omitted.
See S3.6.2/4:
It is implementation-defined whether the dynamic initialization of a non-local variable with static storage duration is done before the first statement of main. If the initialization is deferred to some point in time after the first statement of main, it shall occur before the first odr-use (3.2) of any function or variable defined in the same translation unit as the variable to be initialized.
Yes, the initialisation has to be completed before any code runs in the same translation unit.
The use of an entry point called main() in your plugin is of no particular importance.
You're good to go.
As per a comment, you do need to make sure that your Dummy constructor and your main function are in the same translation unit for this to work. If they were compiled separately and only linked together this guarantee would not apply.
Don't call your function main() unless it is a program entry point. If it is, then you are guaranteed that static object constructors will be called before main().
Once main has started, it's guaranteed to run before any function or variable in the same translation unit is used.
So if, as here, it's in the same translation unit as main, then it's guaranteed to run before main. If it's in another transation unit, then it's implementation-defined whether it will be run before main. In the worst case, if the program doesn't use anything from the same translation unit, it might not run at all.
In general, a compiler is allowed to optimize something out only if it can be sure that the semantics are the same. So if you call any function that it can't see into, for example, then it must assume that the function has side effects, and won't optimize the code out.
Note that you may have initialization order issues between translation units, however, since the initialization order of static objects between TUs is in general not guaranteed. It is, however, guaranteed that the constructor will be called before the "main" for your module is entered (assuming same TU). See Section 3.6.2 of the C++11 standard for full details.
If a platform-specific mechanism will work for you, look into using a function attribute, which is supported by g++ and clang++.

How function declared __declspec(naked) stores local variables?

__declspec(naked) void printfive() {
int i = 5;
printf("%i\n", i);
}
for some reason this code works, but I do not understand where the i is stored? In the frame of the calling function? It becomes global variable? If it is stored in the caller's frame, then how compiler knows the displacement, because you can call printfive() from different functions with different frame size and local variables. If it is global, or something like static maybe, I have tried to go recursive and I can see that variable is not changed, it is not truly local indeed. But that's obvious, there is no entry code (prolog). Ok, I understand, no prolog, no frame, no register change, but this is values, but what happens to the scope? Is the behaviour of this specifier defined in any reference? Is this part of C++ standard anyway? This sort of functions are great if you mostly use asm {} inside them, (or use asm to call them and want to be sure the function is not over optimized), but you can mix with C++. But this is sort of brain twister.
I know this topic is more than several years old, Here are my own answers.
Since no references are made to Microsoft documentation regarding this topic, for those who care to learn more about what is needed or not as stated by Keltar, here is the Microsoft documentation that explains most of what Keltar didn't explain here.
According to Microsoft documentation, and should be avoided.
The following rules and limitations apply to naked functions:
The return statement is not permitted.
Structured Exception Handling and C++ Exception Handling constructs are not permitted because they must unwind across the stack frame.
For the same reason, any form of setjmp is prohibited
Use of the _alloca function is prohibited.
To ensure that no initialization code for local variables appears before the prolog sequence, initialized local variables are not
permitted at function scope. In particular, the declaration of C++
objects are not permitted at function scope. There may, however, be
initialized data in a nested scope.
Frame pointer optimization (the /Oy compiler option) is not recommended, but it is automatically suppressed for a naked function.
You cannot declare C++ class objects at the function lexical scope. You can, however, declare objects in a nested block.
From gcc manual:
Use this attribute ... to indicate that the specified function does
not need prologue/epilogue sequences generated by the compiler. It is
up to the programmer to provide these sequences. The only statements
that can be safely included in naked functions are asm statements that
do not have operands. All other statements, including declarations of
local variables, if statements, and so forth, should be avoided. Naked
functions should be used to implement the body of an assembly
function, while allowing the compiler to construct the requisite
function declaration for the assembler.
And it isn't standard (as well as any __declspec or __attribute__)
When entering or exiting a function, the compiler adds code to help with the passing or parameters. When a function is declared naked, non of the parameter variables assignment code is generated, if you want to get at any of the parameters, you will need to directly access the relevant registers or stack (depending on the ABI defined calling convention).
In your case, you are not passing parameters to the function, therefore your code works even though the function is declared naked. Take a look at the dis-assembler if you want to see the difference.

Is the main() function re-entrant?

I heard that in C, main() is reentrant, while in C++ is not.
Is this true? What is the scenario of re-entering the main() function?
Early C++ implementations, which were based on translation to C, implemented global constructors via adding a function call to the beginning of main. Under such an implementation, calling main again would re-run the global ctors, resulting in havoc, so it was simply forbidden to do so.
C on the other hand had no reason to forbid calling main, and it was always traditionally possible.
As for when it's useful, I would say "rarely". Most of the programs I've seen that called main were IOCCC entries.