How function declared __declspec(naked) stores local variables? - c++

__declspec(naked) void printfive() {
int i = 5;
printf("%i\n", i);
}
for some reason this code works, but I do not understand where the i is stored? In the frame of the calling function? It becomes global variable? If it is stored in the caller's frame, then how compiler knows the displacement, because you can call printfive() from different functions with different frame size and local variables. If it is global, or something like static maybe, I have tried to go recursive and I can see that variable is not changed, it is not truly local indeed. But that's obvious, there is no entry code (prolog). Ok, I understand, no prolog, no frame, no register change, but this is values, but what happens to the scope? Is the behaviour of this specifier defined in any reference? Is this part of C++ standard anyway? This sort of functions are great if you mostly use asm {} inside them, (or use asm to call them and want to be sure the function is not over optimized), but you can mix with C++. But this is sort of brain twister.

I know this topic is more than several years old, Here are my own answers.
Since no references are made to Microsoft documentation regarding this topic, for those who care to learn more about what is needed or not as stated by Keltar, here is the Microsoft documentation that explains most of what Keltar didn't explain here.
According to Microsoft documentation, and should be avoided.
The following rules and limitations apply to naked functions:
The return statement is not permitted.
Structured Exception Handling and C++ Exception Handling constructs are not permitted because they must unwind across the stack frame.
For the same reason, any form of setjmp is prohibited
Use of the _alloca function is prohibited.
To ensure that no initialization code for local variables appears before the prolog sequence, initialized local variables are not
permitted at function scope. In particular, the declaration of C++
objects are not permitted at function scope. There may, however, be
initialized data in a nested scope.
Frame pointer optimization (the /Oy compiler option) is not recommended, but it is automatically suppressed for a naked function.
You cannot declare C++ class objects at the function lexical scope. You can, however, declare objects in a nested block.

From gcc manual:
Use this attribute ... to indicate that the specified function does
not need prologue/epilogue sequences generated by the compiler. It is
up to the programmer to provide these sequences. The only statements
that can be safely included in naked functions are asm statements that
do not have operands. All other statements, including declarations of
local variables, if statements, and so forth, should be avoided. Naked
functions should be used to implement the body of an assembly
function, while allowing the compiler to construct the requisite
function declaration for the assembler.
And it isn't standard (as well as any __declspec or __attribute__)

When entering or exiting a function, the compiler adds code to help with the passing or parameters. When a function is declared naked, non of the parameter variables assignment code is generated, if you want to get at any of the parameters, you will need to directly access the relevant registers or stack (depending on the ABI defined calling convention).
In your case, you are not passing parameters to the function, therefore your code works even though the function is declared naked. Take a look at the dis-assembler if you want to see the difference.

Related

What is a magic function in C/C++ (in regards to OpenMP)

Currently looking at this guide to using OpenMP with C/C++ programs and wonder what they mean by creating a magic function in the quote below:
Internally, GCC implements this by creating a magic function and
moving the associated code into that function, so that all the
variables declared within that block become local variables of that
function (and thus, locals to each thread). ICC, on the other hand,
uses a mechanism resembling fork(), and does not create a magic
function. Both implementations are, of course, valid, and semantically
identical.
A "magic" function is a function created by the compiler - its magicness comes from the fact that you as a programmer don't need to do anything, it's "magically done for you".

Is the C++ calling convention constrained by the standard, since the return type of a function does not need to be defined when the fn is declared?

While studying the One Definition Rule in Wikipedia, I became stuck on the following example in the Examples section:
struct S; // declaration of S
...
S f(); // ok, no definition required
...
I know that space on the stack needs to be allotted for the return value, but seeing this example made me think that C++ calling conventions might dictate that stack management for the return value is handled by the code block in which the function is defined, rather than the code block in which it is called. So I investigated "C vs. C++ calling convention" (recalling that the issue of stack return value allocation might be a primary difference), and came across this answer, which indicates that "calling convention" is not defined by the standard.
However, given the apparent requirement that the above code snippet is valid, it seems to me that there must be some constraints on calling convention in order to support the above code snippet.
Am I right? Does the C++ standard implicitly require that stack management for the return value of a function be handled by the code that defines the function, in order to support the syntax above?
As mentioned in the comments
As you have written your example, Both Struct S and function f are forward declarations. The Compiler Will indeed complain if you attempt to use either
** EDIT as noted by Steven Sudit, function f is not a forward declaration but a function prototype**
and
Also, I believe that default calling convention ( and optional calling conventions ) are explicitly implementation dependent with the exception of those with external linkage. If you search the c++ standard for "calling convention". It is mentioned only once in section 7.5 Linkage Specifications
As to your specific question
Am I right? Does the C++ standard implicitly require that stack management for the return value of a function be handled by the code that defines the function, in order to support the syntax above?
Definitely not, as many compilers support calling conventions where the values are not even passed/returned on the stack (FASTCALL) or microsofts version of (thiscall) where the caller cleans the stack.
The C/C++ standard does not define calling conventions. That is the job of compiler vendors to implement on their own, as evident by the fact that calling convention keywords start with underscores indicating they are vendor-provided extensions.
The C/C++ standard defines the base rules (how to assign values to parameters and return values, pass by-value vs by-reference, etc), but the calling conventions dictate how to accomplish those rules in different ways (passing parameters via stack or registers, in which order, which registers, who cleans up the stack, etc).
In the casev of x86, vendors have agreed on the semantics of the __cdecl and __stdcall calling conventions for interoperability (although there are some slight variations in __cdecl implementations in some cases), but other calling conventions are vendor-specific (Microsoft's __fastcall/__thiscall, Borland's __fastcall/__safecall/__msfastcall, etc).
In the case of x64, there is only one calling convention, dictated by x64 itself. Calling convention keywords are silently ignored by x64 compiler so existing code will still compile and work correctly (as long as it is not using inline assembly to access/manipulate the call stack directly).

Is the register keyword still used?

Just came across the register keyword in C++ and I wondered as this seems a good idea (keeping certain variables in a register) surely the compiler does this by default?
So I wondered is this keyword still used?
Most implementations just ignore the register keyword (unless it imposes a syntactical or semantical error).
The standard also doesn't say that anything must be kept in a register; merely that it's a hint to the implementation that the variable is going to be used very often. Its use is even deprecated.
7.1.1 Storage class specifiers [dcl.stc]
3) A register specifier is a hint to the implementation that the variable so declared will be heavily used. [ Note: The hint can be ignored and in most implementations it will be ignored if the address of the variable is taken. This use is deprecated (see D.2). — end note ]
The standard says this (7.1.1(2-3)):
The register specifier shall be applied only to names of variables declared in a block (6.3) or to function parameters (8.4). It specifies that the named variable has automatic storage duration (3.7.3). A variable declared without a storage-class-specifier at block scope or declared as a function parameter has automatic storage duration by default.
A register specifier is a hint to the implementation that the variable so declared will be heavily used. [ Note: The hint can be ignored and in most implementations it will be ignored if the address of the variable is taken. This use is deprecated (see D.2). — end note ]
In summary: register is useless, vestigial, atavistic and deprecated. Its main purpose is to make the life of people harder who are trying to implement self-registering classes and want to name the main function register(T *).
Probably the only remotely serious use for the register keyword left is a GCC extension that allows you to use a hard-coded hardware register without inline assembly:
register int* foo asm("a5");
This will mean that any access to foo will affect the CPU register a5.
This extension of course has little use outside of very low-level code.
Only specific number of registers are available for any C++ program.
Also, it is just a suggestion for the compiler mostly compilers can do this optimization themselves so there is not really much use of using register keyword and so more because compilers may or may not follow the suggestion.
So the only thing register keyword does with modern compilers is prevent you from using & to take the address of the variable.
Using the register keyword just prevents you from taking the address of the variable in C, while in C++ taking the address of the variable just makes the compiler ignore the register keyword.
Bottomline is, Just don't use it!
Nicely explained by Herb:
Keywords That Aren't (or, Comments by Another Name)
No, it's not used. It's only a hint, and a very weak one at that. Compilers have register allocators, they can figure out which variables should be kept in registers (and account for things you probably never thought about).
The keyword "register" has been deprecated since the 2011 C++ standard; see "Remove Deprecated Use of the register Keyword". It should therefore not be used.
In my own experiments I found that debug code generated by gcc (v8.1.1) does differ if the "register" keyword is used; the generated assembly code allocates designated variables to registers. Benchmarks even showed that this code ran faster (than code without "register"). This is irrelevant, however, as release (optimised) code showed no differences (ie, using "register" had no effect). Vacbob states here that if any optimization is enabled, then gcc ignores "register". My own tests confirm this.
So, in summary, don't use "register" and if debug code appears to run faster when "register" is used, bear in mind that the optimized release code will not.

Why calling main() is not allowed in C++

C++03 3.6.1.3: The function main shall not be used (3.2) within a program. ...
I wonder why this rule exists... Is anyone aware of any system/implementation where it would be a problem if main were used?
P.S. 1. I know the definition of the term used. 2. I know there are simple workarounds like calling a single MyMain() from main() and using MyMain() instead. 3. The question is about real-world implementations which would have a problem if the restriction weren't there. Thanks!
In addition to the other answers: The c++ spec guarantees that all static initialization has happened before main is called.
If code could call main then some static scoped object could call main, in which case a fundamental guarantee is violated.
The spec can't say "static scoped objects should not call main()" because many objects are not written specifically to be instantiated at static scope always. It also can't say that constructors should not call main() - because its very difficult to audit and prove that a constructor isn't calling a method, calling a method, that might sometimes, call main().
I'd imagine this preserves an implementation's freedom to prefix main with code to construct globals and statics, accept any parameters identifying the environment and command line arguments and map them to the argc/argv/env conventions of C++, construct an appropriate stack and exception framework for the application to execute etc. Consider that not all environments may allow an executable image to have any other symbol designated as initialisation code to be run before main().
Similarly, cleanup code may be appended to main(), along with a call to the OS with some mapping from the 0/non-zero convention of C and C++ to the actual success/failure values used by that specific OS.
Consequently, calling main from elsewhere could attempt a second reinitialisation of the application framework or force an unintended exit to the OS - sounds catastrophic to me.
C++'s main() is a strange little function that has different syntax for exception-handling, doesn't have to return a value, even though it has to be defined as returning int, etc. I don't know whether this affects any real implementations, but I would guess that the restriction exists to give compiler writers some latitude in how they implement main().

Function and declaring a local variable

Just having an conversation with collegue at work how to declare a variables.
For me I already decided which style I prefer, but maybe I wrong.
"C" style - all variable at the begining of function.
If you want to know data type of variable, just look at the begining of function.
bool Foo()
{
PARAM* pParam = NULL;
bool rc;
while (true)
{
rc = GetParam(pParam);
... do something with pParam
}
}
"C++" style - declare variables as local as possible.
bool Foo()
{
while (true)
{
PARAM* pParam = NULL;
bool rc = GetParam(pParam);
... do something with pParam
}
}
What do you prefer?
Update
The question is regarding POD variables.
The second one. (C++ style)
There are at least two good reasons for this:
This allow you to apply the YAGNI principle in the code, as you only declare variable when you need them, as close as possible to their use. That make the code easier to understand quickly as you don't have to get back and forth in the function to understand it all. The type of each variable is the main information about the variable and is not always obvious in the varaible name. In short : the code is easier to read.
This allow better compiler optimizations (when possible). Read : http://www.tantalon.com/pete/cppopt/asyougo.htm#PostponeVariableDeclaration
If due to the language you are using you are required to declare variables at the top of the function then clearly you must do this.
If you have a choice then it makes more sense to declare variables where they are used. The rule of thumb I use is: Declare variables with the smallest scope that is required.
Reducing the scope of a variable prevents some types errors, for example where you accidentally use a variable outside of a loop that was intended only to be used inside the loop. Reducing the scope of the variable will allow the compiler to spot the error instead of having code that compiles but fails at runtime.
I prefer the "C++ style". Mainly because it allows RAII, which you do in both your examples for the bool variable.
Furthermore, having a tight scope for the variable provides the compile better oppertunities for optimizations.
This is probably a bit subjective.
I prefer as locally as possible because it makes it completely clear what scope is intended for the variable, and the compiler generates an error if you access it outside the intended useful scope.
This isn't a style issue. In C++, non-POD types will have their constructors called at the point of declaration and destructors called at the end of the scope. You have to be wise about selecting where to declare variables or you will cause unnecessary performance issues. For example, declaring a class variable inside a loop may not be the wisest idea since constructor/destructor will be called every iteration of the loop. But sometimes, declaring class variables at the top of the function may not be the best if there is a chance that variable doesn't get used at all (like a variable is only used inside some 'if' statement).
I prefer C style because the C++ style has one major flaw to me: in a dense function it is very hard on eyes to find the declaration/initialization of the variable. (No syntax highlighting was able yet to cope reliably and predictably with my C++ coding hazards habits.)
Though I do adhere to no style strictly: only key variables are put there and most smallish minor variables live within the block where they are needed (like bool rc in your example).
But all important key variables in my code inevitably end up being declared on the top. And if in a nested block I have too much local variables, that is the sign that I have to start thinking about splitting the code into smaller functions.