C++ declare 'main' as a reference to function? - c++

What if I define main as a reference to function?
#include<iostream>
#include<cstring>
using namespace std;
int main1()
{
cout << "Hello World from main1 function!" << endl;
return 0;
}
int (&main)() = main1;
What will happen? I tested in an online compiler with error "Segmentation fault":
here
And under VC++ 2013 it will create a program crashing at run-time!
A code calling the data of the function pointer as a code will be compiled which will immediately crash on launch.
I would also like an ISO C++ standard quote about this.
The concept will be useful if you want to define either of 2 entry-points depending on some macro like this:
int main1();
int main2();
#ifdef _0_ENTRY
int (&main)() = main1;
#else
int (&main)() = main2;
#endif

That's not a conformant C++ program. C++ requires that (section 3.6.1)
A program shall contain a global function called main
Your program contains a global not-a-function called main, which introduces a name conflict with the main function that is required.
One justification for this would be it allows the hosted environment to, during program startup, make a function call to main. It is not equivalent to the source string main(args) which could be a function call, a function pointer dereference, use of operator() on a function object, or construction of an instance of a type main. Nope, main must be a function.
One additional thing to note is that the C++ Standard never says what the type of main actually is, and prevents you from observing it. So implementations can (and do!) rewrite the signature, for example adding any of int argc, char** argv, char** envp that you have omitted. Clearly it couldn't know to do this for your main1 and main2.

This would be useful if you want to define either of 2 entry-points depending on some macro
No, not really. You should do this:
int main1();
int main2();
#ifdef _0_ENTRY
int main() { return main1(); }
#else
int main() { return main2(); }
#endif

This is soon going to become clearly ill-formed, thanks to the resolution of CWG issue 1886, currently in "tentatively ready" status, which adds, among other things, the following to [basic.start.main]:
A program that declares a variable main at global scope or that
declares the name main with C language linkage (in any namespace) is
ill-formed.

What will happen in practice is highly dependent on the implementation.
In your case your compiler apparently implements that reference as a "pointer in disguise". In addition to that, the pointer has external linkage. I.e. your program exports an external symbol called main, which is actually associated with memory location in data segment occupied by a pointer. The linker, without looking too much into it, records that memory location as the program's entry point.
Later, trying to use that location as an entry point causes segmentation fault. Firstly, there's no meaningful code at that location. Secondly, a mere attempt to pass control to a location inside a data segment might trigger the "data execution protection" mechanisms of your platform.
By doing this you apparently hoped that the reference will get optimized out, i.e. that main will become just another name for main1. In your case it didn't happen. The reference survived as an independent external object.

Looks like you already answered the question about what happens.
As far as why, in your code main is a reference/pointer to a function. That is different than a function. And I would expect a segment fault if the code is calling a pointer instead of a function.

Related

What is the purpose of having a function prototype in a function?

I will admit I have not kept up with the latest C/C++ releases but I was wondering why having a function prototype in a function is valid code? Is it related to lambda usage?
Here is sample code - this will compile/run on Visual Studio 2019 and g++ 5.4.0
int main()
{
int func(bool test);
return 0;
}
A code block may contain any number of declarations. And because a function prototype is a declaration, it may appear in a block.
Granted, it doesn't make much sense to do so logistically as opposed to declaring a function at file scope, but it's syntactically correct.
In that example the declaration is pointless. But in a more complex example it's not:
int main() {
int func(bool test);
func(true);
return 0;
}
That code is equivalent to the more usual formulation:
int func(bool test);
int main() {
int func(bool test);
func(true);
return 0;
}
except that the first one introduces the name func only inside the scope of main; the second one introduces the name into the global scope.
I occasionally use the first form when I don't want to scroll through the source file to figure out where to put the declaration; putting it in the function where it's going to be used is a quick-and-dirty solution. And if it's temporary code (adding debugging output, for example), it makes it easier to remove it all afterwards. But, in general, having declarations at global scope is simpler to deal with. After all, you might want to call that same function from somewhere else, too, and having the global declaration means you don't have to repeat it.

Does int main() need a declaration on C++?

I was taught that functions need declarations to be called. To illustrate, the following example would give me an error as there is no declaration for the function sum:
#include <iostream>
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
// main.cpp:4:36: error: use of undeclared identifier 'sum'
// std::cout << "The result is " << sum(1, 2);
// ^
// 1 error generated.
To fix this, I'd add the declaration:
#include <iostream>
int sum(int x, int y); // declaration
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
Why the main function doesn't need the declaration, as other functions like sum need?
A definition of a function is also a declaration of a function.
The purpose of a declaring a function is to make it known to the compiler. Declaring a function without defining it allows a function to be used in places where it is inconvenient to define it. For example:
If a function is used in a source file (A) other than the one it is defined in (B), we need to declare it in A (usually via a header that A includes, such as B.h).
If two or more functions may call each other, then we cannot define all those functions before the others—one of them has to be first. So declarations can be provided first, with definitions coming afterward.
Many people prefer to put “higher level” routines earlier in a source file and subroutines later. Since those “higher level” routines call various subroutines, the subroutines must be declared earlier.
In C++, a user program never calls main, so it never needs a declaration before the definition. (Note that you could provide one if you wished. There is nothing special about a declaration of main in this regard.) In C, a program can call main. In that case, it does require that a declaration be visible before the call.
Note that main does need to be known to the code that calls it. This is special code in what is typically called the C++ runtime startup code. The linker includes that code for you automatically when you are linking a C++ program with the appropriate linker options. Whatever language that code is written in, it has whatever declaration of main it needs in order to call it properly.
I was taught that functions need declarations to be called.
Indeed. A function must be declared before it can be called.
why we don't add a declaration for the main function?
Well, you didn't call main function. In fact, you must not call main at all1, so there is never a need to declare main before anything.
Technically though, all definitions are also declarations, so your definition of main also declares main.
Footnote 1: The C++ standard says it's undefined behaviour to call main from within the program.
This allows C++ implementations to put special run-once startup code at the top of main, if they aren't able to have it run earlier from hooks in the startup code that normally calls main. Some real implementations do in fact do this, e.g. calling a fast-math function that sets some FPU flags like denormals-are-zero.
On a hypothetical implementation, calling main could result in fun things like re-running constructors for all static variables, re-initializing the data structures used by new/delete to keep track of allocations, or other total breakage of your program. Or it might not cause any problem at all. Undefined behaviour doesn't mean it has to fail on every implementation.
The prototype is required if you want to call the function, but it's not yet available, like sum in your case.
You must not call main yourself, so there is no need to have a prototype. It's even a bad a idea to write a prototype.
No, the compiler does not need a forward declaration for main().
main() is a special function in C++.
Some important things to remember about main() are:
The linker requires that one and only one main() function exist when creating an executable program.
The compiler expects a main() function in one of the following two forms:
int main () { /* body */ }
int main (int argc, char *argv[]) { /* body */ }
where body is zero or more statements
An additional acceptable form is implementation specific and provides a list of the environment variables at the time the function is called:
int main (int argc, char* argv[], char *envp[]) { /* body */ }
The coder must provide the 'definition' of main using one of these acceptable forms, but the coder does not need to provide a declaration. The coded definiton is accepted by the compiler as the declaration of main().
If no return statement is provided, the compiler will provide a return 0; as the last statement in the function body.
As an aside, there is sometimes confusion about whether a C++ program can make a call to main(). This is not recommended. The C++17 draft states that main() "shall not be used within a program." In other words, cannot be called from within a program. See e.g. Working Draft Standard for C++ Programming Language, dated "2017-03-21", Paragraph 6.6.1.3, page 66. I realize that some compilers support this (including mine), but the next version of the compiler could modify or remove that behavior as the standard uses the term "shall not".
It is illegal to call main from inside your program. That means the only thing that is going to call it is the runtime and the compiler/linker can handle setting that up.This means you do not need a prototype for main.
A definition of a function also implicitly declares it. If you need to reference a function before it is defined you need to declare it before you use it.
So writing the following is also valid:
int sum(int x, int y) {
return x + y;
}
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
If you use a declaration in one file to make a function known to the compiler before it is defined, then its definition has to be known at linking time:
main.cpp
int sum(int x, int y);
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
sum.cpp
int sum(int x, int y) {
return x + y;
}
Or sum could have its origin in a library, so you do not even compile it yourself.
The main function is not used/referenced in your code anywhere, so there is no need to add the declaration of main anywhere.
Before and after your main function the c++ library might execute some init and cleanup steps, and will call your main function. If that part of the library would be represented as c++ code then it would contain a declaration of int main() so that that it could be compiled. That code could look like this:
int main();
int __main() {
__startup_runtime();
main();
__cleanup_runtime();
}
But then you again have the same problem with __main so at some point there is no c++ anymore and a certain function (main) just represents the entry point of your code.
Nope. You can't call it anyway.
You only need forward declarations for functions called before they are defined. You need external declarations (which look exactly like forward declarations on purpose) for functions defined in other files.
But you can't call main in C++ so you don't need one. This is because the C++ compiler is allowed to modify main to do global initialization.
[I looked at crt0.c and it does have a declaration for main but that's neither here nor there].

Bad practice to call static function from external file via function pointer?

Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?
There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.
In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.

On an example of global inline functions in C++

Considering the following setup, I have ran into a quite strange phenomena, which I can not really explain. Using Visual Studio 2005, the following piece of code results in crash. I would like to really know the reason.
playground.cpp
static int local=-1;
#include "common.h"
int main(int arg)
{
setit();
docastorUpdate();
return 0;
}
common.h
#include <stdio.h>
#include <iostream>
void docastorUpdate();
static int *gemini;
inline void setit()
{
gemini = &local;
}
castor.cpp
static int local = 2;
#include "common.h"
void docastorUpdate() {
setit();
// crashing here, dereferencing a null pointer
std::cout << "castor:" << *gemini << std::endl;
}
The thing is, that the crash disappears when
I move the inline function setit() to an unnamed namespace
I make it static
To put it in a nutshell, I would need help to understand the reasons. Any suggestion is appreciated! (I am aware that, this solution is not one of the best partices, just being curious.)
This breaks because you are violating the one-definition rule. The one-definition rule says that in a program, across all translation units, there is only one definition of any given function. inline is sort of an exception to this rule, and it more or less means "dear compiler, there will be several definitions of this function, but they will all be the same, I promise".
static, when used here for local, means "dear compiler, this is an internal detail that only this translation unit will ever see; please do not confuse it with variables named local from other translation units"
So you promised the compiler that all definitions of setit will be the same, and asked the compiler to give each translation unit its very own local variable.
However, since the setit function uses whatever variable named local is in scope, the end result is two different definitions of setit, each one using a different variable. You just broke your promise. The compiler trusted you and the result was a totally messed up program. It thought it could do certain things with the code based on your promises, but since you broke them behind its back, those things it tried to do with the code didn't work at all.
Your code invokes undefined behaviour, in a very subtle way.
[C++11: 7.1.2/4]: An inline function shall be defined in every translation unit in which it is ODR-used and shall have exactly the same definition in every case. [..]
Although the definition looks the same because it's lexically copy-pasted into each translation unit by your #include, it's not because &local isn't taking the address of the same variable, in each case.
As a result, anything can happen when you run your program, including transferring all your hard-earned savings into my bank account, or taking away all of Jon Skeet's rep.
This is why making the function non-inline solves the problem; also, putting it in an unnamed namespace makes it a different function in each translation unit.
you can avoid using static in all the places, you can use extern in your case:
common.h:
extern int *gemini;
common.cpp;
int *gemini = nullptr;
also avoid using local like that , instead you can do this:
inline void setit(int * p)
{
gemini = p;
}
void docastorUpdate()
{
static int local = 2;
setit(&local);
std::cout << "castor:" << *gemini << std::endl;
}

main () returns an integer? [duplicate]

This question already has answers here:
What should main() return in C and C++?
(19 answers)
Closed 9 years ago.
I am currently reading what functions are in c++. It says that they are "artifacts that enable you to divide the content of your application into functional units that can be invoked in a sequence of your choosing.A function, when invoked, typically returns a value to the calling function."
It then goes on to say that main() is recognized by the compiler as the starting point of your c++ application and has to return an int (integer).
I don't know what is meant by 'has to return an integer'. From my (extremely limited experience) int main () is the start of your application. But what is meant by 'has to return an int'?. This is also intertwined with me not understanding 'typically returns a value to the calling function'
Just like in mathematics, in C++ functions return values. All functions in C++ must specify exactly what type of value they return, and every function must return only one type of thing. In some cases, that "one type of thing" might be nothing, which is denoted in C++ with the keyword void.
Every function must declare what it returns. This is done via a function declaration. Here are several examples:
int foo();
void bar();
string baz();
int main();
4 function declarations. foo returns an int, bar returns nothing, baz returns a string (which is declared in the C++ Standard Library), and main returns an int.
Not only must every function declare what it returns, it must also return that type of thing. If your function returns void, then you can write:
void bar()
{
return;
}
...or just do nothing:
void bar()
{
}
If your function returns anything other than void, then you have to have a return statement that returns that type of thing:
int foo()
{
return 42;
}
If you declare a function to return one type of thing, but then try to return another type of thing, then either there must be a way to implicitly convert from whatever you're trying to convert to what the function is declared to return. If there is no possible implicit conversion, your program won't compile. Consider:
int foo()
{
return "foobar rulez!";
}
Here, foo is declared to return an int, but I'm trying to return a string (not a string from the Standard Library, but an old C-style const char* string. `"foobar rulez!" here is called a string literal.)
It is possible to write code to provide the implicit conversion I mentioned earlier, but unless you know exactly why you want to do that it's better to not get mixed up in all that right now.
What do you do with the values that are returned from functions? Again, just like with mathematics, you can use those values somewhere else in your program.
#include <cstdlib>
#include <iostream>
int foo()
{
return 42;
}
int main()
{
int answer = foo();
std::cout << "The answer to Life, the Universe and Everything is...\n"
<< answer << "!\n";
return 0;
}
Obviously you can't do anything with the value that is returned from a function that returns void, because a function that returns void doesn't really return anything at all. But these kinds of functions are useful for doing stuff kind of on the side.
#include <cstdlib>
#include <iostream>
int theAnswer = 0;
void DeepThought()
{
theAnswer = 42;
}
int foo()
{
return theAnswer;
}
int main()
{
DeepThought();
int answer = foo();
std::cout << "The answer to Life, the Universe and Everything is...\n"
<< answer << "!\n";
return 0;
}
OK, back to all this business with main.
main is a function in C++. There are a few things about main that make it special compared to other functions in C++, and two of those things are:
Every program must have exactly one function called main() (in global scope).
That function must return an int
There is one more thing about main that's a little special and possibly confusing. You don't actually have to write a return statement in main*, even though it is declared to return an int. Consider:
int main()
{
}
Note that there's no return statement here. That is legal and valid in C++ for main, but main is the only function where this is allowed. All other functions must have an explicit return statement if they don't return void.
So what about the return value from main()? When you run a program on an Windows or Linux computer, the program returns a value to the operating system. What that value means depends on the program, but in general a value of 0 means that the program worked without any problems. A value other than 0 often means that the program didn't work, and the exact value is actually a code for what went wrong.
Scripts and other programs can use these return values to decide what to do next. For example, if you wrote a program to rename an MP3 file based on the Artist and track Number, then your program might return 0 if it worked, 1 if it couldn't figure out the Artist, and 2 if it couldn't figure out the Track Number. You can call this function in a script that renames and then moves files. If you want your script to quit if there was an error renaming the file, then it can check these return values to see if it worked or not.
no explicit return statement in main: In cases where main does not have an explicit return, it is defined to return the value 0.
Although it may appear so when you are programming in C or C++, main is not actually the "first thing" that happens. Typically, somewhere in the guts of the C or C++ runtime library is a call to main, which starts your program. When your program is finished and returns from main, it will return a value (in C++, if you don't specify something, the compiler will automatically add return 0), and this return value is used to signal "the success" of the program.
In Unix/Linux etc, this is used as $?, so you can echo $? after running a program to see what the "result" was - 0 means "went well", other values are used for "failure". In windows, there is a ERRORLEVEL variable in batch scripts, etc, that can be used to see the result of the last command.
Edit: If your code calls another program, e.g. through CreatProcess in Windows, or fork()/exec() in a Unix style OS (or the C runtime functions spawn and siblings in almost any OS), the return value from main is the new process finishes, and made available for the owning process. End Edit.
Since, even in C++, main is a "C" style function, if you change the return type, it still has the same name, so the linker/compiler can't "detect" that it's got the wrong return type, and some weird stuff will happen if you declare void main(), std::string main() or float main() or something other than int main() - it will still compile, but what happens in the code calling main will be undefined behaviour - this means "almost anything can happen".
This is how you report back to the operating system the exit status of the program, whether it ran successfully or not. For example in Linux you can use the following command:
echo $?
to obtain the the exit status of the program that ran previously.
Yes, main should always returns an int, this can be used to show if the program runs successfully, usually 0 represents sucess, a non-zero value represents some kind of failure.
For example, in Linux, you can call your program in a bash script, and in this script, run different commands on the return status of your program.
It means, in the signature of main() the return type is int:
int main();
int main(int argc, char const *argv[]);
Now what value you would return from main(), is the question. Well, the return value is actually exit status which indicates to the runtime whether the main() executes sucessfully or unsuccessfully.
In Linux, I usually return EXIT_SUCCESS or EXIT_FAILURE depending on the cases. These are macros defined by <cstdlib>.
int main()
{
//code
if ( some failure condition )
return EXIT_FAILURE;
//code
return EXIT_SUCCESS;
}
As per the doc:
#define EXIT_SUCCESS /*implementation defined*/
#define EXIT_FAILURE /*implementation defined*/
The EXIT_SUCCESS and EXIT_FAILURE macros expand into an integral expression and indicate program execution status.
Constant Explanation
EXIT_SUCCESS successful execution of a program
EXIT_FAILURE unsuccessful execution of a program