I was taught that functions need declarations to be called. To illustrate, the following example would give me an error as there is no declaration for the function sum:
#include <iostream>
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
// main.cpp:4:36: error: use of undeclared identifier 'sum'
// std::cout << "The result is " << sum(1, 2);
// ^
// 1 error generated.
To fix this, I'd add the declaration:
#include <iostream>
int sum(int x, int y); // declaration
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
int sum(int x, int y) {
return x + y;
}
Why the main function doesn't need the declaration, as other functions like sum need?
A definition of a function is also a declaration of a function.
The purpose of a declaring a function is to make it known to the compiler. Declaring a function without defining it allows a function to be used in places where it is inconvenient to define it. For example:
If a function is used in a source file (A) other than the one it is defined in (B), we need to declare it in A (usually via a header that A includes, such as B.h).
If two or more functions may call each other, then we cannot define all those functions before the others—one of them has to be first. So declarations can be provided first, with definitions coming afterward.
Many people prefer to put “higher level” routines earlier in a source file and subroutines later. Since those “higher level” routines call various subroutines, the subroutines must be declared earlier.
In C++, a user program never calls main, so it never needs a declaration before the definition. (Note that you could provide one if you wished. There is nothing special about a declaration of main in this regard.) In C, a program can call main. In that case, it does require that a declaration be visible before the call.
Note that main does need to be known to the code that calls it. This is special code in what is typically called the C++ runtime startup code. The linker includes that code for you automatically when you are linking a C++ program with the appropriate linker options. Whatever language that code is written in, it has whatever declaration of main it needs in order to call it properly.
I was taught that functions need declarations to be called.
Indeed. A function must be declared before it can be called.
why we don't add a declaration for the main function?
Well, you didn't call main function. In fact, you must not call main at all1, so there is never a need to declare main before anything.
Technically though, all definitions are also declarations, so your definition of main also declares main.
Footnote 1: The C++ standard says it's undefined behaviour to call main from within the program.
This allows C++ implementations to put special run-once startup code at the top of main, if they aren't able to have it run earlier from hooks in the startup code that normally calls main. Some real implementations do in fact do this, e.g. calling a fast-math function that sets some FPU flags like denormals-are-zero.
On a hypothetical implementation, calling main could result in fun things like re-running constructors for all static variables, re-initializing the data structures used by new/delete to keep track of allocations, or other total breakage of your program. Or it might not cause any problem at all. Undefined behaviour doesn't mean it has to fail on every implementation.
The prototype is required if you want to call the function, but it's not yet available, like sum in your case.
You must not call main yourself, so there is no need to have a prototype. It's even a bad a idea to write a prototype.
No, the compiler does not need a forward declaration for main().
main() is a special function in C++.
Some important things to remember about main() are:
The linker requires that one and only one main() function exist when creating an executable program.
The compiler expects a main() function in one of the following two forms:
int main () { /* body */ }
int main (int argc, char *argv[]) { /* body */ }
where body is zero or more statements
An additional acceptable form is implementation specific and provides a list of the environment variables at the time the function is called:
int main (int argc, char* argv[], char *envp[]) { /* body */ }
The coder must provide the 'definition' of main using one of these acceptable forms, but the coder does not need to provide a declaration. The coded definiton is accepted by the compiler as the declaration of main().
If no return statement is provided, the compiler will provide a return 0; as the last statement in the function body.
As an aside, there is sometimes confusion about whether a C++ program can make a call to main(). This is not recommended. The C++17 draft states that main() "shall not be used within a program." In other words, cannot be called from within a program. See e.g. Working Draft Standard for C++ Programming Language, dated "2017-03-21", Paragraph 6.6.1.3, page 66. I realize that some compilers support this (including mine), but the next version of the compiler could modify or remove that behavior as the standard uses the term "shall not".
It is illegal to call main from inside your program. That means the only thing that is going to call it is the runtime and the compiler/linker can handle setting that up.This means you do not need a prototype for main.
A definition of a function also implicitly declares it. If you need to reference a function before it is defined you need to declare it before you use it.
So writing the following is also valid:
int sum(int x, int y) {
return x + y;
}
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
If you use a declaration in one file to make a function known to the compiler before it is defined, then its definition has to be known at linking time:
main.cpp
int sum(int x, int y);
int main() {
std::cout << "The result is " << sum(1, 2);
return 0;
}
sum.cpp
int sum(int x, int y) {
return x + y;
}
Or sum could have its origin in a library, so you do not even compile it yourself.
The main function is not used/referenced in your code anywhere, so there is no need to add the declaration of main anywhere.
Before and after your main function the c++ library might execute some init and cleanup steps, and will call your main function. If that part of the library would be represented as c++ code then it would contain a declaration of int main() so that that it could be compiled. That code could look like this:
int main();
int __main() {
__startup_runtime();
main();
__cleanup_runtime();
}
But then you again have the same problem with __main so at some point there is no c++ anymore and a certain function (main) just represents the entry point of your code.
Nope. You can't call it anyway.
You only need forward declarations for functions called before they are defined. You need external declarations (which look exactly like forward declarations on purpose) for functions defined in other files.
But you can't call main in C++ so you don't need one. This is because the C++ compiler is allowed to modify main to do global initialization.
[I looked at crt0.c and it does have a declaration for main but that's neither here nor there].
Related
I have a legacy interface that has a function with a signature that looks like the following:
int provide_values(int &x, int &y)
x and y are considered output parameters in this function. Note: I'm aware of the drawbacks of using output parameters and that there are better design choices for such an interface. I'm not trying to debate the merits of this interface.
Within the implementation of this function, it first checks to see if the addresses of the two output parameters are the same, and returns an error code if they are.
if (&x == &y) {
return -1; // Error: both output parameters are the same variable
}
Is there a way at compile time to prevent callers of this function from providing the same variable for the two output parameters without having such a check within the body of the function? I'm thinking of something similar to the restrict keyword in C, but that only is a signal to the compiler for optimization, and only provides a warning when compiling code that calls such a function with the same pointer.
No, there's not. Keep in mind that the calling code could derive x and y from references returned from some arbitrary black-box functions. But even otherwise, it is provably impossible (by the Incompleteness Theorem) for the compiler to robustly determine whether they point to the same object, since what objects they are bound to is determined by the execution of the program.
If all you want to do is preventing that the user calls provide_values(xyz, xyz), you can use a macro as in the following example. However, this won't protect the user from calling provide_values(xyz, reference_to_xyz), so the whole this is probably pointless anyway.
#include <cstring>
void provide_values(int&, int&) {}
#define PROV_VAL(x, y) if (strcmp((#x),(#y))) { provide_values(x, y); } else { throw -1; }
int main()
{
int x;
int y;
PROV_VAL(x,y);
//PROV_VAL(x,x); // this throws
int& z = x;
PROV_VAL(x,z); // this passes though!
}
I will admit I have not kept up with the latest C/C++ releases but I was wondering why having a function prototype in a function is valid code? Is it related to lambda usage?
Here is sample code - this will compile/run on Visual Studio 2019 and g++ 5.4.0
int main()
{
int func(bool test);
return 0;
}
A code block may contain any number of declarations. And because a function prototype is a declaration, it may appear in a block.
Granted, it doesn't make much sense to do so logistically as opposed to declaring a function at file scope, but it's syntactically correct.
In that example the declaration is pointless. But in a more complex example it's not:
int main() {
int func(bool test);
func(true);
return 0;
}
That code is equivalent to the more usual formulation:
int func(bool test);
int main() {
int func(bool test);
func(true);
return 0;
}
except that the first one introduces the name func only inside the scope of main; the second one introduces the name into the global scope.
I occasionally use the first form when I don't want to scroll through the source file to figure out where to put the declaration; putting it in the function where it's going to be used is a quick-and-dirty solution. And if it's temporary code (adding debugging output, for example), it makes it easier to remove it all afterwards. But, in general, having declarations at global scope is simpler to deal with. After all, you might want to call that same function from somewhere else, too, and having the global declaration means you don't have to repeat it.
What if I define main as a reference to function?
#include<iostream>
#include<cstring>
using namespace std;
int main1()
{
cout << "Hello World from main1 function!" << endl;
return 0;
}
int (&main)() = main1;
What will happen? I tested in an online compiler with error "Segmentation fault":
here
And under VC++ 2013 it will create a program crashing at run-time!
A code calling the data of the function pointer as a code will be compiled which will immediately crash on launch.
I would also like an ISO C++ standard quote about this.
The concept will be useful if you want to define either of 2 entry-points depending on some macro like this:
int main1();
int main2();
#ifdef _0_ENTRY
int (&main)() = main1;
#else
int (&main)() = main2;
#endif
That's not a conformant C++ program. C++ requires that (section 3.6.1)
A program shall contain a global function called main
Your program contains a global not-a-function called main, which introduces a name conflict with the main function that is required.
One justification for this would be it allows the hosted environment to, during program startup, make a function call to main. It is not equivalent to the source string main(args) which could be a function call, a function pointer dereference, use of operator() on a function object, or construction of an instance of a type main. Nope, main must be a function.
One additional thing to note is that the C++ Standard never says what the type of main actually is, and prevents you from observing it. So implementations can (and do!) rewrite the signature, for example adding any of int argc, char** argv, char** envp that you have omitted. Clearly it couldn't know to do this for your main1 and main2.
This would be useful if you want to define either of 2 entry-points depending on some macro
No, not really. You should do this:
int main1();
int main2();
#ifdef _0_ENTRY
int main() { return main1(); }
#else
int main() { return main2(); }
#endif
This is soon going to become clearly ill-formed, thanks to the resolution of CWG issue 1886, currently in "tentatively ready" status, which adds, among other things, the following to [basic.start.main]:
A program that declares a variable main at global scope or that
declares the name main with C language linkage (in any namespace) is
ill-formed.
What will happen in practice is highly dependent on the implementation.
In your case your compiler apparently implements that reference as a "pointer in disguise". In addition to that, the pointer has external linkage. I.e. your program exports an external symbol called main, which is actually associated with memory location in data segment occupied by a pointer. The linker, without looking too much into it, records that memory location as the program's entry point.
Later, trying to use that location as an entry point causes segmentation fault. Firstly, there's no meaningful code at that location. Secondly, a mere attempt to pass control to a location inside a data segment might trigger the "data execution protection" mechanisms of your platform.
By doing this you apparently hoped that the reference will get optimized out, i.e. that main will become just another name for main1. In your case it didn't happen. The reference survived as an independent external object.
Looks like you already answered the question about what happens.
As far as why, in your code main is a reference/pointer to a function. That is different than a function. And I would expect a segment fault if the code is calling a pointer instead of a function.
There are a few legal ways which can we declare a function in C++.
Some of the legal ways are:
void function ();
void function (void);
dataType function (dataType);
and so on...
Recently, I came across a function declaration as such:
void (function) (); //Take note of the braces around the function name
I have never seen somehting like this before, when I tested it in a C++ compiler, it runs without any warning or compilation errors.
My question is: Why is void (function) (); a legal way to decalre a function prototype? Is there any special meaning to declare a function in this way? Or does it just work normally like any other function declaration?
One difference is that enclosing it in parenthesis prevents the function-like macros expansion by the preprocessor. As mentioned in the other answers it makes no difference though to the actual compiler.
For instance:
// somewhere buried deep in a header
#define function(a, b) a + b
// your code
void function() { // this expands the macro and gives compilation error
}
void (function)() { // this does not expand and works as expected
}
This comes in handy for instance when the bright minds behind the Microsoft Visual Studio library decided to provide function-like macros for things like min and max. (There are other ways like #undef to go around this).
Note that object-like macros (e.g. #define function 3 + 4) are still expanded.
The preprocessor is just a dumb text replacement tool (as opposed to the compiler which is just a (smart) text replacement tool). It takes the macro definition and replaces it everywhere. He is not aware of the semantics of what he replaces.
For instance:
// somewhere buried deep in a header
#define function 3 + 2
// your code
void function() {
}
The preprocessor sees the word function and textually replaces it with the string 3 + 2. He is unaware that function is a id-name part of a function declaration and definition. After the preprocess phase there come the actual compile phases. So the compiler actually sees:
// your code
void 3 + 2() {
}
which does not make any sense to him and gives an error.
For function-like macros
// somewhere buried deep in a header
#define function(a, b) a + b
The preprocessor does the same except that it expects two ‘tokens’ enclosed in parenthesis separated by comma (the parameters) and does the replacement. (again no semantics aware):
int d = function(2, 3);
//will be replaced by the preprocessor to:
int d = 2 + 3; // passes compilation phase
void function();
// the preprocessor doesn’t find the arguments for function so it gives an error.
However if it encounters (function) it will not try to expand it (it ignores it). It is just a rule.
it's the same as
void function();
you can declare it as
void ((function)) ();
if you want :)
be careful not to mix this up with the function pointer declaration syntax.
There is nothing special about it, it means exactly the same as the version without parentheses. It is just an artifact of how the syntax is declared. Usually you see the use of parentheses around the function name when a function pointer is declared, e.g.
void (*function_pointer)() = nullptr;
// a function pointer to a function taking and returning void
in contrast to
void *function();
// a function declaration of a function taking void and returning void*
I think it works the same as a normal function because function pointers are declared like: void (*function)() so if you leave out the * then it should be just a function.
It corresponds to the C++ grammar. If to simplify then one of the rules for defining of the declarator looks as
declarator:
(declarator)
So you can write for example
void (function) ();
or
void ( (function) () );
or even the following way
struct A
{
void ( ( function )() const );
};
I think you may find that was:
void (*function) ();
since there is no benefit to using void (function)(); or void (((((function)))))(); for that matter, they're equivalent. If I'm mistaken and it's not a typo, the answer is that you can put as many parentheses around the function name as you like, subject to compiler limitations, as per the code for output6() below.
If I'm not mistaken, that one with the * actually declares a function pointer which can be used to hold a pointer to a function. It does not declare a function at all, just a pointer that can be used to reference a function.
Like an int pointer (for example), the function pointer can point to an arbitrary function, parameters notwhithstanding.
So for example:
#include <iostream>
void (((((output6)))))() { std::cout << 6; }
void output7() { std::cout << 7; }
void output8() { std::cout << 8; }
void (*fn)();
int main() {
fn = &output6; fn();
fn = &output7; fn();
fn = &output8; fn();
std::cout << '\n';
}
would output 678.
Here goes newbie question number 5, but I don't have a teacher.. so.. anyhow here we go:
I'm wondering if is necessary to have function prototypes at the top of the file, instead of putting the main function to the end of the file and create all the function at the top of the file. As far as I can tell, VC++ and G++ both are not complaining. Is there standards that disallows me to do so?
It seems rather annoying to have to change the prototype when you change a function parameters and return types.
Example:
#include <iostream>
void say_hi(){
std::cout << "hi" << std::endl;
}
int main(){
say_hi();
return 0;
}
This declares but does not define the function say_hi:
void say_hi();
This both declares and defines the function say_hi:
void say_hi(){
std::cout << "hi" << std::endl;
}
You can declare a function many times; you can only define it once.
A function must be declared in the file before you can call it. A function must be defined somewhere--in the same file before or after you call it or maybe even in a different file.
So, yes, this is perfectly fine.
You are correct; if you define all your functions above where they are called, you don't need function prototypes. The actual function definition serves the same purpose as a separate declaration.
This works when you have tiny functions. It works less well when they get long. Or when you have more than one file of code. As a matter of style, many teachers demand that even tiny applications be written with the structure that serves large applications well.