How does the main() method work in C?

How does the main() method work in C? - c++

I know there are two different signatures to write the main method -
int main()
{
//Code
}
or for handling command line argument, we write it as-
int main(int argc, char * argv[])
{
//code
}
In C++ I know we can overload a method, but in C how does the compiler handle these two different signatures of main function?

Some of the features of the C language started out as hacks which just happened to work.
Multiple signatures for main, as well as variable-length argument lists, is one of those features.
Programmers noticed that they can pass extra arguments to a function, and nothing bad happens with their given compiler.
This is the case if the calling conventions are such that:
The calling function cleans up the arguments.
The leftmost arguments are closer to the top of the stack, or to the base of the stack frame, so that spurious arguments do not invalidate the addressing.
One set of calling conventions which obeys these rules is stack-based parameter passing whereby the caller pops the arguments, and they are pushed right to left:
;; pseudo-assembly-language
;; main(argc, argv, envp); call
push envp ;; rightmost argument
push argv ;;
push argc ;; leftmost argument ends up on top of stack
call main
pop ;; caller cleans up
pop
pop
In compilers where this type of calling convention is the case, nothing special need to be done to support the two kinds of main, or even additional kinds. main can be a function of no arguments, in which case it is oblivious to the items that were pushed onto the stack. If it's a function of two arguments, then it finds argc and argv as the two topmost stack items. If it's a platform-specific three-argument variant with an environment pointer (a common extension), that will work too: it will find that third argument as the third element from the top of the stack.
And so a fixed call works for all cases, allowing a single, fixed start-up module to be linked to the program. That module could be written in C, as a function resembling this:
/* I'm adding envp to show that even a popular platform-specific variant
can be handled. */
extern int main(int argc, char **argv, char **envp);
void __start(void)
{
/* This is the real startup function for the executable.
It performs a bunch of library initialization. */
/* ... */
/* And then: */
exit(main(argc_from_somewhere, argv_from_somewhere, envp_from_somewhere));
}
In other words, this start module just calls a three-argument main, always. If main takes no arguments, or only int, char **, it happens to work fine, as well as if it takes no arguments, due to the calling conventions.
If you were to do this kind of thing in your program, it would be nonportable and considered undefined behavior by ISO C: declaring and calling a function in one manner, and defining it in another. But a compiler's startup trick does not have to be portable; it is not guided by the rules for portable programs.
But suppose that the calling conventions are such that it cannot work this way. In that case, the compiler has to treat main specially. When it notices that it's compiling the main function, it can generate code which is compatible with, say, a three argument call.
That is to say, you write this:
int main(void)
{
/* ... */
}
But when the compiler sees it, it essentially performs a code transformation so that the function which it compiles looks more like this:
int main(int __argc_ignore, char **__argv_ignore, char **__envp_ignore)
{
/* ... */
}
except that the names __argc_ignore don't literally exist. No such names are introduced into your scope, and there won't be any warning about unused arguments.
The code transformation causes the compiler to emit code with the correct linkage which knows that it has to clean up three arguments.
Another implementation strategy is for the compiler or perhaps linker to custom-generate the __start function (or whatever it is called), or at least select one from several pre-compiled alternatives. Information could be stored in the object file about which of the supported forms of main is being used. The linker can look at this info, and select the correct version of the start-up module which contains a call to main which is compatible with the program's definition. C implementations usually have only a small number of supported forms of main so this approach is feasible.
Compilers for the C99 language always have to treat main specially, to some extent, to support the hack that if the function terminates without a return statement, the behavior is as if return 0 were executed. This, again, can be treated by a code transformation. The compiler notices that a function called main is being compiled. Then it checks whether the end of the body is potentially reachable. If so, it inserts a return 0;

There is NO overloading of main even in C++. Main function is the entry point for a program and only a single definition should exist.
For Standard C
For a hosted environment (that's the normal one), the C99 standard
says:
5.1.2.2.1 Program startup
The function called at program startup is named main. The implementation declares no prototype for this function. It shall be
defined with a return type of int and with no parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they
are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent;9) or in some other implementation-defined manner.
9) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char **argv, and
so on.
For standard C++:
3.6.1 Main function [basic.start.main]
1 A program shall contain a global function called main, which is the designated start of the program. [...]
2 An implementation shall not predefine the main function. This function shall not be overloaded. It shall
have a return type of type int, but otherwise its type is implementation defined.
All implementations
shall allow both of the following definitions of main:
int main() { /* ... */ }
and
int main(int argc, char* argv[]) { /* ... */ }
The C++ standard explicitly says "It [the main function] shall have a return type of type int, but otherwise its type is implementation defined", and requires the same two signatures as the C standard.
In a hosted environment (A C environment which also supports the C libraries) - the Operating System calls main.
In a non-hosted environment (One intended for embedded applications) you can always change the entry point (or exit) of your program using the pre-processor directives like
#pragma startup [priority]
#pragma exit [priority]
Where priority is an optional integral number.
Pragma startup executes the function before the main (priority-wise) and pragma exit executes the function after the main function. If there is more than one startup directive then priority decides which will execute first.

There is no need for overloading. Yes, there are 2 versions, but only one can be used at the time.

This is one of the strange asymmetries and special rules of the C and C++ language.
In my opinion it exists only for historical reasons and there's no real serious logic behind it. Note that main is special also for other reasons (for example main in C++ cannot be recursive and you cannot take its address and in C99/C++ you are allowed to omit a final return statement).
Note also that even in C++ it's not an overload... either a program has the first form or it has the second form; it cannot have both.

What's unusual about main isn't that it can be defined in more than one way, it's that it can only be defined in one of two different ways.
main is a user-defined function; the implementation doesn't declare a prototype for it.
The same thing is true for foo or bar, but you can define functions with those names any way you like.
The difference is that main is invoked by the implementation (the runtime environment), not just by your own code. The implementation isn't limited to ordinary C function call semantics, so it can (and must) deal with a few variations -- but it's not required to handle infinitely many possibilities. The int main(int argc, char *argv[]) form allows for command-line arguments, and int main(void) in C or int main() in C++ is just a convenience for simple programs that don't need to process command-line arguments.
As for how the compiler handles this, it depends on the implementation. Most systems probably have calling conventions that make the two forms effectively compatible, and any arguments passed to a main defined with no parameters are quietly ignored. If not, it wouldn't be difficult for a compiler or linker to treat main specially. If you're curious how it works on your system, you might look at some assembly listings.
And like many things in C and C++, the details are largely a result of history and arbitrary decisions made by the designers of the languages and their predecessors.
Note that both C and C++ both permit other implementation-defined definitions for main -- but there's rarely any good reason to use them. And for freestanding implementations (such as embedded systems with no OS), the program entry point is implementation-defined, and isn't necessarily even called main.

The main is just a name for a starting address decided by the linker where main is the default name. All function names in a program are starting addresses where the function starts.
The function arguments are pushed/popped on/from the stack so if there are no arguments specified for the function there are no arguments pushed/popped on/off the stack. That is how main can work both with or without arguments.

Well, the two different signatures of the same function main() comes in picture only when you want them so, I mean if your programm needs data before any actual processing of your code you may pass them via use of -
int main(int argc, char * argv[])
{
//code
}
where the variable argc stores the count of data that is passed and argv is an array of pointers to char which points to the passed values from console.
Otherwise it's always good to go with
int main()
{
//Code
}
However in any case there can be one and only one main() in a programm, as because that's the only point where from a program starts its execution and hence it can not be more than one.
(hope its worthy)

A similar question was asked before: Why does a function with no parameters (compared to the actual function definition) compile?
One of the top-ranked answers was:
In C func() means that you can pass any number of arguments. If you
want no arguments then you have to declare as func(void)
So, I guess it's how main is declared (if you can apply the term "declared" to main). In fact you can write something like this:
int main(int only_one_argument) {
// code
}
and it will still compile and run.

You do not need to override this.because only one will used at a time.yes there are 2 different version of main function

Related

What is the reason C++ doesn't add `std::vector<std::string>` "overload" as argument to `main()`?

Is there any fundamental reason why the new C++17 (or later) won't allow for an alternative way of writing main as
int main(std::vector<std::string> args){...}
? I know that one needs compatibility with previous code, so
int main(int, char**)
still has to exist, but is there anything technical that prevents the first "alternative" declaration?

Here's how to do that yourself, trivially, in a very few lines of code:
auto my_main( std::vector<std::string> const& args ) -> int;
auto main( int n_args, char** args )
-> int
{ return my_main( std::vector<std::string>( args, args + n_args ) ); }
Modulo notation I believe this approch is presented in the Accelerated C++ book, i.e. it's well known.
One doesn't need to add this to the standard: those who find it useful can just copy and paste the code.
Others may not find it useful: it doesn't work so well in Windows, because by common convention the main arguments are not Unicode in Windows, and Microsoft's setlocale explicitly does not support UTF-8 locales.

This might be somewhat non-trivial to implement, at least in one respect.
This basically requires kind of a reverse-lookup form of function overloading. That is, the startup code normally looks roughly like this:
extern int main(int argc, char *argv[], char *envp[]);
void entry() {
// OS-specific stuff to retrieve/parse command line, env, etc.
static_constructors();
main(argc, argv, envp);
execute(onexit_list);
static_destructors();
}
With your scheme, we'd need two separate pieces of startup code: one that calls main passing argc/argv, the other passing a std::vector<std::string>.
I should add that while this means the job isn't entirely trivial, it's still far from an insurmountable problem. Just for one example, Microsoft's linker already links different startup code depending on whether you've defined main or WinMain (or wmain or wWinMain). As such, it's obviously possible to detect the (mangled) name of the entry point the user has provided, and link to an appropriate set of startup code accordingly.

how to start the execution of a program in c/c++ from a different function,but not main() [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Does the program execution always start from main in C?
i want to start the execution of my program which contains 2 functions (excluding main)
void check(void)
void execute(void)
i want to start my execution from check(), is it possible in c/c++?

You can do this with a simple wrapper:
int main()
{
check();
}
You can't portably do it in any other way since the standard explicitly specifies main as the program entry point.
EDIT for comment: Don't ever do this. In C++ you could abuse static initialization to have check called before main during static init, but you still can't call main legally from check. You can just have check run first. As noted in a comment this doesn't work in C because it requires constant initializers.
// At file scope.
bool abuse_the_language = (check(), true);
int main()
{
// No op if desired.
}

Various linkers have various options to specify the entry point. Eg. Microsoft linker uses /ENTRY:function:
The /ENTRY option specifies an entry point function as the starting
address for an .exe file or DLL.
GNU's ld uses the -e or ENTRY() in the command file.
Needles to say, modifying the entry point is a very advanced feature which you must absolutely understand how it works. For one, it may cause skipping the loading the standard libraries initialization.

int main()
{
check();
return 0;
}

Calling check from main seems like the most logical solution, but you could still explore using /ENTRY to define another entry point for your application. See here for more info.

You cannot start in something other than main, although there are ways to have some code execute before main.
Putting code in a static initialization block will have the code run prior to main; however, it won't be 100% controllable. while you can be assured it runs prior to main, you cannot specify the order that two static initialization blocks will run prior to them both executing before main.
Linkers and loaders both have the concept of main held as a shared "understood" start of a C / C++ program; however, there is code that runs prior to main. This code is responsible for "setting up the environment" of the program (things like setting up stdin or cin). By putting code in a static initialization block, you effectively say, "hey you need to do this too to have the right environment". Generally, this should be something small, that can stand independently in execution order of other items.
If you need two or three things to execute in order before main, then make them into proper functions and call them at the beginning of main.

There is a contrived way to achieve that, but it is nothing more than a hack.
The idea is to create a static library containing the main function, and make it call your "check" function.
The linker will resolve the symbol when linking against your "program", and your "program" code will indeed not have a main by itself.
This is NOT recommended, unless you have very specific needs (an example that pops to mind is Windows Screensavers, as the helper library that comes with the Windows SDK has a main function that performs specific initialization like parsing the command line).

It may be supportted by the compiler. For example, gcc, you can use -nostartfiles and --entry=xxx to set the entry point of the program. The default entry point is _start, which will call the function main.

You can "intercept" the call to main by creating an object before the main starts. The constructor needs to execute your function.
#include <iostream>
void foo()
{
// do stuff
std::cout<<"exiting from foo" <<std::endl;
}
struct A
{
A(){ foo(); };
};
static A a;
int main()
{
// something
std::cout<<"starting main()" <<std::endl;
}

I have found solution to my own question.
we can simply use
#pragma startup function-name <priority>
#pragma exit function-name <priority>
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function taking no arguments and returning void; in other words, it should be declared as:
void func(void);
The optional priority parameter should be an integer in the range 64 to 255. The highest priority is 0. Functions with higher priorities are called first at startup and last at exit. If you don't specify a priority, it defaults to 100.
thanks!

Process argc and argv outside of main()

If I want to keep the bulk of my code for processing command line arguments out of main (for organization and more readable code), what would be the best way to do it?
void main(int argc, char* argv[]){
//lots of code here I would like to move elsewhere
}

Either pass them as parameters, or store them in global variables. As long as you don't return from main and try to process them in an atexit handler or the destructor of an object at global scope, they still exist and will be fine to access from any scope.
For example:
// Passing them as args:
void process_command_line(int argc, char **argv)
{
// Use argc and argv
...
}
int main(int argc, char **argv)
{
process_command_line(argc, argv);
...
}
Alternatively:
// Global variables
int g_argc;
char **g_argv;
void process_command_line()
{
// Use g_argc and g_argv
...
}
int main(int argc, char **argv)
{
g_argc = argc;
g_argv = argv;
process_command_line();
...
}
Passing them as parameters is a better design, since it's encapsulated and let's you modify/substitute parameters if you want or easily convert your program into a library. Global variables are easier, since if you have many different functions which access the args for whatever reason, you can just store them once and don't need to keep passing them around between all of the different functions.

One should keep to standards wherever practical. Thus, don't write
void main
which has never been valid C or C++, but instead write
int main
With that, your code can compile with e.g. g++ (with usual compiler options).
Given the void main I suspect a Windows environment. And anyway, in order to support use of your program in a Windows environment, you should not use the main arguments in Windows. They work in *nix because they were designed in and for that environment; they don't in general work in Windows, because by default (by very strong convention) they're encoded as Windows ANSI, which means they cannot encode filenames with characters outside the user's current locale.
So for Windows you better use the GetCommandLine API function and its sister parsing function. For portability this should better be encapsulated in some command line arguments module. Then you need to deal with the interesting problem of using wchar_t in Windows and char in *nix…
Anyway, I'm not sure of corresponding *nix API, or even if there is one, but google it. In the worst case, for *nix you can always initialize a command line arguments module from main. The ugliness for *nix stems directly from the need to support portability with C++'s most non-portable, OS-specific construct, namely standard main.

Simply pass argc and argv as arguments of the function in which you want to process them.
void parse_arg(int argc, char *argv[]);

Linux provides program_invocation_name and program_invocation_name_short.

Check out the "getoptlong" family of functions and libraries. These offer a structured way of defining the arguments your program expects and can then parse them readily for you. Can also help with the generation of documentation / help responses.
It's an old library in the UNIX world, and there is a .Net implementation in C# too. (+ Perl, Ruby, & probably more. Nice to have a single paradigm usable across all of these! Learn once, use everywhere!)

in c++ main function is the entry point to program how i can change it to an other function?

I was asked an interview question to change the entry point of a C or C++ program from main() to any other function. How is it possible?

In standard C (and, I believe, C++ as well), you can't, at least not for a hosted environment (but see below). The standard specifies that the starting point for the C code is main. The standard (c99) doesn't leave much scope for argument:
5.1.2.2.1 Program startup: (1) The function called at program startup is named main.
That's it. It then waffles on a bit about parameters and return values but there's really no leeway there for changing the name.
That's for a hosted environment. The standard also allows for a freestanding environment (i.e., no OS, for things like embedded systems). For a freestanding environment:
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.
You can use "trickery" in C implementations so that you can make it look like main isn't the entry point. This is in fact what early Windows compliers did to mark WinMain as the start point.
First way: a linker may include some pre-main startup code in a file like start.o and it is this piece of code which runs to set up the C environment then call main. There's nothing to stop you replacing that with something that calls bob instead.
Second way: some linkers provide that very option with a command-line switch so that you can change it without recompiling the startup code.
Third way: you can link with this piece of code:
int main (int c, char *v[]) { return bob (c, v); }
and then your entry point for your code is seemingly bob rather than main.
However, all this, while of possibly academic interest, doesn't change the fact that I can't think of one single solitary situation in my many decades of cutting code, where this would be either necessary or desirable.
I would be asking the interviewer: why would you want to do this?

The entry point is actually the _start function (implemented in crt1.o) .
The _start function prepares the command line arguments and then calls main(int argc,char* argv[], char* env[]),
you can change the entry point from _start to mystart by setting a linker parameter:
g++ file.o -Wl,-emystart -o runme
Of course, this is a replacement for the entry point _start so you won't get the command line arguments:
void mystart(){
}
Note that global/static variables that have constructors or destructors must be initialized at the beginning of the application and destroyed at the end. Keep that in mind if you are planning on bypassing the default entry point which does it automatically.

From C++ standard docs 3.6.1 Main Function,
A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined
whether a program in a freestanding environment is required to define a main function.
So, it does depend on your compiler/linker...

If you are on VS2010, this could give you some idea
As it is easy to understand, this is not mandated by the C++ standard and falls in the domain of 'implemenation specific behavior'.

This is highly speculative, but you might have a static initializer instead of main:
#include <iostream>
int mymain()
{
std::cout << "mymain";
exit(0);
}
static int sRetVal = mymain();
int main()
{
std::cout << "never get here";
}
You might even make it 'Java-like', by putting the stuff in a constructor:
#include <iostream>
class MyApplication
{
public:
MyApplication()
{
std::cout << "mymain";
exit(0);
}
};
static MyApplication sMyApplication;
int main()
{
std::cout << "never get here";
}
Now. The interviewer might have thought about these, but I'd personally never use them. The reasons are:
It's non-conventional. People won't understand it, it's nontrivial to find the entry point.
Static initialization order is nondeterministic. Put in another static variable and you'll never now if it gets initialized.
That said, I've seen it being used in production instead of init() for library initializers. The caveat is, on windows, (from experience) your statics in a DLL might or might not get initialized based on usage.

Modify the crt object that actually calls the main() function, or provide your own (don't forget to disable linking of the normal one).

With gcc, declare the function with attribute((constructor)) and gcc will execute this function before any other code including main.

For Solaris Based Systems I have found this. You can use the .init section for every platforms I guess:
pragma init (function [, function]...)
Source:
This pragma causes each listed function to be called during initialization (before main) or during shared module loading, by adding a call to the .init section.

It's very simple:
As you should know when you use constants in c, the compiler execute a kind of 'macro' changing the name of the constant for the respective value.
just include a #define argument in the beginning of your code with the name of start-up function followed by the name main:
Example:
#define my_start-up_function (main)

I think it is easy to remove the undesired main() symbol from the object before linking.
Unfortunately the entry point option for g++ is not working for me(the binary crashes before entering the entry point). So I strip undesired entry-point from object file.
Suppose we have two sources that contain entry point function.
target.c contains the main() we do not want.
our_code.c contains the testmain() we want to be the entry point.
After compiling(g++ -c option) we can get the following object files.
target.o, that contains the main() we do not want.
our_code.o that contains the testmain() we want to be the entry point.
So we can use the objcopy to strip undesired main() function.
objcopy --strip-symbol=main target.o
We can redefine testmain() to main() using objcopy too.
objcopy --redefine-sym testmain=main our_code.o
And then we can link both of them into binary.
g++ target.o our_code.o -o our_binary.bin
This works for me. Now when we run our_binary.bin the entry point is our_code.o:main() symbol which refers to our_code.c::testmain() function.

On windows there is another (rather unorthodox) way to change the entry point of a program: TLS. See this for more explanations: http://isc.sans.edu/diary.html?storyid=6655

Yes,
We can change the main function name to any other name for eg. Start, bob, rem etc.
How does the compiler knows that it has to search for the main() in the entire code ?
Nothing is automatic in programming.
somebody has done some work to make it looks automatic for us.
so it has been defined in the start up file that the compiler should search for main().
we can change the name main to anything else eg. Bob and then the compiler will be searching for Bob() only.

Changing a value in Linker Settings will override the entry point. i.e., MFC applications use a value of 'Windows (/SUBSYSTEM:WINDOWS)' to change entry point from main() to CWinApp::WinMain().
Right clicking on solution > Properties > Linker > System > Subsystem > Windows (/SUBSYSTEM:WINDOWS)
...
Very practical benefit to modifying entry point:
MFC is a framework we take advantage of to write Windows applications in C++. I know it's ancient, but my company maintains one for legacy reasons! You will not find a main() in MFC code. MSDN says the entry point is WinMain(), instead. Thus, you can override the WinMain() of your base CWinApp object. Or, most people override CWinApp::InitInstance() because the base WinMain() will call it.
Disclaimer: I use empty parentheses to denote a method, without caring how many arguments.

Two 'main' functions in C/C++

Can I write a program in C or in C++ with two main functions?

No. All programs have a single main(), that's how the compiler and linker generate an executable that start somewhere sensible.
You basically have two options:
Have the main() interpret some command line arguments to decide what actual main to call. The drawback is that you are going to have an executable with both programs.
Create a library out of the shared code and compile each main file against that library. You'll end up with two executables.

You can have two functions called main. The name is not special in any way and it's not reserved. What's special is the function, and it happens to have that name. The function is global. So if you write a main function in some other namespace, you will have a second main function.
namespace kuppusamy {
int main() { return 0; }
}
int main() { kuppusamy::main(); }
The first main function is not special - notice how you have to return explicitly.

Yes; however, it's platform specific instead of standard C, and if you ask about what you really want to achieve (instead of this attempted solution to that problem), then you'll likely receive answers which are more helpful for you.

No, a program can have just 1 entry point(which is main()). In fact, more generally, you can only have one function of a given name in C.

If one is static and resides in a different source file I don't see any problem.

No, main() defines the entry point to your program and you must only one main() function(entry point) in your program.
Frankly speaking your question doesn't make much sense to me.

What do you mean by "main function"? If you mean the first function to execute when the program starts, then you can have only one. (You can only have one first!)
If you want to have your application do different things on start up, you can write a main function which reads the command line (for example) and then decides which other function to call.

In some very special architecture, you can. This is the case of the Cell Processor where you have a main program for the main processor (64-bit PowerPC Processors Element called PPE) and one or many main program for the 8 different co-processor (32-bit Synergistic Processing Element called SPE).

No, you cannot have more than one main() function in C language. In standard C language, the main() function is a special function that is defined as the entry point of the program. There cannot be more than one copy of ANY function you create in C language, or in any other language for that matter - unless you specify different signatures. But in case of main(), i think you got no choice ;)

No,The main() is the entry point to your program,since u can't have two entry points you cant have two main().

You can write it, and it'll compile, but it won't link (unless your linker is non-comformant)

The idiom is to dispatch on the value of argv[0]. With hardlinks (POSIX) you don't even lose diskspace.

Standard C doesn’t allow nested functions but GCC allows them.
void main()
{
void main()
{
printf(“stackoverflow”);
}
printf(“hii”);
}
The o/p for this code will be -hii
if you use GCC compiler.
There is a simple trick if you want to use 2 main() in your program such that both are successfully executed;you can use define.Example-
void main()
{
printf("In 1st main\n");
func1();
}
#define main func1
void main()
{
printf("In 2nd main\n");
}
Here the o/p will be:
In 1st main
In 2nd main
NOTE:here warning conflicting types of func1 will be generated.
And yes don’t change the place of define.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js