Question on entry point (beginner level ) - c++

I have started learning C++ according to a recommended list in Stack Overflow. There is book called "C++ primer" got me interested .Anyway in that book writer called "main" function an entry point. According to "Wikipedia" (what I understood) entry points are used to run a program .Does it give permission to OS to run my code? Is that why main is needed so OS can recognize and have the authority to run the code?

Does it give permission to OS to run my code?
Nope.
A program is a sequence of commands for the computer, commands such as std::cout << "Hello, world!\n";. The formal term for a such command (in C++) is statement.
Statements are generally executed from top to bottom, but which statement should be executed first? Can't be the first statement in the source code file, because there can be more than one file.
In C++ it was decided that the first statement to be executed is the first statement of main, followed by the rest of statements in it. Even if your program contains more than one source code file, there can't be more than one main.
Execution of statements in a specific order is called control flow, and since control flow enters your program at the beginning of main, it's called the entry point.
It will make more sense once you learn about functions.

Related

The use of return; For a beginner, from a different perspective c++

I've just started programming and I've come to hear the standard beginner's definition of "the use of the return value in main" a lot, but it does not get to the point I am trying to understand. So, yes a return value 0 for 'int main' for example signifies that the programme running was successful and since main is of int datatype, 0 reflects this.
BUT what is the point of this? Won't the computer already know that the code was successful or not? Surely, we could write a flawed code and then return 0, and by that logic, we (the programmers) are saying this code is correct but the compiler actually executes the programme and if it's wrong/flawed it simply cannot operate on it.
Please use explanations that a beginner could understand.
The return code of your program ain't about crashing, its about a functional kind of failure.
For example, the program grep defines exit/failure code 0 as successfully found and 1 as not found. While value 2 gets used for invalid input.
Within scripting, this can be used for some automated logic without the user needing a user to interpret the results.
As you are a beginner, I would recommend to always return zero as you are focusing on how to learn the language. Looking into how applications can connect to each other via exit codes is adding unneeded distraction/complexity.
A program can fail, because some expectations are not met.
For example, a program which count the number of lines in files passed as arguments to main would fail when one of the arguments is not a valid file name, or if, for some reasons, that file could not be opened. And if you code such a program, you'll need to explicitly add some program logic (that is, several or many source code lines) for that. A good programmer don't allow his program to crash (even with wrong or missing input or arguments).
A simple program which copies a source file into a destination requires two arguments. If main is not given two arguments, it should fail. If the first argument does not name a valid and accessible file, the program should also fail. If the copy could not be achieved because some disk is full, that program should also fail.
The return from main is, in practice, not some arbitrary integer. On Linux and many POSIX systems, it should be some integer between 0 and 255 (with 0 meaning "successful execution", and other exit values are for failures). See exit(3) & waitpid(2) for more.
By some convention (which you need to document) the failure codes (in practice there are few of them, usually less than a dozen and quite often 0 -named EXIT_SUCCESS- on success and 1 -named EXIT_FAILURE- on failure) could tell about the failure reason. See for examples the documentation of tar(1), coreutils programs, grep(1), etc.
BSD unixes define some conventions in sysexits (but Linux programs generally don't use that).
Shell scripts can easily test and handle the exit code.
Read also about the Unix philosophy. Successful command-line programs (e.g. cp(1)) could often be silent. Error messages would go (by convention) to stderr.
As you would learn more about C programming, you'll understand that conventions matter a big lot (and it is important to document them). Study also the source code of some existing free software programs, e.g. on github.
Remember that you don't write code mostly for the computer, but also -and mostly- for the person (perhaps you in a few months, perhaps some future developer working at your company, when you'll be a professional developer) which would have to improve your code....
The return value from main indicates if something worked in a "business" sense, not in a "technical" sense. If the program has a technical flaw, main probably won't return at all, as the program will probably have crashed, or the return value will be meaningless, due to undefined behaviour.
The return value is used in things like search programs to indicate if something the program was interested in was found or not. The computer can't know what to return in these sorts of cases, as it has no understanding of the semantics of the program.
The return value of main, which becomes the exit code of the process once it's done running, is not related to whether the code is correct C++, but whether it has executed correctly from the point of view of its semantics (its business logic, let's say).
While the program exists as C++ source code, returning from main is an instruction like any other. Having return 0; in main will not affect whether your program is a valid C++ program, and will not fix e.g. syntax errors. While being compiled, it's totally irrelevant w.r.t. correctness.
The return value of main comes into play when the compiled program actually runs (already in binary form).
That is, when you're executing e.g. gcc ... -o myapp, the return value of main does not come into play (indeed, it doesn't even exist). But when you're then executing ./myapp, its process exit code (which is used by e.g. shell) is what gets set by the return value of main.
For example, the unix if command tests whether its argument returned 0 or non-0:
if ./myapp; then
echo "Success"
fi
Whether the above shell script echoes Success or not depends on whether the process exit code of myapp was 0 or not, in other words, whether its main function returned 0 or not.
The Windows-world equivalent of such a check would be:
myapp.exe
if errorlevel 1 goto bad
echo "Success"
bad:
One common convention is to have a process exit code of 0 on success, 1 when the program couldn't complete its task (e.g. it was asked to remove a file which doesn't exist), and 2 when it was invoked incorrectly (e.g. it was given a command-line option it doesn't understand). These are the values main returns.

What is the difference between dprintf vs break + commands + continue?

For example:
dprintf main,"hello\n"
run
Generates the same output as:
break main
commands
silent
printf "hello\n"
continue
end
run
Is there a significant advantage to using dprintf over commands, e.g. it is considerably faster (if so why?), or has some different functionality?
I imagine that dprinf could be in theory faster as it could in theory compile and inject code with a mechanism analogous to the compile code GDB command.
Or is it mostly a convenience command?
Source
In the 7.9.1 source, breakpoint.c:dprintf_command, which defines dprintf, calls create_breakpoint which is also what break_command calls, so they both seem to use the same underlying mechanism.
The main difference is that dprintf passes the dprintf_breakpoint_ops structure, which has different callbacks and gets initialized at initialize_breakpoint_ops.
dprintf stores list of command strings much like that of commands command, depending on the settings. They are:
set at update_dprintf_command_list
which gets called on after a type == bp_dprintf check inside init_breakpoint_sal
which gets called by create_breakpoint.
When a breakpoint is reached:
bpstat_stop_status gets called and invokes b->ops->after_condition_true (bs); for the breakpoint reached
after_condition_true for dprintf is dprintf_after_condition_true
bpstat_do_actions_1 runs the commands
There are two main differences.
First, dprintf has some additional output modes that can be used to make it work in other ways. See help set dprintf-channel, or the manual, for more information. I think these modes are the reason that dprintf was added as a separate entity; though at the same time they are fairly specialized and unlikely to be of general interest.
More usefully, though, dprintf doesn't interfere with next. If you write a breakpoint and use commands, and then next over such a breakpoint, gdb will forget about the next and act as if you had typed continue. This is a longstanding oddity in the gdb scripting language. dprintf doesn't suffer from this problem. (If you need similar functionality from an ordinary breakpoint, you can do this from Python.)

Fortran script only runs when print statement added

I am running an atmospheric model, and need to compile an executable to convert some files. If I compile the code as supplied, it runs but it gets stuck and doesn't ever complete. It doesn't give an error or anything like that.
After doing some testing by adding print statements to see where it was getting stuck, I've found that the executable only runs if I compile the code with a print statement in one of the subroutines.
The piece of code in question is the one here. Specifically, the code fails to run unless I put a print statement somewhere in the get_bottom_top_dim subroutine.
Does anyone know why this might be? It doesn't matter what the print statement is (currently I'm using print*, '!'). but as soon as I remove it or comment it out, the code no longer works.
I'm assuming it must have something to do with my machine or compiler (ifort 12.1.0), but I'm stumped as to what the problem is!
This is an extended comment rather than an answer:
The situation you describe, inserting a print statement which apparently fixes a program, often arises when the underlying problem is due to either
a) an attempt to access an element outside the declared bounds of an array; or
b) a mismatch between dummy and actual arguments to some procedure.
Recompile your program with the compiler options to check interfaces at compile-time and to check array bounds at run-time.
Fortran has evolved a LOT since I last used it but here's how to go about solving your problem.
Think of some hypotheses that could explain the symptoms, e.g. the compiler is optimizing the subroutine down to a no-op when it has no print side effect. Or a compiler bug is translating this code into something empty or an infinite loop or crashing code. (What exactly do you mean by "fails to run"?) Or the Linker is failing to link in some needed code unless the subroutine explicitly calls print.
Or there's a bug in this subroutine and the print statement alters its symptoms e.g. by changing which data gets overwritten by an index-out-of-bounds bug.
Think of ways to test these hypotheses. You might already have observations adequate to rule out of some of them. You could decompile the object code to see if this subroutine is empty. Or step through it in a debugger. Or replace the print statement with a different side effect like logging to a file or to an in-memory text buffer.
Or turn on all optional runtime memory checks and compile time warnings. Or simplify the code until the problem goes away, then binary search on bringing back code until the problem recurs.
Do the most likely or easiest tests first. Rule out some hypotheses, and iterate.
I had a similar bug and I found that the problem was in the dependencies on the makefile.
This was what I had:
I set a variable with a value and the program stops.
I write a print command and it works.
I delete the print statement and continues to work.
I alter the variable value and stops.
The thing is, the variable value is set in a parameters.f90
The print statement is in a file H3.f90 that depends on parameters.f90 but it was not declared on the makefile.
After correcting:
h3.o: h3.f90 variables.f90 parameters.f90
$(FC) -c h3.f90
It all worked properly.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.
If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.
Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.
Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.
Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.
If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.
If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.
If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.
There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).

Custom front end and back end with Pantheios logging

Apologies if I'm missing something really obvious, but I'm trying to understand how to write a custom front end and back end with Pantheios. (I'm using it from C++, not C.)
I can follow the purposes of the initialisation functions (I think) but I'm unsure about the others: pantheios_be_logEntry, pantheios_fe_getProcessIdentity and pantheios_fe_isSeverityLogged.
In particular, I'm confused about the relationship between a front end and a back end. How do I make them communicate with each other?
Not sure I understand exactly what you don't understand, but maybe that's part of the problem. ;-) So I'll try my best and you let me know whether it's near or not.
pantheios_fe_getProcessIdentity() is called once, when Pantheios is initializing. You need to return a string that identifies the process. (Actually, it identifies the link-unit; a term defined in Imperfect C++, written by Pantheios' creator, Matthew Wilson, which means the scope of link names, i.e. an executable program module or a dynamic library module.)
pantheios_fe_isSeverityLogged() is called whenever a log statement is executed in application code. It returns non-zero to indicate that the statement should be processed and sent to the output (via the back-end). If it returns zero, no processing occurs. FWIU, this is the main reason why Pantheios is so fast.
pantheios_be_logEntry() is called whenever a log statement is to be sent for output, when pantheios_fe_isSeverityLogged() has returned non-zero and the Pantheios core has processed the statement (forming all the arguments in your code into a single string). It sends the statement string to wherever it should go. For example, the be.fprintf back-end prints it to the console using fprint().
Once you grok these aspects, the second part of your question is where it gets interesting. When your front-end and back-end are initialized they get to create some context (e.g. a C++ object) that the Pantheios core holds for them, and gives them back each time it calls a front/back end API function. When you're customizing both, you can have them communicate via some shared context that they both know about, but which the Pantheios core does not (and should not) know about, beyond having an opaque handle (void*) to it.
HTH