Is striping certain information from a C++ binary safe? - c++

I am brand new to C++, trying to create a program to read pixels on the screen on Linux.
I currently compile the project without any optimization flag, as I am unsure what it does to the program, but that would be another question, here's mine:
Is striping certain information from a C++ binary safe?
I found a possibly helpful manual page of strip program.
As I don't really know what striping means in this context, I am unsure if it is as simple as striping all of it with:
-s --strip-all Remove all symbol and relocation information
But, of course, I'd want the program to work flawlessly then, so does it interfere anyhow with program's execution?
As for my motivation for striping: I want to know if it's safe, and as I said already, I repeat:
I don't really know what striping means in this context.
I thought the answerer could have also covered this. For me to decide.

Symbols are used for debugging.
Your application would continue to work with out issues if you strip them; but you may find it harder to debug if there's a problem.
Relocation information is used for dynamic library loading and for address space layout randomisation (thank you #interjay); and from the strip documentation
--remove-relocations=sectionpattern
... Note that using this option inappropriately may make the output file unusable. ...

Related

How to export function names and variable names using GCC or clang?

I am making a commercial software and I don't want for it to be easily crackable. It is targeted for Linux and I am compiling it using GCC (8.2.1). The problem is that when I compile it, technically anyone can use disassembler like IDA or Binary Ninja to see all functions names. Here is example (you can see function names on left panel):
Is there any way to protect my program from this kind if reverse engineering? Is there any way of exporting all if these function names and variables from code automatically (with GCC or clang?), so I can make a simple script to change them to completely random before compilation?
So you want to hide/mask the names of symbols in your binary. You've decided that, to do this, you need to get a list of them so that you can create a script to modify them. Well, you could get that list with nm but you don't need any of that (rewriting names inside a compiled binary? oof… recipe for disaster).
Instead, just do what everybody does in a release build and strip the symbols! You'll see a much smaller binary, too. Of course this doesn't prevent reverse engineering (nothing does), though it arguably makes said task more difficult.
Honestly you should be stripping your release binaries anyway, and not to prevent cracking. Common wisdom is not to try too hard to prevent cracking, because you'll inevitably fail, and at the cost of wasted dev time in the attempt (and possibly a more complex codebase that's harder to maintain / a more complex executable that is less fast and/or useful for the honest customer).

How do you ascertain that you are running the latest executable?

Every so often I (re)compile some C (or C++) file I am working on -- which by the way succeeds without any warnings -- and then I execute my program only to realize that nothing has changed since my previous compilation. To keep things simple, let's assume that I added an instruction to my source to print out some debugging information onto the screen, so that I have a visual evidence of trouble: indeed, I compile, execute, and unexpectedly nothing is printed onto the screen.
This happened me once when I had a buggy code (I ran out of the bounds of a static array). Of course, if your code has some kind of hidden bug (What are all the common undefined behaviours that a C++ programmer should know about?) the compiled code can be pretty much anything.
This happened me twice when I used some ridiculously slow network hard drive which -- I guess -- simply did not update my executable file after compilation, and I kept running-and-running the old version, despite the updated source. I just speculate here, and feel free to correct me, if such a phenomenon is impossible, but I suspect it has had to do something with certain processes waiting for IO.
Well, such things could of course happen (and they indeed do), when you execute an old version in the wrong directory (that is: you execute something similar, but actually completely unrelated to your source).
It is happening again, and it annoys me enough to ask: how do you make sure that your executable is matching the source you are working on? Should I compare the date strings of the source and the executable in the main function? Should I delete the executable prior compilation? I guess people might do something similar by means of version control.
Note: I was warned that this might be a subjective topic likely doomed to be closed.
Just use ol' good version control possibilities
In easy case you can just add (any) visible version-id in the code and check it (hash, revision-id, timestamp)
If your project have a lot of dependent files and you suspect older version, than "latest", in produced code, you can (except, obvioulsly, good makefile-rules) monitor also version of every file, used for building code (VCS-dependent, but not so heavy trick)
Check the timestamp of your executable. That should give you a hint regarding whether or not it is recent/up-to-date.
Alternatively, calculate a checksum for your executable and display it on startup, then you have a clue that if the csum is the same the executable was not updated.

How to know the optimzation options used to build a shared library in C++

I have a very simple question but I haven't been able to find the answer yet so here I go:
I am using a shared library and I'd like to know if it had been compiled with an optimization flag such as -O3 or not.
Is there a way to find that information ?
Thank you very much
If you are using gcc 4.3 or later, check out -frecord-gcc-switches. After you build the binary file use readelf -n to read the notes section.
More info can be found here Detect GCC compile-time flags of a binary
Unless whoever compiled the library in the first place used a compiler that saves these flags to the binary somehow (I think only recent GCC allows that, and probably clang), there's inherently no way to know exactly what flags have been used. You can, of course, if you have had a lot of experience looking at assembly, deduct a lot (for example "this looks like an automatically unrolled loop", "This looks like someone optimized for a processor where A xor A is faster than A := 0x0", etc).
Generally, there's always different source code that can end up as the same compiled code, so there's no way to tell wether what has been compiled was optimized "by hand" in the first place or has seen compiler optimization in many cases.
Also, there are a lot of C++ compilers out there, a lot of versions of these and even more flags...
Now, your question comes out of somewhere; I'm guessing you're asking this because either
you want to know if there's debugging symbols in there, or
you want to make sure something isn't crashing because of incorrect optimization, or
you want to know whether there's potential for optimization.
Now, 1. is really rather independent of the level of optimization; of course, the more you optimize, the less your bytecode corresponds to "lines of source code", but you can still have debugging symbols.
The second point: I've learned the hard way that unless I've successfully excluded every other alternative, I'm the one to blame for bugs (and not my compiler).
The third point: There's always room for optimization, but that won't help you unless you're in a position to recompile the library yourself. If you recompile, you'll set the flags, so no need to find out if they were set in the first place. If you're not able to recompile: Knowing there is room won't help you. If you're just getting your library out of a complex build process: Most build systems leave you with a log that will include things like compiler flags.

Strip symbols and RTTI text from GCC executable

My project uses template metaprogramming heavily. Most of the action happens inside recursive templates which produce objects and functions with very long (mangled) symbol names.
Despite the build time being only ~30 sec, the resulting executable is about a megabyte, and it's mostly symbol names.
On Linux, adding a -s argument to GCC brings the size down to ~300 KiB, but a quick look with a text editor shows there are still a lot of cumbersome names in there. I can't find how to strip anything properly on OS X… will just write that off for now.
I suspect that the vtable entries for providing typeid(x).name() are taking up a big chunk. Removing all use of the typeid operator did not cause anything more to be stripped on Linux. I think that the default exception handler uses the facility to report the type of an uncaught exception.
How might I maximize strippage and minimize these kilobyte-sized symbols in my executable?
Just run the program strip on the final executable. If you want to be fancier, you can use some other tools to store the debug info separately, but for your stated purpose, just strip a.out is fine. Maybe use the --strip-all option--I haven't tried that myself to see if it differs from the default behavior.
If you really want to try disabling RTTI, well, it's gcc -fno-rtti. But that may break your program badly--only one way to find out I guess.

Can GNU ld be instructed to print which .o files are needed during a link?

A little background: I'm trying to build an AVR binary for an embedded sensor system, and I'm running close to my size limit. I use a few external libraries to help me, but they are rather large when compiled into one object per library. I want to pull these into smaller objects so only the functionality I need is linked into my program. I've already managed to drop the binary size by 2k by splitting up a large library.
It would help a lot to know which objects are being used at each stage of the game so I can split them more efficiently. Is there a way to make ld print which objects it's linking?
I'm not sure about the object level, but I believe you might be able to tackle this on the symbol level using CFLAGS="-fdata-sections -ffunction-sections" and LDFLAGS="-Wl,--gc-sections -Wl,--print-gc-sections". This should get rid of the code for all unreferenced symbols, and display the removed symbols to you as well which might be useful if for some reason you decide to go back to the object file level and want to identify object files only containing removed symbols.
To be more precise, the compiler flags I quoted will ask the compiler to place each function or global variable in a section for itself, and the --gc-sections linker flag will then remove all the sections which have not been used. It might be that each object file contains its own sections, even if the functions therein all share a single section. In that case the linker flag alone should do what you ask for: eliminate whole objects which are not used. The gcc manual states that the compiler flags will increase the object size, and although I hope that the final executable should not be affected by this, I don't know for sure, so you should give the LDFLAGS="-Wl,--gc-sections by itself a try in any case.
The listed option names might be useful keywords to search on stackoverflow for other suggestions on how to reduce the size of the binary. gc-sections e.g. yields 62 matches at the moment.