i am developing a program for some board, which uses PowerPC Architechture. I just have made some changes to the repository, refactored a bit and moved and erased classes.
On my development machine (VM linux x64) the binaries build fine and are executable. When i build with the CorssCompile Toolchain, it runs through smoothly without any errors nor warnings. But on the target system i cannot get the program to run, it seems to be not even making it to the main entry point.
So my guess is, that i have somehow created a linkage problem in the project. I just don't know how to untangle that beast.
So my questions, how can i get to the bottom of errors that occur before the main entry point was reached. How can i find the possible circular dependencies existing.
And just for "fun": Why in gods name would it build and run on x86 but not on ppc.
Yes i know this is few information to really help out, but i am asking for directions, sort of. Since i will have to deal with these problems some times anyways.
Why in gods name would it build and run on x86 but not on ppc.
There are a million possible reasons, from broken cross-toolchain, to bugs in libc, to incorrect toolchain invocation.
i am asking for directions
You should start by compiling this source:
int main() { return 0; }
and verifying that it works (this verifies basic toolchain sanity). If it does, you then extend it to print something. Then compile it with exactly the flags that you use in your real project.
If all of that checks out, you can run your real project under strace and/or GDB and see if you understand where the crash is happening. If you don't, edit your question with output from the tools, and someone may be able to guess better.
Update:
It seems the PPC toolchain or rather its compiler did not know how to handle static variables being declared after usage
If that were true, you would get a compilation error. You didn't, so this is false.
On Target: "gdb crashing/app" and then look at the frames, somewhere there is a "__static_initialization_and_destruction_0" frame, which should even point you to the file and line the yet undeclared static variable is used.
Your real problem is very likely this: order of construction of global (or class-static) variables in different translation units (i.e. different source files) is undefined (and usually is opposite between x86_64 and ppc). If you have two such globals in different files (say A and B), and if Bs constructor depends on A having been constructed already, then your program will work fine on platforms where A is constructed before B, but will crash on platforms where B is attempted to be constructed before A.
Related
I have a program that have been running fine for long time on one platform. Because of its success it is to be ported to another platform. No problem, I thought since it is written in Standard C++...
My approach (illustrated with pseudo CMake):
setup the development environment by sourcing the platform specific toolchain to ensure that correct platform is targeted
factor out all core business logic into an application object and build a library out of that (one library for each platform from the same source code):
add_library(appLib STATIC app.cpp)
target_link_libraries(appLib utilLib networkLib dbLib ${boostLibs})
have one main_a.cpp and another main_b.cpp, which do the platform-specific initialization for platform a and b respectively, and let the main function in those instantiate the application object.
int main()
{
auto result = initAndDoPlatformStuff();
App app(result);
app.run();
}
instruct compiler and linker to assemble an executable:
if (Platform_A)
add_executable(appExe main_a.cpp)
else()
add_executable(appExe main_b.cpp)
endif()
target_link_libraries(appExe appLib)
In principle, this is a perfectly valid approach I guess. But in reality it does not work. Within a second program crashes, and the crashes are different almost every time; inspecting the core dumps indicate it sometime crashed in the standard library, sometime in boost library and also in my code, but this is nonsense I guess. Program seem to work 1 out 10 times, but eventually crashes.
However, if I use the same exact code, only extract it into its original main.cpp file and then build it together differently, like this:
int main()
{
auto result = initAndDoStuff();
processForever(result); // Business logic
}
add_executable(appExe main.cpp)
target_link_libraries(appExe utilLib networkLib dbLib ${boostLibs})
then it works!
I'm puzzled and confused. I'm suspecting it has to do something with code layout, I've therefore played around with different variants of PIC and PIE but have had no success with that. Are there any tools available that allows you to get a comprehensive overview of the binary code layout? I know about nm, od, objdump but they are low-level and I don't know what to look for...
Maybe I'm on the wrong path anyway, maybe the problem is related to something completely different. Does anyone got any hunch of what can cause this behavior? How else can I approach this problem?
Actually, the fault was mine. Of course...
I really tried to get all details correct when I refactored the code into a lib, but obviously I was not careful enough, and blind when searching for the problem.
The problem, which I finally found, was that I still kept one variable as a local variable after refactoring, which then went out of scope causing deallocated memory to be referenced, which resulted in all sorts of undefined behavior.
Every so often I (re)compile some C (or C++) file I am working on -- which by the way succeeds without any warnings -- and then I execute my program only to realize that nothing has changed since my previous compilation. To keep things simple, let's assume that I added an instruction to my source to print out some debugging information onto the screen, so that I have a visual evidence of trouble: indeed, I compile, execute, and unexpectedly nothing is printed onto the screen.
This happened me once when I had a buggy code (I ran out of the bounds of a static array). Of course, if your code has some kind of hidden bug (What are all the common undefined behaviours that a C++ programmer should know about?) the compiled code can be pretty much anything.
This happened me twice when I used some ridiculously slow network hard drive which -- I guess -- simply did not update my executable file after compilation, and I kept running-and-running the old version, despite the updated source. I just speculate here, and feel free to correct me, if such a phenomenon is impossible, but I suspect it has had to do something with certain processes waiting for IO.
Well, such things could of course happen (and they indeed do), when you execute an old version in the wrong directory (that is: you execute something similar, but actually completely unrelated to your source).
It is happening again, and it annoys me enough to ask: how do you make sure that your executable is matching the source you are working on? Should I compare the date strings of the source and the executable in the main function? Should I delete the executable prior compilation? I guess people might do something similar by means of version control.
Note: I was warned that this might be a subjective topic likely doomed to be closed.
Just use ol' good version control possibilities
In easy case you can just add (any) visible version-id in the code and check it (hash, revision-id, timestamp)
If your project have a lot of dependent files and you suspect older version, than "latest", in produced code, you can (except, obvioulsly, good makefile-rules) monitor also version of every file, used for building code (VCS-dependent, but not so heavy trick)
Check the timestamp of your executable. That should give you a hint regarding whether or not it is recent/up-to-date.
Alternatively, calculate a checksum for your executable and display it on startup, then you have a clue that if the csum is the same the executable was not updated.
I want to give a C++ programme to someone for testing but I don't want them to see the source just yet. My main issues are that I don't know what platform that person is using and I don't want to create a shared library unless I have no other option. Ideally, I would like to send headers and object files for the person to compile and link him/herself but as far as I know that would only work if the person has the same set up that I have.
I am currently using Windows but I'm comfortable working on any Unix-like system as well and I am not using an IDE, in case you need that information
Well, a Windows development environment allows you to bind some native always backward compatible winapi functions. The distribution of correctly setup binary .dll files, along with consistent headers, is enough.
For Linux distributions, the scenario is different, since you need to have a distributed package compiled from source (that's disclosed), or distributed binaries for every Linux distributions you actually want to support.
If you want to avoid source code disclosure, where it's needed to compile on specific target systems, use a licencing mechanism that's preventing to run it.
Assuming the choice of machine is "reasonable" - in other words, it's something running Linux, Windows, Android or MacOS and a reasonable target processor such as MIPS, Sparc, x86 or ARM, then one POSSIBLE solution is to use clang -S -emit-llvm yourfile.cpp to produce an intermediate form of the LLVM "virtual machine code". This can then, using llc, be compiled to machine code for any target that LLVM supports.
It's not completely impossible to figure out roughly what the source code looked like, but unless someone wants to put a LOT of effort into running your code, they won't be able to see what the code does. And even giving someone a binary allows them, if they are that way inclined, to reverse engineer the code.
The other alternative, as I see it, is that you demonstrate the code on your machine [or a machine under your control].
There are also tools that can "obfuscate" source-code (rename variables, structure/class members and functions to a, b, c; remove any comments; and "unformat" the code - all of which makes it much harder to understand what the code does). Sorry, you'll have to google to find a good one, as I have never used such a thing myself. And again, of course, it's not IMPOSSIBLE to recover the code into something that can be used and modified and rebuilt. There is really no way to avoid giving the customer something they can compile unless you know what OS/processor it is for.
Unfortunately I am not working with open code right now, so please consider this a question of pure theoretical nature.
The C++ project I am working with seems to be definitely crippled by the following options and at least GCC 4.3 - 4.8 are causing the same problems, didn't notice any trouble with 3.x series (these options might have not been existed or worked differently there), affected are the platforms Linux x86 and Linux ARM. The options itself are automatically set with O1 or O2 level, so I had to find out first what options are causing it:
tree-dominator-opts
tree-dse
tree-fre
tree-pre
gcse
cse-follow-jumps
Its not my own code, but I have to maintain it, so how could I possibly find the sources of the trouble these options are making. Once I disabled the optimizations above with "-fno" the code works.
On a side note, the project does work flawlessly with Visual Studio 2008,2010 and 2013 without any noticeable problems or specific compiler options. Granted, the code is not 100% cross platform, so some parts are Windows/Linux specific but even then I'd like to know what's happening here.
It's no vital question, since I can make the code run flawlessly, but I am still interested how to track down such problems.
So to make it short: How to identify and find the affected code?
I doubt it's a giant GCC bug and maybe there is not even a real fix for the code I am working with, but it's of real interest for me.
I take it that most of these options are eliminations of some kind and I also read the explanations for these, still I have no idea how I would start here.
First of all: try using debugger. If the program crashes, check the backtrace for places to look for the faulty function. If the program misbehaves (wrong outputs), you should be able to tell where it occurs by carefully placing breakpoints.
If it didn't help and the project is small, you could try compiling a subset of your project with the "-fno" options that stop your program from misbehaving. You could brute-force your way to finding the smallest subset of faulty .cpp files and work your way from there. Note: finding a search algorithm with good complexity could save you a lot of time.
If, by any chance, there is a single faulty .cpp file, then you could further factor its contents into several .cpp files to see which functions are the cause of misbehavior.
I am facing a rather peculiar issue: I have a Qt C++ application that used to work fine. Now, suddenly I cannot start it anymore. No error is thrown, no nothing.
Some more information:
Last line of output when application is started in debug mode with Visual Studio 2012:
The program '[4456] App.exe' has exited with code -1 (0xffffffff).
Actual application code (= first line in main()) is never called or at least no breakpoints are triggered, so debugging is not possible.
The executable process for a few seconds appears in the process list and then disappears again.
Win 7 x64 with latest Windows updates.
The issues simultaneously appeared on two separate machines.
Application was originally built with Qt 5.2.1. Today I test-wise switched to Qt 5.4.1. But as expected no change.
No changes to source code were made. The issue also applies to existing builds of the application.
Running DependencyWalker did not yield anything of interest from my point of view.
I am flat out of ideas. Any pointers on what to try or look at? How can an executable suddenly stop working at all with no error?
I eventually found the reason for this behavior...sort of. The coding (e. g. my singletons) were never the problem (as I expected since the code always worked). Instead an external library (SAP RFC SDK) caused the troubles.
This library depends on the ICU Unicode libraries and apparently specific versions at that. Since I wasn't aware of that fact, I only had the ICU libraries that my currently used Qt version needs in my application directory. The ICU libraries for the SAP RFC SDK must have been loaded from a standard windows path so far.
In the end some software changes (Windows updates, manual application uninstalls, etc.) must have removed those libraries which resulted in that described silent fail. Simply copying the required ICU library version DLLs into my application folder, solved the issue.
The only thing I am not quite sure about, is why this was not visible when tracing the loaded DLLs via DependencyWalker.
"Actual application code (= first line in main()) is never called. So debugging is not possible."
You probably have some static storage initialization failing, that's applied before main() is called.
Do you use any interdependent singletons in your code? Consolidate them to a single singleton if so (remember, there shouldn't be more than one singleton).
Also note, debugging still is possible well for such situation, the trap is ,- for such case as described in my answer -, main()'s bodies' first line is set as the first break point as default, when you start up your program in the debugger.
Nothing hinders you to set breakpoints, that are hit before starting up the code reaches main() actually.
As for your clarification from comments:
"I do use a few singletons ..."
As mentioned above, if you are really sure you need to use a singleton, use actually a single one.
Otherwise you may end up, struggling with undefined order of initialization of static storage.
Anyway, it doesn't matter that much if static storage data depends on each other, provide a single access point to it throughout your code, to avoid cluttering it with heavy coupling to a variety of instances.
Coupling with a single instance, makes it easier to refactor the code to go with an interface, if it turns out singleton wasn't one.