So I´ve been using this OpenCl based program (https://github.com/johguse/profanity) for a while now and wanted to build it from its source code. The resulting executable (source code unchanged), seems to stop execution when looking for devices.
user#room:~/profanity$ ./profanity.x64 -I 900 --zeros
Mode: zeros
Target: Address
Devices:
user#room:~/profanity$
Ive tried this on 3 different PCs now, which all had no problem running the original program, so my hardware shouldnt be the problem.
Since this is a lot about the program itself, nobody will probably know the solution right away, but im getting kind of desperate with this situation, so I would like to know, what some common causes of GPU-finding problems in OpenCl programs are.
Thanks in advance.
Related
Working Fortran compilers sometimes generate invalid Win32 .exe files
Hello everybody,
several working Fortran compilers seem to have a strange behavior in certain situations. I have tried to compile and run Prof. John Denton's programs which can be found here:
https://www.dropbox.com/sh/8i0jyxzjb57q4j4/AABD9GQ1MUFwUm5hMWFylucva?dl=0
The different versions of the programs Meangen und Stagen could be compiled and worked fine. The last program named Multall also has several different versions. As before, the appropriate source codes could be compiled without any problems. But: as I tried to run the resulting .exe files, I got a very strange error message saying Multall's .exe would NOT be a valid Win32 executable.
I used four different Fortran compilers (g77, Cygwin, Mingw, FTN95) on Windows XP and Windows 8, always with the same result. I made several tests, and it seems to me the reason of the strange error message is the huge amount of source code Multall consists of. There are much more than 16000 lines of code, so maybe the memory being allocated by default by the compiler for the code segment is too small and an overflow occurs.
I tried several command line options of the g77 compiler in order to increase the code segment's amount of memory, but none worked. Can anybody tell me which of the g77's command line options make the huge program Multall's .exe work? Or maybe I am wrong, and the strange error message has nothing to do with the code segment? Who can help me?
Thanks a lot, I highly appreciate your help
Indeed, the problem is not the program size but the stack size. This is due to the large common blocks. As a test you could reduce JD in commall-open-18.3 to 1000 and you will notice that the problem is solved.
You could check whether the arrays are not oversized and adjust some parameters.
I tried reducing common blocks - without any effect - then I tried on another computer and there the compilation went fine and the code runs - I am guessing it is some sort of screw-up of the libraries - maybe because I made a messy (first) installation where I didn't really know what I wass doing - but I really don't know.
Since many weeks our compilation server is crashing randomly while compiling our C++ code.
Sometimes the compilation failed and we have the following error :
/usr/include/c++/7/future:429:7: internal compiler error: Segmentation fault
The error is always raised from system libraries (but not always the same) and at different step of the compilation process.
We have tried to increase the size of RAM up to 10 GB and the size of the swap (up to 5GB) but the issue has not been solved. We have also tried multiple version of the cc compiler but without success.
We have a set a machine but the issue is only reproducible on out compilation server. We have to fix it because this server is part of our continuous integration chain.
The source code is composed of about 10000-20000 line of codes (not really much) but we use some template.
Does someone knows how to solve or investigate this error ?
System information:
compiler = c++
compiler version = c++ (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0
compilation tools = cmake and make
ubuntu-xenial
RAM = 10G
Swap = 5G
NbCPU = 4
Thank you very much for your help
So you've got intermittent errors in (presumably well tested) compiler internals from (presumably also well tested) system libraries, and the problems are reproducible on multiple compiler versions, but only on this single machine. That points towards a hardware issue.
Bad RAM seems like a good candidate. A C++ compiler processing a moderately sized code base is likely to crash from e.g. random bit flips at least some of the time.
You should test your RAM (or just swap it out and see if the failures go away).
At work I inherited a big code base. Older version was compiled with VC6.0 and works fine on Windows XP and 32-bit Windows 7. The quad core computer is specifically made for field use in a special industry.
Managed to upgrade to VC2005 and VC2013, however, the binaries produced by newer compilers yields very high CPU usage, to a point UI is not usable.
Tried a few profilers but got quite different results. For example, one points to PostMessageA, and another points to LineTo (MFC function).
Any clue where I should look at to find the cause?
I've rarely trusted profilers. One thing I do is I will repeatedly pause the debugger and see where it ends up. If it keeps ending up with a similar call stack, that's where the problem probably is.
Of course, if you have a lot of threads, you can play with freezing individual threads and pressing play/pause. Of course, if there is a lot of intra-thread dependencies, this will be difficult.
Every so often I (re)compile some C (or C++) file I am working on -- which by the way succeeds without any warnings -- and then I execute my program only to realize that nothing has changed since my previous compilation. To keep things simple, let's assume that I added an instruction to my source to print out some debugging information onto the screen, so that I have a visual evidence of trouble: indeed, I compile, execute, and unexpectedly nothing is printed onto the screen.
This happened me once when I had a buggy code (I ran out of the bounds of a static array). Of course, if your code has some kind of hidden bug (What are all the common undefined behaviours that a C++ programmer should know about?) the compiled code can be pretty much anything.
This happened me twice when I used some ridiculously slow network hard drive which -- I guess -- simply did not update my executable file after compilation, and I kept running-and-running the old version, despite the updated source. I just speculate here, and feel free to correct me, if such a phenomenon is impossible, but I suspect it has had to do something with certain processes waiting for IO.
Well, such things could of course happen (and they indeed do), when you execute an old version in the wrong directory (that is: you execute something similar, but actually completely unrelated to your source).
It is happening again, and it annoys me enough to ask: how do you make sure that your executable is matching the source you are working on? Should I compare the date strings of the source and the executable in the main function? Should I delete the executable prior compilation? I guess people might do something similar by means of version control.
Note: I was warned that this might be a subjective topic likely doomed to be closed.
Just use ol' good version control possibilities
In easy case you can just add (any) visible version-id in the code and check it (hash, revision-id, timestamp)
If your project have a lot of dependent files and you suspect older version, than "latest", in produced code, you can (except, obvioulsly, good makefile-rules) monitor also version of every file, used for building code (VCS-dependent, but not so heavy trick)
Check the timestamp of your executable. That should give you a hint regarding whether or not it is recent/up-to-date.
Alternatively, calculate a checksum for your executable and display it on startup, then you have a clue that if the csum is the same the executable was not updated.
Unfortunately I am not working with open code right now, so please consider this a question of pure theoretical nature.
The C++ project I am working with seems to be definitely crippled by the following options and at least GCC 4.3 - 4.8 are causing the same problems, didn't notice any trouble with 3.x series (these options might have not been existed or worked differently there), affected are the platforms Linux x86 and Linux ARM. The options itself are automatically set with O1 or O2 level, so I had to find out first what options are causing it:
tree-dominator-opts
tree-dse
tree-fre
tree-pre
gcse
cse-follow-jumps
Its not my own code, but I have to maintain it, so how could I possibly find the sources of the trouble these options are making. Once I disabled the optimizations above with "-fno" the code works.
On a side note, the project does work flawlessly with Visual Studio 2008,2010 and 2013 without any noticeable problems or specific compiler options. Granted, the code is not 100% cross platform, so some parts are Windows/Linux specific but even then I'd like to know what's happening here.
It's no vital question, since I can make the code run flawlessly, but I am still interested how to track down such problems.
So to make it short: How to identify and find the affected code?
I doubt it's a giant GCC bug and maybe there is not even a real fix for the code I am working with, but it's of real interest for me.
I take it that most of these options are eliminations of some kind and I also read the explanations for these, still I have no idea how I would start here.
First of all: try using debugger. If the program crashes, check the backtrace for places to look for the faulty function. If the program misbehaves (wrong outputs), you should be able to tell where it occurs by carefully placing breakpoints.
If it didn't help and the project is small, you could try compiling a subset of your project with the "-fno" options that stop your program from misbehaving. You could brute-force your way to finding the smallest subset of faulty .cpp files and work your way from there. Note: finding a search algorithm with good complexity could save you a lot of time.
If, by any chance, there is a single faulty .cpp file, then you could further factor its contents into several .cpp files to see which functions are the cause of misbehavior.