c++ floats and valgrind strange behaviour - c++

I have valgrind 3.6.0, I've searched everywhere and found nothing.
The problem is that when I'm trying to access a float number while using valgrind, I get a segfault, but when I run the program as is, without valgrind, everythings goes as expected.
This is the piece of code:
class MyClass {
public:
void end() {
float f;
f = 1.23;
std::stringstream ss;
ss << f;
std::cout << ss.str();
}
};
extern "C" void clean_exit_on_sig(int sig) {
//Code logging the error
mc->end();
exit(1);
}
MyClass *mc;
int main(int argc, char *argv[]) {
signal(SIGINT , clean_exit_on_sig);
signal(SIGABRT , clean_exit_on_sig);
signal(SIGILL , clean_exit_on_sig);
signal(SIGFPE , clean_exit_on_sig);
signal(SIGSEGV, clean_exit_on_sig);
signal(SIGTERM , clean_exit_on_sig);
mc = new MyClass();
while(true) {
// Main program loop
}
}
When I press Control+C, the program catches the signal correctly and everything goes fine, but when I run the program using valgrind, when tries to execute this command ss << f; // (Inside MyClass) a segfault is thrown :-/
I've tried this too:
std::string stm = boost::lexical_cast<std::string>(f);
But I keep on receiving a segfault signal when boost acceses the float number too.
This is the backtrace when I get segfault with boost:
./a.out(_Z17clean_exit_on_sigi+0x1c)[0x420e72]
/lib64/libc.so.6(+0x32920)[0x593a920]
/usr/lib64/libstdc++.so.6(+0x7eb29)[0x51e6b29]
/usr/lib64/libstdc++.so.6(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE15_M_insert_floatIdEES3_S3_RSt8ios_baseccT_+0xd3)[0x51e8f43]
/usr/lib64/libstdc++.so.6(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE6do_putES3_RSt8ios_basecd+0x19)[0x51e9269]
/usr/lib64/libstdc++.so.6(_ZNSo9_M_insertIdEERSoT_+0x9f)[0x51fc87f]
./a.out(_ZN5boost6detail26lexical_stream_limited_srcIcSt15basic_streambufIcSt11char_traitsIcEES4_E9lcast_putIfEEbRKT_+0x8f)[0x42c251]
./a.out(_ZN5boost6detail26lexical_stream_limited_srcIcSt15basic_streambufIcSt11char_traitsIcEES4_ElsEf+0x24)[0x42a150]
./a.out(_ZN5boost6detail12lexical_castISsfLb0EcEET_NS_11call_traitsIT0_E10param_typeEPT2_m+0x75)[0x428349]
./a.out(_ZN5boost12lexical_castISsfEET_RKT0_+0x3c)[0x426fbb]
./a.out(This line of code corresponds to the line where boost tries to do the conversion)
and this is with the default stringstream conversion:
./a.out(_Z17clean_exit_on_sigi+0x1c)[0x41deaa]
/lib64/libc.so.6(+0x32920)[0x593a920]
/usr/lib64/libstdc++.so.6(+0x7eb29)[0x51e6b29]
/usr/lib64/libstdc++.so.6(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE15_M_insert_floatIdEES3_S3_RSt8ios_baseccT_+0xd3)[0x51e8f43]
/usr/lib64/libstdc++.so.6(_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE6do_putES3_RSt8ios_basecd+0x19)[0x51e9269]
/usr/lib64/libstdc++.so.6(_ZNSo9_M_insertIdEERSoT_+0x9f)[0x51fc87f]
./a.out(This line of code corresponds to the line where I try to do the conversion)
a.out is my program, and I run valgrind this way: valgrind --tool=memcheck ./a.out
Another weird thing is that when I call mc->end(); while the program runs fine (Any signal received, Object just finished his work), I don't get segfault in any way (as is and with valgrind).
Please, don't tell me 'Don't close your program with Control+C blah blah...' this piece of code is for logging any error the program possibly have without losing data in case of segfault, killing it because of deadlock or something else.
EDIT: Maybe is a valgrind bug (I don't know, searched on google but found nothing, don't kill me), any workaround will be accepted too.
EDIT2: Just realized that boost calls ostream too (Here is clearer than using vim :-/), going to try sprintf float conversion.
EDIT3: Tried this sprintf(fl, "%.1g", f); but still crashes, backtrace:
./a.out(_Z17clean_exit_on_sigi+0x40)[0x41df24]
/lib64/libc.so.6(+0x32920)[0x593a920]
/lib64/libc.so.6(sprintf+0x56)[0x5956be6]
./a.out(Line where sprintf is)

Ok, after some hours of reading and research, I found the problem, I'm going to answer my own question because noone does, only a comment by #Kerrek SB [ https://stackoverflow.com/users/596781/kerrek-sb ] but I cannot accept a comment. (Thank you)
It's as easy as inside a signal handler you only can call a bunch of functions safely: http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html
If you call some non-async-safe functions, they can work, but not always.
If you want to call non-async-safe functions inside a signal handler, you can do this:
Create 2 pipes. int pip1[2]; int pip2[2]; pipe(pip1); pipe(pip2);
Create a new thread and make the thread wait to receive some data from the 1rst pipe read(pip1[0], msg, 1);
When signal handler is called, use write async-safe function to write to the 1rst pipe write(pip1[1], "0", 1);
Then make the signal wait for the second pipe with read(pip2[0], msg, 1);
The thread will wake up and do all the job he has to do (saving data to database in this case), after that, make the thread write data to the second pipe write(pip2[1], "0", 1);
Now main thread will wake up and finish with _Exit(1) or something else.
Info:
I'm using 2 pipes because if I write to a pipe and just after that I read it, it's possible that the 2nd thread never wakes up because the main thread have read the data have just written. And I'm using a secondary pipe to block the main thread because I don't want it to exit while the 2nd thread is saving data.
Keep in mind that signal handler maybe has been called while modifying a shared resource, if your 2nd thread acceses that resource is possible that you encounter a second segfault, so be careful when accesing shared resources with your 2nd thread (Global variables or something else).
If you are testing with valgrind and don't want to receive 'false' memory leaks when receiving a signal you can do this before exiting pthread_join(2ndthread, NULL) and exit(1) instead of _Exit(1). These are non-async-safe functions, but at least you can test memory leaks and close you app with a signal without receiving 'false' memory leaks.
Hope this helps someone. Thanks again #Kerrek SB.

Debuggers and stuff sometimes toss signals to the process that you don't normally get. I had to alter a function that used recv to work under gdb for example. Check to see what your signal is and verify that mc is not null before trying to use it. See if that starts getting you closer to an answer.
I am thinking perhaps your use of new (or something else maybe) is possibly causing valgrind to send a signal that is being caught by your handler before mc is initialized.
It's also clear you didn't paste actual code because your use of 'class' without making the end() function public means this should not compile.

Related

Interprocess communication, reading from multiple children stdout

I'm trying to write a custom shell-like program, where multiple commands can be executed concurrently. For a single command this is not much complicated. However, when I try to concurrently execute multiple commands (each one in a separate child) and capture their stdout I'm having a problem.
What I tried so far is this under my shell application I have two functions to run the commands concurrently, execute() takes multiple commands and for each of the commands it fork() a child process to execute the command, subprocess() takes 1 cmd and executes it.
void execute(std::vector<std::string> cmds) {
int fds[2];
pipe(fds);
std::pair<pid_t, int> sp;
for (int i = 0; i < cmds.size(); i++) {
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
}
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
}
std::pair<pid_t, int> subprocess(std::string &cmd, int *fds) {
std::pair<pid_t, int> process = std::make_pair(fork(), fds[0]);
if (process.first == 0) {
close(fds[0]); // no reading
dup2(fds[1], STDIN_FILENO);
close(fds[1]);
char *argv[] = {"/bin/sh", "-c", cmd.data(), NULL};
execvp(argv[0], argv);
exit(0);
}
close(fds[1]); // only reading
return process;
}
The problem here is, when I execute multiple commands on my custom shell (not diving into spesifics here, but it will call execute() at some point.) if I use STDIN_FILENO as above to capture child process stdout, it keeps writing to shell's stdin forever what the captured output is, for example
if the input commands are
echo im done, yet?
echo nope
echo maybe
then, in writing to STDIN_FILENO case, the output is like (where >>> ) is my marker for user input.
im done, yet?
nope
maybe
>>> nope
maybe
im done, yet?
>>> im done, yet?
nope
maybe
in writing to STDOUT_FILENO case, it seems it's ignoring one of the commands (probably the first child), I'm not sure why?
maybe
nope
>>> maybe
nope
>>> nope
maybe
>>> maybe
nope
>>> nope
So, potential things I thought are in my shell I'm using std::cin >> ... for user input in a while loop ofc, this may somehow conflict with stdin case. On the other hand, in the main process (parent) I'm waiting for all children to exit, so children somehow is not exiting, but child should die off after execvp, right ? Moreover, I close the reading end in the main process close(sp.second). At this point, I'm not sure why this case happens ?
Should I not use pipe() for a process like this ? If I use a temp file to redirect stdout of child process, would everything be fine ? and if so, can you please explain why ?
There are multiple, fundamental, conceptual problems in the shown code.
std::pair<pid_t, int> sp;
This declares a new std::pair object. So far so good.
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
This declares a new std::pair object inside the for loop. It just happens to have the same name as the sp object at the function scope. But it's a different object that has nothing to do, whatsoever, with it. That's how C++ works: when you declare an object inside an inner scope, inside an if statement, a for loop, or anything that's stuffed inside another pair of { ... } you end up declaring a new object. Whether its name happens to be the same as another name that's been declared in a larger scope, it's immaterial. It's a new object.
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
There are two separate problems here.
For starters, if we've been paying attention: this sp object has not been initialized to anything.
If the goal here is to read from the children, that part is completely missing, and that should be done before waiting for the child processes to exit. If, as the described goal is here, the child processes are going to be writing to this pipe the pipe should be read from. Otherwise if nothing is being read from the pipe: the pipe's internal buffer is limited, and if the child processes fill up the pipe they'll be blocked, waiting for the pipe to be read from. But the parent process is waiting for the child processes to exist, so everything will hang.
Finally, it is also unclear why the pipe's file descriptor is getting passed to the same function, only to return a std::pair with the same file descriptor. The std::pair serves no useful purpose in the shown code, so it's likely that there's also more code that's not shown here, where this is put to use.
At least all of the above problems must be fixed in order for the shown code to work correctly. If there's other code that's not shown, it may or may not have additional issues, as well.

C++ What exactly does exit() do when using multiple threads?

I'm working on some code that uses an existing code-base which is now a DLL. What I'm trying to do it to terminate all the threads that are were started, but keep my main program running.
This is the basic structure of the code:
void Mainprogram()
{
tempProcessingThreadHandle = (HMODULE)_beginthread(SomeDLLEntry, 0, (void)&Params); //SomeDLLEntry is a function in some.dll
//Other Code I Want to Run
}
In the DLL:
void SomeDLLEntry()
{
tempProcessingThreadHandle = (HMODULE)_beginthread(SomeOtherDLLThing1, 0, (void)&Params);
tempProcessingThreadHandle = (HMODULE)_beginthread(SomeOtherDLLThing2, 0, (void)&Params);
if (someCondition)
return;
}
void SomeOtherDLLThing1()
{
if (someOtherCondition)
exit(1);
}
I thought that returning from SomeDLLEntry() would cause the threads started in the DLL (SomeOtherDLLThing1 and 2) to terminate as well, but that's not the case, as seen in the debugger; SomeDLLEntry() thread would disappear, but the others are still running.
Now, if I set someOtherCondition to true, and exit(1) is called from SomeOtherDLLThing1(), what should happen? When I debug over this, the debugger seems to crash, but it appears to go past the exit(1) line and give me:
Unhandled exception at 0x6B4E87CD (Mso20win32client.dll) in Mainprogram.exe: 0xC0000005: Access violation reading location 0x0000018C.
Is this because the whole process (including Mainprogram()) has been terminated? What exactly does calling exit(1) in SomeOtherDLLThing1() do? How can I properly terminate all of the DLL-related threads and continue with my Mainprogram()?
The only way to safely terminate threads and continue to execute is to coordinate with those threads, have them cleanly finish, then join the thread handles.
There is no free lunch.
Other choices are "hack, force halt of threads, and pray you get lucky" or "do your work in another process, and pray summary shutdown causes no problems".

Interrupt running program and save data

How to design a C/C++ program so that it can save some data after receiving interrupt signal.
I have a long running program that I might need to kill (say, by pressing Ctrl-C) before it finished running. When killed (as opposed to running to conclusion) the program should be able to save some variables to disk. I have several big Linux books, but not very sure where to start. A cookbook recipe would be very helpful.
Thank you.!
to do that, you need to make your program watch something, for example a global variable, that will tell him to stop what it is doing.
For example, supposing your long-running program execute a loop, you can do that :
g_shouldAbort = 0;
while(!finished)
{
// (do some computing)
if (g_shouldAbort)
{
// save variables and stuff
break; // exit the loop
}
}
with g_shouldAbort defined as a global volatile variable, like that :
static volatile int g_shouldAbort = 0;
(It is very important to declare it "volatile", or else the compiler, seeing that no one write it in the loop, may consider that if (g_shouldAbort) will always be false and optimize it away.)
then, using for example the signal API that other users suggested, you can do that :
void signal_handler(int sig_code)
{
if (sig_code == SIGUSR1) // user-defined signal 1
g_shouldAbort = 1;
}
(you need to register this handler of course, cf. here.
signal(SIGUSR, signal_handler);
Then, when you "send" the SIGUSR1 signal to your program (with the kill command for example), g_shouldAbort will be set to 1 and your program will stop its computing.
Hope this help !
NOTE : this technique is easy but crude. Using signals and global variables makes it difficult to use multiple threads of course, as other users have outlined.
What you want to do isn't trivial. You can start by installing a signal handler for SIGINT (C-c) using signal or sigaction but then the hard part starts.
The main problem is that in a signal handler you can only call async-signal-safe functions (or reentrant functions). Most library function can't be reliably considered reentrant. For instance, stdio functions, malloc, free and many others aren't reentrant.
So how do you handle this ? Set a flag in you handler (set some global variable done to 1) and look out for EINTR errors. It should be safe to do the cleanup outside the handler.
What you are trying to do falls under the rubric of checkpoint/restart.
There's several big problems with using a signal-driven scheme for checkpoint/restart. One is that signal handlers have to be very compact and very primitive. You cannot write the checkpoint inside your signal handler. Another problem is that your program can be anywhere in its execution state when the signal is sent. That random location almost certainly is not a safe point from which a checkpoint can be dropped. Yet another problem is that you need to outfit your program with some application-side checkpoint/restart capability.
Rather than rolling your own checkpoint/restart capability, I suggest you look into using a free one that already exists. gdb on linux provides a checkpoint/restart capability. Another is DMTCP, see http://dmtcp.sourceforge.net/index.html .
Use signal(2) or sigaction(2) to assign a function pointer to the SIGINT signal, and do your cleanups there.
Make your you enter only once in your save function
// somewhere in main
signal( SIGTERM, signalHandler );
signal( SIGINT, signalHandler );
void saveMyData()
{
// save some data here
}
void signalHandler( int signalNumber )
{
static pthread_once_t semaphore = PTHREAD_ONCE_INIT;
std::cout << "signal " << signalNumber << " received." << std::endl;
pthread_once( & semaphore, saveMyData );
}
If your process get 2 or more signals before you finish writing your file you'll save weird data

Make main() "uncrashable"

I want to program a daemon-manager that takes care that all daemons are running, like so (simplified pseudocode):
void watchMe(filename)
{
while (true)
{
system(filename); //freezes as long as filename runs
//oh, filename must be crashed. Nevermind, will be restarted
}
}
int main()
{
_beginThread(watchMe, "foo.exe");
_beginThread(watchMe, "bar.exe");
}
This part is already working - but now I am facing the problem that when an observed application - say foo.exe - crashes, the corresponding system-call freezes until I confirm this beautiful message box:
This makes the daemon useless.
What I think might be a solution is to make the main() of the observed programs (which I control) "uncrashable" so they are shutting down gracefully without showing this ugly message box.
Like so:
try
{
char *p = NULL;
*p = 123; //nice null pointer exception
}
catch (...)
{
cout << "Caught Exception. Terminating gracefully" << endl;
return 0;
}
But this doesn't work as it still produces this error message:
("Untreated exception ... Write access violation ...")
I've tried SetUnhandledExceptionFilter and all other stuff, but without effect.
Any help would be highly appreciated.
Greets
This seems more like a SEH exception than a C++ exception, and needs to be handled differently, try the following code:
__try
{
char *p = NULL;
*p = 123; //nice null pointer exception
}
__except(GetExceptionCode() == EXCEPTION_ACCESS_VIOLATION ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
cout << "Caught Exception. Terminating gracefully" << endl;
return 0;
}
But thats a remedy and not a cure, you might have better luck running the processes within a sandbox.
You can change the /EHsc to /EHa flag in your compiler command line (Properties/ C/C++ / Code Generation/ Enable C++ exceptions).
See this for a similar question on SO.
You can run the watched process a-synchronously, and use kernel objects to communicate with it. For instance, you can:
Create a named event.
Start the target process.
Wait on the created event
In the target process, when the crash is encountered, open the named event, and set it.
This way, your monitor will continue to run as soon as the crash is encountered in the watched process, even if the watched process has not ended yet.
BTW, you might be able to control the appearance of the first error message using drwtsn32 (or whatever is used in Win7), and I'm not sure, but the second error message might only appear in debug builds. Building in release mode might make it easier for you, though the most important thing, IMHO, is solving the cause of the crashes in the first place - which will be easier in debug builds.
I did this a long time ago (in the 90s, on NT4). I don't expect the principles to have changed.
The basic approach is once you have started the process to inject a DLL that duplicates the functionality of UnhandledExceptionFilter() from KERNEL32.DLL. Rummaging around my old code, I see that I patched GetProcAddress, LoadLibraryA, LoadLibraryW, LoadLibraryExA, LoadLibraryExW and UnhandledExceptionFilter.
The hooking of the LoadLibrary* functions dealt with making sure the patching was present for all modules. The revised GetProcAddress had provide addresses of the patched versions of the functions rather than the KERNEL32.DLL versions.
And, of course, the UnhandledExceptionFilter() replacement does what you want. For example, start a just in time debugger to take a process dump (core dumps are implemented in user mode on NT and successors) and then kill the process.
My implementation had the patched functions implemented with __declspec(naked), and dealt with all the registered by hand because the compiler can destroy the contents of some registers that callers from assembly might not expect to be destroyed.
Of course there was a bunch more detail, but that is the essential outline.

Deleting And Reconstructing Singleton in C++

I have an application which runs on a controlling hardware connected with different sensors. On loading the application, it checks the individual sensors one by one to see whether there is proper communication with the sensor according to predefined protocol or not.
Now, I have implemented the code for checking the individual sensor communication as a singleton thread and following is the run function, it used select system call and pipe for interprocess communication to signal the end of thread.
void SensorClass::run()
{
mFdWind=mPort->GetFileDescriptor();
fd_set readfs;
int max_fd = (mFdWind > gPipeFdWind[0] ? mFdWind : gPipeFdWind[0]) + 1;
int res;
mFrameCorrect=false;
qDebug("BEFORE WHILE");
while(true)
{
qDebug("\n IN WHILE LOOP");
usleep(50);
FD_ZERO(&readfs);
FD_SET(mFdWind,&readfs);
FD_SET(gPipeFdWind[0],&readfs);
res=select(max_fd,&readfs,NULL,NULL,NULL);
if(res < 0)
perror("Select Failed");
else if(res == 0)
puts("TIMEOUT");
else
{
if(FD_ISSET(mFdWind,&readfs))
{
puts("*************** RECEIVED DATA ****************");
mFrameCorrect=false;
FlushBuf();
//int n=mPort->ReadPort(mBuf,100);
int n=mPort->ReadPort(mBuf,100);
if(n>0)
{
Count++;
QString str((const char*)mBuf);
//qDebug("\n %s",qPrintable(str));
//See if the Header of the frame is valid
if(IsHeaderValid(str))
{
if( (!IsCommaCountOk(str)) || (!IsChecksumOk(str,mBuf)) || (!CalculateCommaIndexes(str)) )
{
qDebug("\n not ok");
mFrameCorrect=false;
} //if frame is incorrect
else
{
qDebug("\n OK");
mFrameCorrect=true;
}//if frame is correct(checksum etc are ok)
}//else if header is ok
}//if n > 0
}//if data received FD_ISSET
if(FD_ISSET(gPipeFdWind[0],&readfs))
break;
}//end nested else res not <= 0
}//infinite loop
}
The above thread is run started from the main GUI thread. This runs fine. The problem is I have given an option to the user to retest the subsystem at will. For this I delete the singleton instance using
delete SensorClass::instance();
and then restart the singleton using
SensorClass::instace()->start();
The problem is this time the control comes out of while loop in run() function immedeately upon entering the while loop, my guess is the pipe read has again read from the write pipe which was written to the last time. I have tried to use the fflush() to clear out the I/O but no luck.
My question is
Am I thinking on the right track?
If yes then how do we clear out the pipes?
If not can anyone suggest why is the selective retest not working?
Thanks in advance..
fflush clears the output buffer. If you want to clear the input buffer, you're going to need to read the data or seek to the end.
I'm not convinced the "Singleton" pattern is appropriate. There are other ways of ensuring at most one instance for each piece of hardware. What if you later want multiple threads, each working with a different sensor?
Let's assume that you're creating this thread by inheriting from QThread (which you don't specify). From the documentation of QThread::~QThread ():
Note that deleting a QThread object will not stop the execution of the thread it represents. Deleting a running QThread (i.e. isFinished() returns false) will probably result in a program crash.
So the statement delete SensorClass::instance(); is probably a really, really bad idea. In particular, it's going to be tough making any sense of this program's behavior given this flaw. Before continuing, you might want to find a way to remove the instance and ensure that the thread goes away, too.
Another problem comes to mind. When you run delete SensorClass::instance(), you get rid of some object (on the heap, one hopes). Who tells the singleton holder that its object is gone? E.g. so that the next call to SensorClass::instance() knows it needs to allocate another instance? Is this handled properly in SensorClass::~SensorClass?
Suppose that's not a problem. That likely means that the pointer to the instance is held in a global variable (or, e.g. a class level static member). It probably doesn't matter for this situation, but is access to that member properly synchronized? I.e. is there a mutex that's locked for each access to it?
You really don't want to run your initialization in thread. That is issue number one that dramatically complicates your problem and which is the kind of thing for some reason no one points out.
Just make the initialization its own function, then have a guard variable and lock, and have everything that uses it separately initialize it when they start up.
So you're signaling by writing something to the pipe, and the pipe is only created once - i.e. reused in the later threads?
Read the signaling away from the pipe. Assuming you signal by writing a single byte, then instead of just breaking out, you'd do something like (NB, no error checking etc below):
if(FD_ISSET(gPipeFdWind[0],&readfs)) {
char c;
read(gPipeFdWind[0], &c, 1);
break;
}
There are also Qt classes for handling socket I/O, e.g. QTcpSocket, which would make the code not only cleaner, also more cross-platform. Or at least QSocketNotifier to abstract the select away.