Undebuggable non-deterministic heisenbug in single-threaded C++ function call

Undebuggable non-deterministic heisenbug in single-threaded C++ function call - c++

I'm at the end of my rope here: I have a single-threaded C++ program. Here is some empirical data and background information, I tried to highlight the most important keywords;
The entire section I'm talking about does not have any syscalls, other than the memory (de-)allocation calls the standard C++ library may perform (std::sets are involved). It's a purely logical algorithm.
The behaviour of this should be deterministic, depending on the input, which I do not vary.
If the bug manifests itself, the program simply falls into what looks like an endless loop where it seems to start allocating memory beyond any bound.
The bug does not manifest itself predictably, I can run the program from the command line and sometimes (perhaps 30%-50%) the bug manifests itself, otherwise, everything runs smoothly and correctly as far as I can tell.
Once I run the program not directly from the prompt, but in gdb or valgrind, the bug is gone, the program never dies.
Now comes the best part: I traced the problem to a (templated) non-virtual member function call. Just before the call, I print a message to std::cout, which I can see in the terminal. The first line inside the function also has a debug message, which is never shown.
I don't see any reasonable explanation any more. Maybe you can come up with an idea how to proceed.
Edit: The significant lines of code, I changed the line numbers so we can refer to them and omitted irrelevant parts, so not everything seems to make the best sense.
a.cpp
10 std::set<Array const*>* symbols;
11 std::set<Array const*> allSymbols;
12 symbols = &allSymbols;
// ... allSymbols are populated with std::inserter
15 std::cout << "eval; cd = " << &cd << ", cg = " << &cd.cg << std::endl;
16 senderConstraints = cd.cg.eval(*symbols);
b.cpp
31 template <typename ArrayContainer>
32 ConstraintList eval(ArrayContainer const request) {
33 std::cout << "inside eval ... going to update graph now" << std::endl;
The last line of output is:
eval; cd = 0x2e6ebb0, cg = 0x2e6ebc0
Then it's trapped in the endless loop.

I bet, the second line is printed, when you change
ConstraintList eval(ArrayContainer const request)
to
ConstraintList eval(ArrayContainer const & request)
If so, either the state of allSymbols is corrupted between line 12 and line 15, or your code really looks more like this:
std::set<Array const*>* symbols;
{
std::set<Array const*> allSymbols;
symbols = &allSymbols;
// ... allSymbols are populated with std::inserter
}
std::cout << "eval; cd = " << &cd << ", cg = " << &cd.cg << std::endl;
senderConstraints = cd.cg.eval(*symbols);
Which is UB, because symbols refers to an already destructed object.

Related

I don't understand how my compiler got this output

Note that I am using a Turbo C++ compiler because we are supposed to learn only Turbo C++ for our school syllabus. Which is why, the cout statement is evaluated from right to left in this case.
Program
#include <iostream.h>
#include <string.h>
void func(char *s, char t[]) {
strcpy(t, "Have fun");
s = "Be\0Cool";
cout << s[0] << ++s << s++ << --s << strupr(s+2) << ++s << s++ << s;
}
int main() {
char x[] = "Hello World!!!", y[] = "Hello World";
func(x, y);
cout << x << y;
return 0;
}
Output
CCOOLeeOOLBeBeHello World!!!Have fun
I feel the output should be:
CCooleeOOLBeBeHello World!!!Have fun
Because in the ++s part of the cout statement (second position), the pointer is at index 3 of the string s, so only 'Cool' should be printed. Instead 'COOL' is being printed. Why does this happen?

Tests with Visual Studio 2019
For comparison purpose, in Visual Studio 2019 (DEBUG), if we make required change to compile the code, then the program crash because we try to modify a constant string ("Be\0Cool").
If we do additional change to avoid the crash (by using a local array), the output is:
CCoOLeCoOLOLCoOLBeCoOLHello World!!!Have fun
If we split cout << s[]…; line to multiple calls to cout (one before each <<), then the output would be:
BeeeCOOLCOOLHello World!!!Have fun
Or if we add a line after each output, we get:
B
e
e
e
COOL
COOL
Hello World!!!
Have fun
Trying to understand the output of Turbo C++
If we then reverse each call to cout to start with the last (i.e. cout << s<< endl;) and end with first (cout << s[0] << endl), then we get:
Be
Be
OOL
e
e
COOL
C
Hello World!!!
Have fun
If we manually write that starting with the third last line and up and then the two last line in order without space, we get:
CCOOLeeOOLBeBeHello World!!!Have fun
Which is exactly what you got as an output.
Thus, it appears that Turbo C++ evaluate every expression from the right to the left.
Notes about required changes to compile (and run)
<iostream.h> is not available so I have to use <iostream> instead.
#define _CRT_SECURE_NO_WARNINGS must be added at top because some functions are not secure (and won't compile by default).
using namespace std; to avoid making more change to the code.
Add cast in s = (char *)"Be\0Cool"; so that line compile.
This lead to a crash because data is constant and we try to modify it.
Remove the cast and instead write char data[] = "Be\0Cool"; s = data;
The program run but the output is CCoOLeCoOLOLCoOLBeCoOLHello World!!!Have fun
In fact, this is undefined behavior. It just happen to be the actual output.
Undefined behavior
Some things are not defined by the standard and thus are not required to work a certain way. Well, as expected if read-only memory is not supported, it works like read-write memory.
For the order of evaluation, common possibilities are:
left to right
right to left
whatever is more optimal
Also since a variable is modified more than once, the value of s is not defined during the evaluation and afterward. The easy to remember rule is to avoid modifying the same variable more than once in a single expression.
About strupr
That function modify the string up to the terminating null character. In your case, it would convert to uppercase every letters from whatever value s has at the moment of the call.

R check doesn't like std:cout (C++)

I'm trying to submit a package to CRAN which contains C++ code (I have no clue about C++, the cpp files were written by somebody else).
The R check complains about ‘std::cout’ (C++)
Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor the C RNG
I found in the code the following command:
integrate_const(stepper_type( default_error_checker< double >( abs_error , rel_error ) ),
mDifEqn,
x,
0.0,
(precipitationLength * timeStep),
timeStep,
streaming_observer(std::cout) );
I guess R (CRAN) expects something else rather than std::cout... but what?

Your C++ project may well be using standard input and output.
The issue, as discussed in the Writing R Extensions manual, is that you then end up mixing two output systems: R's, and the C++ one.
So you are "encouraged" to replace all uses of, say,
std::cout << "The value of foo is " << foo << std::endl;
with something like
Rprintf("The value of foo is %f\n", foo);
so that your output gets blended properly with R's. In one of my (non-Rcpp) packages I had to do a lot of tedious patching for that...
Now, as mentioned in a comment by #vasicbre and an answer by #Dason, if you use Rcpp you can simply do
Rcpp::Rcout << "The value of foo is " << foo << std::endl;
If you already use Rcpp this is pretty easy, otherwise you need to decide if that makes it worth adding Rcpp...
edit: fixed typo in Rcpp::Rcout.

If you want to stream to R's buffered output you'll want to use Rcpp::Rcout instead of std::cout.
For more details you can read this article by one of Rcpp's authors: http://dirk.eddelbuettel.com/blog/2012/02/18/

Easier-to-type alternative to std::cout for printing to screen in C++

Often I just want to quickly check the contents of a series of variables (let's call them a,b,c,d and e, and suppose they're a mixture of floats, integers and strings). I'm fed up typing
cout << a << " " << b << " " << " " << c << " " << " " << d << " " << e << endl;
Is there a more convenient (less key-strokes) way to quickly dump a few variables to stdout in C++? Or do C++ people just always define their own simple print function or something? Obviously something like
printf("%d %f %s %d %s\n",a,b,c,d,e);
is not the alternative I'm looking for, but rather something like
print a,b,c,d,e
Even
print*, a,b,c,d,e
or
write(*,*) a,b,c,d,e
isn't too inconvenient to type.
Of course, googling 'quickly print to screen in C++' keeps just sending me back to std::cout.

Is it that, what you want?
print(a, b, c);
That would be this.
template <typename T>
void print(T t)
{
std::cout << t << " ";
}
template<typename T, typename... Args>
void print(T t, Args... args)
{
std::cout << t << " ";
print(args...) ;
}

It's easy to create a "print" class which have an overloaded template operator,, then you could do something like
print(),a,b,c,d;
The expression print() would create a temporary instance of the print class, and then use that temporary instance for the printing with the comma operator. The temporary instance would be destroyed at the end of the expression (after last comma overload is called).
The implementation could look something like this:
struct print
{
template<typename T>
print& operator,(const T& v)
{
std::cout << v;
return *this;
}
};
Note: This is just off my head, without any testing.

Is there a more convenient (less key-strokes) way to quickly dump a
few variables to stdout in C++?
I would have to say No, not in the language. But I do not consider std::cout a challenging amount to type.
You can tryout the template methods provided by other answer's.
But you should try GDB (or some debugger available on your system). GDB can 'dump' automatic variables with no _effort_ at all, as automatic var's for the current stack frame are always kept up-to-date in the "Locals" window.
Or do C++ people just always define their own simple print function or something?
No, or maybe something.
I use std::cout and std::cerr (as defined) for lots of debugging, but not in the 'how can I save the most typing' frame of mind.
My view is that creating a 'convenience' (i.e. not required) function is appropriate for doing something you wish to repeat. My rule of thumb is 3 times ... if I do a particular something 3 (or more) times (like generate a std::cout statement with the same or similar variables in it) then I might write a function (rather than copy the line) for that repeated effort.
Typically, I use one of two (what I call) disposable debug methods ... and most of my objects also have both show() and dump(), and there can be multiple show/dump functions or methods, each with different signatures, and default values.
if(dbg1) show(a,b,c,d,e);
if(dbg1b) show(b);
// etc
and
if(dbg2) dump(a,b,c,d,e);
Show typically uses and does what what std::cout provides, and little else.
Dump does what show does, but also might provide an alternate view of the data, either hex or binary translations of the values, or perhaps tables. What ever helps.
Disposable does not mean I will dispose of them, but rather I might, and I often get tired of output that does not change, so I set dbgX to false when this code seems to be working, at least until I decide to dispose of the debug invocation.
But then, you do have to implement each of the functions and methods, and yes, you are going to have to learn to type.
If these variables are automatic, you should know that the debugger GDB automatically displays them in a window called "Locals", and keeps them up-to-date during single step.
In GDB, object instance contents can often be displayed with "p *obj", and there are ways to add a particular obj name to the local display window.
It does not take a lot to run GDB. If you object to creating the 80 char std::cout code above, it takes far less typing to launch GDB, set break in main, and run the simple task under gdb control, (then single step to observe these variables at any step in your code) not just where you happened to insert a show() or dump() command.
And if you have GDB, you can also command to print using "p show()" (when the show() function is in scope) to see what the in-scope variables look like to std::cout (if you don't believe the "Locals" window).
GDB allows you to "p this->show()" when stepping through a method of the instance, or "p myObj->show()" when the myObj is accessible.
ALso, "p *this" and "p *myObj" will provide a default, typically useful, display of the current contents your object.
Anyway. Yes you can always work hard to shorten your typing effort.

3D vector indices inconsistency C++

I was playing around with vectors of vectors in C++. In my case what I call a 3D-vector is shown in the following code
typedef std::vector<double> RandomSample;
typedef std::vector<RandomSample> TimeSample;
typedef std::vector<TimeSample> Option;
int main(int argc, const char * argv[])
{
unsigned int numberOfOptions = 3;
unsigned int timeNodes = 7;
unsigned int numberOfRandSamples = 10;
Option options(3, TimeSample(7, RandomSample(numberOfRandSamples)));
std::cout << options[0][0][0] << std::endl;
//std::cout << options[3][6][9] << std::endl; //SEGMENTATION FAULT
//std::cout << options[2][7][9] << std::endl; //SEGMENTATION FAULT
std::cout << options[2][6][20] << std::endl; //NO ERROR !!
std::cout << "Hola Mundo !" << std::endl;
return 0;
}
The code says by itself the problem, when accessing beyond the vector bounds for the first and second indices I get the expected runtime error, but when doing the same with the third index it doesn't happen, no error, no nothing at all. I've even tried with big numbers in the third index and still everything, apparently, is working fine. What am I missing or what is going on with this code?
I'm developing on Mac OS X 10.8.4 + Xcode 4.6.3

If you're expecting a runtime error when accessing a vector outside bounds with operator[] then it's your expectation that is wrong.
When you make that kind of mistake the C++ standard says that it's "undefined behaviour", not "runtime error".
In C++, for performance reasons, there are very few runtime error angels (i.e. checks that you're not doing something wrong at runtime) so unless you specifically request them (e.g. using std::vector::at() instead of std::vector::operator[]) or unless you implement them yourself no check will be done and whatever happens happens.
Sometimes when doing this kind of mistake you get an immediate crash, but that happens only when you're very lucky. In most common cases instead you end up corrupting data that belongs to some other object or to the runtime library and one million instructions executed later a perfectly innocent part of the program starts behaving like crazy.
Murphy says that you will only get a crash if you're giving a demonstration of your software in front of potential investors and your family. Until that point everything will seem to work perfectly even if you overwrite memory that wasn't yours.
The main philosophy of C++ is that programmers never make this kind of error ;-)

What are some possible causes of program crashing when returning a value?

I have a bunch of code roughly equivalent to this:
bool test(double e, short a, short b, short c) {
// Things being calculated here...
cout << "debug_3" << endl;
return (1 - abs(cos_th)) < (1 - cos(e));
}
int main() {
// something...
cout << "debug_0" << endl;
if(test(e,1,2,0)) {
cout << "debug_4" << endl;
// Bunch of useful operations...
}
// something...
}
Running the code generates the output:
debug_3
After which the program crashes (displaying "The program has stopped working..." in Windows). I have never encountered crashing at value return and I don't know what causes it or how I could fix it. Any thoughts on the issue?
EDIT: Some more info:
In my builds I also verify that the values of cos_th and e are valid.
People seem to point to the second something as the source of problems but my problem seems resolved (i.e. no crashes) when I get rid of the if-statement with a call to test()...

The only things we can fix without knowing what system is, is to change the type of a b and c to unsigned short since they are just array indexes, and make sure they are within array bounds. You might also need to make sure this is not zero since you divide by the result:
sqrt((Xca*Xca+Yca*Yca+Zca*Zca)*(Xba*Xba+Yba*Yba+Zba*Zba))
Use cerr instead of cout to make sure the output is flushed but you still don't see debug 4.
Put more output inside an else condition or after the if: maybe the function returns false?
If you can't locate the error precisely, use a debugger.

Crash at return usually means that your function overwrites stack (and thus the return address) and your program jumps to nowhere. You can verify this by stepping instruction by instruction at the disassembly level.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Undebuggable non-deterministic heisenbug in single-threaded C++ function call - c++

Related

I don't understand how my compiler got this output

R check doesn't like std:cout (C++)

Easier-to-type alternative to std::cout for printing to screen in C++

3D vector indices inconsistency C++

What are some possible causes of program crashing when returning a value?

Categories

Resources