What could cause initialization order to corrupt the stack? - c++

Question is in bold below :
This works fine:
void process_batch(
string_vector & v
)
{
training_entry te;
entry_vector sv;
assert(sv.size() == 0);
...
}
However, this causes the assert to fail :
void process_batch(
string_vector & v
)
{
entry_vector sv;
training_entry te;
assert(sv.size() == 0);
...
}
Now I know this issue isn't shrink wrapped, so I'll restrict my question to this: what conditions could cause such a problem ? Specifically: variable initialization getting damaged dependant on appearance order in the stack frame. There are no malloc's or free's in my code, and no unsafe functions like strcpy, memcpy etc... it's modern c++. Compilers used: gcc and clang.
For brevity here are the type's
struct line_string
{
boost::uint32_t line_no;
std::string line;
};
typedef std::vector<boost::uint32_t> line_vector;
typedef std::vector<line_vector> entry_vector;
typedef std::vector<line_string> string_vector;
struct training_body
{
boost::uint32_t url_id;
bool relevant;
};
struct training_entry
{
boost::uint32_t session_id;
boost::uint32_t region_id;
std::vector< training_body> urls;
};
p.s., I am in no way saying that there is a issue in the compiler, it's probably my code. But since I am templatizing some code I wrote a long time ago, the issue has me completely stumped, I don't know where to look to find the problem.
edit
followed nim's suggestion and went through the following loop
shrink wrap the code to what I have shown here, compile and test, no problem.
#if 0 #endif to shrink wrap the main program.
remove headers till it compiles in shrink wrapped form.
remove library links till compiles in shrink wrapped form.
Solution: removing link to protocol buffers gets rid of the problem

The C++ standard guarantees that the following assertion will succeed:
std::vector<anything> Default;
//in your case anything is line_vector and Default is sv
assert(Default.size() == 0);
So, either you're not telling the whole story or you have a broken STL implementation.
OR: You have undefined behavior in your code. The C++ standard gives no guarantees about the behavior of a program which has a construct leading to UB, even prior to reaching that construct.

The usual case for this when one of the created objects writes beyond
its end in the constructor. And the most frequent reason this happens
in code I've seen is that object files have been compiled with different
versions of the header; e.g. at some point in time, you added (or
removed) a data member of one of the classes, and didn't recompile all
of the files which use it.

What might cause the sort of problem you see is a user-defined type with a misbehaving constructor;
class BrokenType {
public:
int i;
BrokenType() { this[1].i = 9999; } // Bug!
};
void process_batch(
string_vector & v
)
{
training_entry te;
BrokenType b; // bug in BrokenType shows up as assert fail in std::vector
entry_vector sv;
assert(sv.size() < 100);
...
}

Do you have the right version of the Boost libaries suited for your platform? (64 bit/32 bit)? I'm asking since the entry_vector object seems to be have a couple of member variables of type boost::uint32_t. I'm not sure what could be the behaviour if your executable is built for one platform and the boost library loaded is of another platform.

Related

Constructor is not called with /NODEFAULTLIB

I'm using /NODEFAULTLIB to disable CRT(C Runtime), however my constructor is not called, which ends up causing an error in std::map (Access violation) because it is not initialized properly, since std::map constructor it's not called.
Code compiled with LLVM 8.0.0, compiled in mode debug x86
class c_test
{
public:
c_test( int a ) // Constructor not called
{
printf( "Test: %i\n", a ); // Doesn't appear and breakpoint is not reached
}
void add( const std::string& key, const std::string& val )
{
_data[ key ] = val;
}
private:
std::map< std::string, std::string > _data;
};
c_test test{ 1337 };
int main()
{
test.add( "qwrqrqr", "23142421" );
test.add( "awrqw", "12asa1faf" );
return 1;
}
I've implemented my own functions new(HeapAlloc), delete(HeapFree), printf, memcpy, memmove, etc, and all are working perfectly, I have no idea why this happening.
Disabling the CRT is madness.
This performs crucial functions, such as static initialisation. Lack of static initialisation is why your map is in a crippled state. I would also wholly expect various parts of the standard library to just stop working; you're really creating a massive problem for yourself.
Don't reinvent little pieces of critical machinery — turn the CRT back on and use the code the experts wrote. There is really nothing of relative value to gain by turning it off.
I discovered the problem and solved, one guy from another forum said that I needed manually call constructors that are stored in pointers in .CRT section, I just did it and it worked perfectly
I just called _GLOBAL__sub_I_main_cpp function that calls my constructor and solved all my problems, thanks for the answers.

Mutex assert in boost regex constructor

I'm using boost 1.47 for Arm, with the Code Sourcery C++ compiler (4.5.1), crosscompiling from Windows 7 targeting Ubuntu.
When we compile the debug version (i.e. asserts are enabled), there is an assert triggered:
pthread_mutex_lock.c:62: __pthread_mutex_lock: Assertion 'mutex->__data.__owner == 0' failed.
Compiling in release mode, the assert is not triggered and the program works fine (as far as we can tell).
This is happening under a Ubuntu 10.x Arm board.
So, it appears that the pthread_mutex_lock thinks the mutex was set by a different thread than the current one. At this point in my program, we're still single threaded, verified by printing out pthread_self in main and just before the regex constructor is called. That is, it should not have failed the assertion.
Below is the snippet of code that triggers the problem.
// Set connection server address and port from a URL
bool MyHttpsXmlClient::set_server_url(const std::string& server_url)
{
#ifdef BOOST_HAS_THREADS
cout <<"Boost has threads" << endl;
#else
cout <<"WARNING: boost does not support threads" << endl;
#endif
#ifdef PTHREAD_MUTEX_INITIALIZER
cout << "pthread mutex initializer" << endl;
#endif
{
pthread_t id = pthread_self();
printf("regex: Current threadid: %d\n",id);
}
const boost::regex e("^((http|https)://)?([^:]*)(:([0-9]*))?"); // 2: service, 3: host, 5: port // <-- dies in here
I've confirmed that BOOST_HAS_THREADS is set, as is PTHREAD_MUTEX_INITIALIZER.
I tried following the debugger though boost but it's templated code and it was rather difficult to follow the assembly, but we basically die in do_assign
(roughtly line 380 in basic_regex.hpp)
basic_regex& assign(const charT* p1,
const charT* p2,
flag_type f = regex_constants::normal)
{
return do_assign(p1, p2, f);
}
the templated code is:
// out of line members;
// these are the only members that mutate the basic_regex object,
// and are designed to provide the strong exception guarentee
// (in the event of a throw, the state of the object remains unchanged).
//
template <class charT, class traits>
basic_regex<charT, traits>& basic_regex<charT, traits>::do_assign(const charT* p1,
const charT* p2,
flag_type f)
{
shared_ptr<re_detail::basic_regex_implementation<charT, traits> > temp;
if(!m_pimpl.get())
{
temp = shared_ptr<re_detail::basic_regex_implementation<charT, traits> >(new re_detail::basic_regex_implementation<charT, traits>());
}
else
{
temp = shared_ptr<re_detail::basic_regex_implementation<charT, traits> >(new re_detail::basic_regex_implementation<charT, traits>(m_pimpl->m_ptraits));
}
temp->assign(p1, p2, f);
temp.swap(m_pimpl);
return *this;
}
I'm not sure what component is actually using the mutex--does anyone know?
In the debugger, I could retrieve the address for the variable mutex and then inspect (mutex->__data.__owner). I got the offsets from the compiler header file bits/pthreadtypes.h, which shows:
/* Data structures for mutex handling. The structure of the attribute
type is not exposed on purpose. */
typedef union
{
struct __pthread_mutex_s
{
int __lock;
unsigned int __count;
int __owner;
/* KIND must stay at this position in the structure to maintain
binary compatibility. */
int __kind;
unsigned int __nusers;
__extension__ union
{
int __spins;
__pthread_slist_t __list;
};
} __data;
char __size[__SIZEOF_PTHREAD_MUTEX_T];
long int __align;
I used these offsets to inspect the data in memory. The values did not make sense:
For instance, the __data.__lock field (an int) is 0xb086b580. The __count (an unsigned int) is 0x6078af00, and __owner (an int) is 0x6078af00.
This leads me to think that somehow initialization of this mutex was not performed. Either that or it was completely corrupted, but I'm leaning to missed initialization because when I linked with debug boost libraries, there was no assert.
Now, I'm assuming that whatever mutex that is being queried, is some global/static that is used to make regex threadsafe, and that somehow it was not initialized.
Has anyone encountered anything similar? Is there some extra step needed for Ubuntu to ensure mutex initialization?
Is my implementation assumption correct?
If it is correct, can someone point me to where this mutex is declared, and where it's initialization is occurring
any suggestions on further debugging steps? I'm thinking I might have to somehow download the source and rebuild with tracing in there (hoping StackOverflow can help me before I get to this point)
One of the first things to check when a really REALLY peculiar runtime crash appears in a well-known, well-tested library like boost is whether there's a header/library configuration mismatch. IMHO, putting _DEBUG or NDEBUG in headers, especially within the structures in a way that affects their binary layout, is an anti-pattern. Ideally we should be able to use the same .lib whether we define _DEBUG, DEBUG, Debug, Debug, NDEBUG, or whatever (so that we can select the .lib based on whether we want to have debug symbols or not, not whether it matches header defines). Unfortunately this isn't always the case.
I used these offsets to inspect the data in memory. The values did not make sense:
For instance, the __data.__lock field (an int) is 0xb086b580. The __count (an unsigned > int) is 0x6078af00, and __owner (an int) is 0x6078af00.
This sounds like different parts of your code have different views on how large various structures should be. Some things to check:
Is there any #define which enlarges a data structure, but is not consistently set throughout your code base? (On Windows, _SECURE_SCL is infamous for this kind of bugs)
Do you do any structure packing? If you set #pragma pack anywhere in a header and forget to unset it at the end of the header, any data structures included after that will have a different layout than elsewhere in your program.

Calling Function Overwrites Value

I have several configuration flags that I am implementing as structs. I create an object. I call a method of the object with a flag, which eventually triggers a comparison between two flags. However, by this time, one of the flags has been overwritten somehow.
To clarify, here's a VERY simplified version of the code that should illustrate what I'm seeing:
class flag_type { unsigned int flag; /*more stuff*/ };
flag_type FLAG1
flag_type FLAG2
class MyObject {
public:
void method1(const flag_type& flag_arg) {
//conditionals, and then:
const flag_type flag_args[2] = {flag_arg,flag_arg};
method2(flag_args);
}
void method2(const flag_type flag_args[2]) {
//conditionals, and then:
method3(flag_args[0]);
}
void method3(const flag_type& flag_arg) { //Actually in a superclass
//stuff
if (flag_arg==FLAG1) { /*stuff*/ }
//stuff
}
};
int main(int argc, const char* argv[]) {
//In some functions called by main:
MyObject* obj = new MyObject();
//Later in some other functions:
obj->method1(FLAG1);
}
With a debugger and print statements, I can confirm that both FLAG1 and flag_arg/flag_args are fine in both "method1" and "method2". However, when I get to method3, "FLAG1.flag" has been corrupted, so the comparison fails.
Now, although I'm usually stellar about not doing it, and it passes MSVC's static code analysis on strictest settings, this to me looks like the behavior of a buffer overrun.
I haven't found any such error by looking, but of course one usually doesn't. My question isA: Am I screwing up somewhere else? I realize I'm not sharing any real code, but am I missing something already? This scheme worked before before I rewrote a large portion of the code.
B: Is there an easier way than picking through the code more carefully until I find it? The code is cross-platform, so I'm already setting it up to check with Valgrind on an Ubuntu box.
Thanks to those who tried to help. Though, it should be noted that the code was for clarification purposes only; I typed it from scratch to show generally was was happening; not to compile. In retrospect, I realize it wasn't fair to ask people to solve it on so little information--though my actual question "Is there an easier way than picking through the code more carefully" didn't really concern actually solving the problem--just how to approach it.
As to this question, on Ubuntu Linux, I got "stack smashing" which told me more or less where the problem occurred. Interestingly, the traceback for stack smashing was the most helpful. Long story short, it was an embarrassingly basic error; strcpy was overflowing (in the operators for ~, | and &, the flags have a debug string set this way). At least it wasn't me who wrote that code. Always use strncpy, people :P

C++ Struct initialisation problem

This c++ code is working fine , however memory validator says that I am using a deleted pointer in:
grf->filePath = fname; Do you have any idea why ? Thank you.
Dirloader.h
// Other code
class CDirLoader
{
public:
struct TKnownGRF
{
std::string filePath;
DWORD encodingType;
DWORD userDataLen;
char *userData;
};
// Other Code
CDirLoader();
virtual ~CDirLoader();
Dirloader.cpp
// Other code
void CDirLoader::AddGroupFile(const std::string& _fname)
{
// Other code including std::string fname = _fname;
TKnownGRF *grf = new TKnownGRF;
grf->filePath = fname;
delete grf; // Just for testing purposes
P.S.: This is only an code extract. Of course if I define a struct TKnownGRF inside .cpp and use it as an actual object, gfr.filepath = something, instead of pointer grf->filepath=something, than it is ok, but I do need to have it inside *.h in CDirLoader class, due to many other vector allocations.
Since the function returns void
void CDirLoader::AddGroupFile(const std::string& _fname)
the question is what are you going to do with grf?
Are you going to delete it? If so, then, why do a new? you can just declare a TKnownGRF variable on the stack! In that case, _fname is not contributing to the logic of this method.
I guess that the class CDirLoader has a member variable of type TKnownGRF, say grf_, and that need to be used in the AddsGroupFile() method, e.g.:
grf_.filepath = _fname;
Does this happen to be using an older version of STL, say, VC6, and running multithreaded? Older versions of STL's string class used a reference counted copy on write implementation, which didn't really work in a multithreaded environment. See this KB article on VC 6.
Or, it's also possible that you are looking at the wrong problem. If you call std::string::c_str() and cache the result at all, the cached result would probably be invalidated when you modified the original string. There are a few cases where you can get away with that, but it's very much implementation specific.

user defined Copy ctor, and copy-ctors further down the chain - compiler bug ? programmers brainbug?

i have a little problem, and I am not sure if it's a compiler bug, or stupidity on my side.
I have this struct :
struct BulletFXData
{
int time_next_fx_counter;
int next_fx_steps;
Particle particles[2];//this is the interesting one
ParticleManager::ParticleId particle_id[2];
};
The member "Particle particles[2]" has a self-made kind of smart-ptr in it (resource-counted texture-class). this smart-pointer has a default constructor, that initializes to the ptr to 0 (but that is not important)
I also have another struct, containing the BulletFXData struct :
struct BulletFX
{
BulletFXData data;
BulletFXRenderFunPtr render_fun_ptr;
BulletFXUpdateFunPtr update_fun_ptr;
BulletFXExplosionFunPtr explode_fun_ptr;
BulletFXLifetimeOverFunPtr lifetime_over_fun_ptr;
BulletFX( BulletFXData data,
BulletFXRenderFunPtr render_fun_ptr,
BulletFXUpdateFunPtr update_fun_ptr,
BulletFXExplosionFunPtr explode_fun_ptr,
BulletFXLifetimeOverFunPtr lifetime_over_fun_ptr)
:data(data),
render_fun_ptr(render_fun_ptr),
update_fun_ptr(update_fun_ptr),
explode_fun_ptr(explode_fun_ptr),
lifetime_over_fun_ptr(lifetime_over_fun_ptr)
{
}
/*
//USER DEFINED copy-ctor. if it's defined things go crazy
BulletFX(const BulletFX& rhs)
:data(data),//this line of code seems to do a plain memory-copy without calling the right ctors
render_fun_ptr(render_fun_ptr),
update_fun_ptr(update_fun_ptr),
explode_fun_ptr(explode_fun_ptr),
lifetime_over_fun_ptr(lifetime_over_fun_ptr)
{
}
*/
};
If i use the user-defined copy-ctor my smart-pointer class goes crazy, and it seems that calling the CopyCtor / assignment operator aren't called as they should.
So - does this all make sense ? it seems as if my own copy-ctor of struct BulletFX should do exactly what the compiler-generated would, but it seems to forget to call the right constructors down the chain.
compiler bug ? me being stupid ?
Sorry about the big code, some small example could have illustrated too. but often you guys ask for the real code, so well - here it is :D
EDIT : more info :
typedef ParticleId unsigned int;
Particle has no user defined copyctor, but has a member of type :
Particle
{
....
Resource<Texture> tex_res;
...
}
Resource is a smart-pointer class, and has all ctor's defined (also asignment operator)
and it seems that Resource is copied bitwise.
EDIT :
henrik solved it... data(data) is stupid of course ! it should of course be rhs.data !!!
sorry for huge amount of code, with a very little bug in it !!!
(Guess you shouldn't code at 1 in the morning :D )
:data(data)
This is problematic. This is because your BulletFXData struct does not have it's own copy-ctor. You need to define one.
Two things jump out at me:
Is this a compiler bug
No. It is never a compiler bug. In twenty years I have seen enumerus complaints,
'it must be a compiler bug' only one has ever turned out to be a bug and that way
back with gcc 2.95 (nowadays gcc is solidly stable (as is dev studio))'
I built my own smart pointer.
Its a nice concept and a nice learning experience. But it is so much harder
to get correct than you think. Especially when you seem to be having trouble with
copy constructors
This is wrong The structure is copy constructed using itself as the object to be copied. Thus you are copy random data into itself.
:Look at comments to see what you should be using as parameters.
//USER DEFINED copy-ctor. if it's defined things go crazy
BulletFX(const BulletFX& rhs)
:data(data), // rhs.data
render_fun_ptr(render_fun_ptr), // rhs.render_fun_ptr
update_fun_ptr(update_fun_ptr), // rhs.update_fun_ptr
explode_fun_ptr(explode_fun_ptr), // rhs.explode_fun_ptr
lifetime_over_fun_ptr(lifetime_over_fun_ptr) // rhs.lifetime_over_fun_ptr
{
}
Of course at this point you may as well use the compiler generated version of the copy constructor as this is exactly what it is doing.