Find uninitialized variables in C&C++ - c++

I really mean C and C++. This project is using a lib in C which is calling my functions in C++. However the functions are in extern "C" because the lib expects it.
Anyways, in these functions I do new Blah. When a specific function is called (end_tree) i expect all my variables to be initialized. Using Visual Studios, GCC or any other compiler is there a way i can check? I just notice some bools are TRUE which shouldn't be... why, because it wasn't initialized. Is there some kind of _VS_CheckThisMemory(mytree) function or magic I can use?

Don't know it this is what you want but gcc has -Wmaybe-uninitialized and -Wuninitialized. There may be more on the warning options page.

Use valgrind (on linux)
./valgrind myprogram
Especially easy when myprogram was compiled with debug info (gcc -g), but not required. Valgrind will notify where memory is being used that was uninitialzed, and where it was allocated from. If it has debug info, valgrind will report exactly at which file:linenumber things happened. (it can even attach a debugger on the fly for you to insepct things)
It will also detect access beyond allocation boundaries and access after freeing. This is incredibly useful.
Here endeth the useful answer
Edit because it wasn't exactly clear why I was posting the following, as became clear from the comment, let me introduce the remainder of this answer:
When starting to use valgrind with existing codebases, it is almost inevitable that you'll get 'false' positives, i.e. reports that aren't really problems (yet). I include one example of what might trigger such a report, and how you'd typically fix those.
I'm just including this to raise awareness of how to tackle or recognize (semi-)false positives.
Another way of wording it (with reference to Matthieu's convincing reasoning in the comments) is to treat even the 'not-actually-killing' Valgrind warnings as critical: get them fixed, not forgotten.
It is possible that valgrind will report uninitialized access when it is not really a problem. Like, e.g.
char buf[1024];
strcpy(buf, "hello");
char clone[1024];
memcpy(clone, buf, 1024);
You should fix that by doing something smarter like
memcpy(clone, buf, strlen(buf));
To make sure there are no uninitialzed 'parts' in buf (or at least not in the area accessed)

Use a self-initializing class to cover those annoying primitives.
template<typename T> class always_initialized {
T t;
public:
always_initialized()
: t(T()) {}
always_initialized(const T& ref) {
: t(ref) {}
operator T&() { return t; }
operator const T&() const { return t; }
T& operator=(const T& ref) { return t = ref; }
};

In response to the code linked in the comment, RAIIIA (Resource Acquisition is Initialization In Action (R))
class OtherClass;
class MyClass : public SomeBase {
public:
// note I got rid of your default constructor, which leaves values unitialized
MyClass(Var* name, OtherClass* loop)
: m_name(name), m_loop(loop) // this right here
{ }
virtual ~MyClass(); // no implementation needed here
void save();
// made the members protected, other classes have no business accessing them directly
protected:
Var* m_name;
OtherClass* m_loop;
};
Your default constructor left the values unitialized, and goes against RAII in its pure form. It's OK to do that, but as you're having problems with uninitialized variables, I would recomment removing default constructors.
EDIT: storing unknown pointers as class members without newing and deleteing them in the class constructor/destructor isn't really RAII, but I hope you do that somewhere.

I solved this by using a solution with templates to act like properties. I used Property like features in C++? but there are other examples for other things like passing in a get/setter.
Essentially generated most of this code bc i was able to and i kept track if a variable was set or not through the property. At the end i just check members and had asserts to tell me if i set a variable or not. I also assert when i 'get' just in case.

Related

GCC warns about order of initialization not matching the order of declaration

Following code:
class C
{
int a, b;
public:
C()
:b(0), a(0)
{}
};
Causes GCC to complain about wrong order of initialization. Specifically:
../AppSrc/MainForm.cpp: In constructor 'C::C()':
../AppSrc/MainForm.cpp:51:9: warning: 'C::b' will be initialized after
../AppSrc/MainForm.cpp:51:6: warning: 'int C::a'
What's the big deal here, why the whining? It's not like there is, or could potentially be, an interdependency between members. Primitives, duh.
Oh, and how do I turn this warning off or at least make it less aggressive?
EDIT: there are many ways to shoot yourself in the foot in C++; member interdependency is one of them. I'm aware of that and I avoid that anyway, just like I would avoid null pointer dereferencing.
Arbitrary init order can be perfectly safe, just like in the snippet above. I'm unhappy about compiler not recognizing such cases and complaining anyway. I mean, it does not complain about every single pointer dereference that is not immediately preceeded by a null check, does it?
In this particular example, the warning can be safely ignored. If they depended on each other, you'd have an issue. You can turn it off with -Wno-reorder.
But better re-order them. It might not make a difference to the compiler, but you'll get in the habit of initializing members in the order in which they appear, which is a good thing.
It's not like there is, or could potentially be, an interdependency between members. Primitives, duh.
Erm, unless someone changes it:
class C
{
int a, b;
public:
C()
:b(a), a(0)
{}
};
Most people think it's better to get a warning about that before the problem happens, so they can fix the code. You seem to be in a minority who want to leave the code intentionally flawed and complain about the compiler.
Arbitrary init order can be perfectly safe, just like in the snippet above.
The point is the order of initialization isn't arbitrary, it's always in the order the members are declared, and some of us want to be warned when we write mem-initializers in the wrong order.
I'm unhappy about compiler not recognizing such cases and complaining anyway.
If you don't like the warning turn it off, how to do that is in the documentation (and in an earlier answer, so you don't even have to look far ;-)
The is a VERY big reason for the whining, See Scott Myers book Effective C++ Item 13 page 57 for details.
Or as he puts it on page 58:
class Wacko {
public:
Wacko(const char *s): s1(s), s2(0) {}
Wacko(const Wacko &rhs): s2(rhs.s1), s1(0) {}
private:
string s1, s2;
};
Wacko w1 = "Hello World!";
Wacko w2 = w1;
Is w2 the same as w1?

checking invariants in C++

Are there any established patterns for checking class invariants in C++?
Ideally, the invariants would be automatically checked at the beginning and at the end of each public member function. As far as I know, C with classes provided special before and after member functions, but unfortunately, design by contract wasn't quite popular at the time and nobody except Bjarne used that feature, so he removed it.
Of course, manually inserting check_invariants() calls at the beginning and at the end of each public member function is tedious and error-prone. Since RAII is the weapon of choice to deal with exceptions, I came up with the following scheme of defining an invariance checker as the first local variable, and that invariance checker checks the invariants both at construction and destruction time:
template <typename T>
class invariants_checker
{
const T* p;
public:
invariants_checker(const T* p) : p(p)
{
p->check_invariants();
}
~invariants_checker()
{
p->check_invariants();
}
};
void Foo::bar()
{
// class invariants checked by construction of _
invariants_checker<Foo> _(this);
// ... mutate the object
// class invariants checked by destruction of _
}
Question #0: I suppose there is no way to declare an unnamed local variable? :)
We would still have to call check_invariants() manually at the end of the Foo constructor and at the beginning of the Foo destructor. However, many constructor bodies and destructor bodies are empty. In that case, could we use an invariants_checker as the last member?
#include <string>
#include <stdexcept>
class Foo
{
std::string str;
std::string::size_type cached_length;
invariants_checker<Foo> _;
public:
Foo(const std::string& str)
: str(str), cached_length(str.length()), _(this) {}
void check_invariants() const
{
if (str.length() != cached_length)
throw std::logic_error("wrong cached length");
}
// ...
};
Question #1: Is it valid to pass this to the invariants_checker constructor which immediately calls check_invariants via that pointer, even though the Foo object is still under construction?
Question #2: Do you see any other problems with this approach? Can you improve it?
Question #3: Is this approach new or well-known? Are there better solutions available?
Answer #0: You can have unnamed local variables, but you give up control over the life time of the object - and the whole point of the object is because you have a good idea when it goes out of scope. You can use
void Foo::bar()
{
invariants_checker<Foo>(this); // goes out of scope at the semicolon
new invariants_checker<Foo>(this); // the constructed object is never destructed
// ...
}
but neither is what you want.
Answer #1: No, I believe it's not valid. The object referenced by this is only fully constructed (and thus starts to exist) when the constructor finished. You're playing a dangerous game here.
Answer #2 & #3: This approach is not new, a simple google query for e.g. "check invariants C++ template" will yield a lot of hits on this topic. In particular, this solution can be improved further if you don't mind overloading the -> operator, like this:
template <typename T>
class invariants_checker {
public:
class ProxyObject {
public:
ProxyObject(T* x) : m(x) { m->check_invariants(); }
~ProxyObject() { m->check_invariants(); }
T* operator->() { return m; }
const T* operator->() const { return m; }
private:
T* m;
};
invariants_checker(T* x) : m(x) { }
ProxyObject operator->() { return m; }
const ProxyObject operator->() const { return m; }
private:
T* m;
};
The idea is that for the duration of a member function call, you create an anonymous proxy object which performs the check in its constructor and destructor. You can use the above template like this:
void f() {
Foo f;
invariants_checker<Foo> g( &f );
g->bar(); // this constructs and destructs the ProxyObject, which does the checking
}
Ideally, the invariants would be automatically checked at the beginning and at the end of each public member function
I think this is overkill; I instead check invariants judiciously. The data members of your class are private (right?), so only its member functions can change the data memebers and therefore invalidate invariants. So you can get away with checking an invariant just after a change to a data member that particiaptes in that invariant.
Question #0: I suppose there is no way to declare an unnamed local variable? :)
You can usually whip up something using macros and __LINE__, but if you just pick a strange enough name, it should already do, since you shouldn't have more than one (directly) in the same scope. This
class invariants_checker {};
template<class T>
class invariants_checker_impl : public invariants_checker {
public:
invariants_checker_impl(T* that) : that_(that) {that_->check_invariants();}
~invariants_checker_impl() {that_->check_invariants();}
private:
T* that_;
};
template<class T>
inline invariants_checker_impl<T> get_invariant_checker(T* that)
{return invariants_checker_impl<T>(that);}
#define CHECK_INVARIANTS const invariants_checker&
my_fancy_invariants_checker_object_ = get_invariant_checker(this)
works for me.
Question #1: Is it valid to pass this to the invariants_checker constructor which immediately calls check_invariants via that pointer, even though the Foo object is still under construction?
I'm not sure whether it invokes UB technical. In practice it would certainly be safe to do so - where it not for the fact that, in practice, a class member that has to be declared at a specific position in relation to other class members is going to be a problem sooner or later.
Question #2: Do you see any other problems with this approach? Can you improve it?
See #2. Take a moderately sized class, add half a decade of extending and bug-fixing by two dozen developers, and I consider the chances to mess this up at at least once at about 98%.
You can somewhat mitigate this by adding a shouting comment to the data member. Still.
Question #3: Is this approach new or well-known? Are there better solutions available?
I hadn't seen this approach, but given your description of before() and after() I immediately thought of the same solution.
I think Stroustrup had an article many (~15?) years ago, where he described a handle class overloading operator->() to return a proxy. This could then, in its ctor and dtor, perform before- and after-actions while being oblivious to the methods being invoked through it.
Edit: I see that Frerich has added an answer fleshing this out. Of course, unless your class already needs to be used through such a handle, this is a burden onto your class' users. (IOW: It won't work.)
#0: No, but things could be slightly better with a macro (if you're ok with that)
#1: No, but it depends. You cannot do anything that would cause this to be dereferenced in before the body (which yours would, but just before, so it could work). This means that you can store this, but not access fields or virtual functions. Calling check_invariants() is not ok if it's virtual. I think it would work for most implementations, but not guaranteed to work.
#2: I think it will be tedious, and not worth it. This have been my experience with invariant checking. I prefer unit tests.
#3: I've seen it. It seems like the right way to me if you're going to do it.
unit testing is better alternative that leads to smaller code with better performance
I clearly see the issue that your destructor is calling a function that will often throw, that's a no-no in C++ isn't it?

Const Functions and Interfaces in C++

I'll use the following (trivial) interface as an example:
struct IObject
{
virtual ~IObject() {}
virtual std::string GetName() const = 0;
virtual void ChangeState() = 0;
};
Logic dictates that GetName should be a const member function while ChangeState shouldn't.
All code that I've seen so far doesn't follow this logic, though. That is, GetName in the example above wouldn't be marked as a const member function.
Is this laziness/carelessness or is there a legitimate reason for this? What are the major cons of me forcing my clients to implement const member functions when they are logically called for?
EDIT: Thanks for your responses everyone. I think it's pretty much unanimous: laziness/ignorance is the reason for what I'm seeing.
I think it's laziness/carelessness. GetName() should have no effect on the object's state, and the contract of IObject should state that fact explicitly.
If the inheriting class was somehow forced to make GetName() have (hidden!) side effects, they could always declare the corresponding fields as mutable.
Is this laziness/carelessness or is there a legitimate reason for this?
The former. If you really haven't seen any code which does this right, get a new job immediately.
What are the major cons of me forcing my clients to implement constmember functions when they are logically called for?
It allows the compiler to discover common bugs at compile-time. (Nothing better than errors discovered at compile-time. Everything that fails on your desk, won't fail at the client's site.)
More than ten years ago, shortly after I joined a new company and got to hacking at one of their projects, I found that a method that should have been const wasn't, preventing some of my const-correct code to compile. I considered just casting my const away and get on, but I couldn't myself bring to do this.
So I made the method const - just to discover that it called other methods, which should have been const, too, but weren't either. So I changed them as well - just to discover...
In the end, I spent several days hunting through all of the project, adding const left and right.
Co-workers laughed at me - until I showed them some of the bugs the compiler had discovered due to me adding const. Interestingly, a few long-standing bugs nobody had ever taken the time to thoroughly investigate were not reproducible anymore either, after that.
While I think the "laziness" answer is probably right in your case, I do just want to make the point that sometimes a single const keyword is not expressive enough to capture the details of mutability of your class.
Consider:
class MyClass {
public:
bool operator==(const MyClass &other) const {
return identity == other.identity;
}
void setVisible(bool vis) { gfx.setVisible(vis); }
bool isVisible() const;
// other methods ...
private:
string identity;
GraphicsData gfx;
}
I think this code is reasonable:
MyClass item = ...
item.setVisible(true);
// I want to call a function and be sure that the object's
// visibility did not change, so pass a const ref.
const MyClass &constRef = item;
someSafeFunction(constRef);
But at the same time, I think this code is reasonable, too:
// Imagine an appropriate std::hash<MyClass> has been
// defined, based on MyClass::identity.
unordered_set<MyClass> set = ...
// Hide some items
for (MyClass &item : set) {
item.setVisible(false);
}
However, that second bit of code will not compile, because unordered_set can only give const references to its contents (live example).
This is because a modification to the object could change its hash code, invalidating its location in the container.
So in effect, unordered_set demands that operator== and const are referring to the same notion of identity.
But that's not what we want in our first use case.
The problem is that our code has two notions of "did the object change", which both make sense from different points of view.
But there is only one const keyword you can apply, so you have to pick one, and the other case will suffer.

Adding a field to a structure without breaking existing code

So I'm working with this huge repository of code and have realized that one of the structs lack an important field. I looked at the code (which uses the struct) as closely as I could and concluded that adding an extra field isn't going to break it.
Any ideas on where I could've screwed up?
Also: design advice is welcome - what's the best way I can accomplish this?
E.g. (if I wasn't clear):
typedef struct foo
{
int a;
int b;
}
foo;
Now it's :
typedef struct foo
{
int a;
int b;
int c;
}
foo;
If that structure is being serialized/deserialized anywhere, be sure to pay attention to that section of the code.
Double check areas of the code where memory is being allocated.
From what you've written above I can't see anything wrong. Two things I can think of:
Whenever you change code and recompile you introduce the ability to find "hidden" bugs. That is, uninitialized pointers which your new data structure could be just big enough to be corrupted.
Are you making sure you initialize c before it gets used?
Follow Up:
Since you haven't found the error yet I'd stop looking at your struct. Someone once wrote look for horses first, zebras second. That is, the error is probably not an exotic one. How much coverage do you have in your unit tests? I'm assuming this is legacy code which almost invariably means 0% or at least that's been my experience. Is this accurate?
If you are using sizeof(struct) to allocate memory at all places and are accessing the members using -> or . operators, I don't think you should face any problem. But, it also depends on where you are trying to add the member, it might screw up your structure alignment if you are not careful.
Any ideas on where I could've screwed up?
Nothing. Everything. It all depends on how, where and why this is used.
Assuming this structure you talk about is a C-style POD and the code is but the simplest, you'll get away with it. But, the moment you are trying something more ambitious, you are dealing with alignment issues (depending on how and where you create objects) and padding at least. If this is C++ and your POD contains custom operators/ctors etc -- you're getting into a lot of trouble. Cross-platform issues may arise, if you rely on the endianness ever etc.
If the code had a robust set of unit tests, it would probably be much easier to track down the problem (you asked for design advice ;) )
I assume you don't need to use the new 'c' variable everywhere in this giant codebase, you're just adding it so you can use it in some code you're adding or modifying? Instead of adding c to foo, you could make a new struct, bar, which contains a foo object and c. Then use bar where it's needed.
As for the actual bug, it could be anything with so little information to go on, but if I had to guess, I'd say someone used a magic number instead of sizeof() somewhere.
Look for memcpy, memset, memcmp. These functions are not member-wise. If they were used using the previous structure length, you may have problems.
Also search the files for every instance of the struct. There may be functions or methods that do not use the new important field. As others have said, if you find the structure in a #define or typedef, you'll have to search those too.
Since you tagged your question C++:
For the future, Pimpl/d-Pointer is a strategy that allows you much greater freedom in extending or re-designing your classes without breaking compatibility.
For example, if you had originally written
// foo.h
class Foo {
public:
Foo();
Foo(const Foo &);
~Foo();
int a() const;
void a(int);
int b() const;
void b(int);
private:
class FooPrivate *const d;
};
// foo.c
class FooPrivate {
public:
FooPrivate() : a(0), b(0) {}
FooPrivate(const FooPrivate &o) : a(o.a), b(o.b) {}
int a;
int b;
};
Foo::Foo() : d(new FooPrivate()) {}
Foo::Foo(const Foo &o) : d(new FooPrivate(*o->d)) {}
Foo::~Foo() { delete d; }
int Foo::a() const { return d->a; }
void Foo::a(int a) { d->a = a; }
// ...
you can easily extend this to
// foo.h
class Foo {
public:
// ...
int a() const;
void a(int);
int b() const;
void b(int);
int c() const;
void c(int);
// ...
};
// foo.c
class FooPrivate {
// ...
int a;
int b;
int c;
};
// ...
without breaking any existing (compiled!) code using Foo.
If the code is used to transfer data across the network, you could be breaking things.
If adding a structure member anywhere other than as the first member breaks anything, then the code has undefined behaviour and it's wrong. So at least you have someone else (or your earlier self) to blame for the breakage. But yes, undefined behaviour includes "happens to do what we'd like it to do", so as the other guys say, watch out for memory allocation, serialization (network and file IO).
As an aside, I always cringe when I see typedef FOO ... struct FOO, as if one is trying to make C code look like C++. I realize I'm in a minority here :)
Its always safe to add new elements at the end of a C struct. Event if that struct is passed to different processes. The code which has been recompiled will see the new struct member and the code which hasn't been will just be aware of the old struct size and just read the old members its knows about.
The caveat here is that new member has to be added at the end of the structure and not in the middle.

Is it OK to return a const reference to a private member?

I need to implement read-only access to a private member container. If I return a constant reference is it possible to const_cast it and obtain a full access to the member? What's the technique to be used?
Thanks.
Is it safe to return a const reference to a private member
Yes as long as the lifetime of the reference does not exceed the lifetime of the object which returned it. If you must expose the private member you do not want modified, this is a good way to do so. It's not foolproof but it's one of the better ways to do so in C++
Is it possible to use const_cast to actually mess around with member
Yes and there is nothing you can do to prevent this. There is no way to prevent someone from casting away const in C++ at any time. It's a limitation / feature of C++.
In general though, you should flag every use of const_cast as a bug unless it contains a sufficiently detailed comment as to why it's necessary.
Returning a const & is a sensible thing to do in many circumstances, particularly if the object being returned is large or cannot be copied.
Regarding the const_cast, remember the "private" access specifier in C++ is there as an aid to the programmer - it is not intended to be a security measure. If someone wants access to an object's private members, it can get them, no matter what you try to do to prevent it.
const int &ref = your_object.your_function();
*(int*)&ref = 1234;
Don't worry about users doing const_casts just to break your invariants. If they really want to break your code they can without you providing accessors to your internal attributes. By returning a constant reference, the common user will not mistakenly modify your data.
Encapsulation prevents mistakes, not espionage A malicious coder can break it anyway if they really care and know the environment (compiler). Const-ness is lost in the compilation process (in all compilers I know of). Once the compilation unit is converted into binary objects, those objects do not know about const-ness, and that can be exploited to take advantage.
// a.h
class A
{
public:
A( int a ) : data_( a ) {}
int get() const { return data_; }
private:
int data_;
};
// malicious.h
class A;
void change( A& a, int new_value );
// malicious.cpp
// does not include a.h, but redefines an almost exact copy of it
class A {
public:
A( int a ) : data_( a ) {}
int get() const { return data_; }
int data_; // private removed
};
void change( A& a, int new_value )
{
a.data_ = new_value;
}
// main.cpp
#include "a.h"
#include "malicious.h"
int main()
{
A a(0);
change( a, 10 );
std::cout << a.get() << std::endl; // 10
}
While the code above is incorrect (One definition rule is broken, there are two definitions for class A), the fact is that with most compilers the definition of A and malitious A are binary compatible. The code will compile and link, and the result is that external code has access to your private attributes.
Now that you know of it, don't do it. It will later be a maintenance pain in the ***. That has cost Microsoft quite a bit of money in providing backwards compatibility to software that used private parts of the API returned objects (new versions of the API that shared the same public interface but changed the internals would break some third party application code). With some broadly available software the provider (Microsoft in this case) will go through the pain of providing backwards compatibility, but with lesser known applications they won't and suddenly your previously running application will fail in all sort of ways.
I think it was Herb Sutter that once said that one should "Protect against Murphy, not against Machiavelli." That is, you should do everything possible to protect against the code being used incorrectly by accident, but there's nothing you can do about people abusing your code on purpose.
If someone really wants to break your code, they can, even if it's by #define private public before including your header (and thus creating an ODR violation, but I digress).
So yes, passing back a const ref is fine.
const_cast can be definitely used to obtain the full access to the member. I guess you can not stop people if they are hell bent on shooting themself on the foot. If the private member is not heavy, consider returning a copy of that variable.
Yes, so it's probably not what you want to do. On the other hand, if someone is going to the trouble of const casting your reference, it's possible they really know what they are doing.
It is possible to obtain a full access. But what for?
Don't forget to make accessor to be const correct
const MyType& getMyValue() const;
Also you can inject you private value in the callback.
void doJob( callback c )
{
c( myPrivateValue_ );
}
According to Scott Meyer's book, Effective C++ (see item #28), you should avoid it. Here is an excerpt from item #28:
This is why any function that returns a handle to an internal part of
the object is dangerous. It doesn’t matter whether the handle is a
pointer, a reference, or an iterator. It doesn’t matter whether it’s
qualified with const. It doesn’t matter whether the member function
returning the handle is itself const. All that matters is that a
handle is being returned, because once that’s being done, you run the
risk that the handle will outlive the object it refers to.