Bool member variable in class is set to True when uninitialized - c++

Updates:
The reason I was getting true is because anything that is other than 0 would be considered true which obviously makes sense to how unlikely it would have been for me to get false when uninitialized.
I have read a post similar to my question on StackOverflow, it talked about that it is good practise to initialize all member variables, which I agree with.
Post: Boolean variables aren't always false by default?
This post is a bit old (9 years ago) so I just think maybe somethings might have been changed in the new C++ versions, I am currently using C++17. I also have one slight different question from the ones talked about in the post.
I am aware that if a variable is uninitialized it may contain some "garbage data" or as one of the answers in the post said (which I think that is what they meant but I'm not 100% sure), "if not explicitly initialized -- will contain an arbitrary value.".
I have tried testing that, and the results showed that when I didn't initialize my variables, they contained random numbers (for int, double). I also tested std::string but they are set to "" by default from what I saw.
Anyways, now when I tried for the built in type bool I would always get true (after class is constructed, but again that boolean value never initialized, I would go into debug and see that the value would be true), what I am confused is that no matter how many times I tried to test if that was just a random value out of true and false, and if sometimes it would be false, it was always set to true. If uninitialized shouldn't the value be "random" kind of? Why did it always set to true (again when a member variable of my class which wasn't initialized on construction).
Solutions I tried:
Obviously one is just to initialize on construction, but I thought of another one...
What if on construction I wanted it to be true and not false but when it hasn't been constructed to be set to false then that way when I have a vector of pointers to my object I can check that I am not reading uninitialized objects when following the pointer by checking whether that boolean is set to true (if initialized) or false otherwise. I wouldn't be able to use method 1 which was to initialize on construction to false, also because if we are reversing the behaviour I can't rely on what that uninitialized boolean member variable would be as I mentioned in the above paragraphs I am unsure what behaviour that has due to the results I had been getting. I did the following and it worked...
class Testing{
private:
bool condition{false} // Initalize it here which kind of makes me confused but it works
public:
Testing() : condition{true} {} // Constructor setting the condition value to true
};
Could someone explain if it is wrong to do this, personally I have never seen someone do this but I tried it and no errors were given.

While bool conceptually contains only one bit of information, the requirements of the C++ standard mean that a bool object must take up at least eight bits. There's three main ways that a compiler might represent bool at a bitwise level:
All bits zeroed for false; all bits one'd for true (0x00 versus 0xFF)
All bits zeroed for false; the lowest bit one'd for true (0x00 versus 0x01)
All bits zeroed for false; at least one bit one'd for true(0x00 versus anything else)
(Note that this choice of representation is not ordinarily visible in the effects of a program. Regardless of how the bits are represented, a bool becomes a 0 or 1 when casted to a wider integer type. It's only relevant to the machine code being generated.)
In practice, modern x86/x64 compilers go with option 2. There are instructions which make this straightforward, it makes casting bool to int trivial, and comparing bools works without additional effort.
A side effect is that if the bits making up a bool end up set to, say, 0x37, weird stuff can happen, because the executable code isn't expecting that. For instance, both branches of an if-statement might be taken. A good debugger should loudly yell at you when it sees a bool with an unexpected bit pattern, but in practice they tend to show the value as true.
The common theme of all those options is that most random bit patterns are not the bit pattern for false. So if the allocator really did set it to a "random" value, it almost certainly would be shown as true in the debugger.

Related

Using a flag number within unsigned integers

Many times people will combine a boolean check by just re-using an int variable they already have and checking for -1 if something exists or not.
However, what if someone wants to use unsigned integers but still wants to use this method and also where 0 actually has a different meaning besides existance.
Is there a way to have a data range be -1 to 4,294,967,294?
The obvious choice here is to just use a bool that detects what you are after but it is my understanding that a bool is a byte, and can really add to the storage size if you have an array of structs. This is why I wondered if there was a way to get the most useful numbers you can (postivies) all while leaving just one number to act as a flag.
Infact, if it is possible to do something like shifting the data range of a data type, it would seem like shifting it to something like -10 to 4,294,967,285 would allow you to have 10 boolean flags at no additional cost (bits).
The obvious hacky method here is just to add whatever number to what your storing and remember to account for it later on, but I wanted to keep it a bit more readable (I guess if thats the case I shouldnt even be using -1, but meh).
If you simply want to pick a value which can not exist in your interpretation of the variable and to use it to indicate an exception or error value, why not to simply do it? You can take such a value, define it as a macro and use it. For example if you are sure that your variable never reaches the max limit, put:
#define MY_FUN_ERROR_VALUE (UINT_MAX)
then you can use it as:
unsigned r = my_function_maybe_returning_error();
if (r == MY_FUN_ERROR_VALUE) {handle error}
you shall also ensure that my_function_maybe_returning_error does not return MY_FUN_ERROR_VALUE in normal conditions when actually no error happens. For this you may use an assert:
unsigned my_function_maybe_returning_error() {
...
// branch going to return normal (not error) value r
assert(r != MY_FUN_ERROR_VALUE);
return(r);
}
I do not see anything wrong on this.
You just asked how to use a value that can be 0 or something greater than 0 to hold the three states: whatever 0 means, something greater than 0, and does not exist. So no, (by the pigeonhole principle I guess) it's not possible.
Nor should it be. Overloading a variable is bad practice unless you're down to your last 3 bytes left of RAM, which you almost certainly aren't. So yes, please use another variable with a correct name and clear purpose.

bool colon initialization

While reading some C++ code, I saw and was confused by this little line in a class:
bool x:1;
In debug builds, I noticed that 'x' is initialized as 'false', but I can not find any documentation about that. Can anyone tell me what this syntax does?
it's a bit field. read up on bit fields in your c++ textbook.
the initialization to false is independent of the declaration. whether it is guaranteed by your code depends on your code (not given).
the c++ standard gives the compiler some leeway for integer and enumeration bitfields of size 1: storing the value 1 in such a field, you may get out the value -1. happily this applies only to fields of size 1, and it does not apply to a field of type bool.

Default value of an integer?

My program requires several floats to be set to a default number when the program launches. As the program runs these integers will be set to their true values. These true values however can be any real number. My program will be consistently be checking these numbers to see if their value has been changed from the default.
For example lets say I have integers A,B,C. All these integers will be set to a default value at the start (lets say -1). Then as the program progresses, lets say A and B are set to 3 and 2 respectfully. Since C is still at the default value, the program can conclude than C hasn't been assigned a non-default value yet.
The problem arises when trying to find a unique default value. Since the values of the numbers can be set to anything, if the value its set to is identical to the default value, my program won't know if a float still has the default value or its true value is just identical to the default value.
I considered NULL as a default value, but NULL is equal to 0 in C++, leading to the same problem!
I could create a whole object consisting of an bool and a float as members, where the bool indicates whether the float has been assigned its own value yet or not. This however seems like an overkill. Is there a default value I can set my floats to such that the value isn't identical to any other value? (Examples include infinity or i)
I am asking for C/C++ solutions.
I could create a whole object consisting of an bool and a integer as
members, where the bool indicates whether the number has been assigned
its own value yet or not. This however seems like an overkill.
What you described is called a "nullable type" in .NET. A C++ implementation is boost::optional:
boost::optional<int> A;
if (A)
do_something(*A);
On a two's complement machine there's an integer value that is less useful than the others: INT_MIN. You can't make a valid positive value by negating it. Since it's the least useful value in the integer range, it makes a good choice for a marker value. It also has an easily recognizable hex value, 0x80000000.
There is no bit pattern you can assign to an int that isn't an actual int. You need to keep separate flags if you really have no integer values that are out of bounds.
If the domain of valid int values is unlimited, the only choice is a management bit indicating whether it is assigned or not.
But, are you sure MAX_INT is a desired choice?
There is no way to guarantee that a value you assign an int to is not going to be equal to another random int. The only way to assure that what you want to happen occurs, is to create a separate bool to account for changes.
No, you will have to create your own data type which contains the information about whether it has been assigned or not.
If as you say, no integer value is off limits, then you cannot assign a default "uninitialised" value. Just use a struct with an int and a bool as you suggest in your question.
I could create a whole object consisting of an bool and a integer as
members, where the bool indicates whether the number has been assigned
its own value yet or not. This however seems like an overkill.
My first guess would be to effectively use a flag and mark each variable. But this is not your only choice of course.
You can use pointers (which can be NULL) and assign dynamically the memory. Not very convenient.
You can pick a custom value which is almost never used. You can then define this value to be the default value. Ofc, some time, you will need to assign this value to your floats, but this case won't happen often and you just need to keep track of this variables. Given the occurrence of such case, a simple linked list should do.

C++ test to verify equality operator is kept consistent with struct over time

I voted up #TomalakGeretkal for a good note about by-contract; I'm haven't accepted an answer as my question is how to programatically check the equals function.
I have a POD struct & an equality operator, a (very) small part of a system with >100 engineers.
Over time I expect the struct to be modified (members added/removed/reordered) and I want to write a test to verify that the equality op is testing every member of the struct (eg is kept up to date as the struct changes).
As Tomalak pointed out - comments & "by contract" is often the best/only way to enforce this; however in my situation I expect issues and want to explore whether there are any ways to proactively catch (at least many) of the modifications.
I'm not coming up with a satisfactory answer - this is the best I've thought of:
-new up two instances struct (x, y), fill each with identical non-zero data.
-check x==y
-modify x "byte by byte"
-take ptr to be (unsigned char*)&x
-iterator over ptr (for sizeof(x))
-increment the current byte
-check !(x==y)
-decrement the current byte
-check x==y
The test passes if the equality operator caught every byte (NOTE: there is a caveat to this - not all bytes are used in the compilers representation of x, therefore the test would have to 'skip' these bytes - eg hard code ignore bytes)
My proposed test has significant problems: (at least) the 'don't care' bytes, and the fact that incrementing one byte of the types in x may not result in a valid value for the variable at that memory location.
Any better solutions?
(This shouldn't matter, but I'm using VS2008, rtti is off, googletest suite)
Though tempting to make code 'fool-proof' with self-checks like this, it's my experience that keeping the self-checks themselves fool-proof is, well, a fool's errand.
Keep it simple and localise the effect of any changes. Write a comment in the struct definition making it clear that the equality operator must also be updated if the struct is; then, if this fails, it's just the programmer's fault.
I know that this will not seem optimal to you as it leaves the potential for user error in the future, but in reality you can't get around this (at least without making your code horrendously complicated), and often it's most practical just not to bother.
I agree with (and upvoted) Tomalak's answer. It's unlikely that you'll find a foolproof solution. Nonetheless, one simple semi-automated approach could be to validate the expected size within the equality operator:
MyStruct::operator==(const MyStruct &rhs)
{
assert(sizeof(MyStruct) == 42); // reminder to update as new members added
// actual functionality here ...
}
This way, if any new members are added, the assert will fire until someone updates the equality operator. This isn't foolproof, of course. (Member vars might be replaced with something of same size, etc.) Nonetheless, it's a relatively simple (one line assert) that has a good shot of detecting the error case.
I'm sure I'm going to get downvoted for this but...
How about a template equality function that takes a reference to an int parameter, and the two objects being tested. The equality function will return bool, but will increment the size reference (int) by the sizeof(T).
Then have a large test function that calls the template for each object and sums the total size --> compare this sum with the sizeof the object. The existence of virtual functions/inheritance, etc could kill this idea.
it's actually a difficult problem to solve correctly in a self-test.
the easiest solution i can think of is to take a few template functions which operate on multiple types, perform the necessary conversions, promotions, and comparisons, then verify the result in an external unit test. when a breaking change is introduced, at least you'll know.
some of these challenges are more easily maintained/verified using approaches such as composition, rather than extension/subclassing.
Agree with Tomalak and Eric. I have used this for very similar problems.
Assert does not work unless the DEBUG is defined, so potentially you can release code that is wrong. These tests will not always work reliably. If the structure contains bit fields, or items are inserted that take up slack space cause by compiler aligning to word boundaries, the size won't change. For this reason they offer limited value. e.g.
struct MyStruct {
char a ;
ulong l ;
}
changed to
struct MyStruct {
char a ;
char b ;
ulong l ;
}
Both structures are 8 bytes (on 32bit Linux x86)

Coding conventions for method returns in C++

I have observed that the general coding convention for a successful completion of a method intended functionality is 0. (As in exit(0)).
This kind of confuses me because, if I have method in my if statement and method returns a 0, by the "if condition" is false and thereby urging me to think for a minute that the method had failed. Of course I do know I have to append with a "!" (As in if(!Method()) ), but isn't this convention kind of self contradicting itself ??
You need to differentiate between an error code and an error flag. A code is a number representing any number of errors, while a flag is a boolean that indicates success.
When it comes to error codes, the idea is: There is only one way to succeed, but there are many ways to fail. Take 0 as a good single unique number, representing success, then you have every other number is a way of indicating failure. (It doesn't make sense any other way.)
When it comes to error flags, the idea is simple: True means it worked, false means it didn't. You can usually then get the error code by some other means, and act accordingly.
Some functions use error codes, some use error flags. There's nothing confusing or backwards about it, unless you're trying to treat everything as a flag. Not all return values are a flag, that's just something you'll have to get used to.
Keep in mind in C++ you generally handle errors with exceptions. Instead of looking up an error code, you just get the necessary information out of the caught exception.
The convention isn't contradicting itself, it's contradicting your desired use of the function.
One of them has to change, and it's not going to be the convention ;-)
I would usually write either:
if (Function() == 0) {
// code for success
} else {
// code for failure
}
or if (Function() != 0) with the cases the other way around.
Integers can be implicitly converted to boolean, but that doesn't mean you always should. 0 here means 0, it doesn't mean false. If you really want to, you could write:
int error = Function();
if (!error) {
// code for success
} else {
// code for failure, perhaps using error value
}
There are various conventions, but most common for C functions is to return 0 on failure and a positive value on success so you can just use it inside an if statement (almost all CPUs have conditional jumps which can test whether a value is 0 or not which is why in C this "abused" with 0 meansing false and everything else meaning true).
Another convention is to return -1 on error and some other value on success instead (you especially see this with POSIX functions that set the errno variable). And this is where 0 can be interpreted as "success".
Then there's exit. It is different, because the value returned is not to be interpreted by C, but by a shell. And here the value 0 means success, and every other value means an error condition (a lot of tools tell you what type of error occurred with this value). This is because in the shell, you normally only have a range of 0-127 for returning meaningful values (historic reasons, it's a unsigned byte and everything above 127 means killed by some signal IIRC).
You have tagged your question as [c] and [c++]. Which is it? Because the answer will differ somewhat.
You said:
I have observed that the general coding convention for a successful completion of a method intended functionality is 0.
For C, this is fine.
For C++, it decidedly is not. C++ has another mechanism to signal failure (namely exceptions). Abusing (numeric) return values for that is usually a sign of bad design.
If exceptions are a no-go for some reasons, there are other ways to signal failure without clogging the method’s return type. Alternatives comprise returning a bool (consider a method try_insert) or using an invalid/reserved return value for failure, such as string::npos that is used by the string::find method when no occurrence is found in the string.
exit(0) is a very special case because it's a sentinel value requesting that the compiler tell the operating system to return whatever the real success value is on that OS. It's entirely possible that it won't be the number 0.
As you say, many functions return 0 for success, but they're mainly "legacy" C library OS-interfacing functions, and following the interfacing style of the operating system's on which C was first developed and deployed.
In C++, 0 may therefore be a success value when wrapping such a C legacy interface. Another situation where you might consider using 0 for success is when you are effectively returning an error code, such that all errors are non-zero values and 0 makes sense as a not-an-error value. So, don't think of the return value as a boolean (even though C++ will implicitly convert it to one), but as an error code where 0 means "no error". (In practice, using an enum is typically best).
Still, you should generally return a boolean value from functions that are predicates of some form, such as is_empty(), has_dependencies(), can_fit() etc., and typically throw an exception on error. Alterantively, use a sub-system (and perhaps thread) specific value for error codes as per libc's errno, or accept a separate reference/pointer argument to the variable to be loaded with the error code.
int retCode = SaveWork();
if(retCode == 0) {
//Success !
} else if(retCode == ERR_PERMISSIONS) {
//User doesn't have permissions, inform him
//and let him chose another place
} else if(retCode == ERR_NO_SPACE) {
//No space left to save the work. Figure out something.
} else {
//I give up, user is screwd.
}
So, if 0/false were returned to mean failure, you could not distinguish what was the cause of the error.
For C++, you could use exceptions to distinguish between different errors. You could also use
a global variable, akin to errno which you inspect in case of a failure. When neither exceptions or global variables are desired, returning an error code is commonly used.
As the other answers have already said, using 0 for success leaves everything non-zero for failure. Often (though not always) this means that individual non-zero values are used to indicate the type of failure. So as has already been said, in that situation you have to think of it as an error code instead of an success/failure flag.
And in that kind of situation, I absolutely loathe seeing a statement like if(!Method()) which is actually a test for success. For myself too, I've found it can cause a moment of wondering whether that statement is testing for success or failure. In my opinion, since it's not a simple boolean return, it shouldn't be written like one.
Ideally, since the return is being used as an error code the author of that function should have provided an enum (or at least a set of defines) that can be used in place of the raw numbers. If so, then I would always prefer the test rewritten as something like if(Method() == METHOD_SUCCESS). If the author didn't provide that but you have documentation on the error code values, then consider making your own enum or defines and using that.
If nothing else, I would write it as if(Method() == 0) because then at least it should still be clear to the reader that Method doesn't return a simple boolean and that zero has a specific meaning.
While that convention is often used, it certainly isn't the only one. I've seen conventions that distinguish success and failure by using positive and negative values. This particularly happens when a function returns a count upon success, but needs to returns something that isn't a value count upon failure (often -1). I've also seen variations using unsigned numbers where (unsigned int)-1 (aka 0xffffffff) represented an error. And there are probably others that I can't even think of offhand.
Since there's no single right way to do it, various authors at various times have invented various schemes.
And of course this is all without mentioning that exceptions offer (among other things) a totally different way to provide information when a function has an error.
I use this convention in my code:
int r = Count();
if(r >= 0) {
// Function successful, r contains a useful non-negative value
}
else {
// A negative r represents an error code
}
I tend to avoid bool return values, since they have no advantages (not even regarding their size, which is rounded to a byte) and limits possible extensions of the function's return values.
Before using exceptions, take into account the performance and memory issues they bring.