I am refactoring some older code that uses NULL in many places. The question is
Is it safe to blindly replace all NULL instances by nullptr?
I am particularly interested in scenario where replacing NULL by nullptr may lead to some run-time errors (compile-time errors would be ok) but I can't think of any. If not, would it be safe to just auto-replace NULL by nullptr (fixing compile time errors if any).
I apologize if question have been asked earlier - I couldn't find it, I will delete it if you point me to the answer!
In practice it should be fairly safe.
But, technically, it is possible that the meaning of your program changes, without causing any compiler errors. Consider the following program:
void foo(int) { std::cout << "int\n"; }
void foo(int*) { std::cout << "int*\n"; }
int main() {
foo(NULL); // prints 'int'
foo(nullptr); // prints 'int*'
return 0;
}
Note that when there's ambiguity between an int and a pointer when passing NULL, the pointer version is what's almost always desired -- which means that most real programs won't have an ambiguity like that in the first place (or will use casts like (int*)NULL to get around it, in which case replacement by nullptr is perfectly fine).
Related
I am a c++ learner. Others told me "uninitiatied pointer may point to anywhere". How to prove that by code.?I made a little test code but my uninitiatied pointer always point to 0. In which case it does not point to 0? Thanks
#include <iostream>
using namespace std;
int main() {
int* p;
printf("%d\n", p);
char* p1;
printf("%d\n", p1);
return 0;
}
Any uninitialized variable by definition has an indeterminate value until a value is supplied, and even accessing it is undefined. Because this is the grey-area of undefined behaviour, there's no way you can guarantee that an uninitialized pointer will be anything other than 0.
Anything you write to demonstrate this would be dictated by the compiler and system you are running on.
If you really want to, you can try writing a function that fills up a local array with garbage values, and create another function that defines an uninitialized pointer and prints it. Run the second function after the first in your main() and you might see it.
Edit: For you curiosity, I exhibited the behavior with VS2015 on my system with this code:
void f1()
{
// junk
char arr[24];
for (char& c : arr) c = 1;
}
void f2()
{
// uninitialized
int* ptr[4];
std::cout << (std::uintptr_t)ptr[1] << std::endl;
}
int main()
{
f1();
f2();
return 0;
}
Which prints 16843009 (0x01010101). But again, this is all undefined behaviour.
Well, I think it is not worth to prove this question, because a good coding style should be used and this say's: Initialise all variables! One example: If you "free" a pointer, just give them a value like in this example:
char *p=NULL; // yes, this is not needed but do it! later you may change your program an add code beneath this line...
p=(char *)malloc(512);
...
free(p);
p=NULL;
That is a safe and good style. Also if you use free(p) again by accident, it will not crash your program ! In this example - if you don't set NULL to p after doing a free(), your can use the pointer by mistake again and your program would try to address already freed memory - this will crash your program or (more bad) may end in strange results.
So don't waste time on you question about a case where pointers do not point to NULL. Just set values to your variables (pointers) ! :-)
It depends on the compiler. Your code executed on an old MSVC2008 displays in release mode (plain random):
1955116784
1955116784
and in debug mode (after croaking for using unitialized pointer usage):
-858993460
-858993460
because that implementation sets uninitialized pointers to 0xcccccccc in debug mode to detect their usage.
The standard says that using an uninitialized pointer leads to undefined behaviour. That means that from the standard anything can happen. But a particular implementation is free to do whatever it wants:
yours happen to set the pointers to 0 (but you should not rely on it unless it is documented in the implementation documentation)
MSVC in debug mode sets the pointer to 0xcccccccc in debug mode but AFAIK does not document it (*), so we still cannot rely on it
(*) at least I could not find any reference...
I have Class which returns std::shared_ptr aka Product_SPTR:
Product_SPTR Mill::Production(sf::Time time)
{
if(m_isProducing)
{
if(elapsedTime.getElapsedTime()>m_manufacturingTime)
{
elapsedTime.restart();
Flour_SPTR a(new Flour(5,1,ProductType::CONSTRUCTION),deleter<Flour>);
return a ;
}
}
}
then i have typedef std::vector<Product_SPTR> VectorProduct_SPTR
and when I'm trying to add new Product_SPTR to vector i have segmentation fault
Here:
products.push_back(a->Production(gameTime.getElapsedTime()));
But when I do something like this:
products.push_back(new Flour(5,1,ProductType::CONSTRUCTION),deleter<Flour>);
problem does not occur....
I have just started to use smart pointers so maybe i don't know how to use it ..
You missed a return statement when the conditions in both ifs evaluate to false. It compiles, probably giving you compiler warnings. You should always work on the highest warning level and eliminate all warnings one by one, unless you understand a warning and its implications.
Additionaly, instead of writing
Flour_SPTR a(new Flour(5, 1, ProductType::CONSTRUCTION),deleter<Flour>);
return a;
You probably should write
return Flour_SPTR(new Flour(5, 1, ProductType::CONSTRUCTION), deleter<Flour>);
Modern compilers shouldn't have any problems with optimizing out the redundant variable, but it's always good to help the compiler do it's job. If you could skip the deleter you could also write:
return std::make_shared<Flour>(5, 1, ProductType::CONSTRUCTION);
First of all: I know that most optimization bugs are due to programming errors or relying on facts which may change depending on optimization settings (floating point values, multithreading issues, ...).
However I experienced a very hard to find bug and am somewhat unsure if there is any way to prevent these kind of errors from happening without turning the optimization off. Am I missing something? Could this really be an optimizer bug? Here's a simplified example:
struct Data {
int a;
int b;
double c;
};
struct Test {
void optimizeMe();
Data m_data;
};
void Test::optimizeMe() {
Data * pData; // Note that this pointer is not initialized!
bool first = true;
for (int i = 0; i < 3; ++i) {
if (first) {
first = false;
pData = &m_data;
pData->a = i * 10;
pData->b = i * pData->a;
pData->c = pData->b / 2;
} else {
pData->a = ++i;
} // end if
} // end for
};
int main(int argc, char *argv[]) {
Test test;
test.optimizeMe();
return 0;
}
The real program of course has a lot more to do than this. But it all boils down to the fact that instead of accessing m_data directly, a (previously unitialized) pointer is being used. As soon as I add enough statements to the if (first)-part, the optimizer seems to change the code to something along these lines:
if (first) {
first = false;
// pData-assignment has been removed!
m_data.a = i * 10;
m_data.b = i * m_data.a;
m_data.c = m_data.b / m_data.a;
} else {
pData->a = ++i; // This will crash - pData is not set yet.
} // end if
As you can see, it replaces the unnecessary pointer dereference with a direct write to the member struct. However it does not do this in the else-branch. It also removes the pData-assignment. Since the pointer is now still unitialized, the program will crash in the else-branch.
Of course there are various things which could be improved here, so you might blame it on the programmer:
Forget about the pointer and do what the optimizer does - use m_data directly.
Initialize pData to nullptr - that way the optimizer knows that the else-branch will fail if the pointer is never assigned. At least it seems to solve the problem in my test-environment.
Move the pointer assignment in front of the loop (effectively initializing pData with &m_data, which then could also be a reference instead of a pointer (for good measure). This makes sense because pData is needed in all cases so there is no reason to do this inside the loop.
The code is obviously smelly, to say the least, and I'm not trying to "blame" the optimizer for doing this. But I'm asking: What am I doing wrong? The program might be ugly, but it's valid code...
I should add that I'm using VS2012 with C++/CLI and v110_xp-Toolset. Optimization is set to /O2. Please also note that if you really want to reproduce the problem (that's not really the point of this question though) you need to play around with the complexity of the program. This is a very simplified example and the optimizer sometimes doesn't remove the pointer assignment. Hiding &m_data behind a function seems to "help".
EDIT:
Q: How do I know that the compiler is optimizing it to something like the example provided?
A: I'm not very good at reading assembler, I have looked at it however and have made 3 observations which make me believe that it's behaving this way:
As soon as optimization kicks in (adding more assignments usually does the trick) the pointer assignment has no associated assembler statement. It also hasn't been moved up to the declaration, so it's really left uninitialized it seems (at least to me).
In cases where the program crashes, the debugger skips the assignment statement. In cases where the program runs without problems, the debugger stops there.
If I watch the content of pData and the content of m_data while debugging, it clearly shows that all assignments in the if-branch have an effect on m_data and m_data receives the correct values. The pointer itself it still pointing to the same uninitialized value it had from the beginning. Therefore I have to assume that it is in fact not using the pointer to make the assignments at all.
Q: Does it have to do anything with i (Loop unrolling)?
A: No, the actual program actually uses do { ... } while() to loop over a SQL SELECT-resultset so the iteration count is completely runtime-specific and cannot be predetermined by the compiler.
It sure looks like an bug to me. It's fine for the optimizer to eliminate the unnecessary redirection, but it should not eliminate the assignment to pData.
Of course, you can work around the problem by assigning to pData before the loop (at least in this simple example). I gather that the problem in your actual code isn't as easily resolved.
I also vote for an optimizer bug if it is really reproducible in this example. To overrule the optimizer you could try to declare pData as volatile.
I'm currently using a library that uses code like
T& being_a_bad_boy()
{
return *reinterpret_cast<T*>(0);
}
to make a reference to a T without there actually being a T. This is undefined behavior, specifically noted to be unsupported by the standard, but it's not an unheard of pattern.
I am curious if there are any examples or platforms or usages that show that in practice this can cause problems. Can anyone provide some?
Classically, compilers treated "undefined behavior" as simply an excuse not to check for various types of errors and merely "let it happen anyway." But contemporary compilers are starting to use undefined behavior to guide optimizations.
Consider this code:
int table[5];
bool does_table_contain(int v)
{
for (int i = 0; i <= 5; i++) {
if (table[i] == v) return true;
}
return false;
}
Classical compilers wouldn't notice that your loop limit was written incorrectly and that the last iteration reads off the end of the array. It would just try to read off the end of the array anyway, and return true if the value one past the end of the array happened to match.
A post-classical compiler on the other hand might perform the following analysis:
The first five times through the loop, the function might return true.
When i = 5, the code performs undefined behavior. Therefore, the case i = 5 can be treated as unreachable.
The case i = 6 (loop runs to completion) is also unreachable, because in order to get there, you first have to do i = 5, which we have already shown was unreachable.
Therefore, all reachable code paths return true.
The compiler would then simplify this function to
bool does_table_contain(int v)
{
return true;
}
Another way of looking at this optimization is that the compiler mentally unrolled the loop:
bool does_table_contain(int v)
{
if (table[0] == v) return true;
if (table[1] == v) return true;
if (table[2] == v) return true;
if (table[3] == v) return true;
if (table[4] == v) return true;
if (table[5] == v) return true;
return false;
}
And then it realized that the evaluation of table[5] is undefined, so everything past that point is unreachable:
bool does_table_contain(int v)
{
if (table[0] == v) return true;
if (table[1] == v) return true;
if (table[2] == v) return true;
if (table[3] == v) return true;
if (table[4] == v) return true;
/* unreachable due to undefined behavior */
}
and then observe that all reachable code paths return true.
A compiler which uses undefined behavior to guide optimizations would see that every code path through the being_a_bad_boy function invokes undefined behavior, and therefore the being_a_bad_boy function can be reduced to
T& being_a_bad_boy()
{
/* unreachable due to undefined behavior */
}
This analysis can then back-propagate into all callers of being_a_bad_boy:
void playing_with_fire(bool match_lit, T& t)
{
kindle(match_lit ? being_a_bad_boy() : t);
}
Since we know that being_a_bad_boy is unreachable due to undefined behavior, the compiler can conclude that match_lit must never be true, resulting in
void playing_with_fire(bool match_lit, T& t)
{
kindle(t);
}
And now everything is catching fire regardless of whether the match is lit.
You may not see this type of undefined-behavior-guided optimization in current-generation compilers much, but like hardware acceleration in Web browsers, it's only a matter of time before it starts becoming more mainstream.
The largest problem with this code isn't that it's likely to break - it's that it defies an implicit assumption programmers have about references that they will always be valid. This is just asking for trouble when someone unfamiliar with the "convention" runs into this code.
There's a potential technical glitch too. Since references are only allowed to refer to valid variables without undefined behavior, and no variable has the address NULL, an optimizing compiler is allowed to optimize out any checks for nullness. I haven't actually seen this done but it is possible.
T &bad = being_a_bad_boy();
if (&bad == NULL) // this could be optimized away!
Edit: I'm going to shamelessly steal from a comment by #mcmcc and point out that this common idiom is likely to crash because it's using an invalid reference. According to Murphy's Law it will be at the worst possible moment, and of course never during testing.
T bad2 = being_a_bad_boy();
I also know from personal experience that the effects of an invalid reference can propagate far from where the reference was generated, making debugging pure hell.
T &bad3 = being_a_bad_boy();
bad3.do_something();
T::do_something()
{
use_a_member_of_T();
}
T::use_a_member_of_T()
{
member = get_unrelated_value(); // crash occurs here, leaving you wondering what happened in get_unrelated_value
}
Use the NullObject pattern.
class Null_T : public T
{
public:
// implement virtual functions to do whatever
// you'd expect in the null situation
};
T& doing_the_right_thing()
{
static Null_T null;
return null;
}
The important thing to remember is that you have a contract with your users. If you're trying to return a reference to a null pointer, undefined behavior is now part if your function's interface. If your users are all prepared to accept this, then that's on them... but I would try to avoid it if at all possible.
If your code can result in an invalid object, then either have it return a pointer (preferably a smart pointer, but that's another discussion), use the null object pattern mentioned above (boost::optional may be useful here), or throw an exception.
I don't know if this is problems enough for you, or near enough to your "use case", this crashes for me in gcc (on x86_64) :
int main( )
{
volatile int* i = 0;
*i;
}
That said, we should keep in mind that it is always UB, and compilers might change their mind later, so that today it works, tomorrow not.
Another not so obvious bad thing will happen when you call a virtual function on a null pointer (due to usually being implemented via vptr to vtable), and as such of course this applies to the (in standard C++ not existing) null reference.
Btw. I even heard that architectures exist, where even copying around a non-null pointer to invalid memory will trap, maybe there exists also some out there which makes a distinction between pointer and reference.
I would expect that on most platforms, the compiler will convert all references into pointers. If that assumption is true, then this will be identical to just passing around a NULL pointer, which is fine as long as you never use it. The question, then is whether there are any compilers that handle references in some way other than just converting them to pointers. I don't know of any such compilers, but I suppose it's possible that they exist.
When do we need to use "assert" for pointers in C++, and when they are used, how are they most commonly implemented?
Generally you would use an assert to check a condition that, if false, would indicate a bug in your application. So if a NULL pointer shouldn't ever be encountered at some point in the application, unless there's a bug, then assert it. If it might be encountered due to some invalid input then you need to do proper error handling.
You don't need to use assert on pointers at all. The idea is to ensure you don't crash when dereferencing your pointers when they're null.
You can do this with assert but it's not a very professional way to handle errors like this since it invariably terminates the program - not a good idea if the user hasn't, for example, saved their last three hours worth of data entry.
What you should do with pointers is to check them for null-ness and fail gracefully. In other words, have your function return an error of some sort or do nothing (not everyone will agree with this approach but it's perfectly acceptable if it's documented).
The assert stuff is meant, in my opinion, for catching problems during development which is why you'll find assert does nothing in release builds under some compilers. It is not a substitute for defensive programming.
As to how to do it:
#include <assert.h>
void doSomethingWithPointer (int *p) {
assert (p != 0);
cout << *p << endl;
}
but this would be better done as:
void doSomethingWithPointer (int *p) {
if (p != 0)
cout << *p << endl;
}
In other words, even if your "contract" (API) states that you're not allowed to receive null pointers, you should still handle them gracefully. An old quote: be conservative in what you give, liberal in what you accept (paraphrased).
ASSERT statements are great as "enforced documentation" - that is, they tell the reader something about the code ("This should never happen") and then enforces it by letting you know if they don't hold true.
If it's something that could happen (invalid input, memory not able to be allocated), that's not a time to use ASSERT. Asserts are only for things that can not possibly happen if everyone is obeying pre-conditions and such.
You can do it thusly:
ASSERT(pMyPointer);
From experience if you assert on null conditions that should never happen under normal conditions you program is in a really bad state. Recovering from such null condition will more likely than not mask the original problem.
Unless you code with exception guarantee in mind (linky) I say let it crash, then you know you have a problem.
I would use an ASSERT where a null pointer wouldn't immediately cause a crash but might lead to somethign wrong later that's hard to spot.
eg:
ASSERT(p);
strcpy(p, "hello");
Is a little unnecessary, it simply replaces a fatal exception with a fatal assert!
But in more complex code, particulalrly things like smart pointers, it might be useful to know check if the pointer is what you thing it is.
Remember ASSERTs only run in debug builds, they dissapear in the release.
In C, there also assert function..
in debug mode, if assert(x), x condition is false, there will pop up an alert...
But remember it works only in debug mode...
in release mode, all assert functions are all skipped
Assertions are used to to define how the program should function. That being said, the most common use of Assert()s when dealing with pointers is going to either be that they are valid (non-NULL and point towards valid memory) or that their internal state is valid if they point to an object/class instance, for example.
Assertions are not for replacing or acting as error condition code, but instead to enforce rules that you are placing on the functioning of your code, such as what conditions should be at given points in time.
For example,
function int f(int x, int * pY)
{
// These are entrance conditions expected in the function. It would be
// a BUG if this happened at all.
Assert(x >= 0);
Assert(pY != nullptr);
Assert(*pY >= 0, "*pY should never be less than zero");
// ...Do a bunch of computations with x and pY and return the result as z...
int z = x * 2 / (x + 1) + pow(*pY, x); // Maybe z should be always positive
// after these calculations:
Assert(x >= 0, "X should always be positive after calculations);
// Maybe *pY should always be non-zero after calculations
Assert(*pY != 0, "y should never be zero after computation");
Assert(z > 0):
return z;
}
Many users of Asserts choose to apply Assertions to internal state validation once they become familiar with them. We call these Invariants() which are methods on a class that assert many things about the internals of the object that should always hold true.
For example:
class A
{
public:
A(wchar_t * wszName)
{
_cch = 0;
_wszName = wszName;
}
// Invariant method to be called at times to verify that the
// internal state is consistent. This means here that the
// internal variable tracking the length of the string is
// matching the actual length of the string.
void Invariant()
{
Assert(pwszName != nullptr);
Assert(_cch == wcslen(pwszName));
}
void ProcessABunchOfThings()
{
...
}
protected:
int _cch;
wchar_t * pwszName;
}
// Call to validate internal state of object is consistent/ok
A a(L"Test Object");
a.Invariant();
a.ProcessABunchOfThings();
a.Invariant();
The important thing to remember is that this is to make sure that when bugs do happen that mean the program is not working as you would expect, then the effect of the bug happens as close to where it happened in the code as possible in order to make debugging easier. I have used Asserts extensively in my own code and while at Microsoft and I swear by them since they have saved me so much time in debugging and even knowing the defect is there.