Which is faster, pointer access or reference access? - c++

In the example code below I allocate some instances of the struct Chunk. In the for loops I then iterate through the memory block and access the different instances by using either pointer or references, and assign them some random data.
But which for loop will execute the fastest? By my knowledge I'd say the reference-loop will be the fastest because it will require no dereferencing and has direct access to the instance in memory. How wrong / right am I?
struct Chunk {
unsigned int a;
float b;
const char* c;
};
int main() {
Chunk* pData = new Chunk[8];
for( unsigned int i = 0; i < 8; ++i ) {
Chunk* p = &pData[i];
p->a = 1;
p->b = 1.0f;
p->c = "POINTERS";
}
for( unsigned int i = 0; i < 8; ++i ) {
Chunk& r = pData[i];
r.a = 1;
r.b = 1.0f;
r.c = "REFERENCES";
}
delete [] pData;
return 0;
}

They should be the same (not about the same, but exactly the same) with any non-idiotic compiler. Under the hood, references are pointers (on 99% of compilers). There's no reason for any difference.
Pedantic: the second loop could be faster (probably not) because the data is in the cache already, but that's it. :)

I'm tempted to say: who cares? Any difference in speed will be
negligible, and you should chose the most readable. In this particular
case, I would expect to see exactly the same code generated in both
case. In more complicated cases, the compiler may not be able to
determine later in the loop that the pointer has not been reseated, and
may have to reread it. But for this to be the case, you'd have to be
doing enough other things that the difference wouldn't be measurable.

There should be no difference in code produced by any decent compiler.

When you hesitate between two versions of the code like those in your example, you should choose the one more readable. Possible optimization of the kind you propose is supposed to be done by the compiler.
The more readable in your case is rather the version with references (actually, maybe not really more readable, but the consensus is to prefer the usage of references because pointers are more "dangerous").
But back to the effeciency: (please if someone knows assembler, better stop reading or you risk a laugh attack...) In my opinion, as the pData is alocated on the heap, the compiler will have to compile it using pointers anyway. I think that your reasoning could be kind of correct if your structure was allocated on the stack just with "Chunk data[8];". But at latest when the compiler optimizations are on the difference should be deleted anyway.

The execution time is almost the same but using references in less hectic.

Related

Can char* and void* be used interchangeably in all cases as buffers? c++

Say we declared a char* buffer:
char *buf = new char[sizeof(int)*4]
//to delete:
delete [] buf;
or a void* buffer:
void *buf = operator new(sizeof(int)*4);
//to delete:
operator delete(buf);
How would they differ if they where used exclusively with the purpose of serving as pre-allocated memory?- always casting them to other types(not dereferencing them on their own):
int *intarr = static_cast<int*>(buf);
intarr[1] = 1;
Please also answer if the code above is incorrect and the following should be prefered(only considering the cases where the final types are primitives like int):
int *intarr = static_cast<int*>(buf);
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
intarr[1] = 1;
Finally, answer if it is safe to delete the original buffer of type void*/char* once it is used to create other types in it with the latter aproach of placement new.
It is worth clarifying that this question is a matter of curiosity. I firmly believe that by knowing the bases of what is and isnt possible in a programming language, I can use these as building blocks and come up with solutions suitable for every specific use case when I need to in the future. This is not an XY question, as I dont have a specific implementation of code in mind.
In any case, I can name a few things I can relate to this question off the top of my head(pre-allocated buffers specifically):
Sometimes you want to make memory buffers for custom allocation. Sometimes even you want to align these buffers to cache line boundaries or other memory boundaries. Almost always in the name of more performance and sometimes by requirement(e.g. SIMD, if im not mistaken). Note that for alignment you could use std::aligned_alloc()
Due to a technicality, delete [] buf; may be considered to be UB after buf has been invalidated due to the array of characters being destroyed due to the reuse of the memory.
I wouldn't expect it to cause problems in practice, but using raw memory from operator new doesn't suffer from this technicality and there is no reason to not prefer it.
Even better is to use this:
T* memory = std::allocator<T>::allocate(n);
Because this will also work for overaligned types (unlike your suggestions) and it is easy to replace with a custom allocator. Or, simply use std::vector
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
There is a standard function for this:
std::uninitialized_fill_n(intarr, 4, 0);
OK, it's slightly different that it initialises with an actual value, but that's probably better thing to do anyway.

Can I guarantee accessing pointer with negative shift?

I found a negative index access in an embedded code I'm debugging:
for (int i = len; i > 0; i--)
{
data[i - 1] = data[i - 2]; // negative access when i == 1
}
I read this about similar cases, but in the OP arr[-2] is guaranteed to be OK since arr points to the middle of a previously allocated array. In my case, data is a pointer inside a class that is initialized by the constructor with:
public:
constructor_name(): ... data(new T_a[size]), ...
And the pointer data is the first member in the class:
template <class T_a, class T_b, int size>
class T_c
{
private:
T_a *data;
T_b *...;
int ...;
int ...;
int ...;
public:
constructor_name(): ... data(new T_a[size]), ...
Now, is there a possibility that the negative index access was deliberate and was meaningful? Is there a way the programmer who wrote that was able to ensure that data[-1] will access a specific datum, using #pragma pack () or other any methods?
Seeing *data is the first member in the class made me think it was a bug, but I'm not sure. If it is indeed a bug - is it UB?
You are asking about guarantee (a quite strong word). And your code has undefined behavior (because you are accessing some data outside of your object) which really means you cannot have any guarantee. Arbitrarily bad things could happen, even if in practice they usually don't (in particular when data points to a scalar type, like pointers).
I would recommend to replace for (int i = len; i > 0; i--) with for (int i = len; i > 1; i--) at least to make the code more readable and more standard conforming.
If for some weird reason the data[-1] access was meaningful for the previous programmer he should at least have commented about that. I guess that if he did not, it simply is a bug.
It depends on the embedded chip you are using.
First, the important part to notice is data lives in the heap. So normally (in a desktop computer) you cannot make any assumption about that data being just next to another piece of information you control.
But as we are talking about embedded systems here, some of them have a single contiguous memory space for stack and heap, and start filling the heap from one side of that memory space, and the stack from the other, until they meet at the center and the program crashes.
Anyway, in this case, the programmer of your code could have been so careful that he knows which is the next heap allocation that will happen, thus ensuring which variable lives next to data, but I actually think it is very unlikely. Nonetheless, he would be exploiting an undefined behaviour because it's a "memory access outside of array bounds" and I would consider it not only a bad programming practice but straight a bug.
Found a picture of how that heap/stack model works in a specific chip here.

Are evil byteblock reinterpretations valid C++?

Let us not discuss the badness of the following code, it's not mine, and I fully
agree with you in advance that it's not pretty, rather C-ish and potentially
very dangerous:
void * buf = std::malloc(24 + sizeof(int[3]));
char * name = reinterpret_cast<char *>(buf);
std::strcpy(name, "some name");
int * values = reinterpret_cast<int *>(name + 24);
values[0] = 0; values[1] = 13; values[2] = 42;
Its intent is quite clear; it's a "byte block" storing two arrays of different
types. To access elements not in front of the block, it interprets the block as
char * and increments the pointer by sizeof(type[len]).
My question is however, is it legal C++(11), as in "is it guaranteed that it
will work on every standard conforming compiler"? My intuition says it's not, however g++ and clang seem to be fine
with it.
I would appreciate a standard quote on this one; unfortunately I
was not able to find a related passage myself.
This is perfectly valid C++ code (not nice though, as you noted yourself). As long as the string is not longer than 23 characters, it is not even in conflict with the strict aliasing rules, because you never access the same byte in memory through differently typed pointers. However, if the string exceeds the fixed limit, you have undefined behaviour like any other out of bounds bug.
Still, I'd recommend to use a structure, at the very least:
typedef struct whatever {
char name[24];
int [3];
} whatever;
whatever* myData = new myData;
std::strcpy(myData->name, "some name");
myData->values[0] = 0; myData->values[1] = 13; myData->values[2] = 42;
This is 100% equivalent to the code you gave, except for a bit more overhead in the new operator as opposed to directly calling malloc(). If you are worried about performance, you can still do whatever* myData = (whatever*)std::malloc(sizeof(*myData)); instead of using new.

Address of address of array

If I define an variable:
int (**a)[30];
It is pointer. This pointer points to a pointer which points to an array of 30 ints.
How to declare it or initialize it?
int (**a)[10] = new [10][20][30];
int (**a)[10] = && new int[10];
All doesn't work.
The direct answer to your question of how to initialize a (whether or not that's what you actually need) is
int (**a)[10] = new (int (*)[10]);
I don't think this is actually what you want though; you probably want to initialize the pointer to point to an actual array, and either way std::vector is the better way to do it.
If you want an answer to the question as it stands, then you can do this kind of thing:
int a[30];
int (*b)[30] = &a;
int (**c)[30] = &b;
But it's unlikely to be what you want, as other people have commented. You probably need to clarify your underlying goal - people can only speculate otherwise.
Just to follow on from MooingDuck's remark, I can in fact see a way to do it without the typedef, but not directly:
template <typename T>
T *create(T *const &)
{
return new T;
}
int (**a)[30] = create(a);
It's not pretty though.
What do you expect to get by writing &(&var)? This is an equivalent of address of address of a block of memory. Doing things like this just to satisfy the number of * in your code makes no sense.
Think about it - how can you get an address of an address? Even if, by some sheer luck or weird language tricks you manage to do it, there no way it will work.

Optimizer bug or programming error?

First of all: I know that most optimization bugs are due to programming errors or relying on facts which may change depending on optimization settings (floating point values, multithreading issues, ...).
However I experienced a very hard to find bug and am somewhat unsure if there is any way to prevent these kind of errors from happening without turning the optimization off. Am I missing something? Could this really be an optimizer bug? Here's a simplified example:
struct Data {
int a;
int b;
double c;
};
struct Test {
void optimizeMe();
Data m_data;
};
void Test::optimizeMe() {
Data * pData; // Note that this pointer is not initialized!
bool first = true;
for (int i = 0; i < 3; ++i) {
if (first) {
first = false;
pData = &m_data;
pData->a = i * 10;
pData->b = i * pData->a;
pData->c = pData->b / 2;
} else {
pData->a = ++i;
} // end if
} // end for
};
int main(int argc, char *argv[]) {
Test test;
test.optimizeMe();
return 0;
}
The real program of course has a lot more to do than this. But it all boils down to the fact that instead of accessing m_data directly, a (previously unitialized) pointer is being used. As soon as I add enough statements to the if (first)-part, the optimizer seems to change the code to something along these lines:
if (first) {
first = false;
// pData-assignment has been removed!
m_data.a = i * 10;
m_data.b = i * m_data.a;
m_data.c = m_data.b / m_data.a;
} else {
pData->a = ++i; // This will crash - pData is not set yet.
} // end if
As you can see, it replaces the unnecessary pointer dereference with a direct write to the member struct. However it does not do this in the else-branch. It also removes the pData-assignment. Since the pointer is now still unitialized, the program will crash in the else-branch.
Of course there are various things which could be improved here, so you might blame it on the programmer:
Forget about the pointer and do what the optimizer does - use m_data directly.
Initialize pData to nullptr - that way the optimizer knows that the else-branch will fail if the pointer is never assigned. At least it seems to solve the problem in my test-environment.
Move the pointer assignment in front of the loop (effectively initializing pData with &m_data, which then could also be a reference instead of a pointer (for good measure). This makes sense because pData is needed in all cases so there is no reason to do this inside the loop.
The code is obviously smelly, to say the least, and I'm not trying to "blame" the optimizer for doing this. But I'm asking: What am I doing wrong? The program might be ugly, but it's valid code...
I should add that I'm using VS2012 with C++/CLI and v110_xp-Toolset. Optimization is set to /O2. Please also note that if you really want to reproduce the problem (that's not really the point of this question though) you need to play around with the complexity of the program. This is a very simplified example and the optimizer sometimes doesn't remove the pointer assignment. Hiding &m_data behind a function seems to "help".
EDIT:
Q: How do I know that the compiler is optimizing it to something like the example provided?
A: I'm not very good at reading assembler, I have looked at it however and have made 3 observations which make me believe that it's behaving this way:
As soon as optimization kicks in (adding more assignments usually does the trick) the pointer assignment has no associated assembler statement. It also hasn't been moved up to the declaration, so it's really left uninitialized it seems (at least to me).
In cases where the program crashes, the debugger skips the assignment statement. In cases where the program runs without problems, the debugger stops there.
If I watch the content of pData and the content of m_data while debugging, it clearly shows that all assignments in the if-branch have an effect on m_data and m_data receives the correct values. The pointer itself it still pointing to the same uninitialized value it had from the beginning. Therefore I have to assume that it is in fact not using the pointer to make the assignments at all.
Q: Does it have to do anything with i (Loop unrolling)?
A: No, the actual program actually uses do { ... } while() to loop over a SQL SELECT-resultset so the iteration count is completely runtime-specific and cannot be predetermined by the compiler.
It sure looks like an bug to me. It's fine for the optimizer to eliminate the unnecessary redirection, but it should not eliminate the assignment to pData.
Of course, you can work around the problem by assigning to pData before the loop (at least in this simple example). I gather that the problem in your actual code isn't as easily resolved.
I also vote for an optimizer bug if it is really reproducible in this example. To overrule the optimizer you could try to declare pData as volatile.