Are references or pointers faster? - c++

From what I know, references are just another name for a variable whilst pointers are their own variable. Pointers take up space. People often say "use a reference or pointer" but they don't say which is better. If references take up no memory of their own, then references win in that department. What I don't know is if the compiler makes a distinction between references and normal variable. If you do operations on a reference, does it compile to the same code as normal variable?

Internally references are also implemented in terms of pointer. So, it's difficult to say which is faster pointer/reference.
It's a usage of these two which makes a difference.
For example you want to pass by reference a parameter to the function.
void func(int& a) case_1
{
//No need to check for NULL reference...
}
void func(int* a) case_2
{
//Need o check if pointer is not NULL
}
In case_2 you have to explicitly check if pointer is not NULL before dereferncing it whereas that's not the case with references because references are initialized to something.
Assumption is that you are playing game in civilized manner i.e
You are not doing something like:-
int*p = NULL;
int &a = *p;

Here are my two test programs:
Reference:
int i = 0;
int& r = i;
++r;
int j = 0;
++j;
Pointer:
int i = 0;
int* r = &i;
++(*r);
int j = 0;
++j;
My compiler wrote the EXACT same assembly code for both.
movl $0, -16(%rbp) #, i
leaq -16(%rbp), %rax #, tmp87
movq %rax, -8(%rbp) # tmp87, r
movq -8(%rbp), %rax # r, tmp88
movl (%rax), %eax # *r_1, D.31036
leal 1(%rax), %edx #, D.31036
movq -8(%rbp), %rax # r, tmp89
movl %edx, (%rax) # D.31036, *r_1
movl $0, -12(%rbp) #, j
addl $1, -12(%rbp) #, j
movl $0, %eax #, D.31036

They are the same, references are just a language mechanic that is a pointer that cannot be null. The difference remains only in the compilation phase, where you will get a complaint if you try to do something illegal.

Related

<vector> in C++ weird behavior [duplicate]

I understand that arrays are a primitive class and therefore do not have built in methods to detect out of range errors. However, the vector class has the built in function .at() which does detect these errors. By using namespaces, anyone can overload the [ ] symbols to act as the .at() function by throwing an error when a value out of the vector's range is accessed. My question is this: why is this functionality not default in C++?
EDIT: Below is an example in pseudocode (I believe - correct me if needed) of overloading the vector operator [ ]:
Item_Type& operator[](size_t index) { // Verify that the index is legal.
if (index < 0 || index >= num_items) {
throw std::out_of_range
("index to operator[] is out of range");
}
return the_data[index]
}
I believe this function can be written into a user-defined namespace and is reasonably easy to implement. If this is true, why is it not default?
For something that's normally as cheap as [], bounds checking adds a significant overhead.
Consider
int f1(const std::vector<int> & v, std:size_t s) { return v[s]; }
this function translates to just three lines of assembly:
movq (%rdi), %rax
movl (%rax,%rsi,4), %eax
ret
Now consider the bounds-checking version using at():
int f2(const std::vector<int> & v, std:size_t s) { return v.at(s); }
This becomes
movq (%rdi), %rax
movq 8(%rdi), %rdx
subq %rax, %rdx
sarq $2, %rdx
cmpq %rdx, %rsi
jae .L6
movl (%rax,%rsi,4), %eax
ret
.L6:
pushq %rax
movl $.LC1, %edi
xorl %eax, %eax
call std::__throw_out_of_range_fmt(char const*, ...)
Even in the normal (non-throwing) code path, that's 8 lines of assembly - almost three times as many.
C++ has a principle of only pay for what you use. Therefore unchecked operations definitely have their place; just because you're too lazy to be careful about your bounds doesn't mean I should have to pay a performance penalty.
Historically array [] has been unchecked in both C and C++. Just because languages 10-20 years younger made that a checked operation doesn't mean C++ needs to make such a fundamental backward-incompatible change.

Increment variable by N inside array index

Could someone please tell me whether or not such a construction is valid (i.e not an UB) in C++. I have some segfaults because of that and spent couple of days trying to figure out what is going on there.
// Synthetic example
int main(int argc, char** argv)
{
int array[2] = {99, 99};
/*
The point is here. Is it legal? Does it have defined behaviour?
Will it increment first and than access element or vise versa?
*/
std::cout << array[argc += 7]; // Use argc just to avoid some optimisations
}
So, of course I did some analysis, both GCC(5/7) and clang(3.8) generate same code. First add than access.
Clang(3.8): clang++ -O3 -S test.cpp
leal 7(%rdi), %ebx
movl .L_ZZ4mainE5array+28(,%rax,4), %esi
movl $_ZSt4cout, %edi
callq _ZNSolsEi
movl $.L.str, %esi
movl $1, %edx
movq %rax, %rdi
callq _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l
GCC(5/7) g++-7 -O3 -S test.cpp
leal 7(%rdi), %ebx
movl $_ZSt4cout, %edi
subq $16, %rsp
.cfi_def_cfa_offset 32
movq %fs:40, %rax
movq %rax, 8(%rsp)
xorl %eax, %eax
movabsq $425201762403, %rax
movq %rax, (%rsp)
movslq %ebx, %rax
movl (%rsp,%rax,4), %esi
call _ZNSolsEi
movl $.LC0, %esi
movq %rax, %rdi
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movl %ebx, %esi
So, can I assume such a baheviour is a standard one?
By itself array[argc += 7] is OK, the result of argc + 7 will be used as an index to array.
However, in your example array has just 2 elements, and argc is never negative, so your code will always result in UB due to an out-of-bounds array access.
In case of a[i+=N] the expression i += N will always be evaluated first before accessing the index. But the example that you provided invokes UB as your example array contains only two elements and thus you are accessing out of bounds of the array.
Your case is clearly undefined behaviour, since you will exceed array bounds for the following reasons:
First, expression array[argc += 7] is equal to *((array)+(argc+=7)), and the values of the operands will be evaluated before + is evaluated (cf. here); Operator += is an assignment (and not a side effect), and the value of an assignment is the result of argc (in this case) after the assignment (cf. here). Hence, the +=7 gets effective for subscripting;
Second, argc is defined in C++ to be never negative (cf. here); So argc += 7 will always be >=7 (or a signed integer overflow in very unrealistic scenarious, but still UB then).
Hence, UB.
It's normal behavior. Name of array actualy is a pointer to first element of array. And array[n] is the same as *(array+n)

Performance of chained public member access compared to pointer

Since I couldn't find any question relating to chained member access, but only chained function access, I would like to ask a couple of questions about it.
I have the following situation:
for(int i = 0; i < largeNumber; ++i)
{
//do calculations with the same chained struct:
//myStruct1.myStruct2.myStruct3.myStruct4.member1
//myStruct1.myStruct2.myStruct3.myStruct4.member2
//etc.
}
It is obviously possible to break this down using a pointer:
MyStruct4* myStruct4_pt = &myStruct1.myStruct2.myStruct3.myStruct4;
for(int i = 0; i < largeNumber; ++i)
{
//do calculations with pointer:
//(*myStruct4_pt).member1
//(*myStruct4_pt).member2
//etc.
}
Is there a difference between member access (.) and a function access that, e.g., returns a pointer to a private variable?
Will/Can the first example be optimized by the compiler and does that strongly depend on the compiler?
If no optimizations are done during compilation time, will/can the CPU optimize the behaviour (e.g. keeping it in the L1 cache)?
Does a chained member access make a difference at all in terms of performance, since variables are "wildly reassigned" during compilation time anyway?
I would kindly ask to leave discussions out regarding readability and maintainability of code, as the chained access is, for my purposes, clearer.
Update:
Everything is running in a single thread.
This is a constant offset that you're modifying, a modern compiler will realize that.
But - don't trust me, lets ask a compiler (see here).
#include <stdio.h>
struct D { float _; int i; int j; };
struct C { double _; D d; };
struct B { char _; C c; };
struct A { int _; B b; };
int bar(int i);
int foo(int i);
void foo(A &a) {
for (int i = 0; i < 10; i++) {
a.b.c.d.i += bar(i);
a.b.c.d.j += foo(i);
}
}
Compiles to
foo(A&):
pushq %rbp
movq %rdi, %rbp
pushq %rbx
xorl %ebx, %ebx
subq $8, %rsp
.L3:
movl %ebx, %edi
call bar(int)
addl %eax, 28(%rbp)
movl %ebx, %edi
addl $1, %ebx
call foo(int)
addl %eax, 32(%rbp)
cmpl $10, %ebx
jne .L3
addq $8, %rsp
popq %rbx
popq %rbp
ret
As you see, the chaining has been translated to a single offset in both cases: 28(%rbp) and 32(%rbp).

Why doesn't C++ detect when out of range vector elements are accessed with the [ ] operators by default?

I understand that arrays are a primitive class and therefore do not have built in methods to detect out of range errors. However, the vector class has the built in function .at() which does detect these errors. By using namespaces, anyone can overload the [ ] symbols to act as the .at() function by throwing an error when a value out of the vector's range is accessed. My question is this: why is this functionality not default in C++?
EDIT: Below is an example in pseudocode (I believe - correct me if needed) of overloading the vector operator [ ]:
Item_Type& operator[](size_t index) { // Verify that the index is legal.
if (index < 0 || index >= num_items) {
throw std::out_of_range
("index to operator[] is out of range");
}
return the_data[index]
}
I believe this function can be written into a user-defined namespace and is reasonably easy to implement. If this is true, why is it not default?
For something that's normally as cheap as [], bounds checking adds a significant overhead.
Consider
int f1(const std::vector<int> & v, std:size_t s) { return v[s]; }
this function translates to just three lines of assembly:
movq (%rdi), %rax
movl (%rax,%rsi,4), %eax
ret
Now consider the bounds-checking version using at():
int f2(const std::vector<int> & v, std:size_t s) { return v.at(s); }
This becomes
movq (%rdi), %rax
movq 8(%rdi), %rdx
subq %rax, %rdx
sarq $2, %rdx
cmpq %rdx, %rsi
jae .L6
movl (%rax,%rsi,4), %eax
ret
.L6:
pushq %rax
movl $.LC1, %edi
xorl %eax, %eax
call std::__throw_out_of_range_fmt(char const*, ...)
Even in the normal (non-throwing) code path, that's 8 lines of assembly - almost three times as many.
C++ has a principle of only pay for what you use. Therefore unchecked operations definitely have their place; just because you're too lazy to be careful about your bounds doesn't mean I should have to pay a performance penalty.
Historically array [] has been unchecked in both C and C++. Just because languages 10-20 years younger made that a checked operation doesn't mean C++ needs to make such a fundamental backward-incompatible change.

How do I declare an array created using malloc to be volatile in c++

I presume that the following will give me 10 volatile ints
volatile int foo[10];
However, I don't think the following will do the same thing.
volatile int* foo;
foo = malloc(sizeof(int)*10);
Please correct me if I am wrong about this and how I can have a volatile array of items using malloc.
Thanks.
int volatile * foo;
read from right to left "foo is a pointer to a volatile int"
so whatever int you access through foo, the int will be volatile.
P.S.
int * volatile foo; // "foo is a volatile pointer to an int"
!=
volatile int * foo; // foo is a pointer to an int, volatile
Meaning foo is volatile. The second case is really just a leftover of the general right-to-left rule.
The lesson to be learned is get in the habit of using
char const * foo;
instead of the more common
const char * foo;
If you want more complicated things like "pointer to function returning pointer to int" to make any sense.
P.S., and this is a biggy (and the main reason I'm adding an answer):
I note that you included "multithreading" as a tag. Do you realize that volatile does little/nothing of good with respect to multithreading?
volatile int* foo;
is the way to go. The volatile type qualifier works just like the const type qualifier. If you wanted a pointer to a constant array of integer you would write:
const int* foo;
whereas
int* const foo;
is a constant pointer to an integer that can itself be changed. volatile works the same way.
Yes, that will work. There is nothing different about the actual memory that is volatile. It is just a way to tell the compiler how to interact with that memory.
I think the second declares the pointer to be volatile, not what it points to. To get that, I think it should be
int * volatile foo;
This syntax is acceptable to gcc, but I'm having trouble convincing myself that it does anything different.
I found a difference with gcc -O3 (full optimization). For this (silly) test code:
volatile int v [10];
int * volatile p;
int main (void)
{
v [3] = p [2];
p [3] = v [2];
return 0;
}
With volatile, and omitting (x86) instructions which don't change:
movl p, %eax
movl 8(%eax), %eax
movl %eax, v+12
movl p, %edx
movl v+8, %eax
movl %eax, 12(%edx)
Without volatile, it skips reloading p:
movl p, %eax
movl 8(%eax), %edx ; different since p being preserved
movl %edx, v+12
; 'p' not reloaded here
movl v+8, %edx
movl %edx, 12(%eax) ; p reused
After many more science experiments trying to find a difference, I conclude there is no difference. volatile turns off all optimizations related to the variable which would reuse a subsequently set value. At least with x86 gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33). :-)
Thanks very much to wallyk, I was able to devise some code use his method to generate some assembly to prove to myself the difference between the different pointer methods.
using the code: and compiling with -03
int main (void)
{
while(p[2]);
return 0;
}
when p is simply declared as pointer, we get stuck in a loop that is impossible to get out of. Note that if this were a multithreaded program and a different thread wrote p[2] = 0, then the program would break out of the while loop and terminate normally.
int * p;
============
LCFI1:
movq _p(%rip), %rax
movl 8(%rax), %eax
testl %eax, %eax
jne L6
xorl %eax, %eax
leave
ret
L6:
jmp L6
notice that the only instruction for L6 is to goto L6.
==
when p is volatile pointer
int * volatile p;
==============
L3:
movq _p(%rip), %rax
movl 8(%rax), %eax
testl %eax, %eax
jne L3
xorl %eax, %eax
leave
ret
here, the pointer p gets reloaded each loop iteration and as a consequence the array item also gets reloaded. However, this would not be correct if we wanted an array of volatile integers as this would be possible:
int* volatile p;
..
..
int* j;
j = &p[2];
while(j);
and would result in the loop that would be impossible to terminate in a multithreaded program.
==
finally, this is the correct solution as tony nicely explained.
int volatile * p;
LCFI1:
movq _p(%rip), %rdx
addq $8, %rdx
.align 4,0x90
L3:
movl (%rdx), %eax
testl %eax, %eax
jne L3
leave
ret
In this case the the address of p[2] is kept in register value and not loaded from memory, but the value of p[2] is reloaded from memory on every loop cycle.
also note that
int volatile * p;
..
..
int* j;
j = &p[2];
while(j);
will generate a compile error.