Accessing individual objects created by "new", through pointer arithmetic - c++

I am dynamically creating 3 objects of MyClass.
MyClass *ptr = new MyClass[3];
I'm assuming ptr is the address of the first instance of said object. I can do
(*ptr).doStuff();
However, when I try to access the 2nd object, via
(* (ptr + sizeof(MyClass)) ).doStuff();
It throws an exception. How am I supposed to get at the other objects?

Here are the options (from most recommended to least recommended):
ptr[1].doStuff();
(ptr+1)->doStuff();
(*(ptr+1)).doStuff();
((MyClass*)((char*)ptr+sizeof(MyClass)))->doStuff();
(*(MyClass*)((char*)ptr+sizeof(MyClass))).doStuff();

The valid code will look as
(* (ptr + 1) ).doStuff();
This is so-called the pointer arithmetic. You "shift" the pointer to the number of elements you want.

That's because incrementing ptr by 1 does not simply increment the pointer by 1 byte. Instead, it moves the pointer so far as to point to the next element to the array. Instead of your final line you have to write
(* (ptr + 1) ).doStuff();
or:
(ptr+1)->doStuff();
or, even more readable, as the commenters have suggested:
ptr[1]->doStuff();

Pointer arithmetic already takes the size of the pointee type into account. So you should plus 1, not sizeof(MyClass).
Otherwise you are plussing too far.

Related

Clarification on the relation of arrays and pointers

I wanted some further understanding, and possibly clarification, on a few things that confuse me about arrays and pointers in c++. One of the main things that confuse me is how when you refer to the name of an array, it could be referring to the array, or as a pointer to the first element, as well as a few other things. in order to better display where my trouble understanding is coming from, I'll display a few lines of code, as well as the conclusion that I've made from each one.
let's say that I have
int vals[] = {1,50,3,28,32,500};
int* valptr = vals;
Because vals is a pointer to the first value of the array, then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
cout<<valptr<<endl;
the code above all prints out the same value, leading me to conclude two things, one that valptr and vals are equal, and can be treated in the same way, as well as the fact that, for some reason, adding & and * to vals doesn't seem to refer to anything different, meaning that using them in this way, is useless, since they all refer to the same value.
cout<<valptr[1]<<endl;//outputs 50
cout<<vals[1]<<endl;//outputs 50
cout<<*vals<<endl;//outputs 1
cout<<*valptr<<endl;//outputs 1
the code above furthers the mindset to me that valptr and vals are the same, seeing as whenever I do something to vals, and do the same thing to valptr, they both yield the same result
cout<<*(&valptr +1)-valptr<<endl;
cout<<endl;
cout<< *(&vals + 1) -vals<<endl;
Now that we have established what I know, or the misconceptions that I may have, now we move on to the two main problems that I have, which we'll go over now.
The first confusion I have is with cout<< *(&vals + 1) -vals<<endl; I know that this outputs the size of the array, and the general concept on how it works, but I'm confused on several parts.
As shown earlier, if
cout<<vals<<endl;
cout<<&vals<<endl;
cout<<*(&vals)<<endl;
all print out the same value, the why do I need the * and & in cout<< *(&vals + 1) -vals<<endl; I know that if I do vals+1 it just refers to the address of the next element on the array, meaning that *(vals+1)returns 50.
This brings me to my first question: Why does the & in *(&vals+1) refer to the next address out the array? why is it not the same output as (vals+1)in the way that *(&vals)and (vals)have the same output?
Now to my second question. We know thatcout<< *(&vals + 1) -vals<<endl; is a valid statement, successfully printing the size of the array. However, as I stated earlier, in every other instance, valptr could be substituted for vals interchangeably. However, in the instance of writingcout<<*(&valptr +1)-valptr<<endl; I get 0 returned, instead of the expected value of 6. How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
I appreciate any help that anyone reading this can give me
Arrays and pointers are two different types in C++ that have enough similarities between them to create confusion if they are not understood properly. The fact that pointers are one of the most difficult concepts for beginners to grasp doesn't help either.
So I fell a quick crash course is needed.
Crash course
Arrays, easy
int a[3] = {2, 3, 4};
This creates an array named a that contains 3 elements.
Arrays have defined the array subscript operator:
a[i]
evaluates to the i'th element of the array.
Pointers, easy
int val = 24;
int* p = &val;
p is a pointer pointing to the address of the object val.
Pointers have the indirection (dereference) operator defined:
*p
evaluates to the value of the object pointed by p.
Pointers, acting like arrays
Pointers give you the address of an object in memory. It can be a "standalone object" like in the example above, or it can be an object that is part of an array. Neither the pointer type nor the pointer value can tell you which one it is. Just the programmer. That's why
Pointers also have the array subscript operator defined:
p[i]
evaluates to the ith element to the "right" of the object pointed by p. This assumes that the object pointer by p is part of an array (except in p[0] where it doesn't need to be part of an array).
Note that p[0] is equivalent to (the exact same as) *p.
Arrays, acting like pointers
In most contexts array names decay to a pointer to the first element in the array. That is why many beginners think that arrays and pointers are the same thing. In fact they are not. They are different types.
Back to your questions
Because vals is a pointer to the first value of the array
No, it's not.
... then that means that valptr, should equal vals as a whole, since I'm just setting a pointer equal to a pointer, like 1=1.
Because the premise is false the rest of the sentence is also false.
valptr and vals are equal
No they are not.
can be treated in the same way
Most of the times yes, because of the array to pointer decay. However that is not always the case. E.g. as an expression for sizeof and operand of & (address of) operator.
Why does the & in *(&vals+1) refer to the next address out the array?
&vals this is one of the few situation when vals doesn't decay to a pointer. &vals is the address of the array and is of type int (*)[6] (pointer to array of 6 ints). That's why &vals + 1 is the address of an hypothetical another array just to the right of the array vals.
How can something that was proven to be interchangeable before, no longer be interchangeable in this instance?
Simply that's the language. In most cases the name of the array decays to a pointer to the first element of the array. Except in a few situations that I've mentioned.
More crash course
A pointer to the array is not the same thing as a pointer to the first element of the array. Taking the address of the array is one of the few instances where the array name doesn't decay to a pointer to its first element.
So &a is a pointer to the array, not a pointer to the first element of the array. Both the array and the first element of the array start at the same address so the values of the two (&a and &a[0]) are the same, but their types are different and that matters when you apply the dereference or array subscript operator to them:
Expression
Expression type
Dereference / array subscript expresison
Dereference / array subscript type
a
int[3]
*a / a[i]
int
&a
int (*) [3] (pointer to array)
*&a / &a[i]
int[3]
&a[0]
int *
*&a[0] / (&a[0])[i]
int

Converting a pointer value that is returned from a malloc operation that creates an array object to the element type results in UB?

struct T{
T(){}
~T(){}
};
int main(){
auto ptr = (T*)malloc(sizeof(T)*10); // #1
ptr++;
}
Since T is not an implicit-lifetime class, the operation malloc can never implicitly create an object of class type T. In this example, we intend to make ptr points to the initial element of the array object with 10 elements of type T. And based on this assumption, ptr++ would have a valid behavior. As per [intro.object] p11
Further, after implicitly creating objects within a specified region of storage, some operations are described as producing a pointer to a suitable created object. These operations select one of the implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior.
Since an array of any type is an implicit-lifetime type, malloc(sizeof(T)*10) can implicitly create an array object with 10 elements and start the lifetime of that array object. Since the initial element and the containing array are not pointer-interconvertible, ptr can never point to the initial element of that array. Instead, the operation can only produce the pointer value of the array object. With this assumption, we should divide #1 into several steps to make the program well-formed?
// produce the pointer value points to array object of 10 T
auto arrPtr = (T(*)[10])malloc(sizeof(T)*10);
// decay the array
auto ptr =*arrPtr; // use array-to-pointer conversion
ptr++;
we should first get the array pointer and use that pointer value to acquire the initial element pointer value? Are these processes necessary in contributing to making the program well-formed?
I think the first example is perfectly valid but rather pointless. The array returned by malloc has not initialized any of the objects. While I think you can form a pointer to each object and iterate over the array you must not dereference the pointer at any time.
In my opinion the only thing that is missing here is the use of either placement new or construct_at like this:
struct T{
T(){}
~T(){}
};
int main(){
constexpr int SIZE = 10;
auto ptr = (T*)malloc(sizeof(T)*SIZE); // #1
for (auto p = ptr; p < &ptr[SIZE]; ++p) {
std::construct_at(p);
}
ptr++;
}
The objects in the array can only be constructed using pointer arithmetic or array indexing ptr[i], which is equivalent to *(ptr + i), so basically the same as p++. If your initial example is UB then construct_at calls above would be UB too. Only way around that would be to treat the return of malloc as byte array and calling construct_at every sizeof(T) bytes. After construction the array could then be cast to the right type, I hope. But that would be silly and I hope nobody finds a reason why that should be neccessary.
Overall if your example where illegal then how would you ever manage to implement new() itself?

Whats the difference between incrementing a pointer compared to incrementing the elements of the array the pointer points to?

I'm trying to understand what the difference between these two functions is:
void funk1(char* goal, char* source){
int i = 0;
while((goal[i]= source[i]) != ’\0’)
i++; }
and
void funk2(char* goal, char* source){
while((*goal= *source) != ’\0’){
goal++;
source++;
} }
Can someone help me please?
Whats the difference between incrementing a pointer compared to incrementing the elements of the array the pointer points to?
Incrementing a pointer (or more generally, any iterator) modifies the pointer. The resulting pointer will point to the next element of the array.
Indirecting through a pointer and incrementing the pointed object modifies the pointed object. How increment modifies the object depends on the type of the object.
Note that in neither of your examples do you increment element of an array. In the first you increment the variable i which is an integer that you use as an index, while in the second you increment two pointers.
The compiler will probably create the same executable for both cases. But if you will compile with explicit flag that tells the compiler "Do not optimize", funk2 is a bit faster (note that on todays modern computer it is neglect).
Why?
funk1 requires 4 register while funk2 requires only 3.
Because the operation goal[i] is equivalent to *(goal + i), which is a heavier computation than ++.
goal[i] is equivalent to *(goal + i), so both implementations work.
Note: none of these functions are actually "incrementing the elements of the array the pointer points to". The elements of the array are modified with operator=, but the ++ just modifies the pointers themselves.

Intuitively explaining pointers and their significance?

I'm having a hard time understanding pointers, particularly function pointers, and I was hoping someone could give me a rundown of exactly what they are and how they should be used in a program. Code blocks in C++ would be especially appreciated.
Thank you.
The concept of indirection is important to understand.
Here we are passing by value (note that a local copy is created and operated on, not the original version) via increment(x):
And here, by pointer (memory address) via increment(&x):
Note that references work similarly to pointers except that the syntax is similar to value copies (obj.member) and that pointers can point to 0 ("null" pointer) whereas references must be pointing to non-zero memory addresses.
Function pointers, on the other hand, let you dynamically change the behaviour of code at runtime by conveniently passing around and dealing withh functions in the same way you would pass around variables. Functors are often preferred (especially by the STL) since their syntax is cleaner and they let you associate local state with a function instance (read up about callbacks and closures, both are useful computer science concepts). For simple function pointers/callbacks, lambdas are often used (new in C++11) due to their compact and in-place syntax.
The pointer points to the point, is an integer value that has the address of that point.
Pointers can point to other pointers. Then you can get the values more-indirect way.
Reference operator (&):
You may equate a pointer to a reference of a variable or a pointer.
Dereference operator (*):
You may get the value of cell pointed by the pointer.
Arrays are decayed into a pointer when passed to a function by not a reference.
Function pointers are not inlined and makes program more functional. Callback is an example to this.
As an analogy, think of the memory in the computer as an Excel sheet. The equivalent of assigning a value to a variable in a C/C++ program would be to write something into a cell on the Excel sheet. Reading from a variable would be like looking at a cell's content.
Now, if you have a cell (say C3) whose content is "B8" you can interpret that content as a reference to another cell. If you treat the cell C3 in that manner, C3 becomes like a pointer. (In Excel, you can actually achieve this behavior by entering =B8 into C3).
In such a scenario, you basically state that the cell whose value you're interested in is referenced in C3. In C++, this could be something like:
int B8 = 42;
int* C3 = &B8;
You now have two variables that occuppy memory. Now, if you want to know what C3 points to, you'll use
int my_value = *C3;
As for function pointers: these are variables like ordinary pointers but the address (cell) they point to is not just a value but rather a function you can call.
Here you can find some example uses of Function pointer:
http://www.cprogramming.com/tutorial/function-pointers.html
In order to understand pointers one needs to understand a bit about hardware and memory layout.
Computer memory can be seen as a cupboard with drawers. A pointer can point to an arbitrary drawer, when you "dereference" the pointer you are looking inside the drawer, i.e. the value stored in the "drawer":
e.g. ten numbers stored after one another
short a[] = {9,2,3,4,5,6,7,8,1,-1] ;
in memory the values that consist the array 'a' are stored sequentially
+---+---+---+---+---+---+---+---+---+---+
a->| 9 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 1 | -1|
+---+---+---+---+---+---+---+---+---+---+
the 'a' array above is a pointer and is the start address where the
values are stored in memory.
short* ptr = 0; // a pointer not pointing to anything (NULL)
ptr = a + 5; // the pointer is now pointing to the 6th value in the array
// the + 5 is an offset on the starting address of a and since the type of
// a is short int array the compiler calculates the correct byte offset based
// on that type. in the above example 5 is 5 short ints since ptr is of type
// short*
*ptr has the value 6 i.e. we are looking at what the ptr is pointing to.
The size of each "drawer" is determined of the data type that is stored
In the above example 10 short ints are stored, every short int occupies 2 bytes
so the whole array occupies 20 bytes of memory (sizeof(a))
Variables store values.
Pointers store values too. The only difference is that they are memory addresses. They are still numbers.
Nothing complicated here.
You could store all pointers in int or long variables:
int p = 43567854;
char *p1 = (char *) p;
But advantage of storing them in pointer variables is that you can describe what type of variable is saved at the address the pointer points to.
What gets complicated about pointers is the cryptic syntax you have to use with them. Syntax is cryptic so that it's short to type.
Like:
&p = return address of variable
*p = return the value of first member of array that is stored at the address
By using the two rules above we can write this cryptic code:
&(*++t)
Which translated into human language gets quite long:
Increase value of t by 1. this now points to second member of array of values pointer points to. then get value of second member (*) and then get address of this value. if we print this pointer it will print whole string except the first character.
You should make a "pointers syntax cheat sheet.txt" and you are good.
And have open "Pointers tests" projects to test everythng that is unclear to you.
Pointers are similar to regular expressions in a way.

Pointers and dynamically allocated arrays

According to my class notes, you can allocate an array in C++ like
int *A = new int[5]
where A is a pointer to the array.
But then you can access the array as A[3]. Why can you do that? Isn't A a pointer and not the actual array?
The indexing operator[] actually defined to work on pointers, not on arrays. A[3] is actually a synonym for *(A+3). It works on arrays as a consequence of the fact that arrays can be implicitly converted to pointers to their first element.
A[3] is an alias for *(A+3) which dereferences the pointer.
You are right! Basically, this is syntactic sugar, meaning that the language designers put something in place to make your life a bit easier, but behind the scenes it is doing something quite different. (This point is arguable)
Effectively, what this is doing is taking the pointer location at A, moving the pointer 5 int-sizes up, then giving you the value by dereferencing.
This is the starting point of pointer arithmetic: you can add or subtract values with pointers, as long you don't move off the array (well, you can move up to right after the array). Basically, the following hold (for A being a pointer to an array with at leat i + 1 elements:
A[i]
*(A + i) // this what A[i] means
*(i + A) // addition is commutative
i[A] // this sould work, too - and it does!