a is array of integers, if I try to subtract the address value of &a[2] - &a[1] == ?
what should the result be 4 or 1 ?
EDIT: see 4th comment on top answer here why he says 1 ?? this is why I'm confused I thought it will be 4
EDIT: here is a test
&a[2] is same as &(*(a + 2)) (i.e (a + 2)) and &a[1] is same as &(*(a + 1)) (i.e. (a + 1)). So answer will be 1.
Pointer subtraction gives you the difference in elements, not bytes. It does not matter what the element type of the array is, the result of &a[2] - &a[1] will always be 1, because they are 1 element apart.
It is always 1. The pointer arithmetics are not concerned with the number of bytes that each element has, and this is very useful. Compare these:
ptr++; // go to the next element (correct)
ptr += sizeof *ptr; // go to the next element (wrong)
When you work with arrays you are usually interested in the elements, not in the bytes comprising them, and that is why pointer arithmetics in C has been defined this way.
The difference must be 1. When you compare pointers you will always get the difference of elements.
Since this is C++, I'm going to assume that you have not overridden the & or * operators on whatever type a is. Minding that, the following is true:
&a[2] - &a[1]
(a + 2) - (a + 1)
a + 2 - a - 1
2 - 1
1
A couple of the answers here (deleted since this answer was posted) clearly had byte* in mind:
int a[10];
byte * pA2 = (byte*)&a[2];
byte * pA1 = (byte*)&a[1];
int sz1 = &a[2] - &a[1];
int sz2 = pA2 - pA1;
CString msg;
msg.Format("int * %d, byte * %d\n", sz1, sz2);
OutputDebugString(msg);
output is:
int * 1, byte * 4
Two addresses but depending on the declaration of the variable the addresses are stored in, the difference between the two can be 1 or 4.
Related
When we subtract a pointer from another pointer the difference is not equal to how many bytes they are apart but equal to how many integers (if pointing to integers) they are apart. Why so?
The idea is that you're pointing to blocks of memory
+----+----+----+----+----+----+
| 06 | 07 | 08 | 09 | 10 | 11 | mem
+----+----+----+----+----+----+
| 18 | 24 | 17 | 53 | -7 | 14 | data
+----+----+----+----+----+----+
If you have int* p = &(array[5]) then *p will be 14. Going p=p-3 would make *p be 17.
So if you have int* p = &(array[5]) and int *q = &(array[3]), then p-q should be 2, because the pointers are point to memory that are 2 blocks apart.
When dealing with raw memory (arrays, lists, maps, etc) draw lots of boxes! It really helps!
Because everything in pointer-land is about offsets. When you say:
int array[10];
array[7] = 42;
What you're actually saying in the second line is:
*( &array[0] + 7 ) = 42;
Literally translated as:
* = "what's at"
(
& = "the address of"
array[0] = "the first slot in array"
plus 7
)
set that thing to 42
And if we can add 7 to make the offset point to the right place, we need to be able to have the opposite in place, otherwise we don't have symmetry in our math. If:
&array[0] + 7 == &array[7]
Then, for sanity and symmetry:
&array[7] - &array[0] == 7
So that the answer is the same even on platforms where integers are different lengths.
Say you have an array of 10 integers:
int intArray[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
Then you take a pointer to intArray:
int *p = intArray;
Then you increment p:
p++;
What you would expect, because p starts at intArray[0], is for the incremented value of p to be intArray[1]. That's why pointer arithmetic works like that. See the code here.
"When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them"
Check for more here.
This way pointer subtraction behaves is consistent with the behaviour of pointer addition. It means that p1 + (p2 - p1) == p2 (where p1 and p2 are pointers into the same array).
Pointer addition (adding an integer to a pointer) behaves in a similar way: p1 + 1 gives you the address of the next item in the array, rather than the next byte in the array - which would be a fairly useless and unsafe thing to do.
The language could have been designed so that pointers are added and subtracted the same way as integers, but it would have meant writing pointer arithmetic differently, and having to take into account the size of the type pointed to:
p2 = p1 + n * sizeof(*p1) instead of p2 = p1 + n
n = (p2 - p1) / sizeof(*p1) instead of n = p2 - p1
So the result would be code that is longer, and harder to read, and easier to make mistakes in.
When applying arithmetic operations on pointers of a specific type, you always want the resulting pointer to point to a "valid" (meaning the right step size) memory-address relative to the original starting-point. That is a very comfortable way of accessing data in memory independently from the underlying architecture.
If you want to use a different "step-size" you can always cast the pointer to the desired type:
int a = 5;
int* pointer_int = &a;
double* pointer_double = (double*)pointer_int; /* totally useless in that case, but it works */
#fahad Pointer arithmetic goes by the size of the datatype it points.So when ur pointer is of type int you should expect pointer arithmetic in the size of int(4 bytes).Likewise for a char pointer all operations on the pointer will be in terms of 1 byte.
This question already has answers here:
Incrementing pointers in C arrays
(1 answer)
Adding an offset to a pointer
(3 answers)
Closed 3 years ago.
I am coming back to C programming after some years so I guess I am a bit rusty, but I am seeing some weird behavior in my code.
I have the following:
memcpy(dest + (start_position * sizeof(MyEnum)), source, source_size * sizeof(MyEnum));
Where:
dest and source are arrays of MyEnum with different sizes,
dest is 64 bytes long.
source is 16 bytes long.
sizeof(MyEnum) is 4 bytes
source_size is 4, as there are 4 enums inside the array.
I am looping this code 4 times, advancing start_position at each time, so at each of the 4 loop iterations I get memcpy being called with the following values (I already checked this with the debugger):
memcpy(dest + (0), source, 16); (start_position = 0 * 4, since source size is 4)
memcpy(dest + (16), source, 16); (start_position = 1 * 4, since source size is 4)
memcpy(dest + (32), source, 16); (start_position = 2 * 4, since source size is 4)
memcpy(dest + (48), source, 16); (start_position = 3 * 4, since source size is 4)
memcpy works fine on the first loop, but on the second it copies data to another array instead, clearly going outside the memory region of dest array, violating another array's memory area.
So I checked the pointer arithmetic happening inside my function and this is what I got:
dest address is 0xbeffffa74
dest + (start_position * sizeof(MyEnum)) is 0xbefffab4 for (start_position * sizeof(MyEnum) = 16
The array being violated is at 0xbefffab4.
Although this explains why the array's memory is being violated, I don't get how 0xbeffffa74 + 16 is going to be 0xbefffab4, but I can confirm that's the address that memcpy is being called at.
I am running this on a Raspberry Pi, but AFAIK this shouldn't matter.
Pointer arithmetic works on the size of the pointed datatype. If you have a char* then ever time you increment the pointer it’ll move by one. If it’s an int* then every increment adds more than one, usually 4 to the pointer (due to int usually, but not always, being 32bit).
If you have a pointer to a struct then incrementing the pointer moves it by the size of the struct. Therefore the sizeof shouldn’t be there or you’ll move way too much.
memcpy(dest + (start_position * sizeof(MyEnum)), source, source_size * sizeof(MyEnum));
This moves the pointer 4*4 bytes every position since the MyEnum is four bytes.
memcpy(dest + start_position, source, source_size * sizeof(MyEnum));
This moves it only 4 bytes at a time.
This is logical because pointer[2] is the same as *(pointer + 2) so if pointer arithmetic didn’t implicitly take the pointed type size into account all indexing would also need the sizeof and you’d end up writing a lot of pointer[2 * sizeof(*pointer)].
I found this piece of code:
void* aligned_malloc(size_t required_bytes, size_t alignment) {
int offset = alignment - 1;
void* P = (void * ) malloc(required_bytes + offset);
void* q = (void * ) (((size_t)(p) + offset) & ~(alignment - 1));
return q;
}
that is the implementation of aligned malloc in C++. Aligned malloc is a function that supports allocating memory such that the
memory address returned is divisible by a specific power of two.
Example:
align_malloc (1000, 128) will return a memory address that is a multiple of 128 and that points to memory of size 1000 bytes.
But I don't understand line 4. Why sum twice the offset?
Thanks
Why sum twice the offset?
offset isn't exactly being summed twice. First use of offset is for the size to allocate:
void* p = (void * ) malloc(required_bytes + offset);
Second time is for the alignment:
void* q = (void * ) (((size_t)(p) + offset) & ~(alignment - 1));
Explanation:
~(alignment - 1) is a negation of offset (remember, int offset = alignment - 1;) which gives you the mask you need to satisfy the alignment requested. Arithmetic-wise, adding the offset and doing bitwise and (&) with its negation gives you the address of the aligned pointer.
How does this arithmetic work? First, remember that the internal call to malloc() is for required_bytes + offset bytes. As in, not the alignment you asked for. For example, you wanted to allocate 10 bytes with alignment of 16 (so the desired behavior is to allocate the 10 bytes starting in an address that is divisible with 16). So this malloc() from above will give you 10+16-1=25 bytes. Not necessarily starting at the right address in terms of being divisible with 16). But then this 16-1 is 0x000F and its negation (~) is 0xFFF0. And now we apply the bitwise and like this: p + 15 & 0xFFF0 which will cause every pointer p to be a multiple of 16.
But wait, why add this offset of alignment - 1 in the first place? You do it because once you get the pointer p returned by malloc(), the one thing you cannot do -- do in order to find the nearest address which is a multiple of the alignment requested -- is look for it before p, as this could cross into an address space of something allocated before p. For this, you begin by adding alignment - 1, which, think about it, is exactly the maximum by which you'd have to advance to get your alignment.
* Thanks to user DevSolar for some additional phrasing.
Note 1: For this way to work the alignment must be a power of 2. This snippet does not enforce such a thing and so could cause unexpected behavior.
Note 2: An interesting question is how could you implement a free() version for such an allocation, with the return value from this function.
This question already has an answer here:
When I subtract memory addresses, why is the result smaller than I expected?
(1 answer)
Closed 8 years ago.
I am just confused with the output which I got for the following code :
int arr[] = {10,20,30};
cout<<&arr[1]<<"\t"<<&arr[0]<<"\t"<<&arr[1] - &arr[0];
the output which I got was like
0046F7A0 0046F79C 1
I want to know why the difference between the address gave 1 ( I expected 4)...?
Is it something to do with pointer subtraction..?
Yes, this is the result of pointer arithmetic. This is the same reason why arr + 1 would point to arr[1]. Pointer arithmetic is only well-defined when both pointers point to elements in the same array. If two such pointers, P and Q, point to array locations i and j, then P-Q = i-j.
Also, if you look at the differences of the actual addresses printed, they match your expectations - the difference is 4.
You are correct, this has to do with pointer arithmetic. Subtracting two int pointers gives you the difference between them measured in sizeof (int) units. You can get the difference in plain bytes by casting your pointers to char pointers, as chars are guaranteed to be of size 1.
uint arr[] = {10,20,30};
cout << &arr[1] << "\t" << &arr[0] << "\t" << (char*)&arr[1] - (char*)&arr[0];
Output:
0x23fe44 0x23fe40 4
0046F7A0 - 0046F79C is actually 4 but &arr[0]-&arr[1] = (0046F7A0 - 0046F79C)/sizeof(int= 4 bytes) because subtracting two pointers gives you the number of elements between them.
$5.7 -
"[..]For addition, either both operands shall have arithmetic or enumeration type, or one operand shall be a pointer to a completely defined object type and the other shall have integral or enumeration type.
2 For subtraction, one of the following shall hold:
— both operands have arithmetic or enumeration type; or
— both operands are pointers to cv-qualified or cv-unqualified versions of the same completely defined object type; or
— the left operand is a pointer to a completely defined object type and the right operand has integral or enumeration type.
int main(){
int buf[10];
int *p1 = &buf[0];
int *p2 = 0;
p1 + p2; // Error
p1 - p2; // OK
}
So, my question is why 'pointer addition' is not supported in C++ but 'pointer subtraction' is?
The difference between two pointers means the number of elements of the type that would fit between the targets of the two pointers. The sum of two pointers means...er...nothing, which is why it isn't supported.
The result of subtraction is distance (useful).
The result of adding a pointer and a distance is another meaningful pointer.
The result of adding 2 pointers is another pointer, this time meaningless though.
It's the same reason there are distinct TimeSpan and DateTime objects in most libraries.
The first thing that comes to mind is that it doesn't make sense to do pointer addition, so it's not supported. If you have 2 pointers 0x45ff23dd, 0x45ff23ed. What does it mean to add them?? Some memory out-of-bounds. And people in standard comittee have not found good enough reasons to support stuff like that, and rather warn you at compile time about possible problem. While pointer subtraction is fine because it indicates memory distance, which is often useful.
Because adding two pointers doesn't make sense.
Consider I have two ints in memory at 0x1234 and 0x1240. The difference between these addresses is 0xc and is a distance in memory. The sum is 0x2474 and doesn't correspond to anything meaningful.
You can however, add a pointer to an integer to get another pointer. This is what you do when you index into an array: p[4] means *(p + 4) which means "the thing stored at the address 4 units past this address."
In general, you can determine the "pointerness" of an arithmetic operation by assigning each pointer a value 1 and each integer a value zero. If the result is 1, you've got a pointer; if it's 0, you've got an integer; if it's any other value, you have something that doesn't make sense. Examples:
/* here p,q,r are pointers, i,j,k are integers */
p + i; /* 1 + 0 == 1 => p+i is a pointer */
p - q; /* 1 - 1 == 0 => p-q is an integer */
p + (q-r); /* 1 + (1-1) == 1 => pointer */
Result of pointer subtraction is number of objects between two memory addresses. Pointer addition doesn't mean anything, this is why it is not allowed.
subtraction of pointers is only defined if they point into the same array of objects. The resulting subtraction is scaled by the size of object they point to. ie, pointer subtraction gives the number of elements between the two pointers.
N.B. No claims on C standards here.
As a quick addendum to #Brian Hooper's answer, "[t]he sum of two pointers means...er...nothing" however the sum of a pointer and an integer allows you to offset from the initial pointer.
Subtracting a higher value pointer from a lower value pointer gives you the offset between the two. Note that I am not accounting for memory paging here; I'm assuming the memory values are both within the accessible scope of the process.
So if you have a pointer to a series of consecutive memory locations on the heap, or an array of memory locations on the stack (whose variable name decays to a pointer), these pointers (the real pointer and the one that decays to a pointer) will point to the fist memory location question (i.e. element [0]). Adding an integer value to the pointer is equivalent to incrementing the index in brackets by the same number.
#include <stdio.h>
#include <stdlib.h>
int main()
{
// This first declaration does several things (this is conceptual and not an exact list of steps the computer takes):
// 1) allots space on the stack for a variable of type pointer
// 2) allocates number of bytes on the heap necessary to fit number of chars in initialisation string
// plus the NULL termination '\0' (i.e. sizeof(char) * <characters in string> + 1 for '\0')
// 3) changes the value of the variable from step 1 to the memory address of the beginning of the memory
// allocated in step 2
// The variable iPointToAMemoryLocationOnTheHeap points to the first address location of the memory that was allocated.
char *iPointToAMemoryLocationOnTheHeap = "ABCDE";
// This second declaration does the following:
// 1) allots space on the stack for a variable that is not a pointer but is said to decay to a pointer allowing
// us to so the following iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
// 2) allots number of bytes on the stack necessary to fit number of chars in initialisation string
// plus the NULL termination '\0' (i.e. sizeof(char) * <characters in string> + 1 for '\0')
// The variable iPointToACharOnTheHeap just points to first address location.
// It just so happens that others follow which is why null termination is important in a series of chars you treat
char iAmASeriesOfConsecutiveCharsOnTheStack[] = "ABCDE";
// In both these cases it just so happens that other chars follow which is why null termination is important in a series
// of chars you treat as though they are a string (which they are not).
char *iJustPointToMemoryLocationsYouTellMeToPointTo = NULL;
iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
// If you increment iPointToAMemoryLocationOnTheHeap, you'll lose track of where you started
for( ; *(++iJustPointToMemoryLocationsYouTellMeToPointTo) != '\0' ; ) {
printf("Offset of: %ld\n", iJustPointToMemoryLocationsYouTellMeToPointTo - iPointToAMemoryLocationOnTheHeap);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iPointToAMemoryLocationOnTheHeap;
for(int i = 0 ; *(iJustPointToMemoryLocationsYouTellMeToPointTo + i) != '\0' ; i++) {
printf("Offset of: %ld\n", (iJustPointToMemoryLocationsYouTellMeToPointTo + i) - iPointToAMemoryLocationOnTheHeap);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo + i);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo + i);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iAmASeriesOfConsecutiveCharsOnTheStack;
// If you increment iAmASeriesOfConsecutiveCharsOnTheStack, you'll lose track of where you started
for( ; *(++iJustPointToMemoryLocationsYouTellMeToPointTo) != '\0' ; ) {
printf("Offset of: %ld\n", iJustPointToMemoryLocationsYouTellMeToPointTo - iAmASeriesOfConsecutiveCharsOnTheStack);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo);
}
printf("\n");
iJustPointToMemoryLocationsYouTellMeToPointTo = iAmASeriesOfConsecutiveCharsOnTheStack;
for(int i = 0 ; *(iJustPointToMemoryLocationsYouTellMeToPointTo + i) != '\0' ; i++) {
printf("Offset of: %ld\n", (iJustPointToMemoryLocationsYouTellMeToPointTo + i) - iAmASeriesOfConsecutiveCharsOnTheStack);
printf("%s\n", iJustPointToMemoryLocationsYouTellMeToPointTo + i);
printf("%c\n", *iJustPointToMemoryLocationsYouTellMeToPointTo + i);
}
return 1;
}
The first notable thing we do in this program is copy the value of the pointer iPointToAMemoryLocationOnTheHeap to iJustPointToMemoryLocationsYouTellMeToPointTo. So now both of these point to the same memory location on the heap. We do this so we don't lose track of the beginning of it.
In the first for loop we increment the value we just copied into iJustPointToMemoryLocationsYouTellMeToPointTo (increasing it by 1 means it points to one memory location further away from iPointToAMemoryLocationOnTheHeap).
The second loop is similar but I wanted to more clearly show how incrementing the value is related to the offset, and how the arithmetic works.
The third and fourth loops repeat the process but work on an array on the stack as opposed to allocated memory on the heap.
Note the asterisk * when printing the individual char. This tells printf to output whatever is pointed to by the variable, and not the contents of the variable itself. This is in contrast to the line above where the balance of the string is printed and there is no asterisk before the variable because printf() is looking at the series of memory locations in its entirety until NULL is reached.
Here is the output on ubuntu 15.10 running on an i7 (the first and third loops output starting at an offset of 1 because my loop choice of for increments at the beginning of the loop as opposed to a do{}while(); I just wanted to keep it simple):
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 0
ABCDE
A
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Offset of: 0
ABCDE
A
Offset of: 1
BCDE
B
Offset of: 2
CDE
C
Offset of: 3
DE
D
Offset of: 4
E
E
Because the result of that operation is undefined. Where does p1 + p2 point to? How can you make sure it points to a properly initialized memory so that it could be dereferenced later?
p1 - p2 gives the offset between those 2 pointers and that result could be used further on.