Why does conceptual storage allocation differ from the actual? [duplicate] - c++

This question already has answers here:
Pointer subtraction confusion
(8 answers)
Closed 6 years ago.
I have a puzzling question (at least for me)
Say I declare an integer array:
int arr[3];
Conceptually, what happens in the memory is that, at compile time, 12 bytes are allocated to store 3 consecutive integers, right? (Here's an illustration)
Based on the illustration, the sample addresses of
arr[0] is 1000,
arr[1] is 1004, and
arr[2] is 1008.
My question is:
If I output the difference between the addresses of arr[0] and arr[1]:
std::cout << &arr[1] - &arr[0] << std::endl;
instead of getting 4,
I surprisingly get 1.
Can anybody explain why it resulted to that output?
PS: On my computer, an int is 4 bytes.

Pointer arithmetic automatically divides the value by the size of the base type so this is not surprising at all since one would expect to get 4 / 4 which is 1. Cast to unsignd char * to see the difference.
#include <iostream>
int
main(void)
{
int arr[2];
std::cout << &arr[1] - &arr[0] << std::endl;
std::cout << reinterpret_cast<unsigned char *>(&arr[1]) -
reinterpret_cast<unsigned char *>(&arr[0]) << std::endl;
return 0;
}

Related

Calculating the number of elements in an array using pointer arithmetic

I have come across a piece of example code that uses pointers and a simple subtraction to calculate the number of items in an array using C++.
I have run the code and it works but when I do the math on paper I get a different answer.
There explanation does not really show why this works and I was hoping someone could explain this too me.
#include <iostream>
using namespace std;
int main() {
int array[10] = {0, 9, 1, 8, 2, 7, 3, 6, 4, 5};
int stretch = *(&array + 1) - array;
cout << "Array is consists of: " << stretch << " numbers" << endl;
cout << "Hence, Length of Array is: " << stretch;
return 0;
}
From: https://www.educba.com/c-plus-plus-length-of-array/
When I run the code I get the number 10.
When I print the results of *(&array + 1) and array by
cout << *(&array+1) << endl; cout << array << endl;
I get of course two hex address's.
When I subtract these hex numbers I get 1C or 28???
Is it possible that C++ does not actually give the hex results or their translation to decimal but rather sees these numbers as addresses and therefore only returns the number of address slots remaining?
That is the closest I can come to an explanation if some one with more knowledge than I could explain this I would be very grateful.
Let's take one step back and take it step-by-step to see if it will help. Continuing from my comment, the problem you are having difficulty with is one of type.
Let's take the array iteself:
int array[10] = {0, 9, 1, 8, 2, 7, 3, 6, 4, 5};
On access, an array is converted to a pointer to the first element in the array (e.g. the address of the first element) subject to caveats not relevant here. So when you say array, you have type int *, a pointer to the first element in array.
Now what happens when I take the address of the array? (&array in)
int stretch = *(&array + 1) - array;
When you take the address of the array, the result is the same address as array, but has type int (*)[10] (a pointer-to-array-of int[10]). When you add 1 to that pointer (recall type controls pointer arithmetic), you get the address for the pointer to the next array of int[10] in memory after array -- which will be 10 int after the first element of array.
So *(&array + 1) gives you the address to the next array of int[10] after array, and then dereference is only needed for type compatibility. When you dereference an int (*)[10] you are left with int[10] -- which on access gives you the address of the first element of that array (one after the original)
Think through the types and let me know if you have further questions.
You forgot a small detail of how pointer addition or subtraction works. Let's start with a simple example.
int *p;
This is pointing to some integer. If, with your C++ compiler, ints are four bytes long:
++p;
This does not increment the actual pointer value by 1, but by 4. The pointer is now pointing to the next int. If you look at the actual pointer value, in hexadecimal, it will increase by 4, not 1.
Pointer subtraction works the same way:
int *a;
int *b;
// ...
size_t c=b-a;
If the difference in the hexadecimal values of a and b is 12, the result of this subtraction will not be 12, but 3.
When I subtract these hex numbers I get 1C or 28 ???
There must've been a mistake with your subtraction. Your result should be 0x28, or 40 (most likely you asked your debugger or compiler to do the subtraction, you got the result in hexadecimal and assumed that it was decimal instead). That would be the ten ints you were looking for.
I will try it with 5 items
#include <iostream>
using namespace std;
int main(int argc, char *argv[])
{
int array[] {1,2,3,4,5};
int items= sizeof(array)/sizeof(array[0]);
cout << items << endl;
int items2 = *(&array +1) - array;
cout << items2 << endl;
cout << array << endl;
cout << *(&array +1) << endl;
return 0;
}
root#localhost:~/Source/c++# g++ arraySize.cpp
root#localhost:~/Source/c++# ./a.out
5
5
0x7fe2ec2800
0x7fe2ec2814
using https://www.gigacalculator.com/calculators/hexadecimal-calculator.php to subtract the numbers from each other
I get
14 hex
20 decimal.
that fits with the 4 bytes to an integer.
thanx guys :)
this is an edit done on the 12th of december melbourne time ::
I have still had questions on this topic and something did not fit right with me about the entire route to counting array items via this code.
I found something I think is interesting and again would love to know why ( I shall try to explain it as best I can my self anyway)
*(&array + 1) is the question.
lets have a look at it.
as arrays are at there very nature in c and c++ only pointers to the first element in the array how can this work.
I shall use a small set of cout calls to see if I can find out whats happening.
#include <iostream>
using namespace std;
int main(int argc, char* argv[]){
int array[] {1,2,3,4,5,6,7,8,9,10};
int size {0};
size = *(&array + 1) - array;
cout << "size = *(&array + 1) - array = " << size << endl;
cout << "*(&array + 1) = " << *(&array + 1) << endl;
cout << "(&array + 1) = " << (&array + 1) << endl;
cout << "(array + 1) = " << (array + 1) << endl;
cout << "&array = " << &array << endl;
cout << "array = " << array << endl;
cout << "*(&array) = " << *(&array) << endl;
cout << "*(array) = " << *(array) << endl;
cout << "*array = " << *array << endl;
return 0;
}
again this is off proot in my phone so still under root with no systemd.
root#localhost:~/Source/c++# g++ arrayPointerSize.cpp
root#localhost:~/Source/c++# ./a.out
size = *(&array + 1) - array = 10
*(&array + 1) = 0x7ff6a51798
(&array + 1) = 0x7ff6a51798
(array + 1) = 0x7ff6a51774
&array = 0x7ff6a51770
array = 0x7ff6a51770
*(&array) = 0x7ff6a51770
*(array) = 1
*array = 1
we see that as a pointer array can be called with * too derefernce the pointer and give the variable held in position [0] in the array.
when calling &array or reference too array we get the return of the address at the first position in the array or [0].
when calling just array we also get the address for the first position in the array or [0].
when calling *array the * is working as it does for pointers and it is dereferencing the arrays first position [0] to give the variable.
now things get a little interesting.
*(array) also dereferences the array as is seen by its value being given as 1 in this instance.
yet *(&array) does not dereference the array and returns the address to the first position in the array.
in this instance memory address 0x7ff6a51770 is the first spot in the array array = 0x7ff6a51770
and &array (reference to the pointer of the position in memory that is the first spot in the array) gives the same address 0x7ff6a51770.
it is also of note in this instance to remind us of the fact that *(&array) is also returning the first possition in the array and *(array) is not
so we can not dereference a pointer too a position in memory as its variable is the position in memory.
if array and &array give the same answer as array is a pointer too the memory position in the first spot in our array and a reference to
this pointer.
why the different answer for (array + 1) and (&array + 1).
we get the memory address 0x7ff6a51774 for (array + 1) which is in line with an integer taking four bytes on my linux or
the addition of four bytes in memory past the first spot in the array (second array spot) but (&array + 1) gives a different answer.
if we follow the bytes and the code we see that (&array + 1) actually gives us the memory address four bytes after the end of the array.
so pointer too memory address add one gives the amount of bytes the variable type is past the memory address for the start of the array
and the reference to the pointer too the memory address add one gives the address the amount of bytes the variable type is after the last ?? spot in the array.
how then can array and &array return the same answer if (array + 1) and (&array + 1) do not.
it seems to me that the & reference when working with arrays overloads the + operator when doing arithmatic.
this would explain the difference in answers as straight &array has no operator too overload so returns the same answer as calling for
straight array
this small peice of code also shows that the use of pointers using *(&array + 1) is a very bad way to show a way to find array size with
pointers as really arrays are pointers and *(&array + 1) and (&array + 1) give the same result.
the heavy work was really being done by the reference operator &.
I may still be missing something here as I have used cout directly with the different experssions and being a stream it may
be limited in its ability to take advantage of what the reference operator is really doing when working with arrays.
I am still learning this language but I shall for sure keep this in mind as I dive deaper into c++.
I believe other than a few other trials with variables that the true answer will be found when reading the source for GCC g++.
I am not ready for that yet.

How does vectors in C++ use memory? [duplicate]

This question already has answers here:
Why the libc++ std::vector internally keeps three pointers instead of one pointer and two sizes?
(3 answers)
Closed 1 year ago.
#include <iostream>
#include <vector>
int main(int argc, char const *argv[])
{
std::vector<int> a(32768, 0);
std::cout << "Size " << sizeof a << "\nCapacity " << a.capacity() << "\nElements " << a.size();
return 0;
}
for this program im getting the output:
Size 24
Capacity 32768
Elements 32768
using valgrind i calculated heap usage which is:
132096 bytes
that is (32768 x 4 bytes) + 24 bytes
im interested in how are these 24 bytes used by vector a
As addressed in the comments by Kamil, a std::vector keeps track of three pointers internally. One pointer to the begin, one to end and one to the end of allocated memory (see stack post). Now, the size of a pointer should be 8 bytes on any 64-bit C/C++ compiler so, 3 * 8 bytes = 24 bytes (see wiki).

Does converting a pointer to int actually return the location of it in byte or something? [duplicate]

This question already has answers here:
What does "dereferencing" a pointer mean?
(6 answers)
Closed 2 years ago.
I am new to C++, but I am curious enough to dig into these strange things.
I was wondering what happens when I convert a pointer to an int and realized could they indicate something. So I wrote this program to test my ideas as pointers in the same array are close enough in terms of memory location to be compared.
This is the code that will explain my question clearly.
#include <iostream>
using namespace std;
int main() {
cout << "--------------------[ Pointers ]--------------------" << endl;
const unsigned int NSTRINGS = 9;
string strArray[NSTRINGS] = { "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine" };
string *pStartArray = &strArray[0]; // Setting pStartArray pointer location to the first block of the array.
string *pEndArray = &strArray[NSTRINGS - 1]; // Setting pEndArray pointer location to the last block of the array.
cout << "---[ pStartArray value : " << *pStartArray << endl; // Showing the value of the pStartArray pointer (Just for safety check).
cout << "---[ pEndArray value : " << *pEndArray << endl; // Showing the value of the pEndArray pointer (Just for safety check).
short int blockDifferential = pEndArray - pStartArray; // Calculating the block differential of those two pointers.
cout << "---[ Differential of the block locations that pointers are pointing to in array (pEndArray - pStartArray) : " << blockDifferential << endl;
long long pStartIntLocation = (long long)pStartArray; // Converting the memory location (Hexadecimal) of pStartArray pointer to int (Maybe it's byte, regardless of being positive or negative). What's your opinion on this?
cout << "---[ (long long) pStartArray current memory location to int : \"" << pStartIntLocation << "\"" << endl;
long long pEndIntLocation = (long long)pEndArray;
cout << "---[ (long long) pEndArray current memory location converted to int : \"" << pEndIntLocation << "\"" << endl; // Converting the memory location (Hexadecimal) of pEndArray pointer to int (Maybe it's byte, regardless of being positive or negative). What's your opinion on this?
short int locationDifferential = pEndIntLocation - pStartIntLocation; // And subtracting the integer convetred location of pEndArray from pStartArray.
cout << "---[ Differential of the memory locations converted to int ((long long)pStartArray - (long long)pEndArray) : " << locationDifferential << " (Bytes?)" << endl; // Seems like even after running the program multiple times, this number does not change. Something's fishy. Doesn't it seem like it's a random thing. It must be investigated.
cout << "---[ Size of variable <string> (According to the computer that it's running) : " << sizeof(string) << " (Bytes)" << endl; // To know how much memory does a string consume. For example mine was 40.
// Here it goes interesting. I can get the block differential of the pointers using <locationDifferential>.
cout << "---[ Differential of the cell location (AGAIN) using the <locationDifferential> that I have calculated : " << locationDifferential/sizeof(string) << endl; // So definately <locationDifferential> was in bytes. Because I got 8 again. I just wonder is it a new discovery. LOL.
/*
I might look really crazy, because I can't tell it another way. It just can't happen.
This is the last try to make it as clear as I can.
pStrArray ]--\ pEndArray ]--\
\ - 8 cell difference - \
Array = | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|-------- Differential ---------|
Cell difference : 8 string cells
String size that I considered : 32 Bytes
data (location difference) : 8 * 32 = 256
So if you see this, all this might make sense.
I am excited to see what opinions you professional programmers have come up with.
- D3F4U1T
*/
cout << "----------------------------------------------------" << endl;
return 0;
}
How does this exactly work?
pStartIntLocation, pEndIntLocation are all in bytes?
If so, why sometimes they return negative value?
This is strange.
Also correct me if I am wrong about any information I provided.
- Best regards.
- D3F4U1T.
Edit 2:
Does the value that results from the conversion from pointer to a long long mean anything? Like the memory address but with the difference that this one is in bytes?
Edit 3: Seems like this is related to virtual address space. Correct me if I am wrong. Does the OS have any mechanism to number the memory as bytes. For example: Byte 1 , Byte 2 , ....
A "pointer" is an integer quantity of some length whose contents are understood to represent a memory address. (By convention, zero means NULL ... no address.)
If you typecast it into an integer, you are simply declaring to the compiler that *"no, these however-many bits should not be treated as an address ... treat them as an integer." The content of the location does not change, only the compiler's interpretation of it.
Typecasting does not change the bits – only the momentary interpretation of what they are and what they mean.
FYI: unions are another way to do a similar thing: every element of a union overlaps the others and describes various interpretations of the same area of storage. (In the Fortran language, this was called EQUIVALENCE.)

Using unsigned char array to store memory efficiently [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I was looking for an efficient way to store memory, so I realized that an unsigned char uses only one byte of memory. Thinking about it I made a small program that obtains the size of bytes of each variable.
int main() {
int myInt = 10;
long long myLongLong = 10;
unsigned char charArray1[] = { 10, 10, 10, 10 };
unsigned char charArray2[] = { 10, 10, 10, 10, 10, 10, 10, 10 };
std::cout << "Using sizeof () we get:" << std::endl;
std::cout << "myInt -> " << sizeof(myInt) << std::endl;
std::cout << "myLongLong -> " << sizeof(myLongLong) << std::endl;
std::cout << "charArray1 -> " << sizeof(charArray1) << std::endl;
std::cout << "charArray2 -> " << sizeof(charArray2) << std::endl;
return 0;
}
Output:
Using sizeof () we get:
myInt -> 4
myLongLong -> 8
charArray1 -> 4
charArray2 -> 8
Is it correct to say that I can store bytes in an unsigned char array? If correct, how can I get some matrix elements from unsigned char and assign them to a variable?
Example: If an integer occupies 4 bytes in memory, I can get 4 elements from the unsigned char array and assign it to an integer.
Is it correct to say that I can store bytes in an unsigned char array?
Yes.
If an integer occupies 4 bytes in memory, I can get 4 elements from the unsigned char array and assign it to an integer.
If and only if the array contains the bytes of the integer in the exact same format as the system uses natively, then you can do this:
static_assert(sizeof myInt == sizeof charArray1);
std::memcpy(&myInt, charArray1, sizeof myInt);
If the format isn't the same, then it is still possible to calculate the value as long as you know what the bytes represent.

C++ array size is always 4 [duplicate]

This question already has answers here:
How to find the size of an array (from a pointer pointing to the first element array)?
(17 answers)
Closed 9 years ago.
Hi I have an array defined in my header filed
private:
Customer** customerListArray;
In my cpp file I set it as following,
customerListArray = new Customer* [data.size()];
cout << "arr size " << data.size() << "\n";
cout << "arr size " << sizeof(customerListArray) << "\n";
However data.size() is 11900, but sizeof(customerListArray) array is always 4. I've tried replacing data.size() with 100 and still I get 4.
What am I doing wrong here?
Thank you.
Pointers are always of fixed size and the OP is using pointer. For sizeof() to return the actual length of an array, you have to declare an array and pass it's name to sizeof().
int arr[100];
sizeof(arr); // This would be 400 (assuming int to be 4 and num elements is 100)
int *ptr = arr;
sizeof(ptr); // This would be 4 (assuming pointer to be 4 bytes on this platform.
It is also important to note that sizeof() returns number of bytes and not number of elements
because customerListArray is a pointer
sizeof() returns the size in bytes of an element, in this case your 'customer**' is 4 bytes in size.
See this page for reference on sizeof().