Why does binary saving of array in a file works? [C++] - c++

C++ newbie here.
I'm trying to figure out the following line that writes a buffer into a file:
fOut.write((char *)&_data, sizeof(_data));
_data = array of integers...
I have a couple of questions:
Is &_data the address to the first element of the array?
Whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I free it?
Shouldn't I pass sizeof(_data)*arrLength? what is the meaning of passing the size of int (in this case) and not the size of the entire array?
What does casting into char* mean when dealing with addresses?
I would really appreciate some clarifications.

Contrary to your comment, the array must be automatic or static storage duration in order for sizeof to work.
It should be just (char*)_data. The name of an array implicitly converts to a pointer to the first element.
No, write expects a pointer, and stores the content found at that location, not the location's address.
No. Since _data is an array, sizeof (_data) is the cumulative size of all elements in the array. If _data were a pointer (such as when an array is dynamically allocated on the heap), you would want numElems * sizeof(_data[0]). Multiplying the size of a pointer by the number of elements isn't helpful.
It means that the content at that address will be treated as a series of individual bytes, losing whatever numeric meaning it might have had. This is often done to perform efficient bulk copy of data, either to and from a file, or with memcpy/memmove. The data type should be POD (plain old data) or you'll get unexpected results.
If _data is a pointer to an array allocated from the heap, as your comment suggests, then the code is badly broken. In that case, you are saving just the address, and it may appear to work if you load the file back into the same instance of your program, but that's just because it's finding the data still in memory at the same address. The data wouldn't actually be in the file, and if you re-started the program before loading the file, you'd find that the data was gone. Make the changes I mentioned in both (1) and (3) in order to save the complete array regardless of whether it's allocated automatic, static, or dynamically.

What does casting into char* means
when dealing with addresses?
Imagine this simple example
int x = 12;
char * z = (char *)&x;
And assume an architecture where int is 4 bytes long. From the C++ Standard sizeof(char)==1.
On the expression char * z the char * part, you could say that is being used for pointer arithmetic
on the Second line of the example I gave, what happens is that z now points to the first (out of 4 bytes) that x has. Doing a ++z; will make z point to the Second Byte of the (in my example) 4byte int
You could say that the left part of a declaration is used for pointer arithmetic, to simplify things. a ++(char *) would move you by one byte, while a ++(int *) would move you by the corresponding number of bytes int occupies on the memory.

Yes
No, write uses this address as the first location, and reads through sizeof(_data) writing the whole array
sizeof(_data) will return the size of the entire array not the same as sizeof(int)
Means the data will be read byte by byte, this is the pointer required by write as it writes in binary format(byte by byte)

1) Yes, &_data is the address of the first element of your array.
2) No, write() writes the number bytes you have specified via sizeof(_data) starting at address &_data
3) You would pass sizeof(int)*arrLength if _data is a pointer to an array, but since it is an array sizeof() returns the correct size.
4) don't know. ;)

read this : http://www.cplusplus.com/reference/iostream/ostream/write/
should be.
if you call "fstream.write(a,b)" then it writes b bytes starting from location a into the file (i.e. what the address is pointing at);
it should be the size in bytes or chars .
not much, similar to casting stuff to byte[] in more civilized languages.
By the way, it will only work on simple arrays with simple values inside them...
i suggest you look into the >> << operators .

is &_data the address to the first element of the array?
Yes, it is. This is the usual way to pass a "reference" to an array in C and C++. If you passed the array itself as a parameter, the whole array contents would be copied, which is usually wasteful and unnecessary. Correction: You can pass either &_data, or just _data. Either way, the array does not need to be copied to the stack.
whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I delete him from the memory?
No, the method uses the address it gets to read the array contents; just saving the memory address would be pointless, as you point out.
Shouldn't I pass sizeof(_data)*arrLength? I mean...
what is the logic of passing the size
of int (in this case) and not the size
of the entire array?
No, sizeof(_data) is the size of the array, not of one member. No need to multiply by length.
What does casting into char* means when dealing with addresses?
Casting to char* means that the array is accessed as a list of bytes; that's necessary for accessing and writing the raw values.

Related

How to check if a pointer points to an array or single int or char

I want to know whether a pointer is pointing to an array or single integer. I have a function which takes two pointer (int and char) as input and tell whether a pointer is pointing to an array or single integer.
pointer=pointer+4;
pointer1=pointer1+4;
Is this a good idea?
Like others have said here, C doesn't know what a pointer is pointing to. However if you should choose to go down this path, you could put a sentinel value in the integer or first position in the array to indicate what it is...
#define ARRAY_SENTINEL -1
int x = 0;
int x_array[3] = {ARRAY_SENTINEL, 7, 11};
pointer = &x_array[0];
if (*pointer == ARRAY_SENTINEL)
{
// do some crazy stuff
}
pointer = &x;
if (*pointer != ARRAY_SENTINEL)
{
// do some more crazy stuff
}
That's not a good idea. Using just raw pointers there's no way to know if they point to an array or a single value.
A pointer that is being used as an array and a pointer to a single values are identical - they're both just a memory address - so theres no information to use to distinguish between them. If you post what you want to ultimately do there might be a solution that doesn't rely on comparing pointers to arrays and single values.
Actually pointers point to a piece of memory, not integers or arrays. It is not possible to distinguish if an integer is single variable or the integer is an element of array, both will look exactly the same in memory.
Can you use some C++ data structures, std::vector for example?
For C++ questions, the answer is simple. Do not use C-style dynamic arrays in C++. Whenever you need a C-style dynamic array, you should use std::vector.
This way you would never guess what the pointer points to, because only std::vector will be holding an array.

What is the difference between Pointer and strings?

What is the difference between a pointer and array or they are same? As a array also works with poiter arithematic so it can be said that an array is nothing but pointer to its fitst element.
They both are different by the following differences:-
int array[40];
int * arrayp;
Now if you will try to see the size of both then it will be different for pointer it will same everytime whereas for array it varies with your array size
sizeof(array);\\Output 80
sizeof(arrayp);\\Output 4(on 32-bit machines)
Which means that computer treats all the offsprings of integers in an array as one which could not be possible with pointers.
Secondly, perform increment operation.
array++;\\Error
arrayp++;\\No error
If an array could have been a pointer then that pointer's pointing location could have been changes as in the second case with arrayp but it is not so.

Is the name of a two dimensional array address of the address of its first element in C++?

When implementing a two dimensional array like this:
int a[3][3];
these hold: A=&A[0], at the same time A[0]=&A[0][0]. So, A=&(&A[0][0]), what basically says that A is the address of the address of the first element of the array, which is not quite true. What is my mistake here? Does A really decay to a pointer to a pointer?
Your mistake is that you have an incorrect understanding of the relationship between arrays and pointers. An array is not a pointer. It is an array. However, an array is implicitly convertible to a pointer to its own first element. So, while this expression does evaluate to true:
A == &A[0]
It is not correct to say that A is &A[0]. The conversion does not happen in all expressions. For example:
&A
This does not take the address of the address of the first element of A (that doesn't even make sense). It takes the actual address of A, who's type is int[3][3]. So the type of &A is int(*)[3][3], read as "pointer to array of 3 arrays of 3 ints".
The primary difference between &A and &A[0] is that if you add 1 to &A, you will get an address that is 3 * 3 * sizeof(int) bytes away, while if you add 1 to &A[0], you will get a pointer that is only 3 * sizeof(int) bytes away.
With all this in mind, you should be able to see where your mistake is. A[0] is not &A[0][0], but it is implicitly convertible to it. However, like all conversions, this results in a temporary, which you cannot take the address of. So the expression &(&A[0][0]) doesn't even make sense.
Because of reactions on my previous answer I did some research to learn more on whatever was wrong in my explanation.
Found a rather elaborate explanation of the topic here :
http://eli.thegreenplace.net/2009/10/21/are-pointers-and-arrays-equivalent-in-c
I'll try to summarize :
if you have following :
char array_place[100] = "don't panic";
char* ptr_place = "don't panic";
the way that this is represented in memory is entirely different.
whereas ptr_place is a real pointer, array_place is just a label.
char a = array_place[7];
char b = ptr_place[7];
The semantics of arrays in C dictate that the array name is the address of the first element of the array, which is not the same as saying that it is a pointer. Hence in the assignment to a, the 8th character of the array is taken by offsetting the value of array_place by 7, and moving the contents pointed to by the resulting address into the al register, and later into a.
The semantics of pointers are quite different. A pointer is just a regular variable that happens to hold the address of another variable inside. Therefore, to actually compute the offset of the 8th character of the string, the CPU will first copy the value of the pointer into a register and only then increment it. This takes another instruction [1].
This point is frequently ignored by programmers who don't actually hack on compilers. A variable in C is just a convenient, alphanumeric pseudonym of a memory location. Were we writing assembly code, we would just create a label in some memory location and then access this label instead of always hard-coding the memory value - and this is what the compiler does.
Well, actually the address is not hard-coded in an absolute way because of loading and relocation issues, but for the sake of this discussion we don't have to get into these details.
A label is something the compiler assigns at compile time. From here the great difference between arrays and pointers. This also explains why sizeof(array_place) gives the full size of the array where as the size of a pointer will give the size of a pointer.
I must say, I was not aware of these subtle differences myself, and I have been coding for quite a long time in C and C++ and with arrays too.
Nevertheless if the name of the array element is the address of the first element of the array. You can create a pointer and initialise it what that value
char* p = array_place
p will point to the memory location where the characters are.
to conclude :
There is one difference between an array name and a pointer that must be kept in mind. A pointer is a variable, so p=array_place and p++ are legal. But an array name is not a variable; constructions like array_place=p and array_place++ are illegal. That I did know ;-)

a couple of simple questions about basic pointer use in c++ and the c++ memory model

I've been studying along with the Stanford courses on iTunes U and have hit pointers in C++. I think I understand how pointers work, but I just want to check how to do some simple stuff. Let's say I want to create a dynamic array:
double *array;
At this point there's a variable called "array" in the stack and nothing in the heap. First question - what's stored in "array" at this point? A pointer to some nonsense piece of memory?
I then allocate memory using "new":
array = new double[10];
Second question - at this point, what's stored in "array"? A pointer to some contiguous piece of memory big enough to hold ten doubles? (Sorry for the simple questions, but I really want to make sure I understand)
I assign the double 2.0 to each element in the array:
for(int i=0; i<array.length(); i++) array[i]=2.0;
Third question - is this different from using the dereference operator to assign? (i.e., *array[i]=2.0). I then pass the array to some other function:
myFunc(double array[]){
for(int i=1; i<array.length(); i++){
array[i]=array[i]*array[i-1];
}
}
Fourth question - on the pass to myFunc, since array is an array of pointers to doubles, and not an array of doubles, it passes by reference without "&", right? That means the operations in my loop are affecting the actual data stored in "array". What if I wanted to pass by value, so that I wouldn't be touching the data in "array"? Would I use
myFunc(double *array[]){...}?
Last question - what if I wanted to manipulate the memory addresses for the contents of "array" for some reason? Could i use
someVar = &array[5];
to assign the the hex address of array[5] to someVar?
I've read the section on pointers in the reader and watched the Binky video a dozen times and it still doesn't make sense. Any help would be greatly appreciated.
EDIT: Thanks a lot to everyone who answered so far. If you wouldn't mind I just have one more question. In the declaration double *array;, "array" is declared as a pointer to a double, but once I use "new" to assign it, "array" ceases being a pointer to a double, and becomes an array of doubles, right?
array contains junk data - whatever was in that memory location before array existed is still there. If you try to play with it you're going to shoot yourself in the foot, which is why you need to assign it to a valid memory location, (hence the ensuing call to new[]).
Yes, array now contains a pointer (memory address) to some contiguous piece of memory big enough to hold ten doubles.
*array[i]=2.0 won't actually compile. array[i] results in a double, and you can't use the dereference operator on a double.
What you're passing is that address to the first element in the array. So you are passing the pointer by value, and the array by reference (as the pointer is a reference to the array.) To pass the array itself by value you'd have to have one parameter for each entry. You could also copy the array and send in the copy, but the copy itself would be passed by reference, too.
double* someVar = &array[5]; will return to you a pointer to the 6th element of the array. array[5] gives you the double, and taking the address of it (with &) will give you the memory address (pointer) of that double.
Yep, that's what's happening
Most definitely. More specifically, a pointer to the beginning of a contiguous piece of memory.
Not in this case; * (for dereference) is a unary operator, and yet you have passed it two arguments. You can be sure it is multiplication that is performed (or an overloaded version of it) - also, what could array[i](*array[i-1]) mean? you can't dereference something that isn't a pointer (or doesn't have the unary * operator overloaded)
You're only passing the pointer by value and not the data. If you want to pass the data by value (make it unchanged outside the function), you'd have to copy it first, and pass that (or just use a vector)
Yes, you're just getting the address of a part of contiguous memory, and you can store the address and modify the dereferenced value elsewhere, the array will be modified also.
Also, be weary that when you allocate on the heap, you have to delete the memory afterwards. In this case, you would use delete[] array;
After declaration, the array variable contains an arbitary value. You're not allowed to do anything with that value. After new, it contains a pointer to a contiguous range of memory large enough to hold 10 doubles. *array[i]=2.0 is an error (that would imply that array is an array of pointers to double). Indexing operator [] is just a syntactic sugar for *(array+i)=2.0.
Forth question: SAY WHAT?? You don't have an array of pointers to doubles anywhere in that code. In functions, void f(double *x) and void f(double x[]) are THE SAME THING: a pointer to double. If you pass to f an array, x will receive the address of the first element (which is the VALUE of an array).
You can't pass arrays by value. Alternatively, they are always passed by value (as everything else in C), but note that the VALUE of an array is the address of its first element.
Your last question: I have no idea what you're trying to achieve, but the question clearly shows that you're confused. An address is an address, there's no such thing as "hex address".

Pointer arithmetic on string type arrays, how does C++ handle this?

I am learning about pointers and one concept is troubling me.
I understand that if you have a pointer (e.g.'pointer1') of type INT that points to an array then you can fill that array with INTS. If you want to address a member of the array you can use the pointer and you can do pointer1 ++; to step through the array. The program knows that it is an array of INTs so it knows to step through in INT size steps.
But what if the array is of strings whcih can vary in length. How does it know what to do when you try to increment with ++ as each element is a different length?
Similarly, when you create a vector of strings and use the reserve keyword how does it know how much to reserve if strings can be different lengths?
This is probably really obvious but I can't work it out and it doesn't fit in with my current (probably wrong) thinking on pointers.
Thanks
Quite simple.
An array of strings is different from a vector of strings.
An array of strings (C-style pointers) is an array of pointers to an array of characters, "char**". So each element in the array-of-strings is of size "Pointer-to-char-array", so it can step through the elements in the stringarray without a problem. The pointers in the array can point at differently size chunks of memory.
With a vector of strings it is an array of string-objects (C++ style). Each string-object has the same object size, but contains, somewhere, a pointer to a piece of memory where the contents of the string are actually stored. So in this case, the elements in the vector are also identical in size, although different from "just a pointer-to-char-array", allowing simple element-address computation.
This is because a string (at least in C/C++) is not quite the same sort of thing as an integer. If we're talking C-style strings, then an array of them like
char* test[3] = { "foo", "bar", "baz" };
what is actually happening under the hood is that "test" is an array of pointers, each of which point to the actual data where the characters are. Let's say, at random, that the "test" array starts at memory address 0x10000, and that pointers are four bytes long, then we might have
test[0] (memory location 0x10000) contains 0x10020
test[1] (memory location 0x10004) contains 0x10074
test[2] (memory location 0x10008) contains 0x10320
Then we might look at the memory locations around 0x10020, we would find the actual character data:
test[0][0] (memory location 0x10020) contains 'f'
test[0][1] (memory location 0x10021) contains 'o'
test[0][2] (memory location 0x10022) contains 'o'
test[0][3] (memory location 0x10023) contains '\0'
And around memory location 0x10074
test[1][0] (memory location 0x10074) contains 'b'
test[1][1] (memory location 0x10075) contains 'a'
test[1][2] (memory location 0x10076) contains 'r'
test[1][3] (memory location 0x10077) contains '\0'
With C++ std::string objects much the same thing is going on: the actual C++ string object doesn't "contain" the characters because, as you say, the strings are of variable length. What it actually contains is a pointer to the characters. (At least, it does in a simple implementation of std::string - in reality it has a more complicated structure to provide better memory use and performance).
An array of strings is an array of pointers to the first character of some strings. The size of a pointer to a char is probably the same size as a pointer to an int.
Essentially, a 2D array isnt necessarily linear in memory, the pointed-to arrays could be anywhere.
This might seem like pedantry, but in a shoot-yer-foot language like C++ this is important: in your original question you say:
you can do pointer1 ++; to step through the array.
Postincrement (pointer1++) is usually semantically wrong here, because it means "increment pointer1 but keep the expression value at the original value of pointer1". If you have no need for the original value of pointer1, use pre-increment (++pointer1) instead, which has semantically exactly the meaning of "increment the pointer by one".
For some reason most C++ textbooks do the postincrement thing everywhere, teaching new C++-ers bad habits ;-)
In C++, arrays and vectors are always containing fixed-size elements. Strings fit this condition, because your string elements are then either pointers to null-terminated c-strings (char *) stored somewhere else, or plain std::string objects.
The std::string object has a constant size, the actual string data is allocated somewhere else (except for small string optimization, but that's another story).
vector<string> a;
a.resize( 2 ); // allocate memory for 2 strings of any length.
vector<char *> b;
b.resize( 2 ); // allocate memory for 2 string pointers.
vector<char> c; // one string. Should use std::string instead.
c.resize( 2 ); // allocate memory for 2 characters (including or not the terminator).
Note that the reserve() function of std::vector just prepare the vector to grow. It's used mainly for optimization purpose. You probably want to use resize().