How is it possible to have an array of strings in C++?

How is it possible to have an array of strings in C++? - c++

When you access elements of an array using array[i], I thought that C++ would take the starting position of the array in memory and add i*sizeof(one array element) and then dereference that address (or do something equivalent to what I just described). However, it seems to me that if you have an array of strings (std::string), each element could be a different size based on the number of characters in the string, so there must be something else going on.
Also, to my understanding, array elements are stored in contiguous memory. If you had strings stored in contiguous memory and then appended more characters to one of them, all of the succeeding strings would have to be moved over.
Can someone explain to me how this works?

The string size is constant, but it (at some level) has a pointer to some non-constant-sized data.
The pointer size is constant, the pointee size is not.

std::strings are objects. The size of one std::string is the same as the size of another std::string. They indirectly "contain" their data via dynamic allocation, which does not affect the size of the owning object.
Similarly, if you mean C-style strings, you actually only pass around char* (or pointers-to-char). Pointers are always the same size, no matter the length of the block of memory to which they point.

std::string is a wrapper of char*, not an array. Arrays can be different sizes, yes, but char*s are pointers and have a constant size. The char* that std::string encapsulates points to dynamically allocated memory. This is why sizeof(std::string) returns the same size no matter how large the string grows.

If you're refering to the C++ type std::string, each one of the elements of an array of strings occupy the same ammount of memory. However, each string may contain a pointer pointing to a different position in the memory, of different length, where it actually stores the string.
To see an example (sample code), imagine the std::string class something like this:
struct string
{
size_t length;
const char* data;
// other members..
};
Note how the structure size is always the same (a size_t and a pointer), but the memory pointed to, where the actual string is stored, may be different.

The string object has a size and that will differ depending on your implementation complier etc. You are correct in your assessment of who c++ handles arrays, but overlook pointer data. The string class in its bowels has a pointer to some heap data that heap data can be of any arbitrary size, but the pointer to that data is a fixed size. So inside the data layout of a string there is a way for the complier to create uniform objects with non fixed representation data.

Since a string is a character pointer, an array of strings is an array of (char *) — a contiguous vector of (char *) pointers. Modifying a string would modify the memory pointed to by each element. Now, if you declared it to be statically allocated:
char foo[10][10];
then in terms of memory layout it's indistinguishable from
char foo2[100];
and it would be possible to corrupt memory by writing past the declared size; this is one reason one should use std::string instead of C-style strings, which are perhaps the best example of why C is a lousy language for application programming. (An array of std::string would be an array of objects, each of which would have a (char *) stored in it somewhere that you wouldn't need to worry about — std::string does it for you, and much more reliably.)

This is how you would do it in code...
const int ARRSIZE = 5;
string arrayOfStr[ARRSIZE] = {"one", "two", "three"};
for (int i = 0; i < ARRSIZE; ++i)
{
cout << arrayOfStr[i] << endl;
}

Related

C++: Why can't I convert a string to C-string with an initializer char foo[]?

This is my code:
const char readArr[] = readWord.c_str();
This gives an error: array initializer must be an initializer list or string literal
Why must I use
const char *readArr = readWord.c_str();?

It's for the same reason you can't
const char *p="foo";
const char readArr[]=p;
either. An array is not a discrete object that can be initialized. The only thing that can be initialized in C++ is a discrete object, and an array is not an object per se. An array is a conceptual address of a contiguous list of values, in consecutive memory locations. When an array is used in an expression, such as:
readArr[i]
The array's name decays to a pointer to the first element in the array. Now, guess what you did when you wrote this:
const char *readArr = readWord.c_str();
Well, you just stored a pointer to the first element in an array of characters, that's owned by the readWord std::string.
In a regular array declaration:
char readArr[]="Hello";
the compiler is given the length of the string, and thus it initialize a consecutive list of character values, and the label readArr to it.

const char readArr[] = readWord.c_str();
The reason this is not legal is that it simply doesn't make sense to initialise an array from a pointer. A pointer is in essence a memory address: it points to some data, whether that data is dynamically or statically allocated (allocated 'on the heap' or 'on the stack' respectively). A pointer does not record how much memory is there.
This is confusing to newcomers to C and C++ because the language often allows you to treat arrays as if they were just pointers to their first element. That doesn't mean that arrays are just pointers to their first element. They aren't. But if you use them in an expression they will decay to a pointer to their first element.

Because arrays are not pointers. An array... is an array, period. char readArr[] (just like char arr[4]) declares something directly in the local memory space (the stack, for a function) so that something has to be statically allocated.
str.c_str() is somewhere on the heap so that can't work.

How to check if a pointer points to an array or single int or char

I want to know whether a pointer is pointing to an array or single integer. I have a function which takes two pointer (int and char) as input and tell whether a pointer is pointing to an array or single integer.
pointer=pointer+4;
pointer1=pointer1+4;
Is this a good idea?

Like others have said here, C doesn't know what a pointer is pointing to. However if you should choose to go down this path, you could put a sentinel value in the integer or first position in the array to indicate what it is...
#define ARRAY_SENTINEL -1
int x = 0;
int x_array[3] = {ARRAY_SENTINEL, 7, 11};
pointer = &x_array[0];
if (*pointer == ARRAY_SENTINEL)
{
// do some crazy stuff
}
pointer = &x;
if (*pointer != ARRAY_SENTINEL)
{
// do some more crazy stuff
}

That's not a good idea. Using just raw pointers there's no way to know if they point to an array or a single value.
A pointer that is being used as an array and a pointer to a single values are identical - they're both just a memory address - so theres no information to use to distinguish between them. If you post what you want to ultimately do there might be a solution that doesn't rely on comparing pointers to arrays and single values.

Actually pointers point to a piece of memory, not integers or arrays. It is not possible to distinguish if an integer is single variable or the integer is an element of array, both will look exactly the same in memory.
Can you use some C++ data structures, std::vector for example?

For C++ questions, the answer is simple. Do not use C-style dynamic arrays in C++. Whenever you need a C-style dynamic array, you should use std::vector.
This way you would never guess what the pointer points to, because only std::vector will be holding an array.

assign a char array to a literal string - c++

char arr[3];
arr="hi";// ERROR
cin>>arr;// and at runtime I type hi, which works fine.
1)can someone explain to me why?
2)and what's exactly is the type of "hi", I know it's called literal string. but is it just an array of chars too?
3) isn't cin>>arr; will be just like assign arr to what you type at runtime?

Arrays in C++ are not actual types, just a structured representation of a series of values, and not pointers if you should find that anywhere (they decay into pointers). You can't use them like you would use other types, including assignment. The choice was to either add lots of support for arrays, or to keep them as simple and fast as possible. The latter was chosen, which is one of the distinctions C++ has from some other languages.
To copy an array, copy each element one at a time.
In C++11, there is an STL container std::array. It was designed to fit in as a plain array with operator overloading, as well as relating to the rest of the STL.
A better alternative is std::string. It incorporates the behaviour you want and more, and is specifically designed for holding arrays of characters.
"hi" is, as Konrad Rudolph points out, a const char [3].
As for cining a raw array, it is not possible by standard means because there is no overload provided for cin with arrays. It is possible to create your own overload though. However, I'm not sure how you would account for the different sizes of arrays that get passed unless you define it for a container that knows its size instead of a raw array.

If you'd like, you can declare:
char array[] = "hi!";
Creates an array and 'initializes' it to 4 bytes long, "hi!"
char const *array2 = "hey!";
Creates a pointer to read-only memory, a string literal
array2 = array;
You can now use the array2 pointer to access array one. This is called pointer decay; array and array2 are not of the same type, even though they can cooperate here. An array of type char "decays" to a pointer-to of type char.
array = array2; // ERROR
An array is not a pointer. You're thinking like an array is a pointer, when really, it is pre-allocated. You're attempting to assign an address, but array[] already has one "hard-coded" when it was created, and it cannot be changed.

Why does binary saving of array in a file works? [C++]

C++ newbie here.
I'm trying to figure out the following line that writes a buffer into a file:
fOut.write((char *)&_data, sizeof(_data));
_data = array of integers...
I have a couple of questions:
Is &_data the address to the first element of the array?
Whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I free it?
Shouldn't I pass sizeof(_data)*arrLength? what is the meaning of passing the size of int (in this case) and not the size of the entire array?
What does casting into char* mean when dealing with addresses?
I would really appreciate some clarifications.

Contrary to your comment, the array must be automatic or static storage duration in order for sizeof to work.
It should be just (char*)_data. The name of an array implicitly converts to a pointer to the first element.
No, write expects a pointer, and stores the content found at that location, not the location's address.
No. Since _data is an array, sizeof (_data) is the cumulative size of all elements in the array. If _data were a pointer (such as when an array is dynamically allocated on the heap), you would want numElems * sizeof(_data[0]). Multiplying the size of a pointer by the number of elements isn't helpful.
It means that the content at that address will be treated as a series of individual bytes, losing whatever numeric meaning it might have had. This is often done to perform efficient bulk copy of data, either to and from a file, or with memcpy/memmove. The data type should be POD (plain old data) or you'll get unexpected results.
If _data is a pointer to an array allocated from the heap, as your comment suggests, then the code is badly broken. In that case, you are saving just the address, and it may appear to work if you load the file back into the same instance of your program, but that's just because it's finding the data still in memory at the same address. The data wouldn't actually be in the file, and if you re-started the program before loading the file, you'd find that the data was gone. Make the changes I mentioned in both (1) and (3) in order to save the complete array regardless of whether it's allocated automatic, static, or dynamically.

What does casting into char* means
when dealing with addresses?
Imagine this simple example
int x = 12;
char * z = (char *)&x;
And assume an architecture where int is 4 bytes long. From the C++ Standard sizeof(char)==1.
On the expression char * z the char * part, you could say that is being used for pointer arithmetic
on the Second line of the example I gave, what happens is that z now points to the first (out of 4 bytes) that x has. Doing a ++z; will make z point to the Second Byte of the (in my example) 4byte int
You could say that the left part of a declaration is used for pointer arithmetic, to simplify things. a ++(char *) would move you by one byte, while a ++(int *) would move you by the corresponding number of bytes int occupies on the memory.

Yes
No, write uses this address as the first location, and reads through sizeof(_data) writing the whole array
sizeof(_data) will return the size of the entire array not the same as sizeof(int)
Means the data will be read byte by byte, this is the pointer required by write as it writes in binary format(byte by byte)

1) Yes, &_data is the address of the first element of your array.
2) No, write() writes the number bytes you have specified via sizeof(_data) starting at address &_data
3) You would pass sizeof(int)*arrLength if _data is a pointer to an array, but since it is an array sizeof() returns the correct size.
4) don't know. ;)

read this : http://www.cplusplus.com/reference/iostream/ostream/write/
should be.
if you call "fstream.write(a,b)" then it writes b bytes starting from location a into the file (i.e. what the address is pointing at);
it should be the size in bytes or chars .
not much, similar to casting stuff to byte[] in more civilized languages.
By the way, it will only work on simple arrays with simple values inside them...
i suggest you look into the >> << operators .

is &_data the address to the first element of the array?
Yes, it is. This is the usual way to pass a "reference" to an array in C and C++. If you passed the array itself as a parameter, the whole array contents would be copied, which is usually wasteful and unnecessary. Correction: You can pass either &_data, or just _data. Either way, the array does not need to be copied to the stack.
whatever address it is, does that mean that we only save the address of the array? then how come I still can access the array after I delete him from the memory?
No, the method uses the address it gets to read the array contents; just saving the memory address would be pointless, as you point out.
Shouldn't I pass sizeof(_data)*arrLength? I mean...
what is the logic of passing the size
of int (in this case) and not the size
of the entire array?
No, sizeof(_data) is the size of the array, not of one member. No need to multiply by length.
What does casting into char* means when dealing with addresses?
Casting to char* means that the array is accessed as a list of bytes; that's necessary for accessing and writing the raw values.

Pointer arithmetic on string type arrays, how does C++ handle this?

I am learning about pointers and one concept is troubling me.
I understand that if you have a pointer (e.g.'pointer1') of type INT that points to an array then you can fill that array with INTS. If you want to address a member of the array you can use the pointer and you can do pointer1 ++; to step through the array. The program knows that it is an array of INTs so it knows to step through in INT size steps.
But what if the array is of strings whcih can vary in length. How does it know what to do when you try to increment with ++ as each element is a different length?
Similarly, when you create a vector of strings and use the reserve keyword how does it know how much to reserve if strings can be different lengths?
This is probably really obvious but I can't work it out and it doesn't fit in with my current (probably wrong) thinking on pointers.
Thanks

Quite simple.
An array of strings is different from a vector of strings.
An array of strings (C-style pointers) is an array of pointers to an array of characters, "char**". So each element in the array-of-strings is of size "Pointer-to-char-array", so it can step through the elements in the stringarray without a problem. The pointers in the array can point at differently size chunks of memory.
With a vector of strings it is an array of string-objects (C++ style). Each string-object has the same object size, but contains, somewhere, a pointer to a piece of memory where the contents of the string are actually stored. So in this case, the elements in the vector are also identical in size, although different from "just a pointer-to-char-array", allowing simple element-address computation.

This is because a string (at least in C/C++) is not quite the same sort of thing as an integer. If we're talking C-style strings, then an array of them like
char* test[3] = { "foo", "bar", "baz" };
what is actually happening under the hood is that "test" is an array of pointers, each of which point to the actual data where the characters are. Let's say, at random, that the "test" array starts at memory address 0x10000, and that pointers are four bytes long, then we might have
test[0] (memory location 0x10000) contains 0x10020
test[1] (memory location 0x10004) contains 0x10074
test[2] (memory location 0x10008) contains 0x10320
Then we might look at the memory locations around 0x10020, we would find the actual character data:
test[0][0] (memory location 0x10020) contains 'f'
test[0][1] (memory location 0x10021) contains 'o'
test[0][2] (memory location 0x10022) contains 'o'
test[0][3] (memory location 0x10023) contains '\0'
And around memory location 0x10074
test[1][0] (memory location 0x10074) contains 'b'
test[1][1] (memory location 0x10075) contains 'a'
test[1][2] (memory location 0x10076) contains 'r'
test[1][3] (memory location 0x10077) contains '\0'
With C++ std::string objects much the same thing is going on: the actual C++ string object doesn't "contain" the characters because, as you say, the strings are of variable length. What it actually contains is a pointer to the characters. (At least, it does in a simple implementation of std::string - in reality it has a more complicated structure to provide better memory use and performance).

An array of strings is an array of pointers to the first character of some strings. The size of a pointer to a char is probably the same size as a pointer to an int.
Essentially, a 2D array isnt necessarily linear in memory, the pointed-to arrays could be anywhere.

This might seem like pedantry, but in a shoot-yer-foot language like C++ this is important: in your original question you say:
you can do pointer1 ++; to step through the array.
Postincrement (pointer1++) is usually semantically wrong here, because it means "increment pointer1 but keep the expression value at the original value of pointer1". If you have no need for the original value of pointer1, use pre-increment (++pointer1) instead, which has semantically exactly the meaning of "increment the pointer by one".
For some reason most C++ textbooks do the postincrement thing everywhere, teaching new C++-ers bad habits ;-)

In C++, arrays and vectors are always containing fixed-size elements. Strings fit this condition, because your string elements are then either pointers to null-terminated c-strings (char *) stored somewhere else, or plain std::string objects.
The std::string object has a constant size, the actual string data is allocated somewhere else (except for small string optimization, but that's another story).
vector<string> a;
a.resize( 2 ); // allocate memory for 2 strings of any length.
vector<char *> b;
b.resize( 2 ); // allocate memory for 2 string pointers.
vector<char> c; // one string. Should use std::string instead.
c.resize( 2 ); // allocate memory for 2 characters (including or not the terminator).
Note that the reserve() function of std::vector just prepare the vector to grow. It's used mainly for optimization purpose. You probably want to use resize().

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js