Using strings as arrays in C++ - c++

Our instructor told us that a string is the array of characters, and I was wondering whenever we use any array statically, we have to define its size before compiling the pro-gramme in C++ then why don't we do same with the string?
Thanks in advance.

The compiler can choose an array's size automatically to match its initial content, for example:
int a[] = { 3, 5, 2 };
So this is not something that string literals have and other arrays don't.

The string is an object, which is smarter than a character array. A character array is just an allocation in memory, it has no logic associated with it. However the string (because it is an object) is able to manage its own memory and expand as needed.
In C++ you can overload operators. Because the string class has its [ ] operators overloaded you can use the string as an array and access individual characters. However when you use the [ ] operators you are actually invoking a method on the string (namely operator[ ]).
So you can create a string, expand by adding to it, and access individual characters in it:
string str1 = "Hello "; // create a string and assign value
string str2("World"); // use the constructor to assign a value
str1 += str2; // append one string to another
cout << str1[0]; // should print H
But even though the opeartor overloading give it has the same feel as an array, it's actually an object.

if we talk about char* arr = "hello world";
now here "hello world" is given memory through a string object and the object is initialized by the constructor of the String class.
if we say String str = "hello world";
here again constructor of String class is called and it initializes the str object of String to point to the starting address of "hello world" which is stored somewhere in memory.
here we do not have to give the size, instead of that constructor of string class is doing all the trick of allocating dynamic memory and initializing.

Related

How can this constructor that has pointer as the parameter receive string?

This is a constructor that I found from the internet, but it didn't have enough descriptions so I couldn't understand how it's even possible for this constructor to have string as a parameter.
For instance, if I say MyString str1("hello world this "hello world" goes into the constructor, but I have no idea how this string can get into this const pointer parameter. Can anyone explain how this is possible?
MyString::MyString(const char* str) {
string_length = strlen(str);
string_content = new char[string_length];
for (int i = 0; i != string_length; i++) string_content[i] = str[i];
}
When you get a const char* you are getting what is called a C string, which is a pointer to an array of characters that ends with a special value '0/' which you dont have to type. All strings that you write as literals in the code are of this form.
Dont confuse C strings with the C++ class strings (that works with C strings underneath) since are different.
So what is happening is the constructor doesn't actually receive a string, it receives a pointer that points to something of type char (and what is a string if not a sequence of chars, and I mean the concept of string and not the string object in C++ itself).
When you do "MyString str1("hello world")" what is actually passed to the constructor is a pointer that points to the first char of the string, in this case the "h" char.

How to code a strcat function that works with two dynamic arrays

As we know, the strcat function concatinates one c-string onto another to make one big c-string containing two others.
My question is how to make a strcat function that works with two dynamically allocated arrays.
The desired strcat function should be able to work for any sized myStr1 and myStr2
//dynamic c-string array 1
char* myStr1 = new char [26];
strcpy(myStr1, "The dog on the farm goes ");
//dynamic c-string array 2
char* myStr2 = new char [6];
strcpy(myStr2, "bark.");
//desired function
strcat(myStr1,myStr2);
cout<<myStr1; //would output 'The dog on the farm goes bark.'
This is as far as I was able to get on my own:
//*& indicates that the dynamic c-string str1 is passed by reference
void strcat(char*& str1, char* str2)
{
int size1 = strlen(str1);
int size2 = strlen(str2);
//unknown code
//str1 = new char [size1+size2]; //Would wipe out str1's original contents
}
Thanks!
You need first to understand better how pointers work. Your code for example:
char* myStr1 = new char [25];
myStr1 = "The dog on the farm goes ";
first allocates 25 characters, then ignores the pointer to that allocated area (the technical term is "leaks it") and sets myStr1 to point to a string literal.
That code should have used strcpy instead to copy from the string literal into the allocated area. Except that the string is 25 characters so you will need to allocate space for at least 26 as one is needed for the ASCII NUL terminator (0x00).
Correct code for that part should have been:
char* myStr1 = new char [26]; // One more than the actual string length
strcpy(myStr1, "The dog on the farm goes ");
To do the concatenation of C strings the algorithm could be:
measure the lengths n1 and n2 of the two strings (with strlen)
allocate n1+n2+1 charaters for the destination buffer (+1 is needed for the C string terminator)
strcpy the first string at the start of the buffer
strcat the second string to the buffer (*)
delete[] the memory for the original string buffers if they are not needed (if this is the right thing to do or not depends on who is the "owner" of the strings... this part is tricky as the C string interface doesn't specify that).
(*) This is not the most efficient way. strcat will go through all the characters of the string to find where it ends, but you already know that the first string length is n1 and the concatenation could be done instead with strcpy too by choosing the correct start as buffer+n1. Even better instead of strcpy you could use memcpy everywhere if you know the count as strcpy will have to check each character for being the NUL terminator. Before getting into this kind of optimization however you should understand clearly how things work... only once the string concatenation code is correct and for you totally obvious you are authorized to even start thinking about optimization.
PS: Once you get all this correct and working and efficient you will appreciate how much of a simplification is to use std::string objects instead, where all this convoluted code becomes just s1+s2.
You allocate memory and make your pointers point to that memory. Then you overwrite the pointers, making them point somewhere else. The assignment of e.g. myStr1 causes the variable to point to the string literal instead of the memory you allocated. You need to copy the strings into the memory you have allocated.
Of course, that copying will lead to another problem, as you seem to forget that C-strings need an extra character for the terminator. So a C-string with 5 characters needs space for six characters.
As for your concatenation function, you need to do copying here too. Allocate enough space for both strings plus a single terminator character. Then copy the first string into the beginning of the new memory, and copy the second string into the end.
Also you need a temporary pointer variable for the memory you allocate, as you otherwise "would wipe out str1's original contents" (not strictly true, you just make str1 point somewhere else, losing the original pointer).

How to assign a size to a new string? C++

I am trying to create an array of pointers to strings. I want each of the strings to have only 3 chars. This is the code I have so far:
string **ptr=new string *[100]; // An array of 100 pointers to strings
for (i=0;i<100; i++) // Assigning each pointer with a new string
{
ptr[i]=new string;
(*ptr[i])[3];
}
I am having trouble with the line (*ptr[i])[3]). If I were to create a srting with only 3 chars not via a pointer I would write:
string str[3];
How do I assign 3 chars with the pointer? Thanks!
std::vector<std::string> vec(100, " ");
That does exactly what you are looking for without the need to manage memory yourself.
string str[3];
That does not create a string with 3 characters, but an array of 3 strings.
(*ptr[i])[3]; simply accesses the fourth character in the string, it doesn't resize it. The std::string DOES provide a resize() method though.
As already mentioned, string str[3] creates an array of three strings but I don't think you're trying to talk about that.
As already pointed out, you can use the string ctor that takes a size argument and a fill char, like so:
ptr[i]=new string( 3, ' ' );
And of course you should use vector.

string initializing as NULL in C++

string a=NULL;
it gives error. Why and how can I initialize string as NULL?
but when I write
string a="foo";
this it works fine.
Actually to get an empty std::string, you just write
std::string a;
std::string's default constructor will give you an empty string without further prompting.
As an aside, using NULL in C++ is generally discouraged, the recommendation would be to either use 0 (which NULL tends to be defined to anyway) or if you have a modern enough compiler, nullptr.
There is a difference between null and empty string (an empty string is still a valid string). If you want a "nullable" object (something that can hold at most one object of a certain type), you can use boost::optional:
boost::optional<std::string> str; // str is *nothing* (i.e. there is no string)
str = "Hello, world!"; // str is "Hello, world!"
str = ""; // str is "" (i.e. empty string)
Let's break down what you are in fact doing:
string a=NULL;
First you execute string a. This creates a new object on the stack, with default value (an empty string). Then you execute a=NULL, which calls the assignment function of the string class. But what is NULL? NULL in C++ is macro expanded into just 0. So you are attepting to assign an integer to a string variable, which of course is not possible.
string a="abc"
works, because you want to assign a char array, and the string class has the assignment operator method overloaded for char arrays, but not for integers. That's why NULL doesn't work and "abc" works.

Basic c-style string memory allocation

I am working on a project with existing code which uses mainly C++ but with c-style strings. Take the following:
#include <iostream>
int main(int argc, char *argv[])
{
char* myString = "this is a test";
myString = "this is a very very very very very very very very very very very long string";
cout << myString << endl;
return 0;
}
This compiles and runs fine with the output being the long string.
However I don't understand WHY it works. My understanding is that
char* myString
is a pointer to an area of memory big enough to hold the string literal "this is a test". If that's the case, then how am I able to then store a much longer string in the same location? I expected it to crash when doing this due to trying to cram a long string into a space set aside for the shorter one.
Obviously there's a basic misunderstanding of what's going on here so I appreciate any help understanding this.
You're not changing the content of the memory, you're changing the value of the pointer to point to a different area of memory which holds "this is a very very very very very very very very very very very long string".
Note that char* myString only allocates enough bytes for the pointer (usually 4 or 8 bytes). When you do char* myString = "this is a test";, what actually happened was that before your program even started, the compiler allocated space in the executable image and put "this is a test" in that memory. Then when you do char* myString = "this is a test"; what it actually does is just allocate enough bytes for the pointer, and make the pointer point to that memory it had already allocated at compile time, in the executable.
So if you like diagrams:
char* myString = "this is a test";
(allocate memory for myString)
---> "this is a test"
/
myString---
"this is a very very very very very very very very very very very long string"
Then
myString = "this is a very very very very very very very very very very very long string";
"this is a test"
myString---
\
---> "this is a very very very very very very very very very very very long string"
There are two strings in the memory. First is "this is a test" and lets say it begins at the address 0x1000. The second is "this is a very very ... test" and it begins at the address 0x1200.
By
char* myString = "this is a test";
you crate a variable called myString and assign address 0x1000 to it. Then, by
myString = "this is a very very ... test";
you assign 0x1200. By
cout << myString << endl;
you just print the string beginning at 0x1200.
You have two string literals of type const char[n]. These can be assigned to a variable of type char*, which is nothing more than a pointer to a char. Whenever you declare a variable of type pointer-to-T you are only declaring the pointer, and not the memory to which it points.
The compiler reserves memory for both literals and you just take your pointer variable and point it at those literals one after the other. String literals are read-only and their allocation is taken care of by the compiler. Typically they are stored in the executable image in protected read-only memory. A string literal typically has a lifetime equal to that of the program itself.
Now, it would be UB if you attempted to modify the contents of a literal, but you don't. To help prevent yourself from attempting modifications in error you would be wise to declare your variable as const char*.
During program execution, a block of memory containing "this is a test" is allocated, and the address of the first character in that block of memory is assigned to the myString variable. In the next line, a separate block of memory containing "this is a very very..." is allocated, and the address of the first character in that block of memory is now assigned to the myString variable, replacing the address it used to store with the new address to the "very very long" string.
just for illustration, let's say the first block of memory looks like this:
[t][h][i][s][ ][i][s][ ][a][ ][t][e][s][t]
and let's just say the address of this first 't' character in this sequence/array of characters is 0x100.
so after the first assignment of the myString variable, the myString variable contains the address 0x100, which points to the first letter of "this is a test".
then, a totally different block of memory contains:
[t][h][i][s][ ][i][s][ ][a][ ][v][e][r][r][y]...
and let's just say that the address of this first 't' character is 0x200.
so after the second assignment of the myString variable, the myString variable NOW contains the address 0x200, which points to the first letter of "this is a very very very...".
Since myString is just a pointer to a character (hence: "char *" is it's type), it only stores the address of a character; it has no concern for how big the array is supposed to be, it doesn't even know that it is pointing to an "array", only that it is storing the address of a character...
for example, you could legally do this:
char myChar = 'C';
/* assign the address of the location in
memory in which 'C' is stored to
the myString variable. */
myString = &myChar;
Hopefully that was clear enough. If so, upvote/accept answer. If not, please comment so that I may clarify.
string literals do not require allocation - they are stored as-is and can be used directly. Essentially myString was a pointer to one string literal, and was changed to point to another string literal.
char* means a pointer to a block of memory that holds a character.
C style string functions get a pointer to the start of a string. They assume there's a sequence of characters that end with a 0-null character (\n).
So what the << operator actually does is loop from that first character position until it finds a null character.