Different ways to make a string? - c++

I'm new to c++ and I'd like to know right from the start,
Does any of these methods of making strings work exactly the same way and give the exact same results always in every case? is there any difference in the result in any of them?
1) char greeting [6] = { 'h','e','l','l','o','\0' };
2) char greeting[] = "hello";
3) #include <string>
string greeting = "hello";

1) and 2) work exactly the same. Both create a 6-element non-heap-allocated array, and copy the characters 'h', 'e', 'l', 'l', 'o', '\0' to the array at runtime or load time.
3) creates an instance of std::string and calls its constructor which copies the characters 'h', 'e', 'l', 'l', 'o'(, '\0')* to its internal memory buffer. (* The '\0' is not required to be stored in the memory buffer.)
There is another way to declare a string in C++, using a pointer to char:
const char* greeting = "hello";
This will not copy anything. It will just point the pointer to the first character 'h' of the null-terminated "hello" string which is located somewhere in memory. The string is also read-only (modifying it causes undefined behavior), which is why one should use a pointer-to-const here.
If you're wondering which one to use, choose std::string, it's the safest and easiest.

Do these methods of making strings work exactly the same way and give the exact same results always in every case?
The first two are array definitions:
char greeting [6] = { 'h','e','l','l','o','\0' };
char greeting [ ] = "hello";
"work the same" as in the second definition a '\0' is appended implicitly.
As for the third definition:
string greeting = "hello";
A string is a class type object and as such it is more complex than a simple array.
Is there any difference in the result in any of them?
There is a quantitative1 and qualitative2 difference between the first two and the third stemming from the fact that std::string is a class type object.
1. Quantitative: arrays occupy less memory space than string.
2. Qualitative: string provides resource management and many facilities for element manipulation.

Related

Get away with Initialize the char array without putting \0 at the end of string

I am new to c++ language,recently, as I was taught that:
we should put '\0' at the end of char array while doing initialization ,for example :
char x[6] = "hello"; //OK
However,if you do :
char x[5] = "hello";
Then this would raise the error :
initializer-string for array of chars is too long
Everything goes as I expect until the experssion below does not raise the compile error...:
char x[5] = {'h','e','l','l','o'};
This really confuses me , So I would like to ask two questions :
1.Why doesn't expression char x[5] = "hello"; raise error?
2.To my knowledge,the function strlen() would stop only if it finds '\0' to determine the lengh of char array,in this case,what would strlen(x) return?
Thanks!
The string literal "hello" has six characters, because there's an implied nul terminator. So
char x[] = "hello";
defines an array of six char. That's almost always what you want, because the C-style string functions (strlen, strcpy, strcat, etc.) operate on C-style strings, which are, by definition, nul terminated.
But that doesn't mean that every array of char will be nul terminated.
char x[] = { 'h', 'e', 'l', 'l', 'o' };
This defines an array of five char. Applying C-style string functions to this array will result in undefined behavior, because the array does not have a nul terminator.
You can do character-by-character initialization and create a valid C-style string by explicitly including the nul terminator:
char x[] = { 'h', 'e', 'l', 'l', 'o', '\0' };
This defines an array of six char that holds a C-style string (i.e., a nul terminated sequence of characters).
The key here is to separate in your mind the general notion of an array of char from the more specific notion of an array of char that holds a C-style string. The latter is almost always what you want to do, but that doesn't mean that there is never a use for the former. It's just that the former is uncommon.
As an aside, in C you're allowed to elide the nul terminator:
char x[5] = "hello";
this is legal C, and it creates an array of 5 char, with no nul terminator. In C++ that's not legal.
Why doesn't expression char x[5] = "hello"; raise an error?
This is not true. The appearance of an error is expected in this case.
To my knowledge, the function strlen() would stop only if it finds '\0' to determine the length of the char array, in this case, what would strlen(x) return?
If you can run the code somehow, the program will undergo an undefined-behavior. That is, you will not get what you would expect. The strlen() will only stop counting when it finds a null-terminator, i.e. it may go outside the initialized part of the char array and access the uninitialized ones – it's where the UB is invoked.

declare char variable in c++, why need to add 1 for declare the array size [duplicate]

This question already has answers here:
What is a null-terminated string?
(7 answers)
Closed 2 years ago.
There is a saying when we declare char variable.
We should declare like this -> char ArrayName[Maximum_C-String_Size+1];
For example:
char arr[4+1] = {'a', 'b', 'c', 'd'}
but
arr[4] = {'a', 'b', 'c', 'd'} is also work
why need to add 1?
thanks!
There is no need to do this, unless you are defining something that will be used as a null-terminated string.
// these two definitions are equivalent
char a[5] = { 'a', 'b', 'c', 'd' };
char b[5] = { 'a', 'b', 'c', 'd', '\0' };
If you only want an array with 4 char values in it, and you won't be using that with anything that expects to find a string terminator, then you don't need to add an extra element.
If you’re storing a C-style string in an array, then you need an extra element for the string terminator.
Unlike C++, C does not have a unique string data type. In C, a string is simply a sequence of character values including a zero-valued terminator. The string "foo" is represented as the sequence {'f','o','o',0}. That terminator is how the various string handling functions know where the string ends. The terminator is not a printable character and is not counted towards the length of the string (strlen("foo") returns 3, not 4), however you need to set aside space to store it. So, if you need to store a string that’s N characters long, then the array in which it is stored needs to be at least N+1 elements wide to account for the terminator.
However, if you’re storing a sequence that’s not meant to be treated as a string (you don’t intend to print it or manipulate it with the string library functions), then you don’t need to set aside the extra element.

How do char arrays and their pointers work in c++ exactly?

I am a beginner student in c++ and there is one thing I cannot understand when working with character arrays: So, I know that pointers are essentially variables that "point" to the memory address of another variable, and that the name of an array(Ex: int a[20]) is a constant pointer to the values in that array. When working with different numeric types(int, float etc.) if we output through a message the name of that array it shows the address of the first element, but if we do the same with a char type, it doesn't show the address, but the value of the variable.
Example:
#include <iostream>
using namespace std;
int main()
{int a[]={1,2,3,4,5};
cout<<a<<endl; //through output, it shows the memory address of the first
element of the array;
char b[]={"Mountain"};
cout<<b; //It outputs the word "Mountain"
return 0;
}
Is the pointer from a char array automatically converted to its value when you output it?
There is no difference. char pointers aren't somehow magically different than int pointers. So what's going on, than?
std::cout << (or the older printf()) have overloads for char*. Meaning that the functions behave differently if the input is a char*: the pointer is iterated until a '\0' character is reached (see null terminated string).
char b[]={"Mountain"};
b does not contain
{'M', 'o', 'u', 'n', 't', 'a', 'i', 'n'}
but instead
{'M', 'o', 'u', 'n', 't', 'a', 'i', 'n', '\0'} <- '\0'
making the iterating and stopping possible.
This also explains why the array size of b is 1 larger than the number of characters inside the word.
To add, you should not use these char pointers. They are dangerous and are long replaced by modern utilites like std::string.
now int a[]={1,2,3,4,5}; is OK but std::array<int, 5> a = {1,2,3,4,5}; is even better.
the types are unique (std::array<int, 4> != std::array<int, 5>)
it has a .size() function.
you can therefore pass it to other functions without having to add a size argument
it's as fast as a normal array
std::array can be used by including <array>.
If you ever go for something like int* a = new int[5]; than stop right there and instead use std::vector
Fianally never ever say use namespace std; (here why)
It all depends on how you interpret the parameters. cout << operator will consider (sometype*) as an address, but particularly char* as a string.
If you write a function taking your own parameters, you can interpret what ever the way you like.
In this problem, if you want to get the address, you can do it so
std::cout << static_cast<const void*>(b);
In C, strings are represented as a pointer to char. for this reason, when you pass a char* to an ostream (such as std::cout) in C++, it will interpret it as a null-terminated string and print that string's content rather than the address. If you want to print the address, you'll have to cast that pointer to a different kind:
std::cout << (void*)b;
cout is an output stream. When we use output streams, and pass a char*, it treats it as a null terminated string (i.e. it prints all the characters till it find '\0') in the string. For any other pointer type, the address is printed.

Initialize char array to hold non-null-terminated string [duplicate]

This question already has answers here:
Non null-terminated string compiler option for gcc
(6 answers)
Closed 7 years ago.
I am trying to initialize a char array with a long string. However, I do not want it to be NULL terminated.
This:
const char s[] = "The actual string is much longer then this...";
is much easier to read (and to write) than this:
const char s[] = {'T', 'h', 'e', ' ', 'a', 'c', 't', 'u', 'a', 'l', ' ', 's', ...};
but the former gets NULL terminated. Is there a way to avoid the NULL on a string literal?
The reason for doing this is that there is the need to pack densely strings in memory of fixed size length known during development.
No.
A string literal is a C-string which, by definition, is null-terminated.
Either ignore the final character, revisit your requirements (why do you care about a final character?!) or … I dunno, something else. Perhaps generate the objects with xxd?
I would do:
size_t length = 45;
char s[] = "The actual string is much longer then this..";
s[length - 1] = ".";
See what you have there has a trade-off between readability and functionality and I think that you can get away easily with this, since you can not avoid the NULL terminating string in the "normal" initialization.
If I were in your shoes, I would re-consider my approach and use std::string.
No. If you want essy to write code, copy to a second array but miss off the last char. You can use a char pointer in the first case to perhaps save some memory
The terminating nul will be omitted if it doesn't fit. Since your strings are all fixed length, that's not a problem to arrange. For example:
#include <stdio.h>
char foo[3][4] = { "four", ".by." , "3333" };
int main(void)
{
for (int i = 0; i < 3; ++i)
{
for (int j = 0; j < 4; ++j)
putchar(foo[i][j]);
putchar('\n');
}
}
There is no way to have a string literal not null terminated.
But you actually want to pack a number of strings densely and you know both the sizes and the strings at development time.
Assume:
"First"
"Second"
"Third"
to pack them you can safely do:
char const s[] = "First""Second""Third";
You only need to save lengths and properly reconstruct termination in case you want to print or use std string. That is easy though.
As a bonus you have saved from the excess pointer you would have to store for each and everyone string needed.

The importance of null character when initializing char arrays

I'm new with C++ and I started to wonder, what happens if you leave the null character out when defining a char array?
For example, if I define a char array with the null character:
char myarray[] = {'a', 'b', 'c', '\0'};
and then I define it without the null character:
char myarray[] = {'a', 'b', 'c'};
What is the importance of the null character in this scenario? Might the absence of null character in the example above cause some problems later on?...Do you recommend always including or excluding the null character when defining char arrays this way?
Thank you for any help :)
It means that anything that takes a char* as parameter, expecting it to be a null-terminated string, will invoke undefined behaviour, and fail in one way or another*.
Some examples are strlen, the std::string(const char*) constructor, the std::ostream operator<< specialization for char*...
* undefined behaviour means it could even work "correctly", but there is no guarantee this is reproducible.
char myarray[] = {'a', 'b', 'c'};
If you define with out nul character it is a valid character array not valid string.
1.You should not use this character array as an argument to the string functions like strlen(),strcpy(),etc..
2.You should not print this as we print string with %s in C.
3.You can print character by character.
4.You can compare character by character.
Further char myarray[] = {'a', 'b', 'c', '\0'}; is equal to "abc"
but char myarray[] = {'a', 'b', 'c'}; is not equal to "abc"
If you don't have the terminating null-character, you can still use your char array like you could with the null-char.
However, functions that expect null-terminated strings (like strlen), will not stop at the end, since they don't know where the end is. (That's what the null-char is for)
They will therefore continue to work in memory until they either go out of bounds and you get a segmentation fault or run until they find their null-char.
Basically, if you want your char array to be a string, append a null-char to denote the end.
what happens if you leave the null character out when defining a char array?
You get an array containing just the characters you specify.
What is the importance of the null character in this scenario?
In C, it's conventional to represent a string as a null-terminated character array. This convention is sometimes used in C++ to interoperate with C-style interfaces, or to work with string literals (which inherited their specification from C), or because the programmer thinks it's a good idea for some reason. If you're going to do this, then obviously you'll need to terminate all the arrays you want to interpret as strings.
The question seems to be about C++, although you've also tagged it C for some reason. In C++, you usually want to use std::string to manage strings for you. Life is too short for messing around with low-level arrays and pointers.
Might the absence of null character in the example above cause some problems later on?
If you pass a non-terminated array to a function expecting a terminated array, then it will stomp off the end of the array causing undefined behaviour.
Do you recommend always including or excluding the null character when defining char arrays this way?
I recommend understanding what the array is supposed to be used for, and include the terminator if it's supposed to be a C-style string.
What is the importance of the null character in this scenario?
High.
Might the absence of null character in the example above cause some problems later on?
Yes. C and C++ functions taking a char* that points to this C-string will require it to be null-terminated.
Do you recommend always including or excluding the null character when defining char arrays this way?
Neither. I recommend using std::string, since you said you are writing C++.
null character will be used by strlen like functions if you wish to use your array as a string. If I need a string because I want to use some text I write:
const char* mystr = "abc"; // it is already null terminated
writing:
char myarray[] = {'a', 'b', 'c', '\0'};
is to verbose