What's the difference, in terms of the underneath process, for the following two statements:
string strA = "stringA" + "stringB";
string strB = string("stringA") + string("stringB");
The difference is pretty fundamental.
The type of "stringA" is char const[8].
The type of std::string("stringA") is std::string.
There is no operator+ defined that accepts two arguments of types char const[] or char const*.
Whereas, there are overloaded operator+(std::string, chat const*) and operator+(chat const*, std::string).
In other words, if you'd like to use operator+ to concatenate string literals, the first or the second string must be std::string, so that it finds that overloaded operator+. E.g.
std::string("a") + "b" + "c" + "d"
// or
"a" + std::string("b") + "c" + "d"
string strA = "stringA" + "stringB";
error, cannot add two pointers
string strA = "stringA" "stringB";
concatenates in compiler, same as "stringAstringB"
string strB = string("stringA") + string("stringB");
creates two std::string objects, adds them, returning a new object, and should then move construct into strB, so only 3 std::string constructors should be called in c++11. Compiler will probably optimize all that away.
Related
Today I was surprised when trying to concatenate an std::string with an int. Consider the following MWE:
#include <iostream>
#include <string>
void print(const std::string& text)
{
std::cout << "The string is: " << text << ".\n";
}
int main()
{
print("iteration_" + 1);
return 0;
}
Instead of printing
The string is: iteration_1.
which I would expect, it prints
The string is: teration_.
What exactly is going on in the background? Does the string for some reason get converted into char[] or something of the sort? The documentation of operator+ does not list any with an std::string and int.
And what is the proper way of concatenating an std::string with a number? Do I really have to throw them both into an std::stringstream or convert the number into std::string explicitely with std::to_string()?
Does the string for some reason get converted into char[]
Actually it is the other way around. "iteration_" is a char[11] which decays to a const char* when you add 1. Incrementing the pointer by one makes it point to the next character in the string. This is then used to construct a temporary std::string that contains all but the first character.
The documentation you link is for operator+ of std::string, but to use that you need a std::string first.
This line is the problem:
print("iteration_" + 1);
The string literal is decaying to a char*. You are adding 1 to this char*, moving it to the next character.
If you wanted to add the string "1" to the end of your literal, a fairly simple way is to pass the string literal to the std::string constructor and convert the 1 to a string manually. For example:
print(std::string("iteration_") + std::to_string(1));
"iteration_" is not std::string, but const char[]. Which decays to const char*, and "iteration_" + 1 just performs pointer arithmetic and move the pointer pointing to the next char (i.e. 't'), then you got the c-style string "teration_".
You can use std::to_string to convert int to std::string, then concatenate them. e.g.
print("iteration_" + std::to_string(1));
For this case std::operator+(std::basic_string) is called and the 1st argument "iteration_" is converted to std::string implicitly and then passed to operator+, then the concatenated std::string is passed to print.
LIVE
If you try to use the following:
std::string str = "iteration" + 1;
compiler will throw the warning:
warning: adding 'int' to a string does not append to the string
[-Wstring-plus-int]
It is because you are incrementing the pointer to "iteration" string by 1 which means that now "teration" string is being assigned to str variable.
The proper way of concatenating would be:
std::string str = "iteration" + std::to_string(1);
The expression "iteration_" + 1 is a const char[11] literal added to the int 1.
In that expression, "iteration_" decays to a const char* pointer to the first element of the array. + 1 then takes place in pointer arithmetic on that pointer. The entire expression evaluates to a const char* type (pointing to the first t) which is a valid NUL-terminated input to a std::string constructor! (The anonymous temporary std::string binds to the const std::string& function parameter.)
This is completely valid C++ and can occasionally be put to good use.
If you want to treat + as a concatenation, then
print("iteration_" + std::to_string(1));
is one way.
I am reading Accelerated C++ by Koenig. He writes that "the new idea is that we can use + to concatenate a string and a string literal - or, for that matter, two strings (but not two string literals).
Fine, this makes sense I suppose. Now onto two separate exercises meant to illuminate this .
Are the following definitions valid?
const string hello = "Hello";
const string message = hello + ",world" + "!";
Now, I tried to execute the above and it worked! So I was happy.
Then I tried to do the next exercise;
const string exclam = "!";
const string message = "Hello" + ",world" + exclam;
This did not work. Now I understand it has something to do with the fact that you cannot concatenate two string literals, but I don't understand the semantic difference between why I managed to get the first example to work (isn't ",world" and "!" two string literals? Shouldn't this not have worked?) but not the second.
const string message = "Hello" + ",world" + exclam;
The + operator has left-to-right associativity, so the equivalent parenthesized expression is:
const string message = (("Hello" + ",world") + exclam);
As you can see, the two string literals "Hello" and ",world" are "added" first, hence the error.
One of the first two strings being concatenated must be a std::string object:
const string message = string("Hello") + ",world" + exclam;
Alternatively, you can force the second + to be evaluated first by parenthesizing that part of the expression:
const string message = "Hello" + (",world" + exclam);
It makes sense that your first example (hello + ",world" + "!") works because the std::string (hello) is one of the arguments to the leftmost +. That + is evaluated, the result is a std::string object with the concatenated string, and that resulting std::string is then concatenated with the "!".
As for why you can't concatenate two string literals using +, it is because a string literal is just an array of characters (a const char [N] where N is the length of the string plus one, for the null terminator). When you use an array in most contexts, it is converted into a pointer to its initial element.
So, when you try to do "Hello" + ",world", what you're really trying to do is add two const char*s together, which isn't possible (what would it mean to add two pointers together?) and if it was it wouldn't do what you wanted it to do.
Note that you can concatenate string literals by placing them next to each other; for example, the following two are equivalent:
"Hello" ",world"
"Hello,world"
This is useful if you have a long string literal that you want to break up onto multiple lines. They have to be string literals, though: this won't work with const char* pointers or const char[N] arrays.
You should always pay attention to types.
Although they all seem like strings, "Hello" and ",world" are literals.
And in your example, exclam is a std::string object.
C++ has an operator overload that takes a std::string object and adds another string to it. When you concatenate a std::string object with a literal it will make the appropriate casting for the literal.
But if you try to concatenate two literals, the compiler won't be able to find an operator that takes two literals.
Since C++14 you can use two real string literals:
const string hello = "Hello"s;
const string message = hello + ",world"s + "!"s;
or
const string exclam = "!"s;
const string message = "Hello"s + ",world"s + exclam;
Your second example does not work because there is no operator + for two string literals. Note that a string literal is not of type string, but instead is of type const char *. Your second example will work if you revise it like this:
const string message = string("Hello") + ",world" + exclam;
The difference between a string (or to be precise, std::string) and a character literal is that for the latter there is no + operator defined. This is why the second example fails.
In the first case, the compiler can find a suitable operator+ with the first argument being a string and the second a character literal (const char*) so it used that. The result of that operation is again a string, so it repeats the same trick when adding "!" to it.
In case 1, because of order of operations you get:
(hello + ", world") + "!" which resolves to hello + "!" and finally to hello
In case 2, as James noted, you get:
("Hello" + ", world") + exclam which is the concat of 2 string literals.
Hope it's clear :)
if we write
string s = "hello" + "world!";
RHS has following type
const char [6] + const char [7]
Now both are built in data types.
ie, they are not std::string types any more.
So, now operator overloading of built in types
as defined by compiler applies.
ie - no more operator + overloaded by std::string.
now let us turn to how compiler overloads
binary operator for two operands of const char * type.
it turns out, compiler did not overload for this case, as it is meaning less.
ie, adding two 'const char *' is semantically wrong as result would be another const char * in run time.
There can be many reason why above does not make sense.
Hence over all, there is one generic rule for any operator overloading. it is :
overloading any operator when all operands of that operator are built-in only. Compiler designers would take of such cases. In our exact question, std::string can't overload two 'const literals' because of this rule, and compiler choose to not to implement the + binary operator for its meaninglessness.
if we like the string literal form and we can a "s" operator as below.
std::string p = "hello"s + "world!"s;
just suffix with s, the meaning changes.
(s overloaded operator)
im realy confused about const char * and char *.
I know in char * when we want to modify the content, we need to do something like this
const char * temp = "Hello world";
char * str = new char[strlen(temp) + 1];
memcpy(str, temp, strlen(temp));
str[strlen(temp) + 1] = '\0';
and if we want to use something like this
char * str = "xxx";
char * str2 = "xts";
str = str2;
we get compiler warning. it's ok I know when i want to change char * I have to use something memory copy. but about const char * im realy confused. in const char * I can use this
const char * str = "Hello";
const char * str2 = "World";
str = str2; // and now str is Hello
and I have no compiler error ! why ? why we use memory copy when is not const and in const we only use equal operator ! and done !... how possible? is it ok to just use equal in const? no problem happen later?
As other answers say, you should distinguish pointers and bytes they point to.
Both types of pointers, char * and const char *, can be changed, that is, "redirected" to point to different bytes. However, if you want to change the bytes (characters) of the strings, you cannot use const char *.
So, if you have string literals "Hello" and "World" in your program, you can assign them to pointers, and printing the pointer will print the corresponding literal. However, to do anything non-trivial (e.g. change Hello to HELLO), you will need non-const pointers.
Another example: with some pointer manipulation, you can remove leading bytes from a string literal:
const char* str = "Hello";
std::cout << str; // Hello
str = str + 2;
std::cout << str; // llo
However, if you want to extract a substring, or do any other transformation on a string, you should reallocate it, and for that you need a non-const pointer.
BTW since you are using C++, you can use std::string, which makes it easier to work with strings. It reallocates strings without your intervention:
#include <string>
std::string str("Hello");
str = str.substr(1, 3);
std::cout << str; // ell
This is a confusing hangover from the days of early C. Early C didn't have const, so string literals were "char *". They remained char * to avoid breaking old code, but they became non-modifiable, so const char * in all but name. So modern C++ either warns or gives an error (to be strictly conforming) when the const is omitted.
Your memcpy missed the trailing nul byte, incidentally. Use strcpy() to copy a string, that's the right function with the right name. You can create a string in read/write memory by use of the
char rwstring[] = "I am writeable";
syntax.
That is cause your variables are just a pointers *. You're not modifiying their contents, but where they are pointing to.
char * a = "asd";
char * b = "qwe";
a = b;
now you threw away the contents of a. Now a and b points to the same place. If you modify one, both are modified.
In other words. Pointers are never constants (mostly). your const predicate in a pointer variable does not means nothing to the pointer.
The real difference is that the pointer (that is not const) is pointing to a const variable. and when you change the pointer it will be point to ANOTHER NEW const variable. That is why const has no effect on simple pointers.
Note: You can achieve different behaviours with pointers and const with more complex scenario. But with simple as it, it mostly has no effect.
Citing Malcolm McLean:
This is a confusing hangover from the days of early C. Early C didn't have const, so string literals were "char *". They remained char * to avoid breaking old code, but they became non-modifiable, so const char * in all but name.
Actually, string literals are not pointers, but arrays, this is why sizeof("hello world") works as a charm (yields 12, the terminating null character is included, in contrast to strlen...). Apart from this small detail, above statement is correct for good old C even in these days.
In C++, though, string literals have been arrays of constant characters (char const[]) right from the start:
C++ standard, 5.13.5.8:
Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type “array of n const char”, where n is the size of the string as defined below, and has static storage duration.
(Emphasised by me.) In general, you are not allowed to assign pointer to const to pointer to non-const:
char const* s = "hello";
char ss = s;
This will fail to compile. Assigning string literals to pointer to non-const should normally fail, too, as the standard explicitly states in C.1.1, subclause 5.13.5:
Change: String literals made const.
The type of a string literal is changed from “array of char” to “array of const char”.
[...]char* p = "abc"; // valid in C, invalid in C++
Still, string literal assignement to pointer to non-const is commonly accepted by compilers (as an extension!), probably to retain compatibility to C. As this is, according to the standard, invalid, the compiler yields a warning, at least...
I'm having trouble understanding some particular behaviour of assignment in strings.
//method 1
std::string s;
s+='a' //This works perfectly
but
//method2
std::string s;
s="" + 'a';//This gives unexpected value
Why 2nd method gives unexpected value ? From what I've read string default constructor initialise string variable as empty string, if no constructor is specified. And s+='a' should be same as s=s+a. So why isn't the method 2 same as method 1?
And one more query on the same topic , if we can't initialise a string with char literal then how can we assign a char literal to it?
std::string s2='a'//gives error while compiling
whereas
std::string s2;
s2='a'//works perfect
From what I understand is we cannot initialise a string variable by char variable because string constructor needs argument of the type(const char *). Why is there not any such restriction while assigning?
For your first query ,
method 1 works perfectly cause in this method you are adding string object type and char literal .
and s+='a' , is indeed same as s=s+'a'
focus on the fact that s is string object type rather than string literal.
In the 2nd method , you are adding string literal
and char literal . Focus on the difference between the two , In first method there is string object you can add string or char literals to string object type,its one of the features provided by string object type . But you cant add simply add the literals with each other.In c++ , however "StringLiteral1" "StringLiteral2" , will result in the concatenation of the two string literals.
for 2nd query,
Initialisation is not the same as assignment , string object default constructor takes const char * to initialise . Assignment is a completely differenet story(if not,someone please do correct me ).
"" is a string literal of type const char[], and you are adding the string literal, i.e. the pointer to the first element, '\0', to another character. This will naturally give you something else then you expected.
If you want it to be the same as s += 'a', you'll need to use a std::string literal: s += ""s + 'a';. This works, as ""s is an empty std::string, and you just add another character to it.
When you write s="" + 'a'; Remember that "" is not a std::string but a const char*. And const char* doesn't have a predefined concatenation operator. That's why you are having an unexpected behavior instead of concatenation.
I can do what the following code shows:
std::string a = "chicken ";
std::string b = "nuggets";
std::string c = a + b;
However, this fails:
std::string c = "chicken " + "nuggets";
It gives me an error saying " '+' cannot add two pointers". Why is that so? Is there an alternative way to do this without getting an error?
"chicken " and "nuggets" are literal C-string and not std::string.
You may concatenate directly with:
std::string c = "chicken " "nuggets";
Since C++14, you may add suffix s to have string
using namespace std::string_literals;
std::string c = "chicken "s + "nuggets"s;
"chicken " and "nuggets" are not of type std::string but are const char[]. As such even though you want to assign the concatenation to a string they types don't have an operator +. You could solve this using:
std::string c = std::string("chicken ") + "nuggets";
Both "chicken" and "nuggets" has type const char* (as literals in code). So you are trying to add to pointers.
Try
std::string c = std::string("chicken") + "nuggets";
std::string is not part of language itself, it's a part of standard library. C++ aims to have as few language features as possible. So built-in type for strings is not present in parser etc. That's why string literals will be treaded as pointers.
EDIT
To be completely correct: type of literals is really const char[N] (where N is character count in literal +1 for \0. But in C++ arrays ([]) can be treated as pointers, and that is what compiler tries to do (as it cannot add arrays)