return char1 + char2? Isn't it possible? - c++

I'm trying to return a string from a function. Which basically adds some chars together and return the string representation.
string toString() {
char c1, c2, c3;
// some code here
return c1 + c2; // Error: invalid conversion from `char' to `const char*'
}
it is possible to return boolean values like return c1 == 'x'. Isn't it possible to return string values? I know that it is possible to it like this:
string result;
result.append(c1, c2);
return result;
I'm new to C++ so I thought that there must be more elegant solution around.

No, you can't do that because adding two char's together doesn't give you a string. It gives you another char; in this case 'a'+'b' actually gives you '├' (on Windows with the standard CP_ACP code page). Char is an ordinal type, like integers and the compiler only knows how to add them in the most basic of ways. Strings are a completely different beast.
You can do it, but you have to be explicit:
return string(1, c1) + string(1, c2)
This will construct two temporary strings, each initialized to one repetition of the character passed as the second parameter. Since operator+ is defined for strings to be a concatenation function, you can now do what you want.

char types in C++ (as well as in C) are integral types. They behave as integral types. Just like when you write 5 + 3 in your code, you expect to get integral 8 as the result (and not string "53"), when you write c1 + c2 in your code above you should expect to get an integral result - the arithmetic sum of c1 and c2.
If you actually want to concatenate two characters to form a string, you have to do it differently. There are many ways to do it. For example, you can form a C-style string
char str[] = { c1, c2, `\0` };
which will be implicitly converted to std::string by
return str;
Or you can build a std::string right away (which can also be done in several different ways).

You can convert each char to a string then use +:
return string(1, c1)+string(1, c2);
Alternately, string has the + operator overload to work with characters, so you can write:
return string(1, c1) + c2;
No matter what method you choose, you will need to convert the integral type char to either a C-style string (char*) or a C++ style string (std::string).

return string(1, c1) + c2;
This constructs a 1-character string, containing c1, then adds (overloaded to concatenate) c2 (creating another string), then returns it.

No, they just adds up the character codes. You need to convert them to strings.

You need to create a string from the chars.
And then return the string (actually a copy of the string)

Related

How to properly concatenate std::strings in this case? [C++] [duplicate]

I am reading Accelerated C++ by Koenig. He writes that "the new idea is that we can use + to concatenate a string and a string literal - or, for that matter, two strings (but not two string literals).
Fine, this makes sense I suppose. Now onto two separate exercises meant to illuminate this .
Are the following definitions valid?
const string hello = "Hello";
const string message = hello + ",world" + "!";
Now, I tried to execute the above and it worked! So I was happy.
Then I tried to do the next exercise;
const string exclam = "!";
const string message = "Hello" + ",world" + exclam;
This did not work. Now I understand it has something to do with the fact that you cannot concatenate two string literals, but I don't understand the semantic difference between why I managed to get the first example to work (isn't ",world" and "!" two string literals? Shouldn't this not have worked?) but not the second.
const string message = "Hello" + ",world" + exclam;
The + operator has left-to-right associativity, so the equivalent parenthesized expression is:
const string message = (("Hello" + ",world") + exclam);
As you can see, the two string literals "Hello" and ",world" are "added" first, hence the error.
One of the first two strings being concatenated must be a std::string object:
const string message = string("Hello") + ",world" + exclam;
Alternatively, you can force the second + to be evaluated first by parenthesizing that part of the expression:
const string message = "Hello" + (",world" + exclam);
It makes sense that your first example (hello + ",world" + "!") works because the std::string (hello) is one of the arguments to the leftmost +. That + is evaluated, the result is a std::string object with the concatenated string, and that resulting std::string is then concatenated with the "!".
As for why you can't concatenate two string literals using +, it is because a string literal is just an array of characters (a const char [N] where N is the length of the string plus one, for the null terminator). When you use an array in most contexts, it is converted into a pointer to its initial element.
So, when you try to do "Hello" + ",world", what you're really trying to do is add two const char*s together, which isn't possible (what would it mean to add two pointers together?) and if it was it wouldn't do what you wanted it to do.
Note that you can concatenate string literals by placing them next to each other; for example, the following two are equivalent:
"Hello" ",world"
"Hello,world"
This is useful if you have a long string literal that you want to break up onto multiple lines. They have to be string literals, though: this won't work with const char* pointers or const char[N] arrays.
You should always pay attention to types.
Although they all seem like strings, "Hello" and ",world" are literals.
And in your example, exclam is a std::string object.
C++ has an operator overload that takes a std::string object and adds another string to it. When you concatenate a std::string object with a literal it will make the appropriate casting for the literal.
But if you try to concatenate two literals, the compiler won't be able to find an operator that takes two literals.
Since C++14 you can use two real string literals:
const string hello = "Hello"s;
const string message = hello + ",world"s + "!"s;
or
const string exclam = "!"s;
const string message = "Hello"s + ",world"s + exclam;
Your second example does not work because there is no operator + for two string literals. Note that a string literal is not of type string, but instead is of type const char *. Your second example will work if you revise it like this:
const string message = string("Hello") + ",world" + exclam;
The difference between a string (or to be precise, std::string) and a character literal is that for the latter there is no + operator defined. This is why the second example fails.
In the first case, the compiler can find a suitable operator+ with the first argument being a string and the second a character literal (const char*) so it used that. The result of that operation is again a string, so it repeats the same trick when adding "!" to it.
In case 1, because of order of operations you get:
(hello + ", world") + "!" which resolves to hello + "!" and finally to hello
In case 2, as James noted, you get:
("Hello" + ", world") + exclam which is the concat of 2 string literals.
Hope it's clear :)
if we write
string s = "hello" + "world!";
RHS has following type
const char [6] + const char [7]
Now both are built in data types.
ie, they are not std::string types any more.
So, now operator overloading of built in types
as defined by compiler applies.
ie - no more operator + overloaded by std::string.
now let us turn to how compiler overloads
binary operator for two operands of const char * type.
it turns out, compiler did not overload for this case, as it is meaning less.
ie, adding two 'const char *' is semantically wrong as result would be another const char * in run time.
There can be many reason why above does not make sense.
Hence over all, there is one generic rule for any operator overloading. it is :
overloading any operator when all operands of that operator are built-in only. Compiler designers would take of such cases. In our exact question, std::string can't overload two 'const literals' because of this rule, and compiler choose to not to implement the + binary operator for its meaninglessness.
if we like the string literal form and we can a "s" operator as below.
std::string p = "hello"s + "world!"s;
just suffix with s, the meaning changes.
(s overloaded operator)

C++ operator '==' cannot compare string[i] with another string. Compile error

#include <iostream>
using namespace std;
int main()
{
string str = "abcdef";
string x = "a";
if (str[0] == x) {
//do something...
}
return 0;
}
and cannot compile.
"error: no match for ‘operator==’ (operand types are ‘__gnu_cxx::__alloc_traits, char>::value_type’ {aka ‘char’} and ‘std::string’ {aka ‘std::__cxx11::basic_string’})"
std::string except for being a string also provides interface of being a container of chars. So when you use operator[] you access and individual char from this container and you cannot compare a char with a string. If you want to have a single symbol string instead use std::string::substr() with length 1. Or if you want the symbol to compare with another one declare x as being a single char instead of string.
The problem here is that you're comparing a char with a string
str[0] is actually a char
Just need to declare x as char...
#include <iostream>
using namespace std;
int main()
{
string str = "abcdef";
char x = 'a';
if (str[0] == x) {
//do something...
}
return 0;
}
You are asking why your code doesn't compile.
If we look at your code line by line we can see that...
string str = "abcdef";
string x = "a";
if (str[0] == x)
From line one above you declared a string str that stores the set of character encoding values of {a, b, c, d, e, f} either it be ASCII, UTF-X, etc.
On your second line you declare another string x that stores the set of character encoding values of {a} either it be ASCII, UTF-X, etc.
The problem of not compiling does not show up until the expression within the if statement.
The LHS of the expression you are using std::string's operator[] to access the value at the index of its first location in memory. This returns a reference to a character at that indexed location. Then on the RHS of the expression you are comparing the LHS character reference against the std::string named x.
The issue here is that there is no conversion between a reference to a char and std::string and that you have not defined your own operator==() that would do so.
The easiest fix is to change either the LHS to a string or the RHS to a char. There may also be available functions or algorithms within the STL that would do the comparison(s) for you. You can do an online search for that.
You can refer to cppreference:string:basic_string:operator_at for detailed information about std::string's operator[]. And you can search their site for other functions, algorithms and string manipulators and other types of containers. It is probably one of the best resources out there for the C/C++ STL.

Issues with converting between "string" "const unsigned char" and "utf8proc_uint8_t"

Maybe a simple issue, but I've gotten confused between "byte arrays", pointers and casts in c++.
Take a look at the following and let me know what I need to read about to fix it, as well as the fix. It relates to the utf8proc library.
const unsigned char *aa = (const unsigned char*)e.c_str();
utf8proc_uint8_t* a = utf8proc_NFC(aa);
char b = (char)a;
string d = string(b);
It is bad enough no need for an error message here, but there is no constructor string on the string(b) line.
There appear to be a couple problems here. The biggest is the assignment:
char b = (char)a;
What you are doing is telling the compiler to take the pointer (memory location) and convert that to a char, then assign it to single char value b. So you'll basically have random jibberish in b.
Instead, if you want to treat a like a basic char*, you would write:
char* b = (char*)a;
Then you could use the string class with either:
string d = string(b);
or you could skip several line by the direct conversion:
string d = string((char*)a);
You are also looking for a headache down the line if you don't delete the conversion value returned by the utf8proc_NFC() call, and if you don't do an error check after the conversion.
Plus I'll put in a plug for using some Hungarian notation to distinguish a pointer (a 'p' prefix on variables). This makes it obvious that you can do things like:
char tmp = *pStr; // a single character (first in the string)
char tmp2 = pStr[1]; // a single character (second in the string)
char* pTmp = pStr; // a pointer to a null terminated string
But you would never see:
char tmp3 = (char)pStr; // compiles, but makes no sense to treat pointer as a character.
So here is a clean version of all of this:
utf8proc_uint8_t* pUTF = utf8proc_NFC( (const unsigned char*)e.c_str() );
string strUTF;
if (pUTF)
{
strUTF = (char*)pUTF;
free pUTF;
}
This code is almost certainly not what you want, since it is casting a pointer into a scalar.
utf8proc_uint8_t* a = ...;
char b = (char)a;
Instead, you want to cast and produce a pointer:
utf8proc_uint8_t* a = ...;
const char *b = (const char *)a;
I also added const, which is not strictly necessary but a good idea to use wherever you can.

Difference when concatenating strings in c++

What's the difference, in terms of the underneath process, for the following two statements:
string strA = "stringA" + "stringB";
string strB = string("stringA") + string("stringB");
The difference is pretty fundamental.
The type of "stringA" is char const[8].
The type of std::string("stringA") is std::string.
There is no operator+ defined that accepts two arguments of types char const[] or char const*.
Whereas, there are overloaded operator+(std::string, chat const*) and operator+(chat const*, std::string).
In other words, if you'd like to use operator+ to concatenate string literals, the first or the second string must be std::string, so that it finds that overloaded operator+. E.g.
std::string("a") + "b" + "c" + "d"
// or
"a" + std::string("b") + "c" + "d"
string strA = "stringA" + "stringB";
error, cannot add two pointers
string strA = "stringA" "stringB";
concatenates in compiler, same as "stringAstringB"
string strB = string("stringA") + string("stringB");
creates two std::string objects, adds them, returning a new object, and should then move construct into strB, so only 3 std::string constructors should be called in c++11. Compiler will probably optimize all that away.

C++: set of C-strings

I want to create one so that I could check whether a certain word is in the set using set::find
However, C-strings are pointers, so the set would compare them by the pointer values by default. To function correctly, it would have to dereference them and compare the strings.
I could just pass the constructor a pointer to the strcmp() function as a comparator, but this is not exactly how I want it to work. The word I might want to check could be part of a longer string, and I don't want to create a new string due to performance concerns. If there weren't for the set, I would use strncmp(a1, a2, 3) to check the first 3 letters. In fact, 3 is probably the longest it could go, so I'm fine with having the third argument constant.
Is there a way to construct a set that would compare its elements by calling strncmp()? Code samples would be greatly appreciated.
Here's pseudocode for what I want to do:
bool WordInSet (string, set, length)
{
for (each word in set)
{
if strncmp(string, word, length) == 0
return true;
}
return false;
}
But I'd prefer to implement it using the standard library functions.
You could create a comparator function object.
struct set_object {
bool operator()(const char* first, const char* second) {
return strncmp(first, second, 3);
}
};
std::set<const char*, set_object> c_string_set;
However it would be far easier and more reliable to make a set of std::strings.
Make a wrapper function:
bool myCompare(const char * lhs, const char * rhs)
{
return strncmp(lhs, rhs, 3) < 0;
}
Assuming a constant value as a word length looks like asking for trouble to me. I recommend against this solution.
Look: The strcmp solution doesn't work for you because it treats the const char* arguments as nul-terminated strings. You want a function which does exactly the same, but treats the arguments as words - which translates to "anything-not-a-letter"-terminated string.
One could define strcmp in a generic way as:
template<typename EndPredicate>
int generic_strcmp(const char* s1, const char* s2) {
char c1;
char c2;
do {
c1 = *s1++;
c2 = *s2++;
if (EndPredicate(c1)) {
return c1 - c2;
}
} while (c1 == c2);
return c1 - c2;
}
If EndPredicate is a function which returns true iff its argument is equal to \0, then we obtain a regular strcmp which compares 0-terminated strings.
But in order to have a function which compares words, the only required change is the predicate. It's sufficient to use the inverted isalpha function from <cctype> header file to indicate that the string ends when a non-alphabetic character is encountered.
So in your case, your comparator for the set would look like this:
#include <cctype>
int wordcmp(const char* s1, const char* s2) {
char c1;
char c2;
do {
c1 = *s1++;
c2 = *s2++;
if (!isalpha(c1)) {
return c1 - c2;
}
} while (c1 == c2);
return c1 - c2;
}