Prefix vs infix operator* with string literals [duplicate] - c++

This question already has an answer here:
Implicit conversion with operator
(1 answer)
Closed 7 years ago.
I recently gave an answer to this question on how to get Python-like string repeats, e.g. "hello" * 2 gives "hellohello".
I won't repeat the definition here, but the function declaration is:
std::string repeat(std::string str, const std::size_t n);
and can of course can be used like:
std::cout << repeat("helloworld", 2) << std::endl;
To get closer to the Python version, I thought I'd overload operator*. Ideally I'd use a universal reference to avoid the additional std::string move, but operators must use a user-defined type. So I tried this instead:
#include <type_traits> // std::enable_if_t, std::is_integral
#include <utility> // std::move
template <typename T, typename = std::enable_if_t<std::is_integral<T>::value>>
std::string operator*(std::string str, const T n)
{
return repeat(std::move(str), static_cast<std::size_t>(n));
}
Now I can do this:
std::cout << (std::string("helloworld") * 2) << std::end;
and this:
std::cout << operator*("helloworld", 2) << std::endl;
but not this:
std::cout << ("helloworld" * 2) << std::endl;
// error: invalid operands to binary expression ('const char *' and 'int')
Why not?

When you define an overloaded operator, at least one of the operands must be of a user-defined type. For pre-defined types, all operators are either pre-defined, or else prohibited.
Where you've explicitly converted to std::string the string ctor that takes a char const * as its parameter can/will be used to convert the literal to an std::string, but without that, the compiler can't/won't do the conversion.
Likewise, when you invoke the operator more explicitly as operator*("helloworld", 2), the compiler "knows" that it needs to convert the string literal to a type supported by an overload of operator *, so it (basically) enumerates all the types to which a string literal can be converted, and then sees if it can find an operator * that fits one of those types. If it finds more than one, it does (if memory serves) normal overload resolution on the candidate operator * implementations to decide which to use.
With just the expression string-literal * int, however, both types are built in, so it only examines the built-in operators. Since none of them fits, the expression is prohibited.
Note that with a current compiler, you could use a suffix of s on the string literal to create a std::string:
#include <string>
std::cout << "helloworld"s * s << "\n";

Because "helloworld" is not a std::string, it's a char array.

Related

Which overload does an operator use in C++?

Everybody knows that you can't concatenate 2 string literals using the + operator.
#include <iostream>
int main() {
std::cout << "hello " + "world";
}
// Error
What's happening here is that you are trying to add 2 char* which is an error. You can however add a string literal to a std::string.
#include <iostream>
int main() {
std::string s = "hello ";
std::cout << s + "world";
}
// Fine, prints hello world
But what I found is that the below code is also valid.
#include <iostream>
int main() {
std::string s = "world";
std::cout << "hello " + s;
}
// Fine, prints hello world
I would imagine in the above example that you are trying to add a std::string to a char* but it works fine. I think it may just be using the std::string overload of the + operator. My question is what exactly is happening here and how does the operator decide which overload to use in a situation such as with 2 different classes with perfectly valid overloads being added together.
What's happening here is that you are trying to add 2 char* which is an error.
To be a bit more correct, you're trying to add two arrays, each of which decay to const char*.
My question is what exactly is happening here
You're using these overloads:
std::string
operator+(const std::string& lhs, const char* rhs);
std::string
operator+(const char* lhs, const std::string& rhs);
how does the operator decide which overload to use
It uses the same overload resolution as normal functions do. The complete and precise description won't fit within this answer since overload resolution is quite complex.
In short: There is a list of all functions by the same name. This is the overload set. If all arguments (operands in case of operator overload) can be converted to the formal parameters of the function, then that function is a viable candidate for the overload resolution. The candidates are ranked by a set of rules. Candidate requiring "less" conversion is ranked higher. If one candidate is unambiguously the most highly ranked candidate, then that overload will be called; otherwise there is an error.
Operator precedence : + has higher rank than <<, hence the line is parsed as:
(std::cout << ("hello " + s) );
And operator+(const char*,const std::string&) is the one on place 4 here: https://en.cppreference.com/w/cpp/string/basic_string/operator%2B.
Maybe you are a little surprised, because often operators are member functions and that implies that the left operand would need to be the std::string. However, thats not always the case. Operators can be free functions.

Overload resolution for char*, char array, and string literals using constexpr, SFINAE and/or type_traits

I have run into an interesting challenge that I have been trying to solve for hours, but after much research and many failed attempts, I find myself asking this question.
I would like to write 3 overloaded functions that each take one of the following types: const char*, const char(&)[N] and string literal (e.g. "BOO"). I understand that a string literal is simply a char array, but please bear with me while I explain my approach.
The two functions below are able to differentiate between the first two types (const char* and const char(&)[N]) thanks to the wrapper class CharPtrWrapper:
#include <iostream>
class CharPtrWrapper
{
public:
CharPtrWrapper(const char* charPtr)
: m_charPtr(charPtr)
{
}
const char * m_charPtr;
};
void processStr(CharPtrWrapper charPtrWrapper)
{
std::cout << "From function that takes a CharPtrWrapper = " << charPtrWrapper.m_charPtr << '\n';
}
template<std::size_t N>
void processStr(const char (&charArr)[N])
{
std::cout << "From function that takes a \"const char(&)[N]\" = " << charArr << '\n';
}
int main()
{
const char* charPtr = "ABC";
processStr(charPtr);
const char charArr[] = {'X', 'Y', 'Z', '\0'};
processStr(charArr);
}
Output:
From function that takes a CharPtrWrapper = ABC
From function that takes a "const char(&)[N]" = XYZ
Now, if I call processStr with a string literal (e.g. processStr("BOO")), the version that takes a const char(&)[N] gets called, which makes sense, since a string literal is simply a char array.
Here is where I reach the crux of the problem. I have not been able to write a function that is able to differentiate between a char array and a string literal. One thing I thought might work was to write a version that takes an rvalue reference:
template<std::size_t N>
void processStr(const char (&&charArr)[N])
{
std::cout << "From function that takes a \"const char(&&)[N]\" = " << charArr << '\n';
}
But it turns out that string literals are lvalues. I have also played with different versions that use std::enable_if and std::is_array, but I still don't get the result I'm looking for.
So I guess my question is the following: is it possible to differentiate between char arrays and string literals in modern C++?
Per [expr.prim.id.unqual]:
[...] The type of the expression is the type of the identifier. The
result is the entity denoted by the identifier. The expression is an
lvalue if the entity is a function, variable, or data member and a
prvalue otherwise; it is a bit-field if the identifier designates a
bit-field ([dcl.struct.bind]).
Therefore, given a declaration
const char arr[] = "foo";
The expression arr is an lvalue of type const char[4].
Per [lex.string]/8:
Ordinary string literals and UTF-8 string literals are also referred
to as narrow string literals. A narrow string literal has type “array
of n const char”, where n is the size of the string as defined
below, and has static storage duration.
And per [expr.prim.literal]:
A litera is a primary expression. Its type depends on its form. A
string literal is an lvalue; all other literals are prvalues.
Therefore, the expression "foo" is an lvalue of type const char[4].
Conclusion: a function is unable to differentiate between a (const) char array and a string literal.

Is it possible to overload the operator + with char strings?

I want to simplify the using of strings like in java.
so i can write "count "+6; and get a string "count 6"
with std::string it's possible to concatenate two strings or std::string with char string.
i wrote 2 functions
template<typename T>
inline static std::string operator+(const std::string str, const T gen){
return str + std::to_string(gen);
}
template<typename T>
inline static std::string operator+(const T gen, const std::string str){
return std::to_string(gen) + str;
}
to concatenate std::string with numbers, but cannot write something like "count "+6; because "count " is a const char[] and not a std::string.
it works with std::string("count")+" is "+6+" and it works also with double "+1.234; , but thats not really pretty =)
is there a possibility to do the same without starting with std::string("")
template<typename T>
inline static std::string operator+(const char* str, const T gen){
return str + std::to_string(gen);
}
this method doesn't work and i get an compiler error
error: invalid operands of types 'const char*' and 'const char [1]' to binary 'operator+'
No. You cannot overload operators for built-in types only; one of the two types involved in the operation must either be a class type or an enumeration.
What you could do to make things more palatable is construct strings on the fly by using a user-defined literal:
"count"s + 3.1415;
Be aware that this is a C++14 feature that may or may not be supported by your compiler yet.
When overloading operators, at least one of the operands has to be a user type (while types from the std library are considered user types). In other words: Not both operands of operator+ can be builtin types.
Since C++11, there are literal operators available. They make it possible to write
"count "_s
instead of
std::string("count ")
Such an operator is defined like this (the name of the following literal operator is _s; they have to start with an underscore for custom literal operator overloads):
std::string operator ""_s(const char *str, std::size_t len) {
return std::string(str, len);
}
Then, your expression becomes
"count "_s + 6
In C++14, such an operator is already available, and named more conveniently s (the standard may use operator names without a leading underscore), so it becomes
"count "s + 6

C++ Implicit Conversion Operators Precedence

EDIT: Following Mike Seymour's comment, I replaced operator std::string () const; with operator char * () const; and changed the implementation accordingly. This allows implicit casting, but, for some reason, the unsigned long int operator has precedence over the char * operator, which just does not feel right... Also, I don't want to expose nasty C stuff like char * outside the class, when I have std::string. I have a hunch that my CustomizedInt class needs to inherit from some stuff in order to support the feature that I desire. Could anybody please elaborate Mike's comment regarding std::basic_string? I'm not sure I understood it properly.
I have this piece of code:
#include <string>
#include <sstream>
#include <iostream>
class CustomizedInt
{
private:
int data;
public:
CustomizedInt() : data(123)
{
}
operator unsigned long int () const;
operator std::string () const;
};
CustomizedInt::operator unsigned long int () const
{
std::cout << "Called operator unsigned long int; ";
unsigned long int output;
output = (unsigned long int)data;
return output;
}
CustomizedInt::operator std::string () const
{
std::cout << "Called operator std::string; ";
std::stringstream ss;
ss << this->data;
return ss.str();
}
int main()
{
CustomizedInt x;
std::cout << x << std::endl;
return 0;
}
Which prints "Called operator unsigned long int; 123". My questions are these:
After I remove the operator unsigned long int, why do I need to cast x to std::string explicitly? Why does it not call the implicit cast operator (std::string) directly?
Is there any documentation that explains which implicit casts are allowed and which is their order of precedence? It seems that if I add an operator unsigned int to this class together with the operator unsigned long int, I receive a compiler error about ambiguity for the << operator...
Also, I know that defining such an operator may be poor practice, but I am not sure I fully understand the associated caveats. Could somebody please outline them? Would it be better practice to just define public methods ToUnsignedLongInt and ToString?
After I remove the operator unsigned long int, why do I need to cast x to std::string explicitly? Why does it not call the implicit cast operator (std::string) directly?
The version of << for strings is a template, parametrised by the parameters of the std::basic_string template (std::string itself being a specialisation of that template). It can only be chosen by argument-dependent lookup, and that only works if the argument is actually a specialisation of std::basic_string, not something convertible to that.
Is there any documentation that explains which implicit casts are allowed and which is their order of precedence?
The rules are quite complex, and you'd need to read the C++ standard for the full story. Simple rules of thumb are that implicit conversions can't contain more than one user-defined conversion and (as you've found out) the result of an implicit conversion can't be used to choose a template specialisation by argument-dependent lookup.
I am not sure I fully understand the associated caveats. Could somebody please outline them?
I don't fully understand them either; the interactions between implicit conversions, name lookup and template specialisation (and probably other factors that I can't think of right now) are rather complex, and most people don't have the inclination to learn them all. There are quite a few instances where implicit conversion won't happen, and others where it might happen when you don't expect it; personally, I find it easier just to avoid implicit conversions most of the time.
Would it be better practice to just define public methods ToUnsignedLongInt and ToString?
That's probably a good idea, to avoid unwanted conversions. You can fix your problem by leaving them and use them explicitly when necessary:
std::cout << std::string(x) << std::endl;
In C++11, you can declare them explicit, so that they can only be used in this manner. In my opinion, that would be the best option if you can; otherwise, I would use explicit conversion functions as you suggest.
By the way, the return type of main() must be int, not void.

Why is the Visual C++ compiler calling the wrong overload here?

Why is the Visual C++ compiler calling the wrong overload here?
I am have a subclass of ostream that I use to define a buffer for formatting. Sometimes I want to create a temporary and immediately insert a string into it with the usual << operator like this:
M2Stream() << "the string";
Unfortunately, the program calls the operator<<(ostream, void *) member overload, instead of the operator<<(ostream, const char *) nonmember one.
I wrote the sample below as a test where I define my own M2Stream class that reproduces the problem.
I think the problem is that the M2Stream() expression produces a temporary and this somehow causes the compiler to prefer the void * overload. But why? This is borne out by the fact that if I make the first argument for the nonmember overload const M2Stream &, I get an ambiguity.
Another strange thing is that it calls the desired const char * overload if I first define a variable of type const char * and then call it, instead of a literal char string, like this:
const char *s = "char string variable";
M2Stream() << s;
It's as if the literal string has a different type than the const char * variable! Shouldn't they be the same? And why does the compiler cause a call to the void * overload when I use the temporary and the literal char string?
#include "stdafx.h"
#include <iostream>
using namespace std;
class M2Stream
{
public:
M2Stream &operator<<(void *vp)
{
cout << "M2Stream bad operator<<(void *) called with " << (const char *) vp << endl;
return *this;
}
};
/* If I make first arg const M2Stream &os, I get
\tests\t_stream_insertion_op\t_stream_insertion_op.cpp(39) : error C2666: 'M2Stream::operator <<' : 2 overloads have similar conversions
\tests\t_stream_insertion_op\t_stream_insertion_op.cpp(13): could be 'M2Stream &M2Stream::operator <<(void *)'
\tests\t_stream_insertion_op\t_stream_insertion_op.cpp(20): or 'const M2Stream &operator <<(const M2Stream &,const char *)'
while trying to match the argument list '(M2Stream, const char [45])'
note: qualification adjustment (const/volatile) may be causing the ambiguity
*/
const M2Stream & operator<<(M2Stream &os, const char *val)
{
cout << "M2Stream good operator<<(const char *) called with " << val << endl;
return os;
}
int main(int argc, char argv[])
{
// This line calls void * overload, outputs: M2Stream bad operator<<(void *) called with literal char string on constructed temporary
M2Stream() << "literal char string on constructed temporary";
const char *s = "char string variable";
// This line calls the const char * overload, and outputs: M2Stream good operator<<(const char *) called with char string variable
M2Stream() << s;
// This line calls the const char * overload, and outputs: M2Stream good operator<<(const char *) called with literal char string on prebuilt object
M2Stream m;
m << "literal char string on prebuilt object";
return 0;
}
Output:
M2Stream bad operator<<(void *) called with literal char string on constructed temporary
M2Stream good operator<<(const char *) called with char string variable
M2Stream good operator<<(const char *) called with literal char string on prebuilt object
The compiler is doing the right thing: Stream() << "hello"; should use the operator<< defined as a member function. Because the temporary stream object cannot be bound to a non-const reference but only to a const reference, the non-member operator that handles char const* won't be selected.
And it's designed that way, as you see when you change that operator. You get ambiguities, because the compiler can't decide which of the available operators to use. Because all of them were designed with rejection of the non-member operator<< in mind for temporaries.
Then, yes, a string literal has a different type than a char const*. A string literal is an array of const characters. But that wouldn't matter in your case, i think. I don't know what overloads of operator<< MSVC++ adds. It's allowed to add further overloads, as long as they don't affect the behavior of valid programs.
For why M2Stream() << s; works even when the first parameter is a non-const reference... Well, MSVC++ has an extension that allows non-const references bind to temporaries. Put the warning level on level 4 to see a warning of it about that (something like "non-standard extension used...").
Now, because there is a member operator<< that takes a void const*, and a char const* can convert to that, that operator will be chosen and the address will be output as that's what the void const* overload is for.
I've seen in your code that you actually have a void* overload, not a void const* overload. Well, a string literal can convert to char*, even though the type of a string literal is char const[N] (with N being the amount of characters you put). But that conversion is deprecated. It should be not standard that a string literal converts to void*. It looks to me that is another extension by the MSVC++ compiler. But that would explain why the string literal is treated differently than the char const* pointer. This is what the Standard says:
A string literal (2.13.4) that is not a wide string literal can be converted to an rvalue of type "pointer to char"; a wide string literal can be converted to an rvalue of type "pointer to wchar_t". In either case, the result is a pointer to the first element of the array. This conversion is considered only when there is an explicit appropriate pointer target type, and not when there is a general need to convert from an lvalue to an rvalue. [Note: this conversion is deprecated. See Annex D. ]
The first problem is caused by weird and tricky C++ language rules:
A temporary created by a call to a constructor is an rvalue.
An rvalue may not be bound to a non-const reference.
However, an rvalue object can have non-const methods invoked on it.
What is happening is that ostream& operator<<(ostream&, const char*), a non-member function, attempts to bind the M2Stream temporary you create to a non-const reference, but that fails (rule #2); but ostream& ostream::operator<<(void*) is a member function and therefore can bind to it. In the absence of the const char* function, it is selected as the best overload.
I'm not sure why the designers of the IOStreams library decided to make operator<<() for void* a method but not operator<<() for const char*, but that's how it is, so we have these weird inconsistencies to deal with.
I'm not sure why the second problem is occurring. Do you get the same behaviour across different compilers? It's possible that it's a compiler or C++ Standard Library bug, but I'd leave that as the excuse of last resort -- at least see if you can replicate the behaviour with a regular ostream first.
The problem is that you're using a temporary stream object. Change the code to the following and it will work:
M2Stream ms;
ms << "the string";
Basically, the compiler is refusing to bind the temporary to the non const reference.
Regarding your second point about why it binds when you have a "const char *" object, this I believe is a bug in the VC compiler. I cannot say for certain, however, when you have just the string literal, there is a conversion to 'void *' and a conversion to 'const char *'. When you have the 'const char *' object, then there is no conversion required on the second argument - and this might be a trigger for the non-standard behaviour of VC to allow the non const ref bind.
I believe 8.5.3/5 is the section of the standard that covers this.
I'm not sure that your code should compile. I think:
M2Stream & operator<<( void *vp )
should be:
M2Stream & operator<<( const void *vp )
In fact, looking at the code more, I believe all your problems are down to const. The following code works as expected:
#include <iostream>
using namespace std;
class M2Stream
{
};
const M2Stream & operator<<( const M2Stream &os, const char *val)
{
cout << "M2Stream good operator<<(const char *) called with " << val << endl;
return os;
}
int main(int argc, char argv[])
{
M2Stream() << "literal char string on constructed temporary";
const char *s = "char string variable";
// This line calls the const char * overload, and outputs: M2Stream good operator<<(const char *) called with char string variable
M2Stream() << s;
// This line calls the const char * overload, and outputs: M2Stream good operator<<(const char *) called with literal char string on prebuilt object
M2Stream m;
m << "literal char string on prebuilt object";
return 0;
}
You could use an overload such as this one:
template <int N>
M2Stream & operator<<(M2Stream & m, char const (& param)[N])
{
// output param
return m;
}
As an added bonus, you now know N to be the length of the array.