Creating a compatible String object - c++

So I have an existing library that provides a string type.
It implicitly converts to-from C style strings like so:
struct TypeIDoNotOwn {
TypeIDoNotOwn() {}
TypeIDoNotOwn(TypeIDoNotOwn const&) {}
TypeIDoNotOwn(char const*) {}
TypeIDoNotOwn& operator=(TypeIDoNotOwn const&) {return *this;}
TypeIDoNotOwn& operator=(char const*) {return *this;}
operator char const*() const {return nullptr;}
};
it has other methods, but I do not think they are important. These methods have bodies, but my problem doesn't involve them, so I have stubbed them out.
What I want to do is to create a new type that can be used relatively interchangably with the above type, and with "raw string constants". I want to be able to take an instance of TypeIDoNotOwn, and replace it with TypeIDoOwn, and have code compile.
As an example, this set of operations:
void test( TypeIDoNotOwn const& x ) {}
int main() {
TypeIOwn a = TypeIDoNotOwn();
TypeIDoNotOwn b;
a = b;
b = a;
TypeIOwn c = "hello";
TypeIDoNotOwn d = c;
a = "world";
d = "world";
char const* e = a;
std::pair<TypeIDoNotOwn, TypeIDoNotOwn> f = std::make_pair( TypeIOwn(), TypeIOwn() );
std::pair<TypeIOwn, TypeIOwn> g = std::make_pair( TypeIDoNotOwn(), TypeIDoNotOwn() );
test(a);
}
If I replace TypeIOwn with TypeIDoNotOwn above, it compiles. How do I get it to compile with TypeIOwn without modifying TypeIDoNotOwn? And without having to introduce any casts or changes other than the change-of-type at point of declaration?
My first attempt looks somewhat like this:
struct TypeIOwn {
TypeIOwn() {}
operator char const*() const {return nullptr;}
operator TypeIDoNotOwn() const {return {};}
TypeIOwn( TypeIOwn const& ) {}
TypeIOwn( char const* ) {}
TypeIOwn( TypeIDoNotOwn const& ) {}
TypeIOwn& operator=( char const* ) {return *this;}
TypeIOwn& operator=( TypeIOwn const& ) {return *this;}
TypeIOwn& operator=( TypeIDoNotOwn const& ) {return *this;}
};
but I get a series of ambiguous overloads:
main.cpp:31:4: error: use of overloaded operator '=' is ambiguous (with operand types 'TypeIDoNotOwn' and 'TypeIOwn')
b = a;
~ ^ ~
main.cpp:9:17: note: candidate function
TypeIDoNotOwn& operator=(TypeIDoNotOwn const&) {return *this;}
^
main.cpp:10:17: note: candidate function
TypeIDoNotOwn& operator=(char const*) {return *this;}
and
/usr/include/c++/v1/utility:315:15: error: call to constructor of 'TypeIDoNotOwn' is ambiguous
: first(_VSTD::forward<_U1>(__p.first)),
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
main.cpp:40:51: note: in instantiation of function template specialization 'std::__1::pair<TypeIDoNotOwn, TypeIDoNotOwn>::pair<TypeIOwn, TypeIOwn>' requested here
std::pair<TypeIDoNotOwn, TypeIDoNotOwn> f = std::make_pair( TypeIOwn(), TypeIOwn() );
^
main.cpp:7:7: note: candidate constructor
TypeIDoNotOwn(TypeIDoNotOwn const&) {}
^
main.cpp:8:7: note: candidate constructor
TypeIDoNotOwn(char const*) {}
^
In my "real" code I have other operators, like += and ==, that have similar problems.
The scope of the real problem is large; millions of lines of code, and I want to swap out TypeIDoNotOwn for TypeIOwn at many thousands of locations, but not at many hundreds of others. And at thousands of locations they interact in a way that causes the conversion ambiguity.
I have solved the problem of a function taking TypeIDoNotOwn& at the 100s of spots where it happens by wrapping it with a macro that creates a temporary object that creates a TypeIDoNotOwn from the TypeIOwn, returns a reference to that, then when the temporary object is destroyed copies it back to the TypeIOwn. I want to avoid having to do a similar sweep to handle ==, +=, =, copy-construction, and similar situations.
Live example.
If I try to remove the operator TypeIDoNotOwn to clear up that ambiguity, other cases where the conversion need occur don't work right (as it requires 2 user-defined constructions to get from TypeIOwn to TypeIDoNotOwn), which then requires an explicit conversion to occur (at many 100s or 1000s of locations)
If I could make one conversion look worse than the other, it would work. Failing that, I could try fixing the non-operator= and copy-construct cases by overloading a free TypeIDoNotOwn == TypeIOwn operator with exact matching (and similar for other cases), but that doesn't get me construction, function calls, and assignment.

With the usual caveats that this is C++ and there's bound to be some clever workaround... no.
Let's go through your use cases. You want both copy initialization and copy assignment to work:
TypeIOwn a = ...;
TypeIDoNotOwn b = a; // (*)
TypeIDoNotOwn c;
c = a; // (*)
That necessitates:
operator TypeIDoNotOwn();
If you just provided operator const char*(), then assignment would work, but copy-initialization would fail. If you provided both, it's ambiguous as there's no way to force one conversion to be preferred to the other (the only real way to force conversion ordering would be to create type hierarchies, but you can't inherit from const char* so you can't really force that to work).
Once we get ourselves down to having just the one conversion function, the only code that doesn't work from the list of examples is:
const char* e = a; // error: no viable conversion
At which point, you'll have to add a member function:
const char* e = a.c_str();
Both pair constructions work fine with the one conversion function. But just by process of elimination, we can't have both.

There's no magic bullet, but you might get some improvement by declaring the conversion from TypeIOwn to TypeIDoNotOwn as explicit.
explicit operator TypeIDoNotOwn() const { return{}; }
This means you do have to make a change at each spot where this happens, but it does resolve the problem with "const char*" being equally valid for assignments. Is it worth the trade-off? You'll have to decide.
However, for incrementally changing the code base, I have had some luck in similar situations using a different strategy. I just set a #define flag and compile using entirely one or the other, and I can continue to code normally with TypeIDoNotOwn, while simultaneously making progress on making everything work with TypeIDoOwn.
#ifdef SOME_FLAG
struct TypeIOwn {...};
typedef TypeIOwn TypeIDoNotOwn;
#else
struct TypeIDoNotOwn {...};
#endif
You will have to test both for every update, until you finally make the plunge.
Since you say this is a string class, also consider the option of moving towards std::string, so your TypeIOwn becomes a thin wrapper for std::string, and no longer provides an implicit conversion to const char*. Instead, provide data(). You no longer have ambiguous conversions from TypeIOwn -> (const char* | TypeIDoNotOwn) -> TypeIDoNotOwn, because like std::string, you no longer allow implicit conversion to const char*, and any work you put into making the code work with this will pay off when you ditch both string classes entirely and use std::string.

Related

How to prevent a temporary on the left hand side of the assignment operator

If I define operator+ for a type, in the usual fashion
struct S {};
S operator+(S const &, S const &) {
return {};
}
users of S can write code like
S s{};
s + s = S{}; // huh
From what I can tell, operator+ returns a temporary object of type S, which is then assigned to. The object then dies at the end of the statement, because there's no name for it, and so the statement is effectively a no-op.
I don't see any use for code like that, so I would like to make that a compile error. Is there a way to do that? Even a warning would be better than nothing.
One simple way to prevent this from happening is to return a constant object:
const S operator+(S const &, S const &) {
return {};
}
That will now result in a compilation error, in this situation, but
s= s + s;
will still work just fine.
This, of course, has a few other ramifications, and may or may not have undesirable side-effects, which may or may not pose an issue, but that would be a new question, here...
Just found what I need here. Apparently I can make the assignment operator only bind to lvalues.
struct S {
S& operator=(S const&) & // only binds to lvalues
{
return *this;
}
};
Now I get exactly the error I want.

more than one operator "[]" matches these operands

I have a class that has both implicit conversion operator() to intrinsic types and the ability to access by a string index operator[] that is used for a settings store. It compiles and works very well in unit tests on gcc 6.3 & MSVC however the class causes some ambiguity warnings on intellisense and clang which is not acceptable for use.
Super slimmed down version:
https://onlinegdb.com/rJ-q7svG8
#include <memory>
#include <unordered_map>
#include <string>
struct Setting
{
int data; // this in reality is a Variant of intrinsic types + std::string
std::unordered_map<std::string, std::shared_ptr<Setting>> children;
template<typename T>
operator T()
{
return data;
}
template<typename T>
Setting & operator=(T val)
{
data = val;
return *this;
}
Setting & operator[](const std::string key)
{
if(children.count(key))
return *(children[key]);
else
{
children[key] = std::shared_ptr<Setting>(new Setting());
return *(children[key]);
}
}
};
Usage:
Setting data;
data["TestNode"] = 4;
data["TestNode"]["SubValue"] = 55;
int x = data["TestNode"];
int y = data["TestNode"]["SubValue"];
std::cout << x <<std::endl;
std::cout << y;
output:
4
55
Error message is as follows:
more than one operator "[]" matches these operands:
built-in operator "integer[pointer-to-object]" function
"Setting::operator[](std::string key)"
operand types are: Setting [ const char [15] ]
I understand why the error/warning exists as it's from the ability to reverse the indexer on an array with the array itself (which by itself is extremely bizarre syntax but makes logical sense with pointer arithmetic).
char* a = "asdf";
char b = a[5];
char c = 5[a];
b == c
I am not sure how to avoid the error message it's presenting while keeping with what I want to accomplish. (implicit assignment & index by string)
Is that possible?
Note: I cannot use C++ features above 11.
The issue is the user-defined implicit conversion function template.
template<typename T>
operator T()
{
return data;
}
When the compiler considers the expression data["TestNode"], some implicit conversions need to take place. The compiler has two options:
Convert the const char [9] to a const std::string and call Setting &Setting::operator[](const std::string)
Convert the Setting to an int and call const char *operator[](int, const char *)
Both options involve an implicit conversion so the compiler can't decide which one is better. The compiler says that the call is ambiguous.
There a few ways to get around this.
Option 1
Eliminate the implicit conversion from const char [9] to std::string. You can do this by making Setting::operator[] a template that accepts a reference to an array of characters (a reference to a string literal).
template <size_t Size>
Setting &operator[](const char (&key)[Size]);
Option 2
Eliminate the implicit conversion from Setting to int. You can do this by marking the user-defined conversion as explicit.
template <typename T>
explicit operator T() const;
This will require you to update the calling code to use direct initialization instead of copy initialization.
int x{data["TestNode"]};
Option 3
Eliminate the implicit conversion from Setting to int. Another way to do this is by removing the user-defined conversion entirely and using a function.
template <typename T>
T get() const;
Obviously, this will also require you to update the calling code.
int x = data["TestNode"].get<int>();
Some other notes
Some things I noticed about the code is that you didn't mark the user-defined conversion as const. If a member function does not modify the object, you should mark it as const to be able to use that function on a constant object. So put const after the parameter list:
template<typename T>
operator T() const {
return data;
}
Another thing I noticed was this:
std::shared_ptr<Setting>(new Setting())
Here you're mentioning Setting twice and doing two memory allocations when you could be doing one. It is preferable for code cleanliness and performance to do this instead:
std::make_shared<Setting>()
One more thing, I don't know enough about your design to make this decision myself but do you really need to use std::shared_ptr? I don't remember the last time I used std::shared_ptr as std::unique_ptr is much more efficient and seems to be enough in most situations. And really, do you need a pointer at all? Is there any reason for using std::shared_ptr<Setting> or std::unique_ptr<Setting> over Setting? Just something to think about.

Clang ambiguity with custom conversion operator

I've been developing a kind of adapter class, when I encountered a problem under clang. When both conversion operators for lvalue-reference and rvalue reference are defined you get an ambiguity compilation error trying to move from your class (when such code should be fine, as
operator const T& () const&
is allowed only for lvalues AFAIK). I've reproduced error with simple example:
#include <string>
class StringDecorator
{
public:
StringDecorator()
: m_string( "String data here" )
{}
operator const std::string& () const& // lvalue only
{
return m_string;
}
operator std::string&& () && // rvalue only
{
return std::move( m_string );
}
private:
std::string m_string;
};
void func( const std::string& ) {}
void func( std::string&& ) {}
int main(int argc, char** argv)
{
StringDecorator my_string;
func( my_string ); // fine, operator std::string&& not allowed
func( std::move( my_string ) ); // error "ambiguous function call"
}
Compiles fine on gcc 4.9+, fails on any clang version.
So the question: is there any workaround? Is my understanding of const& function modifier right?
P.S.: To clarify - the question is about fixing StringDecorator class itself (or finding the workaround for such class as if were a library code). Please refrain providing answers that call operator T&&() directly or specify conversion type explicitly.
The problem comes from the selection of the best viable function. In the case of the second func call, it implies the comparison of 2 user-defined conversion sequence. Unfortunately, 2 user-defined conversion sequence are undistinguishable if they do not use the same user-defined conversion function or constructor C++ standard [over.ics.rank/3]:
Two implicit conversion sequences of the same form are indistinguishable conversion sequences unless one of
the following rules applies:
[...]
User-defined conversion sequence
U1
is a better conversion sequence than another user-defined conversion
sequence
U2
if they contain the same user-defined conversion function or constructor [...]
Because a rvalue can always bind to a const lvalue reference, you will in any case fall on this ambiguous call if a function is overloaded for const std::string& and std::string&&.
As you mentioned it, my first answer consisting in redeclaring all functions is not a solution as you are implementing a library. Indeed it is not possible to define proxy-functions for all functions taking a string as argument!!
So that let you with a trade off between 2 imperfect solutions:
You remove operator std::string&&() &&, and you will loose some optimization, or;
You publicly inherit from std::string, and remove the 2 conversion functions, in which case you expose your library to misuses:
#include <string>
class StringDecorator
: public std::string
{
public:
StringDecorator()
: std::string("String data here" )
{}
};
void func( const std::string& ) {}
void func( std::string&& ) {}
int main(int argc, char** argv)
{
StringDecorator my_string;
func( my_string ); // fine, operator std::string&& not allowed
func( std::move( my_string )); // No more bug:
//ranking of standard conversion sequence is fine-grained.
}
An other solution is not to use Clang because it is a bug of Clang.
But if you have to use Clang, the Tony Frolov answer is the solution.
Oliv answer is correct, as standard seems to be quite clear in this case. The solution I've chosen at a time was to leave only one conversion operator:
operator const std::string& () const&
The problem exists because both conversion operators are considered viable. So this can be avoided by changing type of implicit argument of lvalue conversion operator from const& to &:
operator const std::string& () & // lvalue only (rvalue can't bind to non-const reference)
{
return m_string;
}
operator std::string&& () && // rvalue only
{
return std::move( m_string );
}
But this breaks conversion from const StringDecorator, making its usage awkward in typical cases.
This broken solution led me thinking if there is a way to specify member function qualifier that will make conversion operator viable with const lvalue object, but not with the rvalue. And I've managed to achieve this by specifying implicit argument for const conversion operator as const volatile&:
operator const std::string& () const volatile& // lvalue only (rvalue can't bind to volatile reference)
{
return const_cast< const StringDecorator* >( this )->m_string;
}
operator std::string&& () && // rvalue only
{
return std::move( m_string );
}
Per [dcl.init.ref]/5, for a reference to be initialized by binding to
an rvalue, the reference must be a const non-volatile lvalue
reference, or an rvalue reference:
While lvalue reference and const lvalue reference can bind to const volatile reference. Obviously volatile modifier on member function serves completely different thing. But hey, it works and sufficient for my use-case. The only remaining problem is that code becomes misleading and astonishing.
clang++ is more accurate. Both func overloads are not exact matches for StringDecorator const& or StringDecorator&&. Thus my_string can not be moved. The compiler can not choose between possible transforms StringDecorator&& --> std::string&& --> func(std::string&&) and StringDecorator&& -> StringDecorator& --> std::string& --> func(const std::string&). In other words the compiler can not determine on what step it should apply cast operator.
I do not have g++ installed to check my assumption. I guess it goes to the second way, since my_string can not be moved, it applies the cast operator const& to StringDecorator&. You can check it if you add debug output into bodies of cast operators.

How to avoid `operator[](const char*)` ambiguity?

Consider the following code:
class DictionaryRef {
public:
operator bool() const;
std::string const& operator[](char const* name) const;
// other stuff
};
int main() {
DictionaryRef dict;
char text[256] = "Hello World!";
std::cout << dict[text] << std::endl;
}
This produces the following warning when compiled with G++:
warning: ISO C++ says that these are ambiguous, even though the worst conversion for the first is better than the worst conversion for the second:
note: candidate 1: const string& DictionaryRef::operator[](const char*) const
note: candidate 2: operator[](long int, char*) <built-in>
I know what this means (and the cause is been explained in operator[](const char *) ambiguity) but I'm looking for a way to ensure correct behavior/resolve the warning without changing my class design - since it makes perfect sense for a class to have both a boolean conversion, and a [](const char*) operator.
What's the purpose of operator[](long int, char*) besides generating random compiler warnings? I can't imagine someone writing 1["hello"] in real code.
Although you "can't imagine someone writing 1["hello"] in a real code", this is something that's legal C++, as a consequence of the commutative [] inherited from C. Reasonable or not, that's how the language is defined, and it's unlikely to be changing for us.
The best way to avoid the ambiguity is to add explicit to the boolean conversion - it's very rare that we ever want a non-explicit operator bool().
An alternative is to replace operator bool() with an operator void*(), which will still satisfy boolean tests, but not convert to integer.

Should I still return const objects in C++11? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Should I return const objects?
(The original title of that question was: int foo() or const int foo()? explaining why I missed it.)
Effective C++, Item 3: Use const whenever possible. In particular, returning const objects is promoted to avoid unintended assignment like if (a*b = c) {. I find it a little paranoid, nevertheless I have been following this advice.
It seems to me that returning const objects can degrade performance in C++11.
#include <iostream>
using namespace std;
class C {
public:
C() : v(nullptr) { }
C& operator=(const C& other) {
cout << "copy" << endl;
// copy contents of v[]
return *this;
}
C& operator=(C&& other) {
cout << "move" << endl;
v = other.v, other.v = nullptr;
return *this;
}
private:
int* v;
};
const C const_is_returned() { return C(); }
C nonconst_is_returned() { return C(); }
int main(int argc, char* argv[]) {
C c;
c = const_is_returned();
c = nonconst_is_returned();
return 0;
}
This prints:
copy
move
Do I implement the move assignment correctly? Or I simply shouldn't return const objects anymore in C++11?
Returning const objects is a workaround that might cause other problems. Since C++11, there is a better solution for the assignment issue: Reference Qualifiers for member functions. I try to explain it with some code:
int foo(); // function declaration
foo() = 42; // ERROR
The assignment in the second line results in a compile-time error for the builtin type int in both C and C++. Same for other builtin types. That's because the assignment operator for builtin types requires a non-const lvalue-reference on the left hand side. To put it in code, the assignment operator might look as follows (invalid code):
int& operator=(int& lhs, const int& rhs);
It was always possible in C++ to restrict parameters to lvalue references. However, that wasn't possible until C++11 for the implicit first parameter of member functions (*this).
That changed with C++11: Similar to const qualifiers for member functions, there are now reference qualifiers for member functions. The following code shows the usage on the copy and move operators (note the & after the parameter list):
struct Int
{
Int(const Int& rhs) = default;
Int(Int&& rhs) noexcept = default;
~Int() noexcept = default;
auto operator=(const Int& rhs) & -> Int& = default;
auto operator=(Int&& rhs) & noexcept -> Int& = default;
};
With this class declaration, the assignment expression in the following code fragment is invalid, whereas assigning to a local variable works - as it was in the first example.
Int bar();
Int baz();
bar() = baz(); // ERROR: no viable overloaded '='
So there is no need to return const objects. You can restrict the assigment operators to lvalue references, so that everything else still works as expected - in particular move operations.
See also:
What is "rvalue reference for *this"?
Returning a const object by value arguably was never a very good idea, even before C++11. The only effect it has is that it prevents the caller from calling non-const functions on the returned object – but that is not very relevant given that the caller received a copy of the object anyway.
While it is true that being returned a constant object prevents the caller from using it wrongly (e.g. mistakenly making an assignment instead of a comparison), it shouldn't be the responsibility of a function to decide how the caller can use the object returned (unless the returned object is a reference or pointer to structures the function owns). The function implementer cannot possibly know whether the object returned will be used in a comparison or for something else.
You are also right that in C++11 the problem is even graver, as returning a const effectively prevents move operations. (It does not prevent copy/move elision, though.)
Of course it is equally important to point out that const is still extremely useful (and using it no sign of paranoia) when the function returns a reference or a pointer.
The reason that your call to const_is_returned triggers copy constructor rather than move constructor is the fact that move must modify the object thus it can't be used on const objects. I tend to say that using const isn't advised in any case and should be subjected to a programmer judgement, otherwise you get the things you demonstrated. Good question.