C++ user-defined conversion operators without classes? - c++

In C++ is it possible to define conversion operators which are not class members? I know how to do that for regular operators (such as +), but not for conversion operators.
Here is my use case: I work with a C Library which hands me out a PA_Unichar *, where the library defines PA_Unichar to be a 16-bit int. It is actually a string coded in UTF-16. I want to convert it to a std::string coded in UTF-8. I have all the conversion code ready and working, and I am only missing the syntactic sugar that would allow me to write:
PA_Unichar *libOutput = theLibraryFunction();
std::string myString = libOutput;
(usually in one line without the temp variable).
Also worth noting:
I know that std::string doesn't define implicit conversion from char* and I know why. The same reason might apply here, but that's beside the point.
I do have a ustring, subclass of std::string that defines the right conversion operator from PA_Unichar*. It works but this means using ustring variables instead of std::string and that then requires conversion to std::string when I use those strings with other libraries. So that doesn't help very much.
Using an assignment operator doesn't work as those must be class members.
So is it possible to define implicit conversion operators between two types you don't control (in my case PA_Unichar* and std::string), which may or may not be class types?
If not what could be workarounds?

What's wrong with a free function?
std::string convert(PA_Unichar *libOutput);
std::string myString = convert(theLibraryFunction());
Edit answering to the comment:
As DrPizza says: Everybody else is trying to plug the holes opened up by implicit conversions through replacing them with those explicit conversion which you call "visual clutter".
As to the temporary string: Just wait for the next compiler version. It's likely to come with rvalue references and its std::string implementation will implement move semantics on top of that, which eliminates the copy. I have yet to see a cheaper way to speedup your code than than by simply upgrading to a new compiler version.

I don't think you can define "global" conversion operators. The standards say that conversion functions are special member functions. I would propse the following if I could consider the following syntax sugar:
struct mystring : public string
{
mystring(PA_Unichar * utf16_string)
{
// All string functionality is present in your mystring.
// All what you have to do is to write the conversion process.
string::operator=("Hello World!");
string::push_back('!');
// any string operation...
}
};
Be aware that polymorphic behavior of this class is broken. As long as you don't create an object of it through a pointer of type string* though, you are in the safe-side! So, this code is perfect:
mystring str(....);
As said before, the following code is broken!
string* str = new mystring(....);
....
delete str; // only deleting "string", but not "mystring" part of the object
// std::string doesn't define virtual destructor

Implicit conversions are the devil, anyway. Make it explicit with a converting function call.

No, you can't. What you could do as an alternative is to create a conversion constructor in the target class (not your case, as you want to convert to std::string - unless you derive it). But I agree to the other answers, I think an implicit conversion is not recommended in this case - especially because you're not converting from an object but from a pointer. Better to have a free function, your code will be easier to understand and the next programmer to inherit the code will for sure thank you.

Related

Passing a class with conversion function for const char* through a variadic function like printf

I have a class SpecialString. It has an operator overload / conversion function it uses any time it's passed off as a const char*. It then returns a normal c-string.
class SpecialString
{
...
operator char* () const { return mCStr; }
...
};
This used to work a long time ago (literally 19 years ago) when I passed these directly into printf(). The compiler was smart enough to know that argument was meant to be a char* and it used the conversion function, but the now g++ complains.
SpecialString str1("Hello"), str2("World");
printf("%s %s\n", str1, str2);
error: cannot pass object of non-POD type 'SPECIALSTRING' (aka 'SpecialString') through variadic method; call will abort at runtime [-Wnon-pod-varargs]
Is there any way to get this to work again without changing the code? I can add a deref operator overload function that returns the c-string and pass the SpecialString objects around like this.
class SpecialString
{
...
operator CHAR* () const { return mCStr; }
char* operator * () const { return mCStr; }
...
};
SpecialString str1("Hello"), str2("World");
printf("%s %s\n", *str1, *str2);
But I'd prefer not to because this requires manually changing thousands of lines of code.
You could disable the warning, if you don't want to be informed about it... but that's a bad idea.
The behaviour of the program is undefined, you should fix it and that requires changing the code. You can use the exising conversion operator with static_cast, or you can use your unary * operator idea, which is going to be terser.
Even less change would be required if you used unary + instead which doesn't require introducing an overload, since it will invoke the implicit conversion instead. That may add some confusion to the reader of the code though.
Since you don't want to modify the existing code, you can write a "hack" instead. More specifically, a bunch of overloads to printf() that patch the existing code.
For example:
int printf(const char* f, const SpecialString& a, const SpecialString& b)
{
return printf(f, (const char*)a, (const char*)b);
}
With this function declared in your header, every call to printf() with those specific parameters will use this function instead of the "real" printf() you're familiar with, and perform the needed conversions.
I presume you have quite a few combinations of printf() calls in your code envolving SpecialString, so you may have to write a bunch of different overloads, and this is ugly af to say the least but it does fit your requirement.
As mentioned in another comment, it has always been undefined behavior that happens to work in your case.
With Microsoft CString class, it seems like the undefined behavior was so used (as it happen to work), that now the layout is defined in a way that it will still works. See How can CString be passed to format string %s?.
In our code base, I try to fix code when I modify a file to explicitly do the conversion by calling GetString()
There are a few things you could do:
Fix existing code everywhere you get the warning.
In that case, a named function like c_str or GetString is preferable to a conversion operator to avoid explicit casting (for ex. static_cast or even worst C-style case (const char *). The deref operator might be an acceptable compromise.
Use some formatting library
<iosteam>
fmt: https://github.com/fmtlib/fmt
many other choices (search C++ formatting library or something similar)
Use variadic template function so that conversion could be done.
If you only use a few types (int, double, string) and rarely more than 2 or 3 parameters, defining overloads might also be a possibility.
Not recommended: Hack your class to works again.
Have you done any change to your class definition that cause it to break or only upgrade the compiler version or change compiler options?
Such hack is working with undefined behavior so you must figure out how your compiler works and the code won't be portable.
For it to works the class must have the size of a pointer and the data itself must be compatible with a pointer. Thus essentially, the data must consist of a single pointer (no v-table or other stuff).
Side note: I think that one should avoid defining its own string class. In most case, standard C++ string should be used (or string view). If you need additional functions, I would recommend your to do write stand-alone function in a namespace like StringUtilities for example. That way, you avoid converting back and forth between your own string and standard string (or some library string like MFC, Qt or something else).

Is there a way to use only the object name of a class as a "default" member?

Think in a similar fashion like:
1. The bare name of an array is equivalent with the pointer to the first element, without the need to specify index 0.
2. toString() from Java makes it possible to use the name of an object as a string without calling any object method.
Now is there a way in C++ to use the name of a class object to refer to its first member?
Consider:
class Program
{
public:
int id;
char *str;
};
void function(int p)
{
//...
}
and then:
Program prog0;
function(prog0); // instead of function(prog0.id)
Any way to "hide" the member reference?
EDIT:
Why was the holyBlackCat's answer deleted? I was inclining to vote it as the best answer -- no offense, Mateusz. But he was the first to suggest conversion operator and the example was complete and simple.
In C++, such behaviour would be a cataclysm. If I understand correctly, Java tries to convert object of type A to object of type B by searching for first member in A, that is of type B or is implicitly convertible to B.
C++ wasn't designed that way. We like to write code, that is always predictable. You can achieve what you want, but for a price.
The best solution in this case would be conversion operator - consider:
class Program
{
public:
int id;
char *str;
operator int()
{
return this->id;
}
//You can have more than one!
operator const char*()
{
return this->str;
}
};
void function_int(int p)
{
}
void function_str(const char* s)
{
}
Now it is possible to do the following:
Program prog;
function_int(prog); //Equivalent of function_int(prog.id)
function_str(prog); //Equivalent of function_int(prog.str)
The price is, that if you add another int and place it before id it will not be used in conversion, because we stated in our operator explicitly, that "int content" of our class is represented by id and this member is considered when it comes to such conversion.
However, even this simple example shows some potential problems - overloading functions with integral and pointer types could result in very unpredictable behavior. When type contains conversion operators to both pointers and integers, it can get even worse.
Assume, that we have following function:
void func(unsigned long)
{
}
And we call func with argument of type Program. Which conversion operator would you expect to be called? Compiler knows how to convert Program to either int or const char*, but not unsigned long. This article on cppreference should help you to understand how implicit conversions work.
Also, as Barry pointed out, more meaningless constructs become available. Consider this one:
int x = prog + 2
What does it mean? It is perfectly valid code, though. That is why conversion operators should be dosed extremely carefully (in pre-C++11 era, there was a general advise, that every class should have at most one such operator).
Quote from MSDN:
If a conversion is required that causes an ambiguity, an error is generated. Ambiguities arise when more than one user-defined conversion is available or when a user-defined conversion and a built-in conversion exist.
Sometimes, simple solution to this problem is to mark conversion operator with explicit keyword, so you would need to change above calls to:
function_int((int)prog);
function_str((const char*)prog);
It is not as pretty as the previous form, but much safer. It basically means, that compiler is forbidden to perform any implicit conversion using operator marked as explicit. Very useful to avoid ambiguous calls, while still providing some flexibility in code - you can still very easily convert objects of one type to another, but you can be sure when and where these conversions are performed.
However, explicit conversion operators are still not supported by some compilers, as this is C++ 11 feature (for example, Visual C++ 11 doesn't support it).
You can read more about explicit keyword here.
Now is there a way in C++ to use the name of a class object to refer to its first member?
No, C++ doesn't have any reflection, so there's no way to actually determine what the "first member" is.
However, if what you really want is to get an ID for any object, you could just require that object to have that method:
template <typename T>
void function(const T& t) {
int id = t.getID();
// etc.
}
Without knowing more about your use-case, it's hard to know what to propose.

Are implicit conversions good or bad in modern C++?

In this proposal:
N3830 Scoped Resource - Generic RAII Wrapper for the Standard Library
a scoped_resource RAII wrapper is presented.
On page 4, there is some code like this:
auto hFile = std::make_scoped_resource(
...
);
...
// cast operator makes it seamless to use with other APIs needing a HANDLE
ReadFile(hFile, ...);
The Win32 API ReadFile() takes a raw HANDLE parameter, instead hFile is an instance of scoped_resource, so to make the above code work, there is an implicit cast operator implemented by scoped_resource.
But, wasn't the "modern" recommendation to avoid these kind of implicit conversions?
For example, ATL/MFC CString has an implicit conversion (cast operator) to LPCTSTR (const char/wchar_t*, i.e. a raw C-string pointer), instead STL strings have an explicit c_str() method.
Similarly, smart pointers like unique_ptr have an explicit get() method to access the underlying wrapped pointer; and that recommendation against implicit conversion seems also present in this blog post:
Reader Q&A: Why don’t modern smart pointers implicitly convert to *?
So, are these implicit conversions (like ATL/MFC CString and the newly proposed scoped_resource) good or not for modern C++?
From a coding perspective, I'd say that being able to simply directly pass a RAII wrapper - be it CString or scoped_resource - to a C API expecting a "raw" parameter (like a raw C string pointer, or raw handle), relying on implicit conversions, and without calling some .GetString()/.get() method, seems very convenient.
Below is quote from C++ Primer 5th edition.
Caution: Avoid Overuse of Conversion Functions
As with using overloaded operators, judicious use of conversion operators can
greatly simplify the job of a class designer and make using a class easier.
However, some conversions can be misleading. Conversion operators are
misleading when there is no obvious single mapping between the class type
and the conversion type.
For example, consider a class that represents a Date. We might think it
would be a good idea to provide a conversion from Date to int. However,
what value should the conversion function return? The function might return
a decimal representation of the year, month, and day. For example, July 30, 1989 might be represented as the int value 19800730. Alternatively, the conversion operator might return an int representing the number of days that have elapsed since some epoch point, such as January 1, 1970. Both these conversions have the desirable property that later dates correspond to larger integers, and so either might be useful.
The problem is that there is no single one-to-one mapping between an
object of type Date and a value of type int. In such cases, it is better not
to define the conversion operator. Instead, the class ought to define one or
more ordinary members to extract the information in these various forms.
So, what I can say is that in practice, classes should rarely provide conversion operators. Too often users are more likely to be surprised if a conversion happens automatically than to be helped by the
existence of the conversion. However, there is one important exception to this rule of
thumb: It is not uncommon for classes to define conversions to bool.
Under earlier versions of the standard, classes that wanted to define a conversion to
bool faced a problem: Because bool is an arithmetic type, a class-type object that is
converted to bool can be used in any context where an arithmetic type is expected.
Such conversions can happen in surprising ways. In particular, if istream had a
conversion to bool, the following code would compile:
int i = 42;
cin << i; // this code would be legal if the conversion to bool were not explicit!
You should use explicit conversion operators. Below is a little example:
class small_int
{
private:
int val;
public:
// constructors and other members
explicit operator int() const { return this->val; }
}
... and in the program:
int main()
{
SmallInt si = 3; // ok: the SmallInt constructor is not explicit
si + 3; // error: implicit is conversion required, but operator int is explicit
static_cast<int>(si) + 3; // ok: explicitly request the conversion
return 0;
}
I think implicit conversions are good/convenient for the application programmer, but bad for library developers as they introduce complexity and other issues: A good example is the interplay between implicit conversion and template deduction.
From a library perspective, it is easier to constrain language features to a minimal set. For that, you could even say that library is easier to maintain if you remove some of the overloading and default parameters. It's less convenient for application programmers, but there is less ambiguity and complexity.
So, it's really a choice between convenience and simplicity.
It highly depends on exact conversion type, but, in general, there are some areas where implicit conversions may introduce problems, whereas explicit rather may not.
Generally, doing anything explicitly in programming is a more safe way.
There is a good source of information on C++ in general and on implicit and explicit conversions in particular.

In what situations is using operator cast more useful than confusing?

In C++, the use of operator cast can lead to confusion to readers of your code due to it not being obvious that a function call is being invoked. That being said, I've seen its use being discouraged.
However, under what circumstances would using operator cast be appropriate and have value which exceeds any possible confusion it might lead to?
When the conversion is natural and has no side effects it can be useful. Nobody is going to argue that an automatic conversion from int to double is inappropriate for example, even if you can come up with a corner case that makes it confusing (and I'm not sure anybody can).
I've found the conversion from Microsoft's CString to const char * to be incredibly handy, even though I know others disagree. I wouldn't mind seeing a similar capability in std::string.
Operator casts are very useful in a C++ idiom of wrapper objects. For example, suppose you have some copy-on-write implementation of a string class. You want your users to be able to index it naturally, like
const String s = "abc";
assert(s[0] == 'a');
// given
char String::operator[](int) const
So far, you'd think this would work. Yet what happens when someone wants to modify your string? Perhaps this will work, then?
String s = "abc";
s[0] = 'z';
assert(s[0] == 'z');
// given
char & String::operator[](int)
But this implementation gives a reference to a non-const character. So someone can always use that reference to modify the string. So, before it hands out the reference, it has to perform a copy of the string internally, so that other strings won't be modified. Thus it's not possible to use operator[] on non-const strings without forcing a copy. What to do?
Instead of returning a character reference, you can return a wrapper object with following interface:
class CharRef {
public:
operator char() const;
CharRef & operator=(char);
};
The char() conversion operator simply returns a copy of the character stored in the string. When you assign to the wrapper, though, the operator=(char) will force the string to perform an internal copy if the reference count is >1, and modify that copy instead.
The wrapper's implementation may, for example, hold the char and a pointer to the string (probably some subpart of the string's implementation).

Is this a cast or a construction?

I'm a little confused after reading something in a textbook. Regarding the code:
void doSomeWork(const Widget& w)
{
//Fun stuff.
}
doSomeWork(Widget(15));
doSomeWork() takes a const Widget& parameter. The textbook, Effective C++ III, states that this creates a temporary Widget object to pass to doSomeWork. It says that this can be replaced by:
doSomeWork(static_cast<Widget>(15));
as both versions are casts - the first is just a function-style C cast apparently. I would have thought that Widget(15) would invoke a constructor for widget taking one integer parameter though.
Would the constructor be executed in this case?
In C++ this kind of expression is a form of a cast, at least syntactically. I.e. you use a C++ functional cast syntax Widget(15) to create a temporary object of type Widget.
Even when you construct a temporary using a multi-argument constructor (as in Widget(1, 2, 3)) it is still considered a variation of functional cast notation (see 5.2.3)
In other words, your "Is this a cast or a construction" question is incorrectly stated, since it implies mutual exclusivity between casts and "constructions". They are not mutually exclusive. In fact, every type conversion (be that an explicit cast or something more implicit) is nothing else than a creation ("construction") of a new temporary object of the target type (excluding, maybe, some reference initializations).
BTW, functional cast notation is a chiefly C++ notation. C language has no functional-style casts.
Short: Yes.
Long:
You can always test those things yourself, by doing e.g.:
#include <iostream>
struct W
{
W( int i )
{
std::cout << "W(" << i << ")\n";
}
};
int main(int argc, const char *argv[])
{
W w(1);
W(2);
static_cast<W>(3);
}
which is outputting
W(1)
W(2)
W(3)
Yes, it is both :). A cast is a syntactic construct (i.e. something you type). In this case, a constructor is invoked as a consequence of the cast. Much like a constructor would be invoked as a consequence of typing
Widget w(15);
Both Widget(15) and static_cast<Widget>(15) are casts, or conversion
operators, if you prefer. Both create a new object of the designated
type, by converting 15 into a Widget. Since 15 doesn't have any
conversion operators, the only way to do this conversion is by
allocating the necessary memory (on the stack) and calling the
appropriate constructor. This is really no different that double(15)
and static_cast<double>(15), except that we usually don't think of
double as having a constructor (but the resulting double is a new
object, distinct from the 15, which has type int).
You said:
the first is just a function-style C cast apparently
The first would not compile in C, it's not C-style. C-style looks like (Widget)15. Here, the temporary object is created, using Widget::Widget(int).
Therefore, it is not a C-style cast.
Yes, of course. Any constructor takes a single parameter would be considered as CONVERSION CONSTRUCTOR. Your constructor is already taking a single int parameter, so that the compiler can "implicitly" call this constructor to match the argument (with the value 15, which is int).
There is a simple trick to prevent such errors, just use the keyword explicit before your constructor.
Check this for more information.
Yeeeah. You can replace
Widget(15)
with
static_cast<Widget>(15)
Because it will be replaced back by compiler :D
When you cast int to Widget compiler looks for Widget::Widget(int); and place it there.