why implicit conversion is harmful in C++

why implicit conversion is harmful in C++ - c++

I understand that the keyword explicit can be used to prevent implicit conversion.
For example
Foo {
public:
explicit Foo(int i) {}
}
My question is, under what condition, implicit conversion should be prohibited? Why implicit conversion is harmful?

Use explicit when you would prefer a compiling error.
explicit is only applicable when there is one parameter in your constructor (or many where the first is the only one without a default value).
You would want to use the explicit keyword anytime that the programmer may construct an object by mistake, thinking it may do something it is not actually doing.
Here's an example:
class MyString
{
public:
MyString(int size)
: size(size)
{
}
//... other stuff
int size;
};
With the following code you are allowed to do this:
int age = 29;
//...
//Lots of code
//...
//Pretend at this point the programmer forgot the type of x and thought string
str s = x;
But the caller probably meant to store "3" inside the MyString variable and not 3. It is better to get a compiling error so the user can call itoa or some other conversion function on the x variable first.
The new code that will produce a compiling error for the above code:
class MyString
{
public:
explicit MyString(int size)
: size(size)
{
}
//... other stuff
int size;
};
Compiling errors are always better than bugs because they are immediately visible for you to correct.

It introduces unexpected temporaries:
struct Bar
{
Bar(); // default constructor
Bar( int ); // value constructor with implicit conversion
};
void func( const Bar& );
Bar b;
b = 1; // expands to b.operator=( Bar( 1 ));
func( 10 ); // expands to func( Bar( 10 ));

A real world example:
class VersionNumber
{
public:
VersionNumber(int major, int minor, int patch = 0, char letter = '\0') : mMajor(major), mMinor(minor), mPatch(patch), mLetter(letter) {}
explicit VersionNumber(uint32 encoded_version) { memcpy(&mLetter, &encoded_version, 4); }
uint32 Encode() const { int ret; memcpy(&ret, &mLetter, 4); return ret; }
protected:
char mLetter;
uint8 mPatch;
uint8 mMinor;
uint8 mMajor;
};
VersionNumber v = 10; would almost certainly be an error, so the explicit keyword requires the programmer to type VersionNumber v(10); and - if he or she is using a decent IDE - they will notice through the IntelliSense popup that it wants an encoded_version.

Mostly implicit conversion is a problem when it allows code to compile (and probably do something strange) in a situation where you did something you didn't intend, and would rather the code didn't compile, but instead some conversion allows the code to compile and do something strange.
For example, iostreams have a conversion to void *. If you're bit tired and type in something like: std::cout << std::cout; it will actually compile -- and produce some worthless result -- typically something like an 8 or 16 digit hexadecimal number (8 digits on a 32-bit system, 16 digits on a 64-bit system).
At the same time, I feel obliged to point out that a lot of people seem to have gotten an almost reflexive aversion to implicit conversions of any kind. There are classes for which implicit conversions make sense. A proxy class, for example, allows conversion to one other specific type. Conversion to that type is never unexpected for a proxy, because it's just a proxy -- i.e. it's something you can (and should) think of as completely equivalent to the type for which it's a proxy -- except of course that to do any good, it has to implement some special behavior for some sort of specific situation.
For example, years ago I wrote a bounded<T> class that represents an (integer) type that always remains within a specified range. Other that refusing to be assigned a value outside the specified range, it acts exactly like the underlying intger type. It does that (largely) by providing an implicit conversion to int. Just about anything you do with it, it'll act like an int. Essentially the only exception is when you assign a value to it -- then it'll throw an exception if the value is out of range.

It's not harmful for the experienced. May be harmful for beginner or a fresher debugging other's code.

"Harmful" is a strong statement. "Not something to be used without thought" is a good one. Much of C++ is that way (though some could argue some parts of C++ are harmful...)
Anyway, the worst part of implicit conversion is that not only can it happen when you don't expect it, but unless I'm mistaken, it can chain... as long as an implicit conversion path exists between type Foo and type Bar, the compiler will find it, and convert along that path - which may have many side effects that you didn't expect.
If the only thing it gains you is not having to type a few characters, it's just not worth it. Being explicit means you know what is actually happening and won't get bit.

To expand Brian's answer, consider you have this:
class MyString
{
public:
MyString(int size)
: size(size)
{
}
// ...
};
This actually allows this code to compile:
MyString mystr;
// ...
if (mystr == 5)
// ... do something
The compiler doesn't have an operator== to compare MyString to an int, but it knows how to make a MyString out of an int, so it looks at the if statement like this:
if (mystr == MyString(5))
That's very misleading since it looks like it's comparing the string to a number. In fact this type of comparison is probably never useful, assuming the MyString(int) constructor creates an empty string. If you mark the constructor as explicit, this type of conversion is disabled. So be careful with implicit conversions - be aware of all the types of statements that it will allow.

I use explicit as my default choice for converting (single parameter or equivalent) constructors. I'd rather have the compiler tell me immediately when I'm converting between one class and another and make the decision at that point if the conversion is appropriate or instead change my design or implementation to remove the need for the conversion completely.
Harmful is a slightly strong word for implicit conversions. It's harmful not so much for the initial implementation, but for maintenance of applications. Implicit conversions allow the compiler to silently change types, especially in parameters to yet another function call - for example automatically converting an int into some other object type. If you accidentally pass an int into that parameter the compiler will "helpfully" silently create the temporary for you, leaving you perplexed when things don't work right. Sure we can all say "oh, I'll never make that mistake", but it only takes one time debugging for hours before one starts thinking maybe having the compiler tell you about those conversions is a good idea.

Related

Templated function can't convert 'int' to nullptr_t

TL;DR:
Why can't templated functions access the same conversions that non-templated functions can?
struct A {
A(std::nullptr_t) {}
};
template <typename T>
A makeA(T&& arg) {
return A(std::forward<T>(arg));
}
void foo() {
A a1(nullptr); //This works, of course
A a2(0); //This works
A a3 = makeA(0); //This does not
}
Background
I'm trying to write some templated wrapper classes to use around existing types, with the goal of being drop-in replacements with minimal need to rewrite existing code that uses the now-wrapped values.
One particular case I can't get my head around is as follows: we have a class which can be constructed from std::nullptr_t (here called A), and as such, there's plenty of places in the code base where someone has assigned zero to an instance.
However, the wrapper cannot be assigned a zero, despite forwarding the constructors. I have made a very similar example that reproduces the issue without using an actual wrapper class - a simple templated function is sufficient to show the issue.
I would like to allow that syntax of being able to assign zero to continue to be allowed - it isn't my favourite, but minimising friction to moving to newer code is often a necessity just to get people on board with using them.
I also don't want to add a constructor that takes any int other than zero because that's very much absurd, was never allowed before, and it should continue to be caught at compile time.
If such a thing is not possible, it would satisfy me to find an explanation, because with as much as I know so far, it makes no sense to me.
This example has the same behaviour in VC++ (Intellisense seems to be OK with it though...), Clang, and GCC. Ideally a solution will also work in all 3 (4 with intellisense) compilers.
A more directly applicable example is follows:
struct A {
A(){}
A(std::nullptr_t) {}
};
template <typename T>
struct Wrapper {
A a;
Wrapper(const A& a):a (a) {}
template <typename T>
Wrapper(T&& t): a(std::forward<T>(t)){}
Wrapper(){}
};
void foo2() {
A a1;
a1 = 0; // This works
Wrapper<A> a2;
a2 = 0; //This does not
}

Why has the compiler decided to treat the zero as an int?
Because it is an integer.
The literal 0 is a literal. Literals get to do funny things. String literals can be converted into const char* or const char[N], where N is the length of the string + NUL terminator. The literal 0 gets to do funny things too; it can be used to initialize a pointer with a NULL pointer constant. And it can be used to initialize an object of type nullptr_t. And of course, it can be used to create an integer.
But once it gets passed as a parameter, it can't be a magical compiler construct anymore. It becomes an actual C++ object with a concrete type. And when it comes to template argument deduction, it gets the most obvious type: int.
Once it becomes an int, it stops being a literal 0 and behaves exactly like any other int. Not unless it is used in a constexpr context (like your int(0)), where the compiler can figure out that it is indeed a literal 0 and therefore can take on its magical properties. Function parameters are never constexpr, and thus they cannot partake in this.

See [conv.ptr]/1:
A null pointer constant is an integer literal with value zero or a prvalue of type std::nullptr_t. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type [...]
So the integer literal 0 can be converted to a null pointer. But if you attempt to convert some other integer value, that is not a literal, to a pointer type then the above quote does not apply. In fact there is no other implicit conversion from integer to pointer (since none such is listed in [conv.ptr]), so your code fails.
Note: Explicit conversion is covered by [expr.reinterpret.cast]/5.

Is there a way to use only the object name of a class as a "default" member?

Think in a similar fashion like:
1. The bare name of an array is equivalent with the pointer to the first element, without the need to specify index 0.
2. toString() from Java makes it possible to use the name of an object as a string without calling any object method.
Now is there a way in C++ to use the name of a class object to refer to its first member?
Consider:
class Program
{
public:
int id;
char *str;
};
void function(int p)
{
//...
}
and then:
Program prog0;
function(prog0); // instead of function(prog0.id)
Any way to "hide" the member reference?
EDIT:
Why was the holyBlackCat's answer deleted? I was inclining to vote it as the best answer -- no offense, Mateusz. But he was the first to suggest conversion operator and the example was complete and simple.

In C++, such behaviour would be a cataclysm. If I understand correctly, Java tries to convert object of type A to object of type B by searching for first member in A, that is of type B or is implicitly convertible to B.
C++ wasn't designed that way. We like to write code, that is always predictable. You can achieve what you want, but for a price.
The best solution in this case would be conversion operator - consider:
class Program
{
public:
int id;
char *str;
operator int()
{
return this->id;
}
//You can have more than one!
operator const char*()
{
return this->str;
}
};
void function_int(int p)
{
}
void function_str(const char* s)
{
}
Now it is possible to do the following:
Program prog;
function_int(prog); //Equivalent of function_int(prog.id)
function_str(prog); //Equivalent of function_int(prog.str)
The price is, that if you add another int and place it before id it will not be used in conversion, because we stated in our operator explicitly, that "int content" of our class is represented by id and this member is considered when it comes to such conversion.
However, even this simple example shows some potential problems - overloading functions with integral and pointer types could result in very unpredictable behavior. When type contains conversion operators to both pointers and integers, it can get even worse.
Assume, that we have following function:
void func(unsigned long)
{
}
And we call func with argument of type Program. Which conversion operator would you expect to be called? Compiler knows how to convert Program to either int or const char*, but not unsigned long. This article on cppreference should help you to understand how implicit conversions work.
Also, as Barry pointed out, more meaningless constructs become available. Consider this one:
int x = prog + 2
What does it mean? It is perfectly valid code, though. That is why conversion operators should be dosed extremely carefully (in pre-C++11 era, there was a general advise, that every class should have at most one such operator).
Quote from MSDN:
If a conversion is required that causes an ambiguity, an error is generated. Ambiguities arise when more than one user-defined conversion is available or when a user-defined conversion and a built-in conversion exist.
Sometimes, simple solution to this problem is to mark conversion operator with explicit keyword, so you would need to change above calls to:
function_int((int)prog);
function_str((const char*)prog);
It is not as pretty as the previous form, but much safer. It basically means, that compiler is forbidden to perform any implicit conversion using operator marked as explicit. Very useful to avoid ambiguous calls, while still providing some flexibility in code - you can still very easily convert objects of one type to another, but you can be sure when and where these conversions are performed.
However, explicit conversion operators are still not supported by some compilers, as this is C++ 11 feature (for example, Visual C++ 11 doesn't support it).
You can read more about explicit keyword here.

Now is there a way in C++ to use the name of a class object to refer to its first member?
No, C++ doesn't have any reflection, so there's no way to actually determine what the "first member" is.
However, if what you really want is to get an ID for any object, you could just require that object to have that method:
template <typename T>
void function(const T& t) {
int id = t.getID();
// etc.
}
Without knowing more about your use-case, it's hard to know what to propose.

why std::string is not implicitly converted to bool

Is there a reason why in c++ std::string is not implicitly converted to bool? For example
std::string s = ""
if (s) { /* s in not empty */ }
as in other languages (e.g. python). I think it is tedious to use the empty method.

This probably could be added now that C++11 has added the concepts of explicit conversions and contextual conversion.
When std::string was designed, neither of these was present though. That made classes that supported conversion to bool fairly difficult to keep safe. In particular, that conversion could (and would) happen in lots of cases you almost never wanted it to. For example, if we assume std::string converts to false if empty and otherwise to true, then you could use a string essentially anywhere an integer or pointer was intended.
Rather than telling you about the type mismatch, the compiler would convert the string to bool, and then the bool to an integer (false -> 0, true -> 1).
Things like this happened often enough with many early attempts at string types (and there were many) that the committee apparently decided it was better to keep implicit conversions to an absolute minimum (so about the only implicit conversion supported by string is to create a string object from a C-style string).
There were a number of methods devised for handling conversion to bool more safely. One was converting to void * instead, which prevented some problems, but not others (this was used by iostreams). There was also a "safe bool" idiom (actually, more like a "safe bool" theme, of which there were several variations). While these certainly improved control over what conversions would and wouldn't be allowed, most of them involved a fair amount of overhead (a typical safe bool required a base class of ~50 lines of code, plus derivation from that base class, etc.)
As to how explicit conversion and contextual conversion would help, the basic idea is pretty simple. You can (starting with C++11) mark a conversion function as explicit, which allows it to be used only where an explicit cast to the target type is used:
struct X {
explicit operator bool() { return true; }
};
int main() {
X x;
bool b1 = static_cast<bool>(x); // compiles
bool b2 = x; // won't compile
}
Contextual conversion adds a little to let the conversion to bool happen implicitly, but only in something like an if statement, so using a class with the conversion function above, you'd get:
X x;
if (x) // allowed
int y = x; // would require explicit cast to compile
I'd add that complaints about "orthogonality" seem quite inapplicable here. Although convenient, converting a string to a Boolean doesn't really make a lot of sense. If anything, we should complain about how strange it is for string("0") to convert to 1 (in languages where that happens).

This article mentions some reasons why operator bool() can lead to surprising results.
Note that std::string is just a typedef for std::basic_string<char>. There is also std::wstring for multi-byte characters. An implicit conversion would let you write:
std::string foo = "foo";
std::wstring bar = "bar";
if (foo == bar) {
std::cout << "This will be printed, because both are true!\n";
}

std::string still has to coexist with C-style strings.
A C-style string is by definition "a contiguous sequence of characters terminated by and including the first null character", and is generally accessed via a pointer to its first character. An expression such as "hello, world" is, in most contexts, implicitly converted to a pointer to the first character. Such a pointer may then be implicitly converted to bool, yielding true if the pointer is non-null, false if it's null. (In C, it's not converted to bool, but it can still be used directly as a condition, so the effect is nearly the same.)
So, due to C++'s C heritage, if you write:
if ("") { ... }
the empty string is already treated as true, and that couldn't easily be changed without breaking C compatibility.
I suggest that having a C-style empty string evaluate as true and a C++ empty std::string evaluate as false would be too confusing.
And writing if (!s.empty()) isn't that difficult (and IMHO it's more legible).

The closest thing to what you (and I) want, that I've been able to find, is the following.
You can define the ! operator on std::string's like so:
bool operator!(const std::string& s)
{
return s.empty();
}
This allows you to do:
std::string s;
if (!s) // if |s| is empty
And using a simple negation, you can do:
if (!!s) // if |s| is not empty
That's a little awkward, but the question is, how badly do you want to avoid extra characters? !strcmp(...) is awkward, too, but we still functioned and got used to it, and many of us preferred it because it was faster than typing strcmp(...) == 0.
If anyone discovers a way to do if (s), in C++, please let us know.

Which declaration is the standard in C++/C?

Which of the following declaration is the standard and preferred?
int x = 7;
or
int x(7);

int x(7);
is not the valid way of declaring & initializing a variable in C;
I suggest you to get a good book for learning C and use a good compiler.

in c kind of langueges you normally use
int x = 7;

You've tagged this question as both C and C++, but the answers aren't really the same for the two.
In C, int x(7); simply doesn't work. It won't even compile. Since this form doesn't work at all in C, the preferred form is int x = 7;.
In C++, int x(7); works -- but you have to be careful, as this form can lead to the "most vexing parse"; if whatever was in the parentheses could be interpreted as a type instead of a value, this would be parsed as declaring a function named x that returned an int instead of defining an int with the value specified in the parentheses. Likewise, if you leave the parens empty: int x(); you end up with a function declaration.
C++ does have another form: int x{7};. Some call this "universal initialization". It does remove any ambiguity -- anything like T id { x }; where T is a type must be a definition of a id with x as an initializer. Some people dislike this, however, because it introduces somewhat different semantics -- for example, "narrowing" conversions are prohibited in this case, so you can't blindly change existing code to use the new form. For example:
int x(12.34); // no problem
int y{ 12.34 }; // won't compile -- double -> int is a narrowing conversion
This isn't particularly likely to happen with a literal as I've shown above, but something like:
void f(double x) {
int y(x);
// ...
...is rather more likely -- and still isn't allowed.
Unfortunately, going "back" to the C-style initialization doesn't cure all the possible problems either. At least in theory, it does copy initialization instead of direct initialization, so the type you're initializing must have a copy constructor available to do this initialization. For example:
class T {
T(T const &) = delete;
public:
T(int) {}
};
int main() {
T t = 1;
}
This isn't officially allowed to work, because what it's supposed to do is use T(int) to create a temporary T object, then use T(T const &) to copy-construct t from that temporary. Since we've deleted the copy constructor, it can't (officially) be used. This can be particularly confusing, because nearly all compilers will normally do the job without using the copy constructor at all, so the code will normally compile and work just fine--but the minute you turn on the mode where the compiler tries to follow the standard as closely as possible, the code won't compile at all.
Some people find the changes in "uniform initialization" so off-putting that they recommend against using it at all. Personally, I prefer to use it and simply ensure that I'm not doing any narrowing conversions (or use an explicit cast if a narrowing conversion just can't be avoided).

In general
int x = 7;
is the standard way, and the ONLY way in C.
However, in C++, you can initialize an int using x(7) in cases such as constructor initializer lists, where you have to invoke a 'constructor' for each variable you are initializing that way. For primitives, you do this with the x(7) syntax.

c++: using type safety to distinguish types oftwo int arguments

I have various functions with two int arguments (I write both the functions and the calling code myself). I am afraid to confuse the order of argument in some calls.
How can I use type safety to have compiler warn me or error me if I call a function with wrong sequence of arguments (all arguments are int) ?
I tried typedefs: Typedef do not trigger any compiler warnings or errors:
typedef int X; typedef int Y;
void foo(X,Y);
X x; Y y;
foo(y,x); // compiled without warning)

You will have to create wrapper classes. Lets say you have two different units (say, seconds and minutes), both of which are represented as ints. You would need something like the following to be completely typesafe:
class Minute
{
public:
explicit Minute(int m) : myMinute(m) {}
operator int () const { return myMinute; }
private:
int myMinute;
};
and a similar class for seconds. The explicit constructor prevents you accidentally using an int as a Minute, but the conversion operator allows you to use a Minute anywhere you need an int.

typedef creates type aliases. As you've discovered, there's no type safety there.
One possibility, depending on what you're trying to achieve, is to use enum. That's not fully typesafe either, but it's closer. For example, you can't pass an int to an enum parameter without casting it.

Get a post-it note. Write on it, in big letters, "X FIRST! THEN Y!" Stick it to your computer screen. I honestly don't know what else to advise. Using wrapper classes is surely overkill, when the problem can be solved with a post-it and a magic marker.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

why implicit conversion is harmful in C++ - c++

I understand that the keyword explicit can be used to prevent implicit conversion. For example Foo { public: explicit Foo(int i) {} } My question is, under what condition, implicit conversion should be prohibited? Why implicit conversion is harmful?

It introduces unexpected temporaries: struct Bar { Bar(); // default constructor Bar( int ); // value constructor with implicit conversion }; void func( const Bar& ); Bar b; b = 1; // expands to b.operator=( Bar( 1 )); func( 10 ); // expands to func( Bar( 10 ));

It's not harmful for the experienced. May be harmful for beginner or a fresher debugging other's code.

Related

Templated function can't convert 'int' to nullptr_t

Is there a way to use only the object name of a class as a "default" member?

why std::string is not implicitly converted to bool

Which declaration is the standard in C++/C?

c++: using type safety to distinguish types oftwo int arguments

Categories

Resources