Type conversion vs type disambiguation

Type conversion vs type disambiguation - c++

Casts are used for both type conversion and disambiguation. In further research I found these two as examples :
(double) 3; // conversion
(double) 3.0; // disambiguation
Can someone explain the difference, I don't see any. Is this distinction, also valid in C++
EDIT
Originally the code snippet was like so:
(float) 3; // conversion
(float) 3.0; // disambiguation
But changed it to double because floating point literals are no longer float in modern compilers and the question had no meaning. I hope I interpreted the comments correctly and I appologize for any answer already posted that became irrelevant after the edit.

The (double) 3 is a conversion from an Integer (int) to a floting point number (double).
The cast in (double) 3.0 is useless, it doesn't do anything since it's already double.
An unsuffixed floating constant has type double.
(ANSI C Standard, §3.1.3.1 Floating constants)
This answer is valid for C, it should be the same in C++.

This is similar to many things asked of programmers in any language. I will give a first example, which is different from what you are asking, but should illustrate better why this would appear wherever you've seen it.
Say I define a constant variable:
static const a = 5;
Now, what is 'a'? Let's try again...
static const max_number_of_buttons = 5;
This is explicit variable naming. It is much longer, and in the long run it is likely to save your ass. In C++ you have another potential problem in that regard: naming of member variables, versus local and parameter variables. All 3 should make use of a different scheme so when you read your C++ function you know exactly what it is doing. There is a small function which tells you nothing about the variables:
void func(char a)
{
char b;
b = a;
p = b * 3;
g = a - 7;
}
Using proper naming conventions and you would know whether p and g are local variables, parameters to the function, variable members, or global variables. (This function is very small so you have an idea, imagine a function of 1,000 lines of code [I've seen those], after a couple pages, you have no idea what's what and often you will shadow variables in ways that are really hard to debug...)
void func(char a)
{
char b;
b = a;
g_p = b * 3;
f_g = a - 7;
}
My personal convention is to add g_ for global variables and f_ for variable members. At this point I do not distinguish local and parameter variables... although you could write p_a instead of just a for the parameter and now you know for all the different types of variables.
Okay, so that makes sense in regard for disambiguation of variable names, although in your case you specifically are asking about types. Let's see...
Ada is known for its very strong typing of variables. For example:
type a is 1 .. 100;
type b is 1 .. 100;
A: a;
B: b;
A := 5;
B := A;
The last line does NOT compile. Even though type a and type b look alike, the compiler view them both as different numeric types. In other words, it is quite explicit. To make it work in Ada, you have to cast A as follow:
B := B'(A);
The same in C/C++, you declare two types and variables:
typedef int a;
typedef int b;
a A;
b B;
A = 5;
B = A;
That works as is in C/C++ because type a and type b are considered to be exactly the same (int, even though you gave them names a and b). It is dearly NOT explicit. So writing something like:
A = (b) 5;
would explicitly tell you that you view that 5 as of type b and convert it (implicitly) to type a in the assignment. You could also write it in this way to be fully explicit:
A = a(b(5));
Most C/C++ programmers will tell you that this is silly, which is why we have so many bugs in our software, unfortunately. Ada protects you against mixing carrots and bananas because even if both are defined as integers they both are different types.
Now there is a way to palliate to that problem in C/C++, albeit pretty much never used, you can make use of a class for each object, including numbers. Then you'd have a specific type for each different type (because variables of class A and class B cannot just be mixed together, at least not unless you allow it by adding functions for the purpose.) Very frankly, even I do not do that because it would be way too much work to write any C program of any length...
So as Juri said: 3.0 and (double) 3.0 are exactly the same thing and the (double) casting is redundant (a pleonasm, as in you say the same thing twice). Yet, it may help the guy who comes behind you see that you really meant for that number to be a double and not whatever the language says it could eventually be.

Related

initialize unsigned int to 0?

In the code base, I often see people writing
void f(unsigned int){/* some stuff*/ }; // define f here
f(0) // call f later
unsigned int a = 0;
double d = 0;
Initialized number (0 here) not matching the declared type annoys me, if this is performance critical code, does this kind of initialization hurt performance?
EDIT
For the downvoters:
I did search before I posted the question, c++ is famous for hidden rules, i didn't know integer promotion rules until someone commented under this question. For people saying "even stupid compiler won't do conversion at run time", here below is an example from this answer
float f1(float x) { return x*3.14; }
float f2(float x) { return x*3.14F; }
These two versions have different performance, and to an unexperienced c++ programmer, I don't see much difference between my question and this example. And C++ is famous for hidden rules and pitfalls, which means intuition sometimes is not right, so why it is getting downvoted to ask a question like this?

You are right that 0 is an int, not a double (0. would be) nor an unsigned int. This does not matter here though as int is implicitly convertible to those types and 0 can be represented perfectly by both of them. There's no performance implication either; the compiler does it all at compile time.

if this is performance critical code, does this kind of initialization hurt performance?
Very unlikely so. Even without any optimization enabled there is no reason for a compiler to generate code to get original value and convert to type of variable instead of initializing by converted representation of that type (if that conversion is necessary). Though this may affect your code if you use new style, that some people recommend to use in modern C++
auto a = 0U; // if you do not specify U then variable type would be signed int
to use this style or not is a subjective question.
For your addition:
float f1(float x) { return x*3.14; }
float f2(float x) { return x*3.14F; }
this case is quite different. In the first case x is promoted to double calculations with double is used and then result is converted to float, while on the second case float is multiplied by float. Difference is significant - you either convert compile time constant of one type to constant of another or use calculations with that types.

Why is the dereference operator (*) also used to declare a pointer?

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.
When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.
Consider the following:
void foo(int* bar)
{
}
int main()
{
int x = 5;
int* y = NULL;
y = &x;
*y = 15;
foo(y);
}
The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.
Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:
void test(int# x)
{
}

In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:
The second innovation that most clearly distinguishes C from its
predecessors is this fuller type structure and especially its
expression in the syntax of declarations... given an object of any
type, it should be possible to describe a new object that gathers
several into an array, yields it from a function, or is a pointer to
it.... [This] led to a
declaration syntax for names mirroring that of the expression syntax
in which the names typically appear. Thus,
int i, *pi, **ppi; declare an integer, a pointer to an integer, a
pointer to a pointer to an integer. The syntax of these declarations
reflects the observation that i, *pi, and **ppi all yield an int type
when used in an expression.
Similarly, int f(), *f(), (*f)(); declare
a function returning an integer, a function returning a pointer to an
integer, a pointer to a function returning an integer. int *api[10],
(*pai)[10]; declare an array of pointers to integers, and a pointer to
an array of integers.
In all these cases the declaration of a
variable resembles its usage in an expression whose type is the one
named at the head of the declaration.
An accident of syntax contributed to the perceived complexity of the
language. The indirection operator, spelled * in C, is syntactically a
unary prefix operator, just as in BCPL and B. This works well in
simple expressions, but in more complex cases, parentheses are
required to direct the parsing. For example, to distinguish
indirection through the value returned by a function from calling a
function designated by a pointer, one writes *fp() and (*pf)()
respectively. The style used in expressions carries through to
declarations, so the names might be declared
int *fp(); int (*pf)();
In more ornate but still realistic cases,
things become worse: int *(*pfp)(); is a pointer to a function
returning a pointer to an integer.
There are two effects occurring.
Most important, C has a relatively rich set of ways of describing
types (compared, say, with Pascal). Declarations in languages as
expressive as C—Algol 68, for example—describe objects equally hard to
understand, simply because the objects themselves are complex. A
second effect owes to details of the syntax. Declarations in C must be
read in an `inside-out' style that many find difficult to grasp.
Sethi [Sethi 81] observed that many of the nested
declarations and expressions would become simpler if the indirection
operator had been taken as a postfix operator instead of prefix, but
by then it was too late to change.

The reason is clearer if you write it like this:
int x, *y;
That is, both x and *y are ints. Thus y is an int *.

That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.

Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:
A pointer type
The dereference operator
Multiplication
You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.
Why not choose another symbol for C++?
Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.
Why not choose another symbol for C?
It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:
when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}
This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:
int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style
though both varieties are found in programs of both languages, varyingly.

Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.
Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.
I prefer the way van der Linden puts it.

Haha, I feel your pain, I had the exact same problem.
I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.
After a while I thought for myself, every type in C has to be read backwards, like
int * const a
is for me
a constant something, when dereferenced equals an int.
Something that has to be dereferenced, has to be a pointer.

Explain "C fundamentally has a corrupt type system"

In the book Coders at Work (p355), Guy Steele says of C++:
I think the decision to be
backwards-compatible with C is a fatal
flaw. It’s just a set of difficulties
that can’t be overcome. C
fundamentally has a corrupt type
system. It’s good enough to help you
avoid some difficulties but it’s not
airtight and you can’t count on it
What does he mean by describing the type system as "corrupt"?
Can you demonstrate with a simple example in C?
Edit:
The quote sounds polemic, but I'm not trying to be. I simply want to understand what he means.
Please give examples in C not C++. I'm interested in the "fundamentally" part too :)

The obvious examples in C of non-type-safety simply come from the fact you can cast from void * to any type without having to explicitly cast so.
struct X
{
int x;
};
struct Y
{
double y;
};
struct X xx;
xx.x = 1;
void * vv = &xx;
struct Y * yy = vv; /* no need to cast explicitly */
printf( "%f", yy->y );
Of course printf itself is not exactly typesafe.
C++ is not totally typesafe.
struct Base
{
int b;
};
struct Derived : Base
{
int d;
Derived()
{
b = 1;
d = 3;
}
};
Derived derivs[50];
Base * bb = &derivs[0];
std::cout << bb[3].b << std::endl;
It has no problem converting the Derived* to a Base* but you run into problems when you try using the Base* as an array as it will get the pointer arithmetic all wrong and whilst all the b values are 1 you may well get a 3 (As the ints will go 1-3-1-3 etc)

Basically you can cast any data type to any data type
struct SomeStruct {
void* data;
};
struct SomeStruct object;
*( (int*) &object ) = 10;
and noone catches you.

char buffer[42];
FunctionThatDestroysTheStack(buffer); // By writing 43 chars or more

The C type system does have some problems. Things like implicit function declaration and implicit conversion from void* can SILENTLY break type safety.
C++ fixes pretty much all of these holes. The C++ type system is NOT backwards compatible with C, it's only compatible with well-written typesafe C code.
Furthermore, the people arguing against C++ typically point you to Java or C# as the "solution". Yet Java and C# do have holes in their type system (array covariance). C++ doesn't have this problem.
EDIT: Examples, in C++, attempting to use array covariance that would (improperly) be allowed by the Java and C# type systems.
#include <stdlib.h>
struct Base {};
struct Derived : Base {};
template<size_t N>
void func1( Base (&array)[N] );
void func2( Base** pArray );
void func3( Base*& refArray );
void test1( void )
{
Base b[40];
Derived d[40];
func1(b); // ok
func1(d); // error caught by C++ type system
}
void test2( void )
{
Base* b[40] = {};
Derived* d[40] = {};
func2(b); // ok
func2(d); // error caught by C++ type system
func3(b[0]); // ok
func3(d[0]); // error caught by C++ type system
}
Results:
Comeau C/C++ 4.3.10.1 (Oct 6 2008 11:28:09) for ONLINE_EVALUATION_BETA2
Copyright 1988-2008 Comeau Computing. All rights reserved.
MODE:strict errors C++ C++0x_extensions
"ComeauTest.c", line 19: error: no instance of function template "func1" matches
the argument list
The argument types that you used are: (Derived [40])
func1(d); // error caught by C++ type system
^
"ComeauTest.c", line 28: error: argument of type "Derived **" is incompatible with
parameter of type "Base **"
func2(d); // error caught by C++ type system
^
"ComeauTest.c", line 31: error: a reference of type "Base *&" (not const-qualified)
cannot be initialized with a value of type "Derived *"
func3(d[0]); // error caught by C++ type system
^
3 errors detected in the compilation of "ComeauTest.c".
This doesn't mean that there are no holes at all in the C++ type system, but it does show that you can't silently overwrite a pointer-to-Derived with a pointer-to-Base like Java and C# allow.

You'd have to ask him what he meant to get a definitive answer, or perhaps provide more context for that quote.
However, it is pretty clear that if this is a fatal flaw for C++, the disease is chronic, not acute - C++ is thriving, and continually evolving as evidenced by ongoing Boost and C++0x efforts.
I don't even think about C and C++ as coupled any more - a few weeks on the respective fora here quickly cures one of any confusion over the fact that they are two different languages, each with its own strengths and foibles.

IMHO the "most broken" part of the C type system is that the concepts of
values/parameters that are optional
mutable values/pass-by-reference
arrays
non-POD function parameters
are all mapped to the single language concept "pointer". That means, if you get a function parameter of type X*, it might be an optional parameter, it might be expected that the function changes the value pointed to by X*, it might be that there are multiple instances of X after the one pointed to (it's open how many - the number could be passed as a separate parameter, or some kind special "terminator" value might mark the end of the array, as in nul-terminated strings). Or, the parameter might simply by a single structure, that you're not expected to change, but it's cheaper to pass it by reference.
If you get something of type X**, it might be an array of optional values, or it might be an array of simple values and you're expected to change it. Or it might be a 2d jagged array. Or an optional value passed by reference.
In contrast, take the ML family of languages (F#, OCaML, SML). Here these concepts map to separate language constructs:
values that are optional have the type X option
values that are mutable/pass by reference have the type X ref
arrays have the type X array
and non-POD types can be passed like PODs. Because they aren't mutable, the compiler can pass them by reference internally, but you don't need to know about that implementation detail
And you can of course combine those, i.e. int optional ref is a mutable value, that can be set to nothing or some integer value. int ref optional on the other hand is an optional mutable value; it can be nothing (and noone can change it) or it can be some mutable int (and you can change it to any other mutable it, but not to nothing).
These distinctions are very sublte, but you have to make them whether you program in ML or not. In C you have to make the same distinctions, but they're not explicitly stated in the type system. You have to read the documentation very carefully, or you might introduce sublte (read: hard to find) bugs if you misunderstand which kind of pointer usage is meant when.

Here, "corrupt" means that it is not "strict", leading to never-ending delight in C++ (because of the many custom types (objects) and overloaded operators, casting becomes a superior nuisance in C++).
The attack against C comes in regard to its MISPLACED USAGE as a strict OOP basis.
C has never been designed to limit coders, hence, maybe the frustration of Academia (and the flamboyant splendor of the ++ given to the World by B.S.).
"I invented the term Object-Oriented, and I can tell you I did not have C++ in mind"
(Alan Kay)

Why do we use "type * var" instead of "type & var" when defining a pointer?

I'm relatively new to C++ (about one year of experience, on and off). I'm curious about what led to the decision of type * name as the syntax for defining pointers. It seems to me that the syntax should be type & name as the & symbol is used everywhere else in code to refer to the variable's memory address. So, to use the traditional example of int pointers:
int a = 1;
int * b = &a;
would become
int a = 1;
int & b = &a
I'm sure there's some reason for this that I'm just not seeing, and I'd love to hear some input from C++ veterans.
Thanks,
-S

C++ adopts the C syntax. As revealed in "The Development of the C Language" (by Dennis Ritchie) C uses * for pointers in type declarations because it was decided that type syntax should follow use.
For each object of [a compound type], there was already a way to mention the underlying object: index the array, call the function, use the indirection operator [*] on the pointer. Analogical reasoning led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an integer, a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression.
Here's a more complex example:
int *(*foo)[4][];
This declaration means an expression *(*foo)[4][0] has type int, and from that (and that [] has higher precedence than unary *) you can decode the type: foo is a pointer to an array of size 4 of array of pointers to ints.
This syntax was adopted in C++ for compatibility with C. Also, don't forget that C++ has a use for & in declarations.
int & b = a;
The above line means a reference variable refering to another variable of type int. The difference between a reference and pointer roughly is that references are initialized only, and you can not change where they point, and finally they are always dereferenced automatically.
int x = 5, y = 10;
int& r = x;
int sum = r + y; // you do not need to say '*r' automatically dereferenced.
r = y; // WRONG, 'r' can only have one thing pointing at during its life, only at its infancy ;)

I think that Dennis Ritchie answered this in The Development of the C Language:
For each object of such a composed
type, there was already a way to
mention the underlying object: index
the array, call the function, use the
indirection operator on the pointer.
Analogical reasoning led to a
declaration syntax for names mirroring
that of the expression syntax in which
the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an
integer, a pointer to a pointer to an
integer. The syntax of these
declarations reflects the observation
that i, *pi, and **ppi all yield an
int type when used in an expression.
Similarly,
int f(), *f(), (*f)();
declare a function returning an
integer, a function returning a
pointer to an integer, a pointer to a
function returning an integer;
int *api[10], (*pai)[10];
declare an array of pointers to
integers, and a pointer to an array of
integers. In all these cases the
declaration of a variable resembles
its usage in an expression whose type
is the one named at the head of the
declaration.
So we use type * var to declare a pointer because this allows the declaration to mirror the usage (dereferencing) of the pointer.
In this article, Ritchie also recounts that in "NB", an extended version of the "B" programming language, he used int pointer[] to declare a pointer to an int, as opposed to int array[10] to declare an array of ints.

If you are a visual thinker, it may help to imagine the asterisk as a black hole leading to the data value. Hence, it is a pointer.
The ampersand is the opposite end of the hole, think of it as an unraveled asterisk or a spaceship wobbling about in an erratic course as the pilot gets over the transition coming out of the black hole.
I remember being very confused by C++ overloading the meaning of the ampersand, to give us references. In their desperate attempt to avoid using any more characters, which was justified by the international audience using C and known issues with keyboard limitations, they added a major source of confusion.
One thing that may help in C++ is to think of references as pre-prepared dereferenced pointers. Rather than using &someVariable when you pass in an argument, you've already used the trailing ampersand when you defined someVariable. Then again, that might just confuse you further!
One of my pet hates, which I was unhappy to see promulgated in Apple's Objective-C samples, is the layout style int *someIntPointer instead of int* someIntPointer
IMHO, keeping the asterisk with the variable is an old-fashioned C approach emphasizing the mechanics of how you define the variable, over its data type.
The data type of someIntPointer is literally a pointer to an integer and the declaration should reflect that. This does lead to the requirement that you declare one variable per line, to avoid subtle bugs such as:
int* a, b; // b is a straight int, was that our intention?
int *a, *b; // old-style C declaring two pointers
int* a;
int* b; // b is another pointer to an int
Whilst people argue that the ability to declare mixed pointers and values on the same line, intentionally, is a powerful feature, I've seen it lead to subtle bugs and confusion.

Your second example is not valid C code, only C++ code. The difference is that one is a pointer, whereas the other is a reference.
On the right-hand side the '&' always means address-of. In a definition it indicates that the variable is a reference.
On the right-hand side the '*' always means value-at-address. In a definition it indicates that the variable is a pointer.
References and pointers are similar, but not the same. This article addresses the differences.

Instead of reading int* b as "b is a pointer to int", read it as int *b: "*b is an int". Then, you have & as an anti-*: *b is an int. The address of *b is &*b, or just b.

I think the answer may well be "because that's the way K&R did it."
K&R are the ones who decided what the C syntax for declaring pointers was.
It's not int & x; instead of int * x; because that's the way the language was defined by the guys who made it up -- K&R.

How should C bitflag enumerations be translated into C++?

C++ is mostly a superset of C, but not always. In particular, while enumeration values in both C and C++ implicitly convert into int, the reverse isn't true: only in C do ints convert back into enumeration values. Thus, bitflags defined via enumeration declarations don't work correctly. Hence, this is OK in C, but not in C++:
typedef enum Foo
{
Foo_First = 1<<0,
Foo_Second = 1<<1,
} Foo;
int main(void)
{
Foo x = Foo_First | Foo_Second; // error in C++
return 0;
}
How should this problem be handled efficiently and correctly, ideally without harming the debugger-friendly nature of using Foo as the variable type (it decomposes into the component bitflags in watches etc.)?
Consider also that there may be hundreds of such flag enumerations, and many thousands of use-points. Ideally some kind of efficient operator overloading would do the trick, but it really ought to be efficient; the application I have in mind is compute-bound and has a reputation of being fast.
Clarification: I'm translating a large (>300K) C program into C++, so I'm looking for an efficient translation in both run-time and developer-time. Simply inserting casts in all the appropriate locations could take weeks.

Why not just cast the result back to a Foo?
Foo x = Foo(Foo_First | Foo_Second);
EDIT: I didn't understand the scope of your problem when I first answered this question. The above will work for doing a few spot fixes. For what you want to do, you will need to define a | operator that takes 2 Foo arguments and returns a Foo:
Foo operator|(Foo a, Foo b)
{
return Foo(int(a) | int(b));
}
The int casts are there to prevent undesired recursion.

It sounds like an ideal application for a cast - it's up to you to tell the compiler that yes, you DO mean to instantiate a Foo with a random integer.
Of course, technically speaking, Foo_First | Foo_Second isn't a valid value for a Foo.

Either leave the result as an int or static_cast:
Foo x = static_cast<Foo>(Foo_First | Foo_Second); // not an error in C++

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js