Does Ceylon allow explicit Type Casting (downcast)? - casting

I know there's a concept of flow-sensitive typing in Ceylon in which we can narrow down the type of an expression by case. Is there a way to explicitly convert the type of an expression in Ceylon as in Java?

As a statement:
assert(variable is NarrowType);
I don’t remember if there’s a recommended way to do that as an expression, but you could always put the above (plus return variable;) into a generic function and call it with your expression

Related

What does static_cast mean when it's followed by two pairs of parentheses?

What does this say:
return static_cast<Hasher &>(*this)(key);
?
I can't tell whether *this or key is passed to static_cast. I looked around and found this answer, but unlike what I'm stuck on, there's nothing inside the first pair of parentheses.
The statement is parsed as
return (static_cast<Hasher &>(*this))(key);
So, the argument to static_cast is *this. Then the result of the cast, let's call it x, is used as postfix-expression in a function call with key as argument, i.e. x(key), the result of which is returned.
I can't tell whether *this or key is passed to static_cast
If you're unsure, you can just look up the syntax.
In the informal documentation, the only available syntax for static_cast is:
static_cast < new-type > ( expression )
and the same is true in any standard draft you compare.
So there is no static_cast<T>(X)(Y) syntax, and the only possible interpretation is:
new-type = Hasher&
expression = *this
and the overall statement is equivalent to
Hasher& __tmp = static_cast<Hasher &>(*this);
return __tmp(key);
In the skarupke/flat_hash_map code you linked, the interpretation is that this class has a function call operator inherited from the private base class Hasher, and it wants to call that explicitly - ie, Hasher::operator() rather than any other inherited operator(). You'll note the same mechanism is used to explicitly call the other privately-inherited function call operators.
It would be more legible if it used a different function name for each of these policy type parameters, but then you couldn't use std::equal_to directly for the Equal parameter, for example.
It might also be more legible if it used data members rather than private inheritance for Hasher, Equal etc. but this way is chosen to allow the empty base-class optimization for stateless policies.

In OCaml Menhir, how to write a parser for C++/Rust/Java-style generics

In C++, a famous parsing ambiguity happens with code like
x<T> a;
Is it if T is a type, it is what it looks like (a declaration of a variable a of type x<T>, otherwise it is (x < T) > a (<> are comparison operators, not angle brackets).
In fact, we could make a change to make this become unambiguous: we can make < and > nonassociative. So x < T > a, without brackets, would not be a valid sentence anyway even if x, T and a were all variable names.
How could one resolve this conflict in Menhir? At first glance it seems we just can't. Even with the aforementioned modification, we need to lookahead an indeterminate number of tokens before we see another closing >, and conclude that it was a template instantiation, or otherwise, to conclude that it was an expression. Is there any way in Menhir to implement such an arbitrary lookahead?
Different languages (including the ones listed in your title) actually have very different rules for templates/generics (like what type of arguments there can be, where templates/generics can appear, when they are allowed to have an explicit argument list and what the syntax for template/type arguments on generic methods is), which strongly affect the options you have for parsing. In no language that I know is it true that the meaning of x<T> a; depends on whether T is a type.
So let's go through the languages C++, Java, Rust and C#:
In all four of those languages both types and functions/methods can be templates/generic. So we'll not only have to worry about an ambiguity with variable declarations, but also function/method calls: is f<T>(x) a function/method call with an explicit template/type argument or is it two relational operators with the last operand parenthesized? In all four languages template/generic functions/methods can be called without template/type when those can be inferred, but that inference isn't always possible, so just disallowing explicit template/type arguments for function/method calls is not an option.
Even if a language does not allow relational operators to be chained, we could get an ambiguity in expressions like this: f(a<b, c, d>(e)). Is this calling f with the three arguments a<b, c and d>e or with the single argument a<b, c, d>(e) calling a function/method named a with the type/template arguments b,c,d?
Now beyond this common foundation, most everything else is different between these languages:
Rust
In Rust the syntax for a variable declaration is let variableName: type = expr;, so x<T> a; couldn't possibly be a variable declaration because that doesn't match the syntax at all. In addition it's also not a valid expression statement (anymore) because comparison operators can't be chained (anymore).
So there's no ambiguity here or even a parsing difficulty. But what about function calls? For function calls, Rust avoided the ambiguity by simply choosing a different syntax to provide type arguments: instead of f<T>(x) the syntax is f::<T>(x). Since type arguments for function calls are optional when they can be inferred, this ugliness is thankfully not necessary very often.
So in summary: let a: x<T> = ...; is a variable declaration, f(a<b, c, d>(e)); calls f with three arguments and f(a::<b, c, d>(e)); calls a with three type arguments. Parsing is easy because all of these are sufficiently different to be distinguished with just one token of lookahead.
Java
In Java x<T> a; is in fact a valid variable declaration, but it is not a valid expression statement. The reason for that is that Java's grammar has a dedicated non-terminal for expressions that can appear as an expression statement and applications of relational operators (or any other non-assignment operators) are not matched by that non-terminal. Assignments are, but the left side of assignment expressions is similarly restricted. In fact, an identifier can only be the start of an expression statement if the next token is either a =, ., [ or (. So an identifier followed by a < can only be the start of a variable declaration, meaning we only need one token of lookahead to parse this.
Note that when accessing static members of a generic class, you can and must refer to the class without type arguments (i.e. FooClass.bar(); instead of FooClass<T>.bar()), so even in that case the class name would be followed by a ., not a <.
But what about generic method calls? Something like y = f<T>(x); could still run into the ambiguity because relational operators are of course allowed on the right side of =. Here Java chooses a similar solution as Rust by simply changing the syntax for generic method calls. Instead of object.f<T>(x) the syntax is object.<T>f(x) where the object. part is non-optional even if the object is this. So to call a generic method with an explicit type argument on the current object, you'd have to write this.<T>f(x);, but like in Rust the type argument can often be inferred, allowing you to just write f(x);.
So in summary x<T> a; is a variable declaration and there can't be expression statements that start with relational operations; in general expressions this.<T>f(x) is a generic method call and f<T>(x); is a comparison (well, a type error, actually). Again, parsing is easy.
C#
C# has the same restrictions on expression statements as Java does, so variable declarations aren't a problem, but unlike the previous two languages, it does allow f<T>(x) as the syntax for function calls. In order to avoid ambiguities, relational operators need to be parenthesized when used in a way that could also be valid call of a generic function. So the expression f<T>(x) is a method call and you'd need to add parentheses f<(T>(x)) or (f<T)>(x) to make it a comparison (though actually those would be type errors because you can't compare booleans with < or >, but the parser doesn't care about that) and similarly f(a<b, c, d>(e)) calls a generic method named a with the type arguments b,c,d whereas f((a<b), c, (d<e)) would involve two comparisons (and you can in fact leave out one of the two pairs of parentheses).
This leads to a nicer syntax for method calls with explicit type arguments than in the previous two languages, but parsing becomes kind of tricky. Considering that in the above example f(a<b, c, d>(e)) we can actually place an arbitrary number of arguments before d>(e) and a<b is a perfectly valid comparison if not followed by d>(e), we actually need an arbitrary amount of lookahead, backtracking or non-determinism to parse this.
So in summary x<T> a; is a variable declaration, there is no expression statement that starts with a comparison, f<T>(x) is a method call expression and (f<T)>(x) or f<(T>(x)) would be (ill-typed) comparisons. It is impossible to parse C# with menhir.
C++
In C++ a < b; is a valid (albeit useless) expression statement, the syntax for template function calls with explicit template arguments is f<T>(x) and a<b>c can be a perfectly valid (even well-typed) comparison. So statements like a<b>c; and expressions like a<b>(c) are actually ambiguous without additional information. Further, template arguments in C++ don't have to be types. That is, Foo<42> x; or even Foo<c> x; where c is defined as const int x = 42;, for example, could be perfectly valid instantiations of the Foo template if Foo is defined to take an integer as a template argument. So that's a bummer.
To resolve this ambiguity, the C++ grammar refers to the rule template-name instead of identifier in places where the name of a template is expected. So if we treated these as distinct entities, there'd be no ambiguity here. But of course template-name is defined simply as template-name: identifier in the grammar, so that seems pretty useless, ... except that the standard also says that template-name should only be matched when the given identifier names a template in the current scope. Similarly it says that identifiers should only be interpreted as variable names when they don't refer to a template (or type name).
Note that, unlike the previous three languages, C++ requires all types and templates to be declared before they can be used. So when we see the statement a<b>c;, we know that it can only be a template instantiation if we've previously parsed a declaration for a template named a and it is currently in scope.
So, if we keep track of scopes while parsing, we can simply use if-statements to check whether the name a refers to a previously parsed template or not in a hand-written parser. In parser generators that allow semantic predicates, we can do the same thing. Doing this does not even require any lookahead or backtracking.
But what about parser generators like yacc or menhir that don't support semantic predicates? For these we can use something known as the lexer hack, meaning we make the lexer generate different tokens for type names, template names and ordinary identifiers. Then we have a nicely unambiguous grammar that we can feed our parser generator. Of course the trick is getting the lexer to actually do that. In order to accomplish that, we need to keep track of which templates and types are currently in scope using a symbol table and then access that symbol table from the lexer. We'll also need to tell the lexer when we're reading the name of a definition, like the x in int x;, because then we want to generate a regular identifier even if a template named x is currently in scope (the definition int x; would shadow the template until the variable goes out of scope).
This same approach is used to resolve the casting ambiguity (is (T)(x) a cast of x to type T or a function call of a function named T?) in C and C++.
So in summary, foo<T> a; and foo<T>(x) are template instantiations if and only if foo is a template. Parsing's a bitch, but possible without arbitrary lookahead or backtracking and even using menhir when applying the lexer hack.
AFAIK C++'s template syntax is a well-known example of real-world non-LR grammar. Strictly speaking, it is not LR(k) for any finite k... So C++ parsers are usually hand-written with hacks (like clang) or generated by a GLR grammar (LR with branching). So in theory it is impossible to implement a complete C++ parser in Menhir, which is LR.
However even the same syntax for generics can be different. If generic types and expressions involving comparison operators never appear under the same context, the grammar may still be LR compatible. For example, consider the rust syntax for variable declaration (for this part only):
let x : Vec<T> = ...
The : token indicates that a type, rather than an expression follows, so in this case the grammar can be LR, or even LL (not verified).
So the final answer is, it depends. But for the C++ case it should be impossible to implement the syntax in Menhir.

Why should I use 'static_cast' for numeric casts in C++?

Sometimes, we need to explicitly cast one numeric type to another (e.g. to avoid warning when we lose precision or range). For some:
double foo;
we could write:
(float)foo
but most C++ programmers say it's evil 'old-style cast' and you should use the 'new-style cast':
static_cast<float>(foo)
There is an alternative of boost::numeric_cast which is quite interesting (at least in not-performance-critical code) but this question is about static_cast.
A similar question was already asked, but the most important arguments used there are not valid for numerical cast (in my opinion, am I wrong?). There are they:
You have to explicitly tell the compiler what you want to do. Is it upcast, or downcast? Maybe reinterpret cast?
No. It is simple numeric cast. There is no ambiguity. If I know how static_cast<float> works, I know how does it work for 'old-style' cast.
When you write (T)foo you don't know what T is!
I'm not writting (T)foo but (float)foo. I know what is float.
Are there any important reasons for using a new, better casts for numeric types?
In a general scenario (which you have mentioned) you'd want explicit C++ cast to avoid possible issues mentioned in When should static_cast, dynamic_cast, const_cast and reinterpret_cast be used? (ability to devolve into reinterpret_cast).
In numeric scenario you get two benefits at least:
you don't need to remember that C-style cast for numerics safely devolves to static_cast and never reaches reinterpret_cast - I'd call it "ease of use" and "no surprises" part
you can textually search for cast operations (grep 'static_cast<double>' vs grep '(double)' which can be part of function signature)
Like many things inherited from C, the more specific C++ variants are there to inform readers of the code, not the compiler.
You use static_cast<double> because it doesn't do all the things that (double) can.
The conversions performed by
a const_­cast,
a static_­cast,
a static_­cast followed by a const_­cast,
a reinterpret_­cast, or
a reinterpret_­cast followed by a const_­cast,
can be performed using the cast notation of explicit type conversion.
[expr.cast/4]
Specifying static_cast alerts you with a compile time error, rather than silently having undefined behaviour, if the expression you think is safe isn't.

Does the C++ specification say how types are chosen in the static_cast/const_cast chain to be used in a C-style cast?

This question concerns something I noticed in the C++ spec when I was trying to answer this earlier, intriguing question about C-style casts and type conversions.
The C++ spec talks about C-style casts in §5.4. It says that the cast notation will try the following casts, in this order, until one is found that is valid:
const_cast
static_cast
static_cast followed by const_cast
reinterpret_cast
reinterpret_cast followed by const_cast.
While I have a great intuitive idea of what it means to use a static_cast followed by a const_cast (for example, to convert a const Derived* to a Base* by going through a const_cast<Base*>(static_cast<const Base*>(expr))), I don't see any wording in the spec saying how, specifically, the types used in the static_cast/const_cast series are to be deduced. In the case of simple pointers it's not that hard, but as seen in the linked question the cast might succeed if an extra const is introduced in one place and removed in another.
Are there any rules governing how a compiler is supposed to determine what types to use in the casting chain? If so, where are they? If not, is this a defect in the language, or are there sufficient implicit rules to uniquely determine all the possible casts to try?
If not, is this a defect in the language, or are there sufficient implicit rules to uniquely determine all the possible casts to try?
What about constructing all types that can be cast to the target type using only const_cast, i.e. all "intermediate types"?
Given target type T, if static_cast doesn't work, identify all positions where one could add cv-qualifiers such that the resulting type can be cast back to T by const_cast1. Sketch of the algorithm: take the cv-decomposition ([conv.qual]/1) of T; each cvj can be augmented. If T is a reference, we can augment the referee type's cv-qualification.
Now add const volatile to all these places. Call the resulting type CT. Try static_casting the expression to CT instead. If that works, our cast chain is const_cast<T>(static_cast<CT>(e)).
If this doesn't work, there very probably is no conversion using static_cast followed by const_cast (I haven't delved into the deep corners of overload resolution (well I have, but not for this question)). But we could use brute force to repeatedly remove const/volatiles and check each type if we really wanted. So in theory, there is no ambiguity or underspecification; if there is some cast chain, it can be determined. In practice, the algorithm can be made very simple, because the "most cv-qualified" type we can construct from T is (assuredly) sufficient.

retrieving type returned by function using "typeof" operator in gcc

We can get the type returned by function in gcc using the typeof operator as follows:
typeof(container.begin()) i;
Is it possible to do something similar for functions taking some arguments, but not giving them? E.g. when we have function:
MyType foo(int, char, bool, int);
I want to retrieve this "MyType" (probably using typeof operator) assuming I know only the name of function ("foo") and have no knowledge about arguments it takes. Is it possible?
typeof is a non-standard extension, so don't use it if you want your code to be portable.
Its syntax is typeof(expression), so you need to give it an expression calling the function (whose type is therefore MyType), like this:
typeof(foo(int(),char(),bool(),int()))
In C++ the return value's type is not part of the method signature. Even if there is a way to get the return type of a method, you would have to deal with the possibility of getting multiple methods back and not knowing which one went with the return type you want.
C++0x is going to introduce a Type inference using the decltype and auto keywords.
In C++ you can use typeof (as suggested by Mike Seymour) or the SFINAE principle.