What's the purpose of std::to_integer? - c++

As far as I know, std::to_integer<T> is equivalent to T(value) where value is a variable having type std::byte.
I looked into some implementations from the major compilers and found that in this case equivalent means literally implemented as. In other terms, most of the times to_integer is actually implemented as:
return T(value);
And that's all.
What I don't understand is what's the purpose of such a function?
Ironically the cons are even more than the pros. I should include a whole header for such a function just to avoid a C-like cast that is most likely directly inlined anyway.
Is there any other reason for that or it's just really a nice looking alternative for a C-like cast and nothing more?

it's just really a nice looking alternative for a C-like cast and nothing more?
You say that as though it's some trivial detail.
Casts are dangerous. It's easy to cast something to the wrong type, and often compilers won't stop you from doing exactly that. Furthermore, because std::byte is not an integral type in C++, working with numerical byte values often requires a quantity of casting. Having a function that explicitly converts to integers makes for a safer user experience.
For example, float(some_byte) is perfectly legal, while to_integer<float>(some_byte) is explicitly forbidden. to_integer<T> requires that T is an integral type.
to_integer is a safer alternative.
I should include a whole header for such a function
If by "whole header", you mean the same header you got std::byte from and therefore is already by definition included...

std::to_integer<T>(some_byte) is equivalent to T(some_byte) if it actually compiles. T(some_byte) is equivalent to the unsafe C-style cast of (T)some_byte, which can do scary things. On the other hand, std::to_integer is appropriately constrained to only work when it is safe:
This overload only participates in overload resolution if std::is_integral_v<IntegerType> is true.
If the T was not actually an integer type, rather than potentially having undefined behavior, the code won't compile. If the some_byte was not actually a std::byte, rather than potentially having undefined behavior, the code won't compile.

Beyond the expression of intent and safety issues already mentioned, I get the idea from the committee discussion on the paper that it’s meant to be like std::to_string and might have more overloads in the future.

A C style cast is not equivalent to std::to_integer<T>. See the below example.
std::to_integer<T> only participates in overload resolution if std::is_integral_v<T> is true.
#include <cstddef>
#include <iostream>
template <typename T>
auto only_return_int_type_foo(std::byte& b)
{
return std::to_integer<T>(b);
}
template <typename T>
auto only_return_int_type_bar(std::byte& b)
{
return T(b);
}
int main()
{
std::byte test{64};
// compiles
std::cout << only_return_int_type_foo<int>(test) << std::endl;
// compiler error
std::cout << only_return_int_type_foo<float>(test) << std::endl;
// compiles
std::cout << only_return_int_type_bar<int>(test) << std::endl;
// compiles
std::cout << only_return_int_type_bar<float>(test) << std::endl;
}

Related

Is there a "safe" static_cast alternative?

Is there a "safe" alternative to static_cast in C++11/14 or a library which implements this functionality?
By "safe" I mean the cast should only allow casts which do not lose precision. So a cast from int64_t to int32_t would only be allowed if the number fits into a int32_t and else an error is reported.
There's gsl::narrow
narrow // narrow<T>(x) is static_cast<T>(x) if static_cast<T>(x) == x or it throws narrowing_error
You've got the use-case reversed.
The intended use of static_cast (and the other c++-style casts) is to indicate programmer intentions. When you write auto value = static_cast<int32_t>(value_64);, you're saying "Yes, I very much *intend* to downcast this value, possibly truncating it, when I perform this assignment". As a result, a compiler, which might have been inclined to complain about this conversion under normal circumstances (like if you'd have written int32_t value = value_64;) instead observes "well, the programmer has told me that this is what they intended; why would they lie to me?" and will silently compile the code.
If you want your C++ code to warn or throw an error on unsafe conversions, you need to explicitly not use static_cast, const_cast, reinterpret_cast, and let the compiler do its job. Compilers have flags that change how warnings are treated (downcasting int64_t to int32_t usually only results in a Warning), so make sure you're using the correct flags to force warnings to be treated as errors.
Assuming the question is about compile-time detection of potentially lossy conversions...
A simple tool not mentioned yet here is that list initialization doesn't allow narrowing, so you can write:
void g(int64_t n)
{
int32_t x{n}; // error, narrowing
int32_t g;
g = {n}; // error, narrowing
}
NB. Some compilers in their default mode might show "warning" and continue compilation for this ill-formed code, usually you can configure this behaviour via compilation flags.
You can create your own with sfinae. Here's an example:
template <typename T, typename U>
typename std::enable_if<sizeof(T) >= sizeof(U),T>::type
safe_static_cast(U&& val)
{
return static_cast<T>(val);
}
int main()
{
int32_t y = 2;
std::cout << safe_static_cast<int32_t>(y) << std::endl;
std::cout << safe_static_cast<int16_t>(y) << std::endl; // compile error
}
This will compile only if the size you cast to is >= the source size.
Try it here
You can complicate this further using numeric_limits for other types and type_traits.
Notice that my solution is a compile-time solution, because you asked about static_cast, where static here refers to "determined at compile-time".

Is it safe to take the address of a temporary?

In my program, I would like to take the address of a temporary. Here is an example:
#include <iostream>
struct Number {
int value;
Number(int n) {
value = n;
}
};
void print(Number *number) {
std::cout << number->value << std::endl;
}
int main() {
Number example(123);
print(&example);
print(&Number(456)); // Is this safe and reliable?
}
This would output:
123
456
To compile, the -fpermissive flag is requied.
Here is my question: is this safe and reliable? In what possible case could something go wrong?
If your definition of "safe and reliable" includes "will compile and produce the same results if the compiler is updated" then your example is invalid.
Your example is ill-formed in all C++ standards.
This means, even if a compiler can be coerced to accept it now, there is no guarantee that a future update of your compiler will accept it or, if the compiler does accept the code, will produce the same desired effect.
Most compiler vendors have form for supporting non-standard features in compilers, and either removing or altering support of those features in later releases of the compiler.
Consider changing your function so it accepts a const Number & rather than a pointer. A const reference CAN be implicitly bound to a temporary without needing to bludgeon the compiler into submission (e.g. with command line options). A non-const reference cannot.
&Number(456) is an error because the built-in & operator cannot be applied to an rvalue. Since it is an error, it is neither safe nor reliable. "What could go wrong" is that the code could be rejected and/or behave unexpectedly by a compiler which follows the C++ Standard. You are relying on your compiler supporting some C++-like dialect in which this code is defined.
You can output the address of the temporary object in various ways. For example add a member function auto operator&() { return this; } . The overloaded operator& can be applied to prvalues of class type.
Another way would be to have a function that is like the opposite of move:
template<typename T>
T& make_lvalue(T&& n)
{
return n;
}
and then you can do print(&make_lvalue(Number(456)));
If you are feeling evil, you could make a global template overload of operator&.
This is fine but..
Number *a;
print(a); // this would be a null ptr error
How I would change it is
void print(const Number num) // make the paramater const so it doesnt change
{
std::cout << num.value << std::endl; // note the . instead of -> cuz this is a reference not a pointer
}
You would remove the "&" from your code like:
Number example(123);
print(example);
print(Number(456));
and if you need to pass a pointer you just put a "*" to dereference it.
chasester

auto rules for direct-list-initialization

I was just trying to learn the new c++17 changes related "auto rules for direct-list-initialization"
Few of the stackoverflow question thread has answers like its not a safe thing to do
Why is direct-list-initialization with auto considered bad or not preferred?
Tried one of the selected answer to understand
#include <typeinfo>
#include <iostream>
struct Foo{};
void eatFoo (const Foo& f){}
int main() {
Foo a;
auto b{a};
eatFoo(b);
std::cout << "a " << typeid(a).name() << '\n';
std::cout << "b " << typeid(b).name() << '\n';
}
But to my surprise it compiled without any warning or compile error
output
a 3Foo
b 3Foo
Program ended with exit code: 0
Does it mean now its safe to use auto for direct initialization
for example like this
auto x { 1 };
"Safe" is in the eye of the beholder.
C++14's behavior was "safe", in that it was well-defined what the result would be. auto var_name{expr} would create an initializer_list of one element.
C++17 makes it so that auto var_name{expr} results in the same type deduction as auto var_name = expr; would have. So long as you expect this to be the case, then it is "safe". But it is no more "safe" than the old behavior.
Then again, this is a language change that is backwards-incompatible with the C++14 behavior. By that standard, it is not "safe", in that a C++14 user will expect your code to be doing something different than it is under a C++17 compiler. So it creates some subtle confusion, which may not be safe.
Then there's the issue that "uniform initialization" is supposed to be "uniform" (ie: performs the same kind of initialization under the same rules in all places), yet auto var_name{expr} will do something very different from auto var_name = {expr}. The latter will still be an initializer_list; the former will be a copy/move. By that token, it is not "safe" to assume that "uniform initialization" is uniform.
Then again, the C++17 behavior is sometimes the thing you really wanted. That initializing a variable from a single value, to you, means to copy/move it. After all, if you had done decltype(expr) var_name{expr}, that's the behavior you'd get. So from that perspective, it could seem to be "safe" behavior.
At the very least, we can say that the situation is more teachable through two rules:
Direct-list-initialization of auto variables means "copy/move the expression into the variable, and if you provide more than one expression, you're using the wrong syntax."
Copy-list-initialization of auto variables means "make an initializer_list, and if you provide values that can't do that, you're using the wrong syntax."
Perhaps the simplicity creates a sense of "safety".
The document that instigated these changes makes a case that you're less likely to get accidental UB by returning a stack-bound initializer_list by sticking to the above rules. That's one definition of "safe".
So it all depends on what you mean by "safe".

When a C++ lambda expression has a lot of captures by reference, the size of the unnamed function object becomes large

The following code:
int main() {
int a, b, c, d, e, f, g;
auto func = [&](){cout << a << b << c << d << e << f << g << endl;};
cout << sizeof(func) << endl;
return 0;
}
outputs 56 compiled with g++ 4.8.2
Since all local variables are stored in the same stack frame, remembering one pointer is sufficient to locate the addresses of all local variables. Why the lambda expression constructs a so big unnamed function object?
I do not understand why you seem surprised.
The C++ Standard gives a set of requirements, and every single implementation is free to pick any strategy that meets the requirements.
Why would an implementation optimize the size of the lambda object ?
Specifically, do you realize how that would tie down the generated code of this lambda to the generated code for the surrounding function ?
It's easy to say Hey! This could be optimized!, but it's much more difficult to actually optimize and make sure it works in all edge cases. So, personally, I much prefer having a simple and working implementation than a botched attempt at optimizing it...
... especially when the work-around is so easy:
struct S { int a, b, c, d, e, f, g; };
int main() {
S s = {};
auto func = [&](){
std::cout << s.a << s.b << s.c << s.d << s.e << s.f << s.g << "\n";
};
std::cout << sizeof(func) << "\n";
return 0;
}
Look Ma: 4 bytes only!
It is legal for a compiler to capture by reference via stack pointer. There is a slight downside (in that offsets have to be added to said stack pointer).
Under the current C++ standard with defects included, you also have to capture reference variables by pseudo-pointer, as the lifetime of the binding must last as long as the referred-to-data, not the reference it directly binds to.
The simpler implementation, where each captured variable corresponds to a constructor argument and class member variable, has the serious advantage that it lines up with "more normal" C++ code. Some work for magic this need be done, but other than that the lambda closure is a bog-standard object instance with an inline operator(). Optimization strategies on "more normal" C++ code will work, bugs are going to be mostly in common with "more normal" code, etc.
Had the compiler writers gone with the stack-frame implementation, probably reference capture of references in that implementation would have failed to work like it did in every other compiler. When the defect was resolved (in favor of it working), the code would have to be changed again. In essence, the compilers that would have used a simpler implementation would almost certainly have had fewer bugs and more working code than those who used a fancy implementation.
With the stack-frame capture, all optimization for a lambda would have to be customized for that lambda. It would be equivalent to a class that captured a void*, does pointer arithmetic on it, and casts the resulting data to typed pointers. That is going to be extremely hard to optimize, as pointer arithmetic tends to block optimization, especially pointer arithmetic between stack variables (which is usually undefined). What is worse is that such pointer arithmetic means that the optimization of stack variable state (eliminating variables, overlapping lifetime, registers) now has to interact with the optimization of lambdas in entangled ways.
Working on such an optimization would be a good thing. As a bonus, because lambda types are tied to compilation units, messing with the implementation of a lambda will not break binary compatibility between compilation units. So you can do such changes relatively safely, once they are a proven stable improvement. However, if you do implement that optimization, you really really will want the ability to revert to the simpler proven one.
I encourage you to provide patches to your favorite open-source compiler to add this functionality.
Because that's how it's implemented. I don't know if the standard says anything about how it should be implemented but I guess it's implementation defined how big a lambda object will be in that situation.
There would be nothing wrong for a compiler to store a single pointer and use the offsets, to do what you suggest, as an optimization. Perhaps some compilers do that, I don't know.

Why doesn't void take a void value in C++?

I'm curious why C++ does not define void via :
typedef struct { } void;
I.e. what is the value in a type that cannot be instantiated, even if that installation must produce no code?
If we use gcc -O3 -S, then both the following produce identical assembler :
int main() { return 0; }
and
template <class T> T f(T a) { }
typedef struct { } moo;
int main() { moo a; f(a); return 0; }
This makes perfect sense. A struct { } simply takes an empty value, easy enough to optimize away. In fact, the strange part is that they produce different code without the -O3.
You cannot however pull this same trick with simply typedef void moo because void cannot assume any value, not even an empty one. Does this distinction have any utility?
There are various other strongly typed languages like Haskell, and presumably the MLs, that have a value for their void type, but offer no valueless types overtly, although some posses native pointer types, which act like void *.
I see the rationale for void being unable to be instantiated coming from the C roots of C++. In the old gory days, type safety wasn't that big a deal and void*s were constantly passed around. However, you could always be looking at code that does not literally say void* (due to typedefs, macros, and in C++, templates) but still means it.
It is a tiny bit of safety if you cannot dereference a void* but have to cast it to a pointer to a proper type, first. Whether you accomplish that by using an incomplete type as Ben Voigt suggests, or if you use the built-in void type doesn't matter. You're protected from incorrectly guessing that you are dealing with a type, which you are not.
Yes, void introduces annoying special cases, especially when designing templates. But it's a good thing (i.e. intentional) that the compiler doesn't silently accept attempts to instantiate void.
Because that wouldn't make void an incomplete type, now would it?
Try your example code again with:
struct x; // incomplete
typedef x moo;
Why should void be an incomplete type?
There are many reasons.
The simplest is this: moo FuncName(...) must still return something. Whether it is a neutral value, whether it is the "not a value value", or whatever; it still must say return value;, where value is some actual value. It must say return moo();.
Why write a function that technically returns something, when it isn't actually returning something? The compiler can't optimize the return out, because it's returning a value of a complete type. The caller might actually use it.
C++ isn't all templates, you know. The compiler doesn't necessarily have the knowledge to throw everything away. It often has to perform in-function optimizations that have no knowledge of the external use of that code.
void in this case means "I don't return anything." There is a fundamental difference between that and "I return a meaningless value." It is legal to do this:
moo FuncName(...) { return moo(); }
moo x = FuncName(...);
This is at best misleading. A cursory scan suggests that a value is being returned and stored. The identifier x now has a value and can be used for something.
Whereas this:
void FuncName(...) {}
void x = FuncName(...);
is a compiler error. So is this:
void FuncName(...) {}
moo x = FuncName(...);
It's clear what's going on: FuncName has no return value.
Even if you were designing C++ from scratch, with no hold-overs from C, you would still need some keyword to indicate the difference between a function that returns a value and one that does not.
Furthermore, void* is special in part because void is not a complete type. Because the standard mandates that it isn't a complete type, the concept of a void* can mean "pointer to untyped memory." This has specialized casting semantics. Pointers to typed memory can be implicitly cast to untyped memory. Pointers to untyped memory can be explicitly cast back to the typed pointer that it used to be.
If you used moo* instead, then the semantics get weird. You have a pointer to typed memory. Currently, C++ defines casting between unrelated typed pointers (except for certain special cases) to be undefined behavior. Now the standard has to add another exception for the moo type. It has to say what happens when you do this:
moo *m = new moo;
*moo;
With void, this is a compiler error. What is it with moo? It's still useless and meaningless, but the compiler has to allow it.
To be honest, I would say that the better question would be "Why should void be a complete type?"