C++ Safely and Efficiently Casting std::weak_ordering to int - c++

C++20 is introducing a new comparison type: std::weak_ordering.
It allows for representing less than, equal to, or greater than.
However, some older functions use an int for a similar purpose. Such as qsort, which uses the signature
int compar (const void* p1, const void* p2);
How can I cast std::weak_ordering to int for the use in a function such as qsort?
Here is an example situation:
#include <compare>
#include <iostream>
int main() {
long a = 2354, b = 1234;
std::weak_ordering cmp = a <=> b;
if (cmp > 0) std::cout << "a is greater than b" << std::endl;
if (cmp == 0) std::cout << "a is equal to b" << std::endl;
if (cmp < 0) std::cout << "a is less than b" << std::endl;
int equivalent_cmp = cmp; // errors
}
In testing, I noticed that using a reinterpret_cast to int8_t type does work, but I am not sure if this would be portable.
int equivalent_cmp = *(int8_t *)&cmp;
or equivalently,
int equivalent_cmp = *reinterpret_cast<int8_t*>(&cmp);
Is this safe?
Furthermore, there are some other solutions that can work, but are inefficient compared this "unsafe" method. All of these would be slower than the above solutions
int equivalent_cmp = (a > b) - (a < b);
or
int equivalent_cmp;
if (cmp < 0) equivalent_cmp = -1;
else if (cmp == 0) equivalent_cmp = 0;
else equivalent_cmp = 1;
Is there a better solution that would be guaranteed to work?

Is there a better solution that would be guaranteed to work?
No.
The standard does not specify the contents or representation of the ordering classes. Barry's answer is based on reasonable assumptions, that are likely to hold, but they are not guaranteed.
Should you need it, your best bet is to write something like your last snippet
constexpr int ordering_as_int(std::weak_ordering cmp) noexcept {
return (cmp < 0) ? -1 : ((cmp == 0) ? 0 : 1);
}

How can I cast std::weak_ordering to int for the use in a function such as qsort?
The easy answer is: don't use qsort, use std::sort, it'll perform better anyway.
That said, we know that std::weak_ordering has to have some integral type member, and C++20 does come with a mechanism to pull it out: std::bit_cast:
static_assert(std::bit_cast<int8_t>(0 <=> 1) == -1);
The rule is that the type you're casting to (in this case int8_t) has to be the same size as the type you're casting from (in this case std::strong_ordering). That's a constraint on bit_cast, so it's safe - if the implementation actually stores an int instead of an int8_t, this won't compile.
So more generally, you'd have to write a short metaprogram to determine the correct signed integer type to cast into.
Note that while weak_ordering and strong_ordering will just be implemented as storing an integer (though not int as illustrated in the standard), partial_ordering will probably not be implemented as storing an int and a bool - it will likely still be implemented as a single integer. So this trick won't work.

Related

If a function has wrong parameters, how to prevent it from returning any value?

I am sure this question has been asked already but I couldn't find the answer.
If I have a function, let's say:
int Power(int number, int degree){
if(degree==0){
return 1;
}
return number*Power(number, degree-1);
}
It works only when the degree is a non-negative int. How can I prevent this function from being called with wrong parameters?
For example, if the programmer writes cout<<Power(2, -1);, I want the compiler to refuse to compile the code and return some kind of an error message (e.g. "function Power accepts only non-negative integers").
Another alternative would be for the function to not return any value in this case. For example:
int Power(int number, unsigned int degree){
if(degree<0){
//return nothing
}
if(degree==0){
return 1;
}
return number*Power(number, degree-1);
}
There is an alternative to returning a value: Throw a value. A typical example:
if(degree<0){
throw std::invalid_argument("degree may not be negative!");
}
I want the compiler to refuse to compilate the code
In general, arguments are unknown until runtime, so this is not typically possible.
Your answer does the job for menicely. But I am curious: 'throw' terminates the program and prevents anything after Power() to be executed.
If you catch the thrown object, then you can continue immediately after the function from which the object was thrown.
The mere fact, that C++ does implicit type conversions, leaves you no way out of the predicament, that if you write unsigned int x = -1;, no matter which warnings you turn on with your compiler, you won't see any problem with that.
The only rule coming to mind, which might help you with that, is the notorious "max zero or one implicit conversions" rule. But I doubt it can be exploited in this case. (-1 would need to be converted to unsigned int, then to another type, implicitly). But I think from what I read on the page I linked above, numeric implicit conversions do not really count under some circumstances.
This leaves you but one other, also imperfect option. In the code below, I outline the basic idea. But there is endless room to refine the idea (more on that, later). This option is to resort to optional types in combination with your own integer type. The code below also only hints to what is possible. All that could be done in some fancy monadic framework or whatnot...
Obviously, in the code, posted in the question, it is a bad idea to have argument degree as an unsigned int, because then, a negative value gets implicitly converted and the function cannot protect itself from the hostile degree 0xFFFFFFFF (max value of unsigned int). If it wanted to check, it had better chosen int. Then it could check for negative values.
The code in the question also calls for a stack overflow, given it does not implement power in a tail recursive way. But this is just an aside and not subject to the question at hand. Let's get that one quickly out of the way.
// This version at least has a chance to benefit from tail call optimizations.
int internalPower_1 (int acc, int number, int degree) {
if (1 == degree)
return acc * number;
return internalPower_1(acc*number, number, degree - 1);
}
int Power_1 (int number, int degree) {
if (degree < 0)
throw std::invalid_argument("degree < 0");
return internalPower_1( 1, number, degree);
}
Now, would it not be nice if we could have integer types, which depended on the valid value range? Other languages have it (e.g. Common Lisp). Unless there is already something in boost (I did not check), we have to roll such a thing ourselves.
Code first, excuses later:
#include <iostream>
#include <stdexcept>
#include <limits>
#include <optional>
#include <string>
template <int MINVAL= std::numeric_limits<int>::min(),
int MAXVAL = std::numeric_limits<int>::max()>
struct Integer
{
int value;
static constexpr int MinValue() {
return MINVAL; }
static constexpr int MaxValue() {
return MAXVAL; }
using Class_t = Integer<MINVAL,MAXVAL>;
using Maybe_t = std::optional<Class_t>;
// Values passed in during run time get handled
// and punished at run time.
// No way to work around this, because we are
// feeding our thing of beauty from the nasty
// outside world.
explicit Integer (int v)
: value{v}
{
if (v < MINVAL || v > MAXVAL)
throw std::invalid_argument("Value out of range.");
}
static Maybe_t Init (int v) {
if (v < MINVAL || v > MAXVAL) {
return std::nullopt;
}
return Maybe_t(v);
}
};
using UInt = Integer<0>;
using Int = Integer<>;
std::ostream& operator<< (std::ostream& os,
const typename Int::Maybe_t & v) {
if (v) {
os << v->value;
} else {
os << std::string("NIL");
}
return os;
}
template <class T>
auto operator* (const T& x,
const T& y)
-> T {
if (x && y)
return T::value_type::Init(x->value * y->value);
return std::nullopt;
}
Int::Maybe_t internalPower_3 (const Int::Maybe_t& acc,
const Int::Maybe_t& number,
const UInt::Maybe_t& degree) {
if (!acc) return std::nullopt;
if (!degree) return std::nullopt;
if (1 == degree->value) {
return Int::Init(acc->value * number->value);
}
return internalPower_3(acc * number,
number,
UInt::Init(degree->value - 1));
}
Int::Maybe_t Power_3 (const Int::Maybe_t& number,
const UInt::Maybe_t& degree) {
if (!number) return std::nullopt;
return internalPower_3 (Int::Init(1),
number,
degree);
}
int main (int argc, const char* argv[]) {
std::cout << Power_1 (2, 3) << std::endl;
std::cout << Power_3 (Int::Init(2),
UInt::Init(3)) << std::endl;
std::cout << Power_3 (Int::Init(2),
UInt::Init(-2)) << std::endl;
std::cout << "UInt min value = "
<< UInt::MinValue() << std::endl
<< "Uint max value = "
<< UInt::MaxValue() << std::endl;
return 0;
}
The key here is, that the function Int::Init() returns Int::Maybe_t. Thus, before the error can propagate, the user gets a std::nullopt very early, if they try to init with a value which is out of range. Using the constructor of Int, instead would result in an exception.
In order for the code to be able to check, both signed and unsigned instances of the template (e.g. Integer<-10,10> or Integer<0,20>) use a signed int as storage, thus being able to check for invalid values, sneaking in via implicit type conversions. At the expense, that our unsigned on a 32 bit system would be only 31 bit...
What this code does not show, but which could be nice, is the idea, that the resulting type of an operation with two (different instances of) Integers, could be yet another different instance of Integer. Example: auto x = Integer<0,5>::Init(3) - Integer<0,5>::Init(5) In our current implementation, this would result in a nullopt, preserving the type Integer<0,5>. In a maybe better world, though it would as well be possible, that the result would be an Integer<-2,5>.
Anyway, as it is, some might find my little Integer<,> experiment interesting. After all, using types to be more expressive is good, right? If you write a function like typename Integer<-10,0>::Maybe_t foo(Integer<0,5>::Maybe_t x) is quite self explaining as to which range of values are valid for x.

Why are bool values compared bitwise in C++ rather than by their meaning?

I would expect the following code to produce equality, but bool values are shown to be different.
#include <iostream>
union crazyBool
{
unsigned char uc;
bool b;
};
int main()
{
crazyBool a, b;
a.uc = 1;
b.uc = 5;
if(a.b == b.b)
{
std::cout << "==" << std::endl;
}
else
{
std::cout << "!=" << std::endl;
}
bool x, y;
void *xVP = &x, *yVP = &y;
unsigned char *xP = static_cast<unsigned char*>(xVP);
unsigned char *yP = static_cast<unsigned char*>(yVP);
(*xP) = (unsigned char)1;
(*yP) = (unsigned char)5;
if(x == y)
{
std::cout << "==" << std::endl;
}
else
{
std::cout << "!=" << std::endl;
}
return 0;
}
Note that here we are not only changing the value through union (which was pointed out as being undefined), but also accessing memory directly via void pointer.
If you assign to a member of a union, that member becomes the active member.
Reading any member other than the active one is undefined behavior, meaning that anything could happen. There are a few specific exceptions to this rule, depending on the version of the C++ standard your compiler is following; but none of these exceptions apply in your case.
(One of the exceptions is that if the union has multiple members of class type where the classes have the same initial members, these shared members can be read through any of the union's members.)
Edit to address the question clarification:
The standard defines bool as having the values true and false, no others. (C++11, 3.9.1/6) It never defines what it means to copy some other bit pattern over the storage (which is what you are doing). Since it doesn't define it, the behavior is undefined too.
Yes, when converting an integer to a bool 0 becomes false and anything else becomes true, but that is just a conversion rule, not a statement about bool's representation.

std::string comparison, lexicographical or not

The following code, comes from the article C++ quirks, part 198276
include <iostream>
#include <string>
using namespace std;
int main()
{
std::string a = "\x7f";
std::string b = "\x80";
cout << (a < b) << endl;
cout << (a[0] < b[0]) << endl;
return 0;
}
Surprisingly the output is
1
0
Shouldn't string comparison be lexicographical ? If yes how is the output explained?
There is nothing in the C++ specification to say if char is signed or unsigned, it's up to the compiler. For your compiler it seems that char defaults to signed char which is why the second comparison returns false.
So I'm just going to quote directly from your link:
It turns out that this behavior is required by the standard, in section 21.2.3.1 [char.traits.specializations.char]: “The two-argument members eq and lt shall be defined identically to the built-in operators == and < for type unsigned char .”
So:
(a < b) is required to use unsigned char comparisons.
(a[0] < b[0]) is required to use char comparisons, which may or may not be signed.

How to copy an int to a boost/std::array of char?

What is the most simple and efficient why to copy an int to a boost/std::array?
The following seems to work, but I'm not sure if this is the most appropriate way to do it:
int r = rand();
boost::array<char, sizeof(int)> send_buf;
std::copy(reinterpret_cast<char*>(&r), reinterpret_cast<char*>(&r + sizeof(int)), &send_buf[0]);
Just for comparison, here's the same thing with memcpy:
#include <cstring>
int r = rand();
boost::array<char, sizeof(int)> send_buf;
std::memcpy(&send_buf[0], &r, sizeof(int));
Your call whether an explosion of casts (and the opportunity to get them wrong) is better or worse than the C++ "sin" of using a function also present in C ;-)
Personally I think memcpy is quite a good "alarm" for this kind of operation, for the same reason that C++-style casts are a good "alarm" (easy to spot while reading, easy to search for). But you might prefer to have the same alarm for everything, in which case you can cast the arguments of memcpy to void*.
Btw, I might use sizeof r for both sizes rather than sizeof(int), but it sort of depends whether the context demands that the array "is big enough for r (which happens to be an int)" or "is the same size as an int (which r happens to be)". Since it's a send buffer, I guess the buffer is the size that the wire protocol demands and r is supposed to match the buffer, rather than the other way around. So sizeof(int) is probably appropriate but 4 or PROTOCOL_INTEGER_SIZE might be more appropriate still.
The idea is correct, but you have a bug:
reinterpret_cast<char*>(&r + sizeof(int))
Should be:
reinterpret_cast<char*>(&r) + sizeof(int)
or
reinterpret_cast<char*>(&r+1)
These or a memcpy equivalent are OK. Anything else risks alignment issues.
It's common currency to use reinterpret_cast for those purposes but the Standard makes it pretty clear that static_cast via void* is perfectly acceptable. In fact in the case of a type like int then reinterpret_cast<char*>(&r) is defined to have the semantics of static_cast<char*>(static_cast<void*>(&r)). Why not be explicit and use that outright?
If you get into the habit, you have less chance in the future of using a reinterpret_cast which will end up having implementation-defined semantics rather than a static_cast chain which will always have well-defined semantics.
Do note that you're allowed to treat a pointer to a single object as if it were a pointer into an array of one (cf. 5.7/4). This is convenient for obtaining the second pointer.
int r = rand();
boost::array<char, sizeof(int)> send_buf;
auto cast = [](int* p) { return static_cast<char*>(static_cast<void*>(p)); };
std::copy(cast(&r), cast(&r + 1), &send_buf[0]);
Minor bug as pointed out by Michael Anderson
But you could do this:
#include <iostream>
union U
{
int intVal;
char charVal[sizeof(int)];
};
int main()
{
U val;
val.intVal = 6;
std::cout << (int)val.charVal[0] << ":" << (int)val.charVal[1] << ":" << (int)val.charVal[2] << ":" << (int)val.charVal[3] << "\n";
}

What is wrong with my For loops? i get warnings: comparison between signed and unsigned integer expressions [-Wsign-compare]

#include <iostream>
#include <string>
#include <vector>
#include <sstream>
using namespace std;
int main() {
vector<double> vector_double;
vector<string> vector_string;
...
while (cin >> sample_string)
{
...
}
for(int i = 0; i <= vector_string.size(); i++)
{
....
}
for (int i = 0; i < vector_double.size(); i++)
....
return 0;
}
Why is there a warning with -Wsign-compare ?
As the name of the warning, and its text, imply, the issue is that you are comparing a signed and an unsigned integer. It is generally assumed that this is an accident.
In order to avoid this warning, you simply need to ensure that both operands of < (or any other comparison operator) are either both signed or both unsigned.
How could I do better ?
The idiomatic way of writing a for loop is to initialize both the counter and the limit in the first statement:
for (std::size_t i = 0, max = vec.size(); i != max; ++i)
This saves recomputing size() at each iteration.
You could also (and probably should) use iterators instead of indices:
for (auto it = vec.begin(), end = vec.end(); it != end; ++it)
auto here is a shorthand for std::vector<int>::iterator. Iterators work for any kind of containers, whereas indices limit you to C-arrays, deque and vector.
It is because the .size() function from the vector class is not of type int but of type vector::size_type
Use that or auto i = 0u and the messages should disappear.
int is signed by default - it is equivalent to writing signed int. The reason you get a warning is because size() returns a vector::size_type which is more than likely unsigned.
This has potential danger since signed int and unsigned int hold different ranges of values. signed int can hold values between –2147483648 to 2147483647 while an unsigned int can hold values between 0 to 4294967295 (assuming int is 32 bits).
I usually solve it like this:
for(int i = 0; i <= (int)vector_string.size(); i++)
I use the C-style cast because it's shorter and more readable than the C++ static_cast<int>(), and accomplishes the same thing.
There's a potential for overflow here, but only if your vector size is larger than the largest int, typically 2147483647. I've never in my life had a vector that large. If there's even a remote possibility of using a larger vector, one of the answers suggesting size_type would be more appropriate.
I don't worry about calling size() repeatedly in the loop, since it's likely an inline access to a member variable that introduces no overhead.
You get this warning because the size of a container in C++ is an unsigned type and mixing signed/unsigned types is dangerous.
What I do normally is
for (int i=0,n=v.size(); i<n; i++)
....
this is in my opinion the best way to use indexes because using an unsigned type for an index (or the size of a container) is a logical mistake.
Unsigned types should be used only when you care about the bit representation and when you are going to use the modulo-(2**n) behavior on overflow. Using unsigned types just because a value is never negative is a nonsense.
A typical bug of using unsigned types for sizes or indexes is for example
// Draw all lines between adjacent points
for (size_t i=0; i<pts.size()-1; i++)
drawLine(pts[i], pts[i+1]);
the above code is UB when the point array is empty because in C++ 0u-1 is a huge positive number.
The reason for which C++ uses an unsigned type for size of containers is because of an historical heritage from 16-bit computers (and IMO given C++ semantic with unsigned types it was the wrong choice even back then).
Your variable i is an integer while the size member function of vector which returns an Allocator::size_type is most likely returning a size_t, which is almost always implemented as an unsigned int of some size.
Make your int i as size_type i.
std::vector::size() will return size_type which is an unsigned int as size cannot be -ve.
The warning is obviously because you are comparing signed integer with unsigned integer.
Answering after so many answers, but no one noted the loop end.. So, here's my full answer:
To remove the warning, change the i's type to be unsigned, auto (for C++11), or std::vector< your_type >::size_type
Your for loops will seg-fault, if you use this i as index - you must loop from 0 to size-1, inclusive. So, change it to be
for( std::vector< your_type >::size_type i = 0; i < vector_xxx.size(); ++i )
(note the <, not <=; my advise is not to use <= with .begin() - 1, because you can have a 0 size vector and you will have issues with that :) ).
To make this more generic, as you're using a container and you're iterating through it, you can use iterators. This will make easier future change of the container type (if you don't need the exact position as number, of course). So, I would write it like this:
for( std::vector< your_type >::iterator iter = vector_XXX.begin();
iter != vector_XXX.end();
++iter )
{
//..
}
Declaring 'size_t i' for me work well.
std::cout << -1U << std::endl;
std::cout << (unsigned)-1 << std::endl;
4294967295
std::cout << 1 << std::endl;
std::cout << (signed)1 << std::endl;
1
std::cout << (unsigned short)-1 << std::endl;
65535
std::cout << (signed)-1U << std::endl;
std::cout << (signed)4294967295 << std::endl;
-1
unsign your index variable
unsigned int index;
index < vecArray.size() // size() would never be negative
Some answers are suggesting using auto, but that won't work as int is the default type deduced from integer literals. Pre c++23 you have to explicitly specify the type std::size_t defined in cstddef header
for(std::size_t i = 0; i <= vector_string.size(); i++)
{
....
}
In c++23 the integral literal zu was added, the motivation was indeed to allow the correct type to be deduced.
for(auto i = 0zu; i <= vector_string.size(); i++)
{
....
}
But unfortunately no compiler support this feature yet.