C++ enum flags vs bitset - c++

What are pros/cons of usage bitsets over enum flags?
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
int main()
{
{
unsigned int state = Flag::Read | Flag::Binary;
std::cout << state << std::endl;
state |= Flag::Write;
state &= ~(Flag::Read | Flag::Binary);
std::cout << state << std::endl;
} {
std::bitset<Plain::Count> state;
state.set(Plain::Read);
state.set(Plain::Binary);
std::cout << state.to_ulong() << std::endl;
state.flip();
std::cout << state.to_ulong() << std::endl;
}
return 0;
}
As I can see so far, bitsets have more convinient set/clear/flip functions to deal with, but enum-flags usage is a more wide-spreaded approach.
What are possible downsides of bitsets and what and when should I use in my daily code?

Both std::bitset and c-style enum have important downsides for managing flags. First, let's consider the following example code :
namespace Flag {
enum State {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
}
namespace Plain {
enum State {
Read,
Write,
Binary,
Count
};
}
void f(int);
void g(int);
void g(Flag::State);
void h(std::bitset<sizeof(Flag::State)>);
namespace system1 {
Flag::State getFlags();
}
namespace system2 {
Plain::State getFlags();
}
int main()
{
f(Flag::Read); // Flag::Read is implicitly converted to `int`, losing type safety
f(Plain::Read); // Plain::Read is also implicitly converted to `int`
auto state = Flag::Read | Flag::Write; // type is not `Flag::State` as one could expect, it is `int` instead
g(state); // This function calls the `int` overload rather than the `Flag::State` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compiles properly, but semantics are broken, `Flag::State`
std::bitset<sizeof(Flag::State)> flagSet; // Notice that the type of bitset only indicates the amount of bits, there's no type safety here either
std::bitset<sizeof(Plain::State)> plainSet;
// f(flagSet); bitset doesn't implicitly convert to `int`, so this wouldn't compile which is slightly better than c-style `enum`
flagSet.set(Flag::Read); // No type safety, which means that bitset
flagSet.reset(Plain::Read); // is willing to accept values from any enumeration
h(flagSet); // Both kinds of sets can be
h(plainSet); // passed to the same function
}
Even though you may think those problems are easy to spot on simple examples, they end up creeping up in every code base that builds flags on top of c-style enum and std::bitset.
So what can you do for better type safety? First, C++11's scoped enumeration is an improvement for type safety. But it hinders convenience a lot. Part of the solution is to use template-generated bitwise operators for scoped enums. Here is a great blog post which explains how it works and also provides working code : https://www.justsoftwaresolutions.co.uk/cplusplus/using-enum-classes-as-bitfields.html
Now let's see what this would look like :
enum class FlagState {
Read = 1 << 0,
Write = 1 << 1,
Binary = 1 << 2,
};
template<>
struct enable_bitmask_operators<FlagState>{
static const bool enable=true;
};
enum class PlainState {
Read,
Write,
Binary,
Count
};
void f(int);
void g(int);
void g(FlagState);
FlagState h();
namespace system1 {
FlagState getFlags();
}
namespace system2 {
PlainState getFlags();
}
int main()
{
f(FlagState::Read); // Compile error, FlagState is not an `int`
f(PlainState::Read); // Compile error, PlainState is not an `int`
auto state = FlagState::Read | FlagState::Write; // type is `FlagState` as one could expect
g(state); // This function calls the `FlagState` overload
auto system1State = system1::getFlags();
auto system2State = system2::getFlags();
if (system1State == system2State) {} // Compile error, there is no `operator==(FlagState, PlainState)`
auto someFlag = h();
if (someFlag == FlagState::Read) {} // This compiles fine, but this is another type of recurring bug
}
The last line of this example shows one problem that still cannot be caught at compile time. In some cases, comparing for equality may be what's really desired. But most of the time, what is really meant is if ((someFlag & FlagState::Read) == FlagState::Read).
In order to solve this problem, we must differentiate the type of an enumerator from the type of a bitmask. Here's an article which details an improvement on the partial solution I referred to earlier : https://dalzhim.github.io/2017/08/11/Improving-the-enum-class-bitmask/
Disclaimer : I'm the author of this later article.
When using the template-generated bitwise operators from the last article, you will get all of the benefits we demonstrated in the last piece of code, while also catching the mask == enumerator bug.

Some observations:
std::bitset< N > supports an arbitrary number of bits (e.g., more than 64 bits), whereas underlying integral types of enums are restricted to 64 bits;
std::bitset< N > can implicitly (depending on the std implementation) use the underlying integral type with the minimal size fitting the requested number of bits, whereas underlying integral types for enums need to be explicitly declared (otherwise, int will be used as the default underlying integral type);
std::bitset< N > represents a generic sequence of N bits, whereas scoped enums provide type safety that can be exploited for method overloading;
If std::bitset< N > is used as a bit mask, a typical implementation depends on an additional enum type for indexing (!= masking) purposes;
Note that the latter two observations can be combined to define a strong std::bitset type for convenience:
typename< Enum E, std::size_t N >
class BitSet : public std::bitset< N >
{
...
[[nodiscard]]
constexpr bool operator[](E pos) const;
...
};
and if the code supports some reflection to obtain the number of explicit enum values, then the number of bits can be deduced directly from the enum type.
scoped enum types do not have bitwise operator overloads (which can easily be defined once using SFINAE or concepts for all scoped and unscoped enum types, but need to be included before use) and unsoped enum types will decay to the underlying integral type;
bitwise operator overloads for enum types, require less boilerplate than std::bitset< N > (e.g., auto flags = Depth | Stencil;);
enum types support both signed and unsigned underlying integral types, whereas std::bitset< N > internally uses unsigned integral types (shift operators).
FWIIW, in my own code I mostly use std::bitset (and eastl::bitvector) as private bit/bool containers for setting/getting single bits/bools. For masking operations, I prefer scoped enum types with explicitly defined underlying types and bitwise operator overloads.

Do you compile with optimization on? It is very unlikely that there is a 24x speed factor.
To me, bitset is superior, because it manages space for you:
can be extended as much as wanted. If you have a lot of flags, you may run out of space in the int/long long version.
may take less space, if you only use just several flags (it can fit in an unsigned char/unsigned short - I'm not sure that implementations apply this optimization, though)

(Ad mode on)
You can get both: a convenient interface and max performance. And type-safety as well. https://github.com/oliora/bitmask

Related

Weird Data Types - C++

I know this sounds like a silly question, but I would like to know if it is possible in any way to make a custom variable size like this rather than using plain 8, 16, 32 and 64 bit integers:
uint15_t var; //as an example
I did some research online and I found nothing that works on all sizes (only stuff like 12 bit and 24 bit). (I was also wondering if bitfields would work with other data sizes too).
And yes, I know you could use a 16 bit integer to store a 15 bit value, but I just wanted to know if it is possible to do something like this.
How can I make and implement custom integer sizes?
Inside a struct or class, can use the bitfields feature to declare integers of the size you want for a member-variable:
unsigned int var : 15;
... it won't be very CPU-efficient (since the compiler will have to generate bitwise-operations on most accesses to that variable) but it will give you the desired 15-bit behavior.
To be able to use your bitfield int as a normal int but still get the behavior of a 15 bit int you can do it like this :
#include <cassert>
#include <utility>
template<typename type_t, std::size_t N>
struct bits_t final
{
bits_t() = default;
~bits_t() = default;
// explicitly implicit so it can convert automatically from underlying type
bits_t(const type_t& value) :
m_value{ value }
{
};
// implicit conversion back to underlying type
operator type_t ()
{
return m_value;
}
private:
type_t m_value : N;
};
int main()
{
bits_t<int, 15> value;
value = 16383; // 0x3FFF
assert(value == 16383);
// overflow now at bit 15 :)
value = 16384; // 0x4000, 16th bit is set.
assert(value == -16384);
return 0;
}
bitfields feature will do the trick ...
uint32_t customInt : 15;
You can try using bitfields, like many people mentioned. However, bitfields don't have a proper type. If you want to make your arbitrary-sized integers object-oriented, you can stuff the bitfield into a template:
template <int size> struct my_uint
{
uint32_t value: size;
};
typedef my_uint<13> uint13_t; // some people use "using" syntax to do this
typedef my_uint<14> uint14_t;
typedef my_uint<15> uint15_t;
However, now you lost arithmetic operators, and you have to implement (overload) them yourself. You have to ask yourself many questions about what you really want to do with these new types:
Do you want to overload operators like +, *, etc? Which ones?
Do you want to support arrays?
What is the maximum size you want to support? In my example, it's 32.
Do you want to support implicit constructors, e.g. uint15_t(uint32_t)?
How to support overflow?
There is no way to make your new types behave like built-in types - you can come close but cannot quite do it. That is, if you write a big program where you work with uint15_t and later you decide to switch to uint16_t, there will be subtle changes caused by uint16_t being a built-in type (e.g. consider rules about implicit conversions).

Continuous enum C++11

Is there a way to check in C++11 if an enum is continuous?
It is fully valid to give an enum values which are not. Is there maybe a feature like a type trait in C++14, C++17 or maybe C++20 to check is the enum is continuous? This to be used in a static_assert.
A small example follows:
enum class Types_Discontinuous {
A = 10,
B = 1,
C = 100
};
enum class Types_Continuous {
A = 0,
B = 1,
C = 2
};
static_assert(SOME_TEST<Types_Discontinuous>::value, "Enum should be continuous"); // Fails
static_assert(SOME_TEST<Types_Continuous>::value, "Enum should be continuous"); // Passes
This is not possible in pure C++, because there is no way to enumerate the enum values, or discover the number of the values and minimum and maximum values. But you could try using the help of your compiler to implement something close to what you want. For example, in gcc it is possible to enforce a compilation error if a switch statement does not handle all values of an enum:
enum class my_enum {
A = 0,
B = 1,
C = 2
};
#pragma GCC diagnostic push
#if __GNUC__ < 5
#pragma GCC diagnostic error "-Wswitch"
#else
#pragma GCC diagnostic error "-Wswitch-enum"
#endif
constexpr bool is_my_enum_continuous(my_enum t = my_enum())
{
// Check that we know all enum values. Effectively works as a static assert.
switch (t)
{
// Intentionally no default case.
// The compiler will give an error if not all enum values are listed below.
case my_enum::A:
case my_enum::B:
case my_enum::C:
break;
}
// Check that the enum is continuous
auto [min, max] = std::minmax({my_enum::A, my_enum::B, my_enum::C});
return static_cast< int >(min) == 0 && static_cast< int >(max) == 2;
}
#pragma GCC diagnostic pop
Obviously, this is specialized for a given enum, but definition of such functions can be automated with preprocessor.
For a number of enums you can probably hack your way through this using the Magic Enum library. For example:
#include "magic_enum.hpp"
template <typename Enum>
constexpr bool is_continuous(Enum = Enum{}) {
// make sure we're actually testing an enum
if constexpr (!std::is_enum_v<Enum>)
return false;
else {
// get a sorted list of values in the enum
const auto values = magic_enum::enum_values<Enum>();
if (std::size(values) == 0)
return true;
// for every value, either it's the same as the last one or it's one larger
auto prev = values[0];
for (auto x : values) {
auto next = static_cast<Enum>(magic_enum::enum_integer(prev) + 1);
if (x != prev && x != next)
return false;
else
prev = x;
}
return true;
}
}
Note that this is indeed, as the library name implies, "magic" – the library functions on a number of compiler-specific hacks. As such it doesn't really meet your requirement of "pure C++", but is probably as good as we can get until we have reflection facilities in the language.
I'd love to see an answer on this. I've been needing it as well.
Unfortunately, I don't think this is possible using the existing utilities. If you want to implement a type trait on this, you need support from your compiler, so writing a template for it doesn't sound feasible.
I've already extended the enumeration with a specific tag to indicate it is contiguous and immediately gives you the size: enum class constructor c++ , how to pass specific value?
Alternatively, you can write your own trait:
template<T> struct IsContiguous : std::false_type {};
This needs to be specialized whenever you define an contiguous enum where you want to use this. Unfortunately, this requires some maintenance and attention if the enum gets changed.
All enum's are continuous. 0 is always allowed; the highest value allowed is the highest enumerator rounded up to the next 1<<N -1 (all bits one), and all values in between are allowed too. ([dcl.enum] 9.7.1/5). If there are negative enumerators defined, the lowest value allowed is similarly defined by rounding down the lowest enumerator.
The enumerators defined in the enum are constant expressions with a value in range and the correct type, but you can define additional constants outside the enum which have the same properties:
constexpr enum class Types_Discontinuous = static_cast<Types_Discontinuous>(2)

Comparing enums to integers

I've read that you shouldn't trust on the underlying implementation of an enum on being either signed or unsigned. From this I have concluded that you should always cast the enum value to the type that it's being compared against. Like this:
enum MyEnum { MY_ENUM_VALUE = 0 };
int i = 1;
if (i > static_cast<int>(MY_ENUM_VALUE))
{
// do stuff
}
unsigned int u = 2;
if (u > static_cast<unsigned int>(MY_ENUM_VALUE))
{
// do more stuff
}
Is this the best practice?
Edit: Does the situation change if the enum is anonymous?
An enum is an integer so you can compare it against any other integer, and even floats. The compiler will automatically convert both integers to the largest, or the enum to a double before the compare.
Now, if your enumeration is not supposed to represent a number per se, you may want to consider creating a class instead:
enum class some_name { MY_ENUM_VALUE, ... };
int i;
if(i == static_cast<int>(some_name::MY_ENUM_VALUE))
{
...
}
In that case you need a cast because an enum class is not viewed as an integer by default. This helps quite a bit to avoid bugs in case you were to misuse an enum value...
Update: also, you can now specify the type of integer of an enum. This was available in older compilers too, but it was often not working quite right (in my own experience).
enum class some_name : uint8_t { ... };
That means the enumeration uses uint8_t to store those values. Practical if you are using enumeration values in a structure used to send data over a network or save in a binary file where you need to know the exact size of the data.
When not specified, the type defaults to int.
As brought up by others, if the point of using enum is just to declare numbers, then using constexpr is probably better.
constexpr int MY_CONSTANT_VALUE = 0;
This has the same effect, only the type of MY_CONSTANT_VALUE is now an int. You could go a little further and use typedef as in:
typedef int my_type_t;
constexpr my_type_t MY_CONSTANT_VALUE = 0;
I often use enum even if I'm to use a single value when the value is not generally considered an integer. There is no set in stone rule in this case.
Short answer: Yes
enum is signed int type, but they get implicitly cast into unsigned int. Your compiler might give a warning without explicit casting, but its still very commonly used. however you should explicitly cast to make it clear to maintainers.
And of course, explicit cast will be must when its a strongly typed enum.
Best practice is not to write
int i = 1;
if (i > static_cast<int>(MY_ENUM_VALUE))
{
// do stuff
}
instead write
MyEnumValue i = MY_ENUM_VALUE ;
...
if ( i > MY_ENUM_VALUE ) {..}
But if - as in your example - you only have one value in your enum it is better to declare it as a constant instead of an enum.

Type safe enum bit flags

I'm looking to use a set of bit flags for my current issue. These flags are (nicely) defined as part of an enum, however I understand that when you OR two values from an enum the return type of the OR operation has type int.
What I'm currently looking for is a solution which will allow the users of the bit mask to remain type safe, as such I have created the following overload for operator |
enum ENUM
{
ONE = 0x01,
TWO = 0x02,
THREE = 0x04,
FOUR = 0x08,
FIVE = 0x10,
SIX = 0x20
};
ENUM operator | ( ENUM lhs, ENUM rhs )
{
// Cast to int first otherwise we'll just end up recursing
return static_cast< ENUM >( static_cast< int >( lhs ) | static_cast< int >( rhs ) );
}
void enumTest( ENUM v )
{
}
int main( int argc, char **argv )
{
// Valid calls to enumTest
enumTest( ONE | TWO | FIVE );
enumTest( TWO | THREE | FOUR | FIVE );
enumTest( ONE | TWO | THREE | FOUR | FIVE | SIX );
return 0;
}
Does this overload really provide type safety? Does casting an int containing values not defined in the enum cause undefined behaviour? Are there any caveats to be aware of?
Does this overload really provide type safety?
In this case, yes. The valid range of values for the enumeration goes at least up to (but not necessarily including) the next largest power of two after the largest named enumerator, in order to allow it to be used for bitmasks like this. So any bitwise operation on two values will give a value representable by this type.
Does casting an int containing values not defined in the enum cause undefined behaviour?
No, as long as the values are representable by the enumeration, which they are here.
Are there any caveats to be aware of?
If you were doing operations such as arithmetic, which could take the value out of range, then you'd get an implementation-defined result, but not undefined behavoiur.
If you think about type safety, it is better to use std::bitset
enum BITS { A, B, C, D };
std::bitset<4> bset, bset1;
bset.set(A); bset.set(C);
bset1[B] = 1;
assert(bset[A] == bset[C]);
assert(bset[A] != bset[B]);
assert(bset1 != bset);
The values of your constants are not closed under OR. In other words, it's possible that the result of an OR of two ENUM constants will result in a value that is not an ENUM constant:
0x30 == FIVE | SIX;
The standard says that this is ok, an enumaration can have a value not equal to any of its enumarators (constants). Presumably it's to allow this type of usage.
In my opinion this is not type safe because if you were to look at the implementation of enumTest you have to be aware that the argument type is ENUM but it might have a value that's not an ENUM enumerator.
I think that if these are simply bit flags then do what the compiler wants you to: use an int for the combination of flags.
With a simple enum such as yours:
enum ENUM
{
ONE = 0x01,
TWO = 0x02,
...
};
it is implementation-defined what's the underlying type (most likely int)1, but as long as you are going to use | (bitwise or) for creating masks, the result will never require a wider type than the largest value from this enum.
[1] "The underlying type of an enumeration is an integral type that can represent all the enumerator values defined in the enumeration. It is implementation-defined which integral type is used as the underlying type for an enumeration except that the underlying type shall not be larger than int unless the value of an enumerator cannot fit in an int or unsigned int."
This is my approach to bit flags:
template<typename E>
class Options {
unsigned long values;
constexpr Options(unsigned long v, int) : values{v} {}
public:
constexpr Options() : values(0) {}
constexpr Options(unsigned n) : values{1UL << n} {}
constexpr bool operator==(Options const& other) const {
return (values & other.values) == other.values;
}
constexpr bool operator!=(Options const& other) const {
return !operator==(other);
}
constexpr Options operator+(Options const& other) const {
return {values | other.values, 0};
}
Options& operator+=(Options const& other) {
values |= other.values;
return *this;
}
Options& operator-=(Options const& other) {
values &= ~other.values;
return *this;
}
};
#define DECLARE_OPTIONS(name) class name##__Tag; using name = Options
#define DEFINE_OPTION(name, option, index) constexpr name option(index)
You can use it like so:
DECLARE_OPTIONS(ENUM);
DEFINE_OPTIONS(ENUM, ONE, 0);
DEFINE_OPTIONS(ENUM, TWO, 1);
DEFINE_OPTIONS(ENUM, THREE, 2);
DEFINE_OPTIONS(ENUM, FOUR, 3);
Then ONE + TWO is still of type ENUM. And you can re-use the class to define multiple bit flag sets that are of different, incompatible types.
I personally don't like using | and & to set and test bits. It's the logical operation that needs to be done to set and test, but they don't express the meaning of the operation unless you think about bitwise operations. If you read out ONE | TWO you might think that you want either ONE or TWO, not necessarily both. This is why I prefer using + to add flags together and == to test if a flag is set.
See this blog post for more details on my suggested implementation.

Is using enum for integer bit oriented operations in C++ reliable/safe?

Consider the following (simplified) code:
enum eTestMode
{
TM_BASIC = 1, // 1 << 0
TM_ADV_1 = 1 << 1,
TM_ADV_2 = 1 << 2
};
...
int m_iTestMode; // a "bit field"
bool isSet( eTestMode tsm )
{
return ( (m_iTestMode & tsm) == tsm );
}
void setTestMode( eTestMode tsm )
{
m_iTestMode |= tsm;
}
Is this reliable, safe and/or good practice? Or is there a better way of achieving what i want to do apart from using const ints instead of enum? I would really prefer enums, but code reliability is more important than readability.
I can't see anything bad in that design.
However, keep in mind that enum types can hold unspecified values. Depending on who uses your functions, you might want to check first that the value of tsm is a valid enumeration value.
Since enums are integer values, one could do something like:
eTestMode tsm = static_cast<eTestMode>(17); // We consider here that 17 is not a valid value for your enumeration.
However, doing this is ugly and you might just consider that doing so results in undefined behavior.
There is no problem. You can even use an variable of eTestMode (and defines bit manipulation for that type) as it is guaranteed to hold all possible values in that case.
See also
What is the size of an enum in C?
For some compilers (e.g. VC++) this non-standard width specifier can be used:
enum eTestMode : unsigned __int32
{
TM_BASIC = 1, // 1 << 0
TM_ADV_1 = 1 << 1,
TM_ADV_2 = 1 << 2
};
Using enums for representing bit patterns, masks and flags is not always a good idea because enums generally promote to signed integer type, while for bit-based operation unsigned types are almost always preferable.