Are constant "flags" considered good style? (c++) - c++

I am currently working an a library and I wonder if it's considered good style to have all-caps constants like:
constexpr int ONLY_LOWERCASE = 1;
constexpr int ONLY_UPPERCASE = 2;
And so on. I plan on using it to let the library user control the behavior of the functions like:
doSomeThing(var, ONLY_UPPERCASE);+
Thanks

It's seems like you are using integer structure to store boolean data. It might considered as a better way to use boolean structure for this purpose, for memory usage reasons.
One way of archive this target is using enum or enum class that inherit from bool:
enum class string_case : bool {
ONLY_LOWERCASE,
ONLY_UPPERCASE
}
This way you will use a single byte that indicate whatever you want, instead of 8 bytes in your example.
Usage example:
doSomeThing(var, string_case::ONLY_UPPERCASE);
Edit
In case you have more than 2 flags, you can still use enum (just without inheritance from bool):
enum class string_case {
ONLY_LOWERCASE = 1,
ONLY_UPPERCASE = 2,
FLAG_3 = 3,
FLAG_4 = 4
};
And even so, using only 4 bytes (instead of 4(bytes) * flags_count).
Another approach, if multiple flags can be on together (and you don't want to play with bits in your enum calculations), you can use a struct:
struct options {
bool option_1: 1;
bool option_2: 1;
bool option_3: 1;
bool option_4: 1;
};
And in this way you'll only use the amount of bytes that you need to store those bits.

Related

Weird Data Types - C++

I know this sounds like a silly question, but I would like to know if it is possible in any way to make a custom variable size like this rather than using plain 8, 16, 32 and 64 bit integers:
uint15_t var; //as an example
I did some research online and I found nothing that works on all sizes (only stuff like 12 bit and 24 bit). (I was also wondering if bitfields would work with other data sizes too).
And yes, I know you could use a 16 bit integer to store a 15 bit value, but I just wanted to know if it is possible to do something like this.
How can I make and implement custom integer sizes?
Inside a struct or class, can use the bitfields feature to declare integers of the size you want for a member-variable:
unsigned int var : 15;
... it won't be very CPU-efficient (since the compiler will have to generate bitwise-operations on most accesses to that variable) but it will give you the desired 15-bit behavior.
To be able to use your bitfield int as a normal int but still get the behavior of a 15 bit int you can do it like this :
#include <cassert>
#include <utility>
template<typename type_t, std::size_t N>
struct bits_t final
{
bits_t() = default;
~bits_t() = default;
// explicitly implicit so it can convert automatically from underlying type
bits_t(const type_t& value) :
m_value{ value }
{
};
// implicit conversion back to underlying type
operator type_t ()
{
return m_value;
}
private:
type_t m_value : N;
};
int main()
{
bits_t<int, 15> value;
value = 16383; // 0x3FFF
assert(value == 16383);
// overflow now at bit 15 :)
value = 16384; // 0x4000, 16th bit is set.
assert(value == -16384);
return 0;
}
bitfields feature will do the trick ...
uint32_t customInt : 15;
You can try using bitfields, like many people mentioned. However, bitfields don't have a proper type. If you want to make your arbitrary-sized integers object-oriented, you can stuff the bitfield into a template:
template <int size> struct my_uint
{
uint32_t value: size;
};
typedef my_uint<13> uint13_t; // some people use "using" syntax to do this
typedef my_uint<14> uint14_t;
typedef my_uint<15> uint15_t;
However, now you lost arithmetic operators, and you have to implement (overload) them yourself. You have to ask yourself many questions about what you really want to do with these new types:
Do you want to overload operators like +, *, etc? Which ones?
Do you want to support arrays?
What is the maximum size you want to support? In my example, it's 32.
Do you want to support implicit constructors, e.g. uint15_t(uint32_t)?
How to support overflow?
There is no way to make your new types behave like built-in types - you can come close but cannot quite do it. That is, if you write a big program where you work with uint15_t and later you decide to switch to uint16_t, there will be subtle changes caused by uint16_t being a built-in type (e.g. consider rules about implicit conversions).

How can I make classes easily configurable without run-time overhead?

I recently started playing with Arduinos, and, coming from the Java world, I am struggling to contend with the constraints of microcontroller programming. I am slipping ever closer to the Arduino 2-kilobyte RAM limit.
A puzzle I face constantly is how to make code more reusable and reconfigurable, without increasing its compiled size, especially when it is used in only one particular configuration in a particular build.
For example, a generic driver class for 7-segment number displays will need, at minimum, configuration for the I/O pin number for each LED segment, to make the class usable with different circuits:
class SevenSeg {
private:
byte pinA; // top
byte pinB; // upper right
byte pinC; // lower right
byte pinD; // bottom
byte pinE; // lower left
byte pinF; // upper left
byte pinG; // middle
byte pinDP; // decimal point
public:
void setSegmentPins(byte a, byte b, byte c, byte d, byte e, byte f, byte g, byte dp) {
/* ... init fields ... */
}
...
};
SevenSeg display;
display.setSegmentPins(12, 10, 7, 6, 5, 9, 8, 13);
...
The price I'm paying for flexibility here is 8 extra RAM bytes for extra fields, and more code bytes and overhead every time the class accesses those fields. But during any particular compilation of this class on any particular circuit, this class is only instantiated with one set of values, and those values are initialized before ever being read. They are effectively constant, as if I had written:
class SevenSeg {
private:
static const byte pinA = 12;
static const byte pinB = 10;
static const byte pinC = 7;
static const byte pinD = 6;
static const byte pinE = 5;
static const byte pinF = 9;
static const byte pinG = 8;
static const byte pinDP = 13;
...
};
Unfortunately, GCC does not share this understanding.
I considered using a "template":
template <byte pinA, byte pinB, byte pinC, byte pinD, byte pinE, byte pinF, byte pinG, byte pinDP> class SevenSeg {
...
};
SevenSeg<12, 10, 7, 6, 5, 9, 8, 13> display;
For this reduced example, where the particular parameters are homogeneous, and always specified, this is not too cumbersome. But I want more parameters: For example I also need to be able to configure the numbers of the common pins for the display's digits (for a configurable amount of digits), and configure the LED polarity: common anode or common cathode. And maybe more options in the future. It will get ugly cramming that into the template initialization line. And this problem is not limited to this one class: I am falling into this rift everywhere.
I want to make my code configurable, reusable, beautiful, but every time I add configurable fields to something, it eats up more RAM bytes just to get back to the same level of functionality.
Watching the free memory number creep down feels like being punished for writing code, and that's not fun.
I feel like I'm missing some tricks.
I've added a bounty to this question because although I quite like the template config struct thing shown by #alterigel, I don't like that it forces respecification of the precise types of each field, which is verbose and feels brittle. It's particularly icky with arrays (compounded by some Arduino limitations, such as not supporting constexpr inline or std::array, apparently).
The config struct ends up consisting almost entirely of structural boilerplate, rather than what I would ideally like: just a concise description of keys and values.
I must be missing some alternatives due to not knowing C++. More templates? Macros? Inheritance? Inlining tricks? To avoid this question becoming too broad, I'm specifically interested in ways of doing this that have zero run-time overhead.
EDIT: I have removed the rest of the example code from here. I included it to avoid getting shut down by the "too broad" police, but it seemed to be distracting people. My question has nothing to do with 7-segment displays, or even Arduinos necessarily. I just want to know the ways in C++ to configure class behavior at compile time that have zero run-time overhead.
You can use a single struct to encapsulate these constants as named static constants, rather than as individual template parameters. You can then pass this struct type as a single template parameter, and the template can expect to find each constant by name. For example:
struct YesterdaysConfig {
static const byte pinA = 3;
static const byte pinB = 4;
static const byte pinC = 5;
static const byte pinD = 6;
static const byte pinE = 7;
static const byte pinF = 8;
static const byte pinG = 9;
static const byte pinDP = 10;
};
struct TodaysConfig {
static const byte pinA = 12;
static const byte pinB = 10;
static const byte pinC = 7;
static const byte pinD = 6;
static const byte pinE = 5;
static const byte pinF = 9;
static const byte pinG = 8;
static const byte pinDP = 13;
// Easy to extend:
static const byte extraData = 0xFF;
using customType = double;
};
Your template can expect any type which provides the required fields as named static variables within the struct's scope.
An example template implementation:
template<typename ConfigT>
class SevenSeg {
public:
SevenSeg() {
theHardware.setSegmentPins(
ConfigT::pinA,
ConfigT::pinB,
ConfigT::pinC,
ConfigT::pinD,
ConfigT::pinE,
ConfigT::pinF,
ConfigT::pinG,
ConfigT::pinDP
);
}
};
And an example usage:
auto display = SevenSeg<TodaysConfig>{};
Live Example
If I understand your situation correctly, whenever you compile your program, you target a single, specific architecture/device with one specific setting. There is never a case where you program would deal with multiple settings at the same time, is that right?
I also assume that your whole project is ultimately relatively small.
If that is the case, I would probably forgo any fancy templates or objects. Instead, for every device you desire to compile for, create a separate header file with all settings given as global constexpr constants or enums. If you change your target, you need to supply a different config header file and recompile the whole program.
The only missing component is how to make your program include appropriate config header? That can be solved with the preprocessor: Depending on the desired device, you can pass a different command line -D<setting_identification_macro> when invoking the compiler. Then, create a header file which acts as a selector. In there you list all supported devices in a form of
#ifdef setting_identification_macro
#include "corresponding_config.h"
#endif
You might cringe at this "hacky" solution, but it has many advantages:
No run-time overhead as you desired
Absolutely no boilerplate code. No structs to pass around or template arguments.
No change in code required when switching between settings. You just change the command line parameter when invoking the compiler.
Can be done in old/limited C++ or plain C
This does nothing for the whole problem, but improves the pgm_read:
template<class T = type>
auto pgm_read(const T *p) {
if constexpr (std::is_same<T, float>::value) {
return pgm_read_float(p);
} else if constexpr (sizeof(T) == 1) {
return pgm_read_byte(p);
} else if constexpr (sizeof(T) == 2) {
return pgm_read_word(p);
} else if constexpr (sizeof(T) == 4) {
return pgm_read_dword(p);
}
}
This has to be a template for the if constexpr to work correctly.

C++. Struct padding / alignment on different platforms and atomatic check of layout compatibility

I have embedded device connected to PC
and some big struct S with many fields and arrays of custom defined type FixedPoint_t.
FixedPoint_t is a templated POD class with exactly one data member that vary in size from char to long depending on template params. Anyway it passes static_assert((std::is_pod<FixedPoint_t<0,8,8> >::value == true),"");
It will be good if this big struct has compatible underlaying memory representation on both embedded system and controlling PC. This allows significant simplification of communication protocol to commands like "set word/byte with offset N to value V". Assume endianess is the same on both platforms.
I see 3 solutions here:
Use something like #pragma packed on both sides.
BUT i got warning when i put attribute((packed)) to struct S declaration
warning: ignoring packed attribute because of unpacked non-POD field.
This is because FixedPoint_t is not declared as packed.
I don't want declare it as packed because this type is widely used in whole program and packing can lead to performance drop.
Make correct struct serialization. This is not acceptable because of code bloat, additional RAM usege...Protocol will be more complicated because i need random access to the struct. Now I think this is not an option.
Control padding manually. I can add some field, reorder others...Just to acheive no padding on both platforms. This will satisfy me at the moment. But i need good way to write a test that shows me is padding is there or not.
I can compare sum of sizeof() each field to sizeof(struct).
I can compare offsetof() each struct field on both platforms.
Both variants are ugly enough...
What do you recommend? Especially i am interested in manual padding controling and automaic padding detection in tests.
EDIT: Is it sufficient to compare sizeof(big struct) on two platforms to detect layout compatibility(assume endianess is equal)?? I think size should not match if padding will be different.
EDIT2:
//this struct should have padding on 32bit machine
//and has no padding on 8bit
typedef struct
{
uint8_t f8;
uint32_t f32;
uint8_t arr[5];
} serialize_me_t;
//count of members in struct
#define SERTABLE_LEN 3
//one table entry for each serialize_me_t data member
static const struct {
size_t width;
size_t offset;
// size_t cnt; //why we need cnt?
} ser_des_table[SERTABLE_LEN] =
{
{ sizeof(serialize_me_t::f8), offsetof(serialize_me_t, f8)},
{ sizeof(serialize_me_t::f32), offsetof(serialize_me_t, f32)},
{ sizeof(serialize_me_t::arr), offsetof(serialize_me_t, arr)},
};
void serialize(void* serialize_me_ptr, char* buf)
{
const char* struct_ptr = (const char*)serialize_me_ptr;
for(int i=0; i<SERTABLE_LEN; I++)
{
struct_ptr += ser_des_table[i].offset;
memcpy(buf, struct_ptr, ser_des_table[i].width );
buf += ser_des_table[i].width;
}
}
I strongly recommend to use option 2:
You are save for future changes (new PCD/ABI, compiler, platform, etc.)
Code-bloat can be kept to a minimum if well thought. There is just one function needed per direction.
You can create the required tables/code (semi-)automatically (I use Python for such). This way both sides will stay in sync.
You definitively should add a CRC to the data anyway. As you likely do not want to calculate this in the rx/tx-interrupt, you'll have to provide an array anyway.
Using a struct directly will soon become a maintenance nightmare. Even worse if someone else has to track this code.
Protocols, etc. tend to be reused. If it is a platform with different endianess, the other approach goes bang.
To create the data-structs and ser/des tables, you can use offsetof to get the offset of each type in the struct. If that table is made an include-file, it can be used on both sides. You can even create the struct and table e.g. by a Python script. Adding that to the build-process ensures it is always up-to-date and you avoid additional typeing.
For instance (in C, just to get idea):
// protocol.inc
typedef struct {
uint32_t i;
uint 16_t s[5];
uint32_t j;
} ProtocolType;
static const struct {
size_t width;
size_t offset;
size_t cnt;
} ser_des_table[] = {
{ sizeof(ProtocolType.i), offsetof(ProtocolType.i), 1 },
{ sizeof(ProtocolType.s[0]), offsetof(ProtocolType.s), 5 },
...
};
If not created automatically, I'd use macros to generate the data. Possibly by including the file twice: one to generate the struct definition and one for the table. This is possible by redefining the macros in-between.
You should care about the representation of signed integers and floats (implementation defined, floats are likely IEEE754 as proposed by the standard).
As an alternative to the width field, you can use an "type" code (e.g. a char which maps to an implementation-defined type. This way you can add custom types with same width, but different encoding (e.g. uint32_t and IEEE754-float). This will completely abstract the network protocol encoding from the physical machine (the best solution). Note noting hinders you from using common encodings which do not complicate code a single bit (literally).

Is using enum for integer bit oriented operations in C++ reliable/safe?

Consider the following (simplified) code:
enum eTestMode
{
TM_BASIC = 1, // 1 << 0
TM_ADV_1 = 1 << 1,
TM_ADV_2 = 1 << 2
};
...
int m_iTestMode; // a "bit field"
bool isSet( eTestMode tsm )
{
return ( (m_iTestMode & tsm) == tsm );
}
void setTestMode( eTestMode tsm )
{
m_iTestMode |= tsm;
}
Is this reliable, safe and/or good practice? Or is there a better way of achieving what i want to do apart from using const ints instead of enum? I would really prefer enums, but code reliability is more important than readability.
I can't see anything bad in that design.
However, keep in mind that enum types can hold unspecified values. Depending on who uses your functions, you might want to check first that the value of tsm is a valid enumeration value.
Since enums are integer values, one could do something like:
eTestMode tsm = static_cast<eTestMode>(17); // We consider here that 17 is not a valid value for your enumeration.
However, doing this is ugly and you might just consider that doing so results in undefined behavior.
There is no problem. You can even use an variable of eTestMode (and defines bit manipulation for that type) as it is guaranteed to hold all possible values in that case.
See also
What is the size of an enum in C?
For some compilers (e.g. VC++) this non-standard width specifier can be used:
enum eTestMode : unsigned __int32
{
TM_BASIC = 1, // 1 << 0
TM_ADV_1 = 1 << 1,
TM_ADV_2 = 1 << 2
};
Using enums for representing bit patterns, masks and flags is not always a good idea because enums generally promote to signed integer type, while for bit-based operation unsigned types are almost always preferable.

Should I use #define, enum or const?

In a C++ project I'm working on, I have a flag kind of value which can have four values. Those four flags can be combined. Flags describe the records in database and can be:
new record
deleted record
modified record
existing record
Now, for each record I wish to keep this attribute, so I could use an enum:
enum { xNew, xDeleted, xModified, xExisting }
However, in other places in code, I need to select which records are to be visible to the user, so I'd like to be able to pass that as a single parameter, like:
showRecords(xNew | xDeleted);
So, it seems I have three possible appoaches:
#define X_NEW 0x01
#define X_DELETED 0x02
#define X_MODIFIED 0x04
#define X_EXISTING 0x08
or
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
or
namespace RecordType {
static const uint8 xNew = 1;
static const uint8 xDeleted = 2;
static const uint8 xModified = 4;
static const uint8 xExisting = 8;
}
Space requirements are important (byte vs int) but not crucial. With defines I lose type safety, and with enum I lose some space (integers) and probably have to cast when I want to do a bitwise operation. With const I think I also lose type safety since a random uint8 could get in by mistake.
Is there some other cleaner way?
If not, what would you use and why?
P.S. The rest of the code is rather clean modern C++ without #defines, and I have used namespaces and templates in few spaces, so those aren't out of question either.
Combine the strategies to reduce the disadvantages of a single approach. I work in embedded systems so the following solution is based on the fact that integer and bitwise operators are fast, low memory & low in flash usage.
Place the enum in a namespace to prevent the constants from polluting the global namespace.
namespace RecordType {
An enum declares and defines a compile time checked typed. Always use compile time type checking to make sure arguments and variables are given the correct type. There is no need for the typedef in C++.
enum TRecordType { xNew = 1, xDeleted = 2, xModified = 4, xExisting = 8,
Create another member for an invalid state. This can be useful as error code; for example, when you want to return the state but the I/O operation fails. It is also useful for debugging; use it in initialisation lists and destructors to know if the variable's value should be used.
xInvalid = 16 };
Consider that you have two purposes for this type. To track the current state of a record and to create a mask to select records in certain states. Create an inline function to test if the value of the type is valid for your purpose; as a state marker vs a state mask. This will catch bugs as the typedef is just an int and a value such as 0xDEADBEEF may be in your variable through uninitialised or mispointed variables.
inline bool IsValidState( TRecordType v) {
switch(v) { case xNew: case xDeleted: case xModified: case xExisting: return true; }
return false;
}
inline bool IsValidMask( TRecordType v) {
return v >= xNew && v < xInvalid ;
}
Add a using directive if you want to use the type often.
using RecordType ::TRecordType ;
The value checking functions are useful in asserts to trap bad values as soon as they are used. The quicker you catch a bug when running, the less damage it can do.
Here are some examples to put it all together.
void showRecords(TRecordType mask) {
assert(RecordType::IsValidMask(mask));
// do stuff;
}
void wombleRecord(TRecord rec, TRecordType state) {
assert(RecordType::IsValidState(state));
if (RecordType ::xNew) {
// ...
} in runtime
TRecordType updateRecord(TRecord rec, TRecordType newstate) {
assert(RecordType::IsValidState(newstate));
//...
if (! access_was_successful) return RecordType ::xInvalid;
return newstate;
}
The only way to ensure correct value safety is to use a dedicated class with operator overloads and that is left as an exercise for another reader.
Forget the defines
They will pollute your code.
bitfields?
struct RecordFlag {
unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};
Don't ever use that. You are more concerned with speed than with economizing 4 ints. Using bit fields is actually slower than access to any other type.
However, bit members in structs have practical drawbacks. First, the ordering of bits in memory varies from compiler to compiler. In addition, many popular compilers generate inefficient code for reading and writing bit members, and there are potentially severe thread safety issues relating to bit fields (especially on multiprocessor systems) due to the fact that most machines cannot manipulate arbitrary sets of bits in memory, but must instead load and store whole words. e.g the following would not be thread-safe, in spite of the use of a mutex
Source: http://en.wikipedia.org/wiki/Bit_field:
And if you need more reasons to not use bitfields, perhaps Raymond Chen will convince you in his The Old New Thing Post: The cost-benefit analysis of bitfields for a collection of booleans at http://blogs.msdn.com/oldnewthing/archive/2008/11/26/9143050.aspx
const int?
namespace RecordType {
static const uint8 xNew = 1;
static const uint8 xDeleted = 2;
static const uint8 xModified = 4;
static const uint8 xExisting = 8;
}
Putting them in a namespace is cool. If they are declared in your CPP or header file, their values will be inlined. You'll be able to use switch on those values, but it will slightly increase coupling.
Ah, yes: remove the static keyword. static is deprecated in C++ when used as you do, and if uint8 is a buildin type, you won't need this to declare this in an header included by multiple sources of the same module. In the end, the code should be:
namespace RecordType {
const uint8 xNew = 1;
const uint8 xDeleted = 2;
const uint8 xModified = 4;
const uint8 xExisting = 8;
}
The problem of this approach is that your code knows the value of your constants, which increases slightly the coupling.
enum
The same as const int, with a somewhat stronger typing.
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
They are still polluting the global namespace, though.
By the way... Remove the typedef. You're working in C++. Those typedefs of enums and structs are polluting the code more than anything else.
The result is kinda:
enum RecordType { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;
void doSomething(RecordType p_eMyEnum)
{
if(p_eMyEnum == xNew)
{
// etc.
}
}
As you see, your enum is polluting the global namespace.
If you put this enum in an namespace, you'll have something like:
namespace RecordType {
enum Value { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;
}
void doSomething(RecordType::Value p_eMyEnum)
{
if(p_eMyEnum == RecordType::xNew)
{
// etc.
}
}
extern const int ?
If you want to decrease coupling (i.e. being able to hide the values of the constants, and so, modify them as desired without needing a full recompilation), you can declare the ints as extern in the header, and as constant in the CPP file, as in the following example:
// Header.hpp
namespace RecordType {
extern const uint8 xNew ;
extern const uint8 xDeleted ;
extern const uint8 xModified ;
extern const uint8 xExisting ;
}
And:
// Source.hpp
namespace RecordType {
const uint8 xNew = 1;
const uint8 xDeleted = 2;
const uint8 xModified = 4;
const uint8 xExisting = 8;
}
You won't be able to use switch on those constants, though. So in the end, pick your poison...
:-p
Have you ruled out std::bitset? Sets of flags is what it's for. Do
typedef std::bitset<4> RecordType;
then
static const RecordType xNew(1);
static const RecordType xDeleted(2);
static const RecordType xModified(4);
static const RecordType xExisting(8);
Because there are a bunch of operator overloads for bitset, you can now do
RecordType rt = whatever; // unsigned long or RecordType expression
rt |= xNew; // set
rt &= ~xDeleted; // clear
if ((rt & xModified) != 0) ... // test
Or something very similar to that - I'd appreciate any corrections since I haven't tested this. You can also refer to the bits by index, but it's generally best to define only one set of constants, and RecordType constants are probably more useful.
Assuming you have ruled out bitset, I vote for the enum.
I don't buy that casting the enums is a serious disadvantage - OK so it's a bit noisy, and assigning an out-of-range value to an enum is undefined behaviour so it's theoretically possible to shoot yourself in the foot on some unusual C++ implementations. But if you only do it when necessary (which is when going from int to enum iirc), it's perfectly normal code that people have seen before.
I'm dubious about any space cost of the enum, too. uint8 variables and parameters probably won't use any less stack than ints, so only storage in classes matters. There are some cases where packing multiple bytes in a struct will win (in which case you can cast enums in and out of uint8 storage), but normally padding will kill the benefit anyhow.
So the enum has no disadvantages compared with the others, and as an advantage gives you a bit of type-safety (you can't assign some random integer value without explicitly casting) and clean ways of referring to everything.
For preference I'd also put the "= 2" in the enum, by the way. It's not necessary, but a "principle of least astonishment" suggests that all 4 definitions should look the same.
Here are couple of articles on const vs. macros vs. enums:
Symbolic Constants
Enumeration Constants vs. Constant Objects
I think you should avoid macros especially since you wrote most of your new code is in modern C++.
If possible do NOT use macros. They aren't too much admired when it comes to modern C++.
With defines I lose type safety
Not necessarily...
// signed defines
#define X_NEW 0x01u
#define X_NEW (unsigned(0x01)) // if you find this more readable...
and with enum I lose some space (integers)
Not necessarily - but you do have to be explicit at points of storage...
struct X
{
RecordType recordType : 4; // use exactly 4 bits...
RecordType recordType2 : 4; // use another 4 bits, typically in the same byte
// of course, the overall record size may still be padded...
};
and probably have to cast when I want to do bitwise operation.
You can create operators to take the pain out of that:
RecordType operator|(RecordType lhs, RecordType rhs)
{
return RecordType((unsigned)lhs | (unsigned)rhs);
}
With const I think I also lose type safety since a random uint8 could get in by mistake.
The same can happen with any of these mechanisms: range and value checks are normally orthogonal to type safety (though user-defined-types - i.e. your own classes - can enforce "invariants" about their data). With enums, the compiler's free to pick a larger type to host the values, and an uninitialised, corrupted or just miss-set enum variable could still end up interpretting its bit pattern as a number you wouldn't expect - comparing unequal to any of the enumeration identifiers, any combination of them, and 0.
Is there some other cleaner way? / If not, what would you use and why?
Well, in the end the tried-and-trusted C-style bitwise OR of enumerations works pretty well once you have bit fields and custom operators in the picture. You can further improve your robustness with some custom validation functions and assertions as in mat_geek's answer; techniques often equally applicable to handling string, int, double values etc..
You could argue that this is "cleaner":
enum RecordType { New, Deleted, Modified, Existing };
showRecords([](RecordType r) { return r == New || r == Deleted; });
I'm indifferent: the data bits pack tighter but the code grows significantly... depends how many objects you've got, and the lamdbas - beautiful as they are - are still messier and harder to get right than bitwise ORs.
BTW /- the argument about thread safety's pretty weak IMHO - best remembered as a background consideration rather than becoming a dominant decision-driving force; sharing a mutex across the bitfields is a more likely practice even if unaware of their packing (mutexes are relatively bulky data members - I have to be really concerned about performance to consider having multiple mutexes on members of one object, and I'd look carefully enough to notice they were bit fields). Any sub-word-size type could have the same problem (e.g. a uint8_t). Anyway, you could try atomic compare-and-swap style operations if you're desperate for higher concurrency.
Enums would be more appropriate as they provide "meaning to the identifiers" as well as type safety. You can clearly tell "xDeleted" is of "RecordType" and that represent "type of a record" (wow!) even after years. Consts would require comments for that, also they would require going up and down in code.
Even if you have to use 4 byte to store an enum (I'm not that familiar with C++ -- I know you can specify the underlying type in C#), it's still worth it -- use enums.
In this day and age of servers with GBs of memory, things like 4 bytes vs. 1 byte of memory at the application level in general don't matter. Of course, if in your particular situation, memory usage is that important (and you can't get C++ to use a byte to back the enum), then you can consider the 'static const' route.
At the end of the day, you have to ask yourself, is it worth the maintenance hit of using 'static const' for the 3 bytes of memory savings for your data structure?
Something else to keep in mind -- IIRC, on x86, data structures are 4-byte aligned, so unless you have a number of byte-width elements in your 'record' structure, it might not actually matter. Test and make sure it does before you make a tradeoff in maintainability for performance/space.
If you want the type safety of classes, with the convenience of enumeration syntax and bit checking, consider Safe Labels in C++. I've worked with the author, and he's pretty smart.
Beware, though. In the end, this package uses templates and macros!
Do you actually need to pass around the flag values as a conceptual whole, or are you going to have a lot of per-flag code? Either way, I think having this as class or struct of 1-bit bitfields might actually be clearer:
struct RecordFlag {
unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};
Then your record class could have a struct RecordFlag member variable, functions can take arguments of type struct RecordFlag, etc. The compiler should pack the bitfields together, saving space.
I probably wouldn't use an enum for this kind of a thing where the values can be combined together, more typically enums are mutually exclusive states.
But whichever method you use, to make it more clear that these are values which are bits which can be combined together, use this syntax for the actual values instead:
#define X_NEW (1 << 0)
#define X_DELETED (1 << 1)
#define X_MODIFIED (1 << 2)
#define X_EXISTING (1 << 3)
Using a left-shift there helps to indicate that each value is intended to be a single bit, it is less likely that later on someone would do something wrong like add a new value and assign it something a value of 9.
Based on KISS, high cohesion and low coupling, ask these questions -
Who needs to know? my class, my library, other classes, other libraries, 3rd parties
What level of abstraction do I need to provide? Does the consumer understand bit operations.
Will I have have to interface from VB/C# etc?
There is a great book "Large-Scale C++ Software Design", this promotes base types externally, if you can avoid another header file/interface dependancy you should try to.
If you are using Qt you should have a look for QFlags.
The QFlags class provides a type-safe way of storing OR-combinations of enum values.
I would rather go with
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
Simply because:
It is cleaner and it makes the code readable and maintainable.
It logically groups the constants.
Programmer's time is more important, unless your job is to save those 3 bytes.
Not that I like to over-engineer everything but sometimes in these cases it may be worth creating a (small) class to encapsulate this information.
If you create a class RecordType then it might have functions like:
void setDeleted();
void clearDeleted();
bool isDeleted();
etc... (or whatever convention suits)
It could validate combinations (in the case where not all combinations are legal, eg if 'new' and 'deleted' could not both be set at the same time). If you just used bit masks etc then the code that sets the state needs to validate, a class can encapsulate that logic too.
The class may also give you the ability to attach meaningful logging info to each state, you could add a function to return a string representation of the current state etc (or use the streaming operators '<<').
For all that if you are worried about storage you could still have the class only have a 'char' data member, so only take a small amount of storage (assuming it is non virtual). Of course depending on the hardware etc you may have alignment issues.
You could have the actual bit values not visible to the rest of the 'world' if they are in an anonymous namespace inside the cpp file rather than in the header file.
If you find that the code using the enum/#define/ bitmask etc has a lot of 'support' code to deal with invalid combinations, logging etc then encapsulation in a class may be worth considering. Of course most times simple problems are better off with simple solutions...