How can I make classes easily configurable without run-time overhead?

How can I make classes easily configurable without run-time overhead? - c++

I recently started playing with Arduinos, and, coming from the Java world, I am struggling to contend with the constraints of microcontroller programming. I am slipping ever closer to the Arduino 2-kilobyte RAM limit.
A puzzle I face constantly is how to make code more reusable and reconfigurable, without increasing its compiled size, especially when it is used in only one particular configuration in a particular build.
For example, a generic driver class for 7-segment number displays will need, at minimum, configuration for the I/O pin number for each LED segment, to make the class usable with different circuits:
class SevenSeg {
private:
byte pinA; // top
byte pinB; // upper right
byte pinC; // lower right
byte pinD; // bottom
byte pinE; // lower left
byte pinF; // upper left
byte pinG; // middle
byte pinDP; // decimal point
public:
void setSegmentPins(byte a, byte b, byte c, byte d, byte e, byte f, byte g, byte dp) {
/* ... init fields ... */
}
...
};
SevenSeg display;
display.setSegmentPins(12, 10, 7, 6, 5, 9, 8, 13);
...
The price I'm paying for flexibility here is 8 extra RAM bytes for extra fields, and more code bytes and overhead every time the class accesses those fields. But during any particular compilation of this class on any particular circuit, this class is only instantiated with one set of values, and those values are initialized before ever being read. They are effectively constant, as if I had written:
class SevenSeg {
private:
static const byte pinA = 12;
static const byte pinB = 10;
static const byte pinC = 7;
static const byte pinD = 6;
static const byte pinE = 5;
static const byte pinF = 9;
static const byte pinG = 8;
static const byte pinDP = 13;
...
};
Unfortunately, GCC does not share this understanding.
I considered using a "template":
template <byte pinA, byte pinB, byte pinC, byte pinD, byte pinE, byte pinF, byte pinG, byte pinDP> class SevenSeg {
...
};
SevenSeg<12, 10, 7, 6, 5, 9, 8, 13> display;
For this reduced example, where the particular parameters are homogeneous, and always specified, this is not too cumbersome. But I want more parameters: For example I also need to be able to configure the numbers of the common pins for the display's digits (for a configurable amount of digits), and configure the LED polarity: common anode or common cathode. And maybe more options in the future. It will get ugly cramming that into the template initialization line. And this problem is not limited to this one class: I am falling into this rift everywhere.
I want to make my code configurable, reusable, beautiful, but every time I add configurable fields to something, it eats up more RAM bytes just to get back to the same level of functionality.
Watching the free memory number creep down feels like being punished for writing code, and that's not fun.
I feel like I'm missing some tricks.
I've added a bounty to this question because although I quite like the template config struct thing shown by #alterigel, I don't like that it forces respecification of the precise types of each field, which is verbose and feels brittle. It's particularly icky with arrays (compounded by some Arduino limitations, such as not supporting constexpr inline or std::array, apparently).
The config struct ends up consisting almost entirely of structural boilerplate, rather than what I would ideally like: just a concise description of keys and values.
I must be missing some alternatives due to not knowing C++. More templates? Macros? Inheritance? Inlining tricks? To avoid this question becoming too broad, I'm specifically interested in ways of doing this that have zero run-time overhead.
EDIT: I have removed the rest of the example code from here. I included it to avoid getting shut down by the "too broad" police, but it seemed to be distracting people. My question has nothing to do with 7-segment displays, or even Arduinos necessarily. I just want to know the ways in C++ to configure class behavior at compile time that have zero run-time overhead.

You can use a single struct to encapsulate these constants as named static constants, rather than as individual template parameters. You can then pass this struct type as a single template parameter, and the template can expect to find each constant by name. For example:
struct YesterdaysConfig {
static const byte pinA = 3;
static const byte pinB = 4;
static const byte pinC = 5;
static const byte pinD = 6;
static const byte pinE = 7;
static const byte pinF = 8;
static const byte pinG = 9;
static const byte pinDP = 10;
};
struct TodaysConfig {
static const byte pinA = 12;
static const byte pinB = 10;
static const byte pinC = 7;
static const byte pinD = 6;
static const byte pinE = 5;
static const byte pinF = 9;
static const byte pinG = 8;
static const byte pinDP = 13;
// Easy to extend:
static const byte extraData = 0xFF;
using customType = double;
};
Your template can expect any type which provides the required fields as named static variables within the struct's scope.
An example template implementation:
template<typename ConfigT>
class SevenSeg {
public:
SevenSeg() {
theHardware.setSegmentPins(
ConfigT::pinA,
ConfigT::pinB,
ConfigT::pinC,
ConfigT::pinD,
ConfigT::pinE,
ConfigT::pinF,
ConfigT::pinG,
ConfigT::pinDP
);
}
};
And an example usage:
auto display = SevenSeg<TodaysConfig>{};
Live Example

If I understand your situation correctly, whenever you compile your program, you target a single, specific architecture/device with one specific setting. There is never a case where you program would deal with multiple settings at the same time, is that right?
I also assume that your whole project is ultimately relatively small.
If that is the case, I would probably forgo any fancy templates or objects. Instead, for every device you desire to compile for, create a separate header file with all settings given as global constexpr constants or enums. If you change your target, you need to supply a different config header file and recompile the whole program.
The only missing component is how to make your program include appropriate config header? That can be solved with the preprocessor: Depending on the desired device, you can pass a different command line -D<setting_identification_macro> when invoking the compiler. Then, create a header file which acts as a selector. In there you list all supported devices in a form of
#ifdef setting_identification_macro
#include "corresponding_config.h"
#endif
You might cringe at this "hacky" solution, but it has many advantages:
No run-time overhead as you desired
Absolutely no boilerplate code. No structs to pass around or template arguments.
No change in code required when switching between settings. You just change the command line parameter when invoking the compiler.
Can be done in old/limited C++ or plain C

This does nothing for the whole problem, but improves the pgm_read:
template<class T = type>
auto pgm_read(const T *p) {
if constexpr (std::is_same<T, float>::value) {
return pgm_read_float(p);
} else if constexpr (sizeof(T) == 1) {
return pgm_read_byte(p);
} else if constexpr (sizeof(T) == 2) {
return pgm_read_word(p);
} else if constexpr (sizeof(T) == 4) {
return pgm_read_dword(p);
}
}
This has to be a template for the if constexpr to work correctly.

Related

Are constant "flags" considered good style? (c++)

I am currently working an a library and I wonder if it's considered good style to have all-caps constants like:
constexpr int ONLY_LOWERCASE = 1;
constexpr int ONLY_UPPERCASE = 2;
And so on. I plan on using it to let the library user control the behavior of the functions like:
doSomeThing(var, ONLY_UPPERCASE);+
Thanks

It's seems like you are using integer structure to store boolean data. It might considered as a better way to use boolean structure for this purpose, for memory usage reasons.
One way of archive this target is using enum or enum class that inherit from bool:
enum class string_case : bool {
ONLY_LOWERCASE,
ONLY_UPPERCASE
}
This way you will use a single byte that indicate whatever you want, instead of 8 bytes in your example.
Usage example:
doSomeThing(var, string_case::ONLY_UPPERCASE);
Edit
In case you have more than 2 flags, you can still use enum (just without inheritance from bool):
enum class string_case {
ONLY_LOWERCASE = 1,
ONLY_UPPERCASE = 2,
FLAG_3 = 3,
FLAG_4 = 4
};
And even so, using only 4 bytes (instead of 4(bytes) * flags_count).
Another approach, if multiple flags can be on together (and you don't want to play with bits in your enum calculations), you can use a struct:
struct options {
bool option_1: 1;
bool option_2: 1;
bool option_3: 1;
bool option_4: 1;
};
And in this way you'll only use the amount of bytes that you need to store those bits.

Bool Flags vs. Unsigned Char Flags

Disclaimer: Please correct me in the event that I make any false claims in this post.
Consider a struct that contains eight bool member variables.
/*
* Struct uses one byte for each flag.
*/
struct WithBools
{
bool f0 = true;
bool f1 = true;
bool f2 = true;
bool f3 = true;
bool f4 = true;
bool f5 = true;
bool f6 = true;
bool f7 = true;
};
The space allocated to each variable is a byte in length, which seems like a waste if the variables are used solely as flags. One solution to reduce this wasted space, as far as the variables are concerned, is to encapsulate the eight flags into a single member variable of unsigned char.
/*
* Struct uses a single byte for eight flags; retrieval and
* manipulation of data is achieved through accessor functions.
*/
struct WithoutBools
{
unsigned char getFlag(unsigned index)
{
return flags & (1 << (index % 8));
}
void toggleFlag(unsigned index)
{
flags ^= (1 << (index % 8));
}
private:
unsigned char flags = 0xFF;
};
The flags are retrieved and manipulated via. bitwise operators, and the struct provides an interface for the user to retrieve and manipulate the flags. While flag sizes have been reduced, we now have the two additional methods that add to the size of the struct. I do not know how to benchmark this difference, therefore I could not be certain of any fluctuation between the above structs.
My questions are:
1) Would the difference in space between these two structs be negligible?
2) Generally, is this approach of "optimising" a collection of bools by compacting them into a single byte a good idea? Either in an embedded systems context or otherwise.
3) Would a C++ compiler make such an optimisation that compacts a collection of bools wherever possible and appropriate.

we now have the two additional methods that add to the size of the
struct
Methods are code and do not increase the size of the struct. Only data makes up size on the structure.
3) Would a C++ compiler make such an optimisation that compacts a
collection of bools wherever possible and appropriate.
That is a sound resounding no. The compiler is not allowed to change data types.
1) Would the difference in space between these two structs be
negligible?
No, there definitely is a size difference between the two approaches.
2) Generally, is this approach of "optimising" a collection of bools
by compacting them into a single byte a good idea? Either in an
embedded systems context or otherwise.
Generally yes, the idiomatic way to model flags is with bit-wise manipulation inside an unsigned integer. Depending on the number of flags needed you can use std::uint8_t, std::uint16_t and so on.
However the most common way to model this is not via index as you've done, but via masks.

Would the difference in space between these two structs be negligible?
That depends on how many values you are storing and how much space you have to store them in. The size difference is 1 to 8.
Generally, is this approach of "optimising" a collection of bools by compacting them into a single byte a good idea? Either in an embedded systems context or otherwise.
Again, it depends on how many values and how much space. Also note that dealing with bits instead of bytes increases code size and execution time.
Many embedded systems have relatively little RAM and plenty of Flash. Code is stored in Flash, so the increased code size can be ignored, and the saved memory could be important on small RAM systems.
Would a C++ compiler make such an optimisation that compacts a collection of bools wherever possible and appropriate.
Hypothetically it could. I would consider that an aggressive space optimization, at the expense of execution time.
STL has a specialization for vector<bool> that I frequently avoid for performance reasons - vector<char> is much faster.

Portable bit fields for Handles

I want to use and store "Handles" to data in an object buffer to reduce allocation overhead. The handle is simply an index into an array with the object. However I need to detect use-after-reallocations, as this could slip in quite easily. The common approach seems to be using bit fields. However this leads to 2 problems:
Bit fields are implementation defined
Bit shifting is not portable across big/little endian machines.
What I need:
Store handle to file (file handler can manage either integer types (byte swapping) or byte arrays)
Store 2 values in the handle with minimum space
What I got:
template<class T_HandleDef, typename T_Storage = uint32_t>
struct Handle
{
typedef T_HandleDef HandleDef;
typedef T_Storage Storage;
Handle(): handle_(0){}
private:
const T_Storage handle_;
};
template<unsigned T_numIndexBits = 16, typename T_Tag = void>
struct HandleDef{
static const unsigned numIndexBits = T_numIndexBits;
};
template<class T_Handle>
struct HandleAccessor{
typedef typename T_Handle::Storage Storage;
typedef typename T_Handle::HandleDef HandleDef;
static const unsigned numIndexBits = HandleDef::numIndexBits;
static const unsigned numMagicBits = sizeof(Storage) * 8 - numIndexBits;
/// "Magic" struct that splits the handle into values
union HandleData{
struct
{
Storage index : numIndexBits;
Storage magic : numMagicBits;
};
T_Handle handle;
};
};
A usage would be for example:
typedef Handle<HandleDef<24> > FooHandle;
FooHandle Create(unsigned idx, unsigned m){
HandleAccessor<FooHandle>::HandleData data;
data.idx = idx;
data.magic = m;
return data.handle;
}
My goal was to keep the handle as opaque as possible, add a bool check but nothing else. Users of the handle should not be able to do anything with it but passing it around.
So problems I run into:
Union is UB -> Replace its T_Handle by Storage and add a ctor to Handle from Storage
How does the compiler layout the bit field? I fill the whole union/type so there should be no padding. So probably the only thing that can be different is which type comes first depending on endianess, correct?
How can I store handle_ to a file and load it from a possible different endianess machine and still have index and magic be correct? I think I can store the containing Storage 'endian-correct' and get correct values, IF both members occupy exactly half the space (2 Shorts in an uint) But I always want more space for the index than for the magic value.
Note: There are already questions about bitfields and unions. Summary:
Bitfields may have unexpected padding (impossible here as whole type occupied)
Order of "members" depend on compiler (only 2 possible ways here, should be save to assume order depends entirely on endianess, so this may or may not actually help here)
Specific binary layout of bits can be achieved by manual shifting (or e.g. wrappers http://blog.codef00.com/2014/12/06/portable-bitfields-using-c11/) -> Is not an answer here. I need also a specific layout of the values IN the bitfield. So I'm not sure what I get, if I e.g. create a handle as handle = (magic << numIndexBits) | index and save/load this as binary (no endianess conversion) Missing a BigEndian machine for testing.
Note: No C++11, but boost is allowed.

Answer is pretty simple (based on another question I forgot the link to and comments by #Jeremy Friesner ):
As "numbers" are already an abstraction in C++ one can be sure to always have the same bit representation when the variable is in a CPU register (when it is used for anything calculation like) Also bit shifts in C++ are defined in an endian-independent way. This means x << 1 is always equal x * 2 (and hence big-endian)
Only time one get endianess problems is when saving to file, send/recv over network or accessing it from memory differently (e.g. via pointers...)
One cannot use C++ bitfields here, as one cannot be 100% sure about the order of the "entries". Bitfield containers might be ok, if they allow access to the data as a "number".
Savest is (still) using bitshifts, which are very simple in this case (only 2 values) During storing/serialization the number must then be stored in an endian-agnostic way.

C++. Struct padding / alignment on different platforms and atomatic check of layout compatibility

I have embedded device connected to PC
and some big struct S with many fields and arrays of custom defined type FixedPoint_t.
FixedPoint_t is a templated POD class with exactly one data member that vary in size from char to long depending on template params. Anyway it passes static_assert((std::is_pod<FixedPoint_t<0,8,8> >::value == true),"");
It will be good if this big struct has compatible underlaying memory representation on both embedded system and controlling PC. This allows significant simplification of communication protocol to commands like "set word/byte with offset N to value V". Assume endianess is the same on both platforms.
I see 3 solutions here:
Use something like #pragma packed on both sides.
BUT i got warning when i put attribute((packed)) to struct S declaration
warning: ignoring packed attribute because of unpacked non-POD field.
This is because FixedPoint_t is not declared as packed.
I don't want declare it as packed because this type is widely used in whole program and packing can lead to performance drop.
Make correct struct serialization. This is not acceptable because of code bloat, additional RAM usege...Protocol will be more complicated because i need random access to the struct. Now I think this is not an option.
Control padding manually. I can add some field, reorder others...Just to acheive no padding on both platforms. This will satisfy me at the moment. But i need good way to write a test that shows me is padding is there or not.
I can compare sum of sizeof() each field to sizeof(struct).
I can compare offsetof() each struct field on both platforms.
Both variants are ugly enough...
What do you recommend? Especially i am interested in manual padding controling and automaic padding detection in tests.
EDIT: Is it sufficient to compare sizeof(big struct) on two platforms to detect layout compatibility(assume endianess is equal)?? I think size should not match if padding will be different.
EDIT2:
//this struct should have padding on 32bit machine
//and has no padding on 8bit
typedef struct
{
uint8_t f8;
uint32_t f32;
uint8_t arr[5];
} serialize_me_t;
//count of members in struct
#define SERTABLE_LEN 3
//one table entry for each serialize_me_t data member
static const struct {
size_t width;
size_t offset;
// size_t cnt; //why we need cnt?
} ser_des_table[SERTABLE_LEN] =
{
{ sizeof(serialize_me_t::f8), offsetof(serialize_me_t, f8)},
{ sizeof(serialize_me_t::f32), offsetof(serialize_me_t, f32)},
{ sizeof(serialize_me_t::arr), offsetof(serialize_me_t, arr)},
};
void serialize(void* serialize_me_ptr, char* buf)
{
const char* struct_ptr = (const char*)serialize_me_ptr;
for(int i=0; i<SERTABLE_LEN; I++)
{
struct_ptr += ser_des_table[i].offset;
memcpy(buf, struct_ptr, ser_des_table[i].width );
buf += ser_des_table[i].width;
}
}

I strongly recommend to use option 2:
You are save for future changes (new PCD/ABI, compiler, platform, etc.)
Code-bloat can be kept to a minimum if well thought. There is just one function needed per direction.
You can create the required tables/code (semi-)automatically (I use Python for such). This way both sides will stay in sync.
You definitively should add a CRC to the data anyway. As you likely do not want to calculate this in the rx/tx-interrupt, you'll have to provide an array anyway.
Using a struct directly will soon become a maintenance nightmare. Even worse if someone else has to track this code.
Protocols, etc. tend to be reused. If it is a platform with different endianess, the other approach goes bang.
To create the data-structs and ser/des tables, you can use offsetof to get the offset of each type in the struct. If that table is made an include-file, it can be used on both sides. You can even create the struct and table e.g. by a Python script. Adding that to the build-process ensures it is always up-to-date and you avoid additional typeing.
For instance (in C, just to get idea):
// protocol.inc
typedef struct {
uint32_t i;
uint 16_t s[5];
uint32_t j;
} ProtocolType;
static const struct {
size_t width;
size_t offset;
size_t cnt;
} ser_des_table[] = {
{ sizeof(ProtocolType.i), offsetof(ProtocolType.i), 1 },
{ sizeof(ProtocolType.s[0]), offsetof(ProtocolType.s), 5 },
...
};
If not created automatically, I'd use macros to generate the data. Possibly by including the file twice: one to generate the struct definition and one for the table. This is possible by redefining the macros in-between.
You should care about the representation of signed integers and floats (implementation defined, floats are likely IEEE754 as proposed by the standard).
As an alternative to the width field, you can use an "type" code (e.g. a char which maps to an implementation-defined type. This way you can add custom types with same width, but different encoding (e.g. uint32_t and IEEE754-float). This will completely abstract the network protocol encoding from the physical machine (the best solution). Note noting hinders you from using common encodings which do not complicate code a single bit (literally).

Should I use #define, enum or const?

In a C++ project I'm working on, I have a flag kind of value which can have four values. Those four flags can be combined. Flags describe the records in database and can be:
new record
deleted record
modified record
existing record
Now, for each record I wish to keep this attribute, so I could use an enum:
enum { xNew, xDeleted, xModified, xExisting }
However, in other places in code, I need to select which records are to be visible to the user, so I'd like to be able to pass that as a single parameter, like:
showRecords(xNew | xDeleted);
So, it seems I have three possible appoaches:
#define X_NEW 0x01
#define X_DELETED 0x02
#define X_MODIFIED 0x04
#define X_EXISTING 0x08
or
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
or
namespace RecordType {
static const uint8 xNew = 1;
static const uint8 xDeleted = 2;
static const uint8 xModified = 4;
static const uint8 xExisting = 8;
}
Space requirements are important (byte vs int) but not crucial. With defines I lose type safety, and with enum I lose some space (integers) and probably have to cast when I want to do a bitwise operation. With const I think I also lose type safety since a random uint8 could get in by mistake.
Is there some other cleaner way?
If not, what would you use and why?
P.S. The rest of the code is rather clean modern C++ without #defines, and I have used namespaces and templates in few spaces, so those aren't out of question either.

Combine the strategies to reduce the disadvantages of a single approach. I work in embedded systems so the following solution is based on the fact that integer and bitwise operators are fast, low memory & low in flash usage.
Place the enum in a namespace to prevent the constants from polluting the global namespace.
namespace RecordType {
An enum declares and defines a compile time checked typed. Always use compile time type checking to make sure arguments and variables are given the correct type. There is no need for the typedef in C++.
enum TRecordType { xNew = 1, xDeleted = 2, xModified = 4, xExisting = 8,
Create another member for an invalid state. This can be useful as error code; for example, when you want to return the state but the I/O operation fails. It is also useful for debugging; use it in initialisation lists and destructors to know if the variable's value should be used.
xInvalid = 16 };
Consider that you have two purposes for this type. To track the current state of a record and to create a mask to select records in certain states. Create an inline function to test if the value of the type is valid for your purpose; as a state marker vs a state mask. This will catch bugs as the typedef is just an int and a value such as 0xDEADBEEF may be in your variable through uninitialised or mispointed variables.
inline bool IsValidState( TRecordType v) {
switch(v) { case xNew: case xDeleted: case xModified: case xExisting: return true; }
return false;
}
inline bool IsValidMask( TRecordType v) {
return v >= xNew && v < xInvalid ;
}
Add a using directive if you want to use the type often.
using RecordType ::TRecordType ;
The value checking functions are useful in asserts to trap bad values as soon as they are used. The quicker you catch a bug when running, the less damage it can do.
Here are some examples to put it all together.
void showRecords(TRecordType mask) {
assert(RecordType::IsValidMask(mask));
// do stuff;
}
void wombleRecord(TRecord rec, TRecordType state) {
assert(RecordType::IsValidState(state));
if (RecordType ::xNew) {
// ...
} in runtime
TRecordType updateRecord(TRecord rec, TRecordType newstate) {
assert(RecordType::IsValidState(newstate));
//...
if (! access_was_successful) return RecordType ::xInvalid;
return newstate;
}
The only way to ensure correct value safety is to use a dedicated class with operator overloads and that is left as an exercise for another reader.

Forget the defines
They will pollute your code.
bitfields?
struct RecordFlag {
unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};
Don't ever use that. You are more concerned with speed than with economizing 4 ints. Using bit fields is actually slower than access to any other type.
However, bit members in structs have practical drawbacks. First, the ordering of bits in memory varies from compiler to compiler. In addition, many popular compilers generate inefficient code for reading and writing bit members, and there are potentially severe thread safety issues relating to bit fields (especially on multiprocessor systems) due to the fact that most machines cannot manipulate arbitrary sets of bits in memory, but must instead load and store whole words. e.g the following would not be thread-safe, in spite of the use of a mutex
Source: http://en.wikipedia.org/wiki/Bit_field:
And if you need more reasons to not use bitfields, perhaps Raymond Chen will convince you in his The Old New Thing Post: The cost-benefit analysis of bitfields for a collection of booleans at http://blogs.msdn.com/oldnewthing/archive/2008/11/26/9143050.aspx
const int?
namespace RecordType {
static const uint8 xNew = 1;
static const uint8 xDeleted = 2;
static const uint8 xModified = 4;
static const uint8 xExisting = 8;
}
Putting them in a namespace is cool. If they are declared in your CPP or header file, their values will be inlined. You'll be able to use switch on those values, but it will slightly increase coupling.
Ah, yes: remove the static keyword. static is deprecated in C++ when used as you do, and if uint8 is a buildin type, you won't need this to declare this in an header included by multiple sources of the same module. In the end, the code should be:
namespace RecordType {
const uint8 xNew = 1;
const uint8 xDeleted = 2;
const uint8 xModified = 4;
const uint8 xExisting = 8;
}
The problem of this approach is that your code knows the value of your constants, which increases slightly the coupling.
enum
The same as const int, with a somewhat stronger typing.
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
They are still polluting the global namespace, though.
By the way... Remove the typedef. You're working in C++. Those typedefs of enums and structs are polluting the code more than anything else.
The result is kinda:
enum RecordType { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;
void doSomething(RecordType p_eMyEnum)
{
if(p_eMyEnum == xNew)
{
// etc.
}
}
As you see, your enum is polluting the global namespace.
If you put this enum in an namespace, you'll have something like:
namespace RecordType {
enum Value { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } ;
}
void doSomething(RecordType::Value p_eMyEnum)
{
if(p_eMyEnum == RecordType::xNew)
{
// etc.
}
}
extern const int ?
If you want to decrease coupling (i.e. being able to hide the values of the constants, and so, modify them as desired without needing a full recompilation), you can declare the ints as extern in the header, and as constant in the CPP file, as in the following example:
// Header.hpp
namespace RecordType {
extern const uint8 xNew ;
extern const uint8 xDeleted ;
extern const uint8 xModified ;
extern const uint8 xExisting ;
}
And:
// Source.hpp
namespace RecordType {
const uint8 xNew = 1;
const uint8 xDeleted = 2;
const uint8 xModified = 4;
const uint8 xExisting = 8;
}
You won't be able to use switch on those constants, though. So in the end, pick your poison...
:-p

Have you ruled out std::bitset? Sets of flags is what it's for. Do
typedef std::bitset<4> RecordType;
then
static const RecordType xNew(1);
static const RecordType xDeleted(2);
static const RecordType xModified(4);
static const RecordType xExisting(8);
Because there are a bunch of operator overloads for bitset, you can now do
RecordType rt = whatever; // unsigned long or RecordType expression
rt |= xNew; // set
rt &= ~xDeleted; // clear
if ((rt & xModified) != 0) ... // test
Or something very similar to that - I'd appreciate any corrections since I haven't tested this. You can also refer to the bits by index, but it's generally best to define only one set of constants, and RecordType constants are probably more useful.
Assuming you have ruled out bitset, I vote for the enum.
I don't buy that casting the enums is a serious disadvantage - OK so it's a bit noisy, and assigning an out-of-range value to an enum is undefined behaviour so it's theoretically possible to shoot yourself in the foot on some unusual C++ implementations. But if you only do it when necessary (which is when going from int to enum iirc), it's perfectly normal code that people have seen before.
I'm dubious about any space cost of the enum, too. uint8 variables and parameters probably won't use any less stack than ints, so only storage in classes matters. There are some cases where packing multiple bytes in a struct will win (in which case you can cast enums in and out of uint8 storage), but normally padding will kill the benefit anyhow.
So the enum has no disadvantages compared with the others, and as an advantage gives you a bit of type-safety (you can't assign some random integer value without explicitly casting) and clean ways of referring to everything.
For preference I'd also put the "= 2" in the enum, by the way. It's not necessary, but a "principle of least astonishment" suggests that all 4 definitions should look the same.

Here are couple of articles on const vs. macros vs. enums:
Symbolic Constants
Enumeration Constants vs. Constant Objects
I think you should avoid macros especially since you wrote most of your new code is in modern C++.

If possible do NOT use macros. They aren't too much admired when it comes to modern C++.

With defines I lose type safety
Not necessarily...
// signed defines
#define X_NEW 0x01u
#define X_NEW (unsigned(0x01)) // if you find this more readable...
and with enum I lose some space (integers)
Not necessarily - but you do have to be explicit at points of storage...
struct X
{
RecordType recordType : 4; // use exactly 4 bits...
RecordType recordType2 : 4; // use another 4 bits, typically in the same byte
// of course, the overall record size may still be padded...
};
and probably have to cast when I want to do bitwise operation.
You can create operators to take the pain out of that:
RecordType operator|(RecordType lhs, RecordType rhs)
{
return RecordType((unsigned)lhs | (unsigned)rhs);
}
With const I think I also lose type safety since a random uint8 could get in by mistake.
The same can happen with any of these mechanisms: range and value checks are normally orthogonal to type safety (though user-defined-types - i.e. your own classes - can enforce "invariants" about their data). With enums, the compiler's free to pick a larger type to host the values, and an uninitialised, corrupted or just miss-set enum variable could still end up interpretting its bit pattern as a number you wouldn't expect - comparing unequal to any of the enumeration identifiers, any combination of them, and 0.
Is there some other cleaner way? / If not, what would you use and why?
Well, in the end the tried-and-trusted C-style bitwise OR of enumerations works pretty well once you have bit fields and custom operators in the picture. You can further improve your robustness with some custom validation functions and assertions as in mat_geek's answer; techniques often equally applicable to handling string, int, double values etc..
You could argue that this is "cleaner":
enum RecordType { New, Deleted, Modified, Existing };
showRecords([](RecordType r) { return r == New || r == Deleted; });
I'm indifferent: the data bits pack tighter but the code grows significantly... depends how many objects you've got, and the lamdbas - beautiful as they are - are still messier and harder to get right than bitwise ORs.
BTW /- the argument about thread safety's pretty weak IMHO - best remembered as a background consideration rather than becoming a dominant decision-driving force; sharing a mutex across the bitfields is a more likely practice even if unaware of their packing (mutexes are relatively bulky data members - I have to be really concerned about performance to consider having multiple mutexes on members of one object, and I'd look carefully enough to notice they were bit fields). Any sub-word-size type could have the same problem (e.g. a uint8_t). Anyway, you could try atomic compare-and-swap style operations if you're desperate for higher concurrency.

Enums would be more appropriate as they provide "meaning to the identifiers" as well as type safety. You can clearly tell "xDeleted" is of "RecordType" and that represent "type of a record" (wow!) even after years. Consts would require comments for that, also they would require going up and down in code.

Even if you have to use 4 byte to store an enum (I'm not that familiar with C++ -- I know you can specify the underlying type in C#), it's still worth it -- use enums.
In this day and age of servers with GBs of memory, things like 4 bytes vs. 1 byte of memory at the application level in general don't matter. Of course, if in your particular situation, memory usage is that important (and you can't get C++ to use a byte to back the enum), then you can consider the 'static const' route.
At the end of the day, you have to ask yourself, is it worth the maintenance hit of using 'static const' for the 3 bytes of memory savings for your data structure?
Something else to keep in mind -- IIRC, on x86, data structures are 4-byte aligned, so unless you have a number of byte-width elements in your 'record' structure, it might not actually matter. Test and make sure it does before you make a tradeoff in maintainability for performance/space.

If you want the type safety of classes, with the convenience of enumeration syntax and bit checking, consider Safe Labels in C++. I've worked with the author, and he's pretty smart.
Beware, though. In the end, this package uses templates and macros!

Do you actually need to pass around the flag values as a conceptual whole, or are you going to have a lot of per-flag code? Either way, I think having this as class or struct of 1-bit bitfields might actually be clearer:
struct RecordFlag {
unsigned isnew:1, isdeleted:1, ismodified:1, isexisting:1;
};
Then your record class could have a struct RecordFlag member variable, functions can take arguments of type struct RecordFlag, etc. The compiler should pack the bitfields together, saving space.

I probably wouldn't use an enum for this kind of a thing where the values can be combined together, more typically enums are mutually exclusive states.
But whichever method you use, to make it more clear that these are values which are bits which can be combined together, use this syntax for the actual values instead:
#define X_NEW (1 << 0)
#define X_DELETED (1 << 1)
#define X_MODIFIED (1 << 2)
#define X_EXISTING (1 << 3)
Using a left-shift there helps to indicate that each value is intended to be a single bit, it is less likely that later on someone would do something wrong like add a new value and assign it something a value of 9.

Based on KISS, high cohesion and low coupling, ask these questions -
Who needs to know? my class, my library, other classes, other libraries, 3rd parties
What level of abstraction do I need to provide? Does the consumer understand bit operations.
Will I have have to interface from VB/C# etc?
There is a great book "Large-Scale C++ Software Design", this promotes base types externally, if you can avoid another header file/interface dependancy you should try to.

If you are using Qt you should have a look for QFlags.
The QFlags class provides a type-safe way of storing OR-combinations of enum values.

I would rather go with
typedef enum { xNew = 1, xDeleted, xModified = 4, xExisting = 8 } RecordType;
Simply because:
It is cleaner and it makes the code readable and maintainable.
It logically groups the constants.
Programmer's time is more important, unless your job is to save those 3 bytes.

Not that I like to over-engineer everything but sometimes in these cases it may be worth creating a (small) class to encapsulate this information.
If you create a class RecordType then it might have functions like:
void setDeleted();
void clearDeleted();
bool isDeleted();
etc... (or whatever convention suits)
It could validate combinations (in the case where not all combinations are legal, eg if 'new' and 'deleted' could not both be set at the same time). If you just used bit masks etc then the code that sets the state needs to validate, a class can encapsulate that logic too.
The class may also give you the ability to attach meaningful logging info to each state, you could add a function to return a string representation of the current state etc (or use the streaming operators '<<').
For all that if you are worried about storage you could still have the class only have a 'char' data member, so only take a small amount of storage (assuming it is non virtual). Of course depending on the hardware etc you may have alignment issues.
You could have the actual bit values not visible to the rest of the 'world' if they are in an anonymous namespace inside the cpp file rather than in the header file.
If you find that the code using the enum/#define/ bitmask etc has a lot of 'support' code to deal with invalid combinations, logging etc then encapsulation in a class may be worth considering. Of course most times simple problems are better off with simple solutions...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How can I make classes easily configurable without run-time overhead? - c++

Related

Are constant "flags" considered good style? (c++)

Bool Flags vs. Unsigned Char Flags

Portable bit fields for Handles

C++. Struct padding / alignment on different platforms and atomatic check of layout compatibility

Should I use #define, enum or const?

Categories

Resources