Detecting duplicate work at compile time within C++ code - c++

lets consider the following code:
struct Foo {
Foo(Bar b1, Bar b2) : b1(b1), b2(b2) {}
void work() {
b1.work();
b2.work();
//something
}
Bar& b1;
Bar& b2;
};
struct Bar {
void work() { /* something */ }
};
int main() {
Bar a, b;
a.work();
//something...
Foo c(a,b);
c.work();
//something...
}
The way I wrote that (or intended to write it), a.work() will get executed twice. But let's say, I, as the programmer know, that executing it twice is a waste of execution time and let's say this was part of a far more complex piece of software where it would be far too troublesome to keep track manually what work is and isn't done.
Obviously I could store some boolean flag in Bar and check every single time whether the work has been done already, but I want to know, if there is some way where I can already catch that at compile time. Because at compile time it is already clear that the work had been done.

Another approach. Have a function pointer within the Bar object and in work() function call the pointer. In the constructor, define the pointer to be the actual work function. At the end of the function, reassign the pointer to be an empty function.
In this case, the first execution will do the job. But later executions will do nothing (also not checking the boolean flag)
struct Bar {
typedef void (*Bar::fptr_t)();
Bar() : fptr(actual_work) {}
void actual_work() {
/*something*/;
fptr = &Bar::empty_work;
}
void empty_work() {}
void work() {fptr();}
fptr_t fptr;
};
Something like above.

No, not really.
The compiler is capable of some static analysis, and if you could ask it to diagnose this condition, it may be able to do so in some simple cases. But as soon as you have a non-trivial flow (runtime if conditions, for example), that goes out of the window very quickly. That's probably part of the reason that nobody has created such a programmable feature for compilers: high complexity, with negligible utility.
It may be possible to program some third-party static analysers (or create one!) to diagnose your simple case, but again that's a lot of work for handling only the most trivial cases that you can already spot with your eyes.
Instead, you could make work() happen in the Bar constructor. Then it's impossible to do the work twice on the same object. However, performing large quantities of work in a constructor is often frowned upon.
I would indeed keep a state flag within Bar, and return false from a subsequent work(), by maintaining the value of that flag accordingly. As a bonus, stick an assertion in the function before returning false so that you catch violations during your testing.
The state flag doesn't have to be a boolean; it can be an enum. Robust state machines inside your objects can be very helpful.
That being said, I'd advise revisiting your current approach where you pass references to things into other things that do work on them; it's not an easy-to-follow design, and this is only a simple example of your design! You may wish to consider passing some single-use proxy type instead.

Actually, at compile time it isn't obvious, whether the code was executed or not. Suppose you have an if statement and within it, you call a.work(). How does the compiler know, whether at that time the a.work() was executed? As you say, don't think, that the if statement is very simple (suppose, it is looking for some external signal and executes the code depending on that signal). The best way to avoid the behavior is to keep a boolean.

Related

Compile time hints/warnings on struct changes

I have a basic POD struct with some fields
struct A{
int a,
int b,
};
The nature of my use case requires that these fields change every so often (like 1-2 months, regular but not often). This means that I want to check the field usages of the struct after the changes to make sure everything is still fine.
The compiler checks that all field usages are valid, something like a.c will fail at compile time.
However, some of my functions should access and handle ALL of the fields of A. So while the compiler verifies that all usages are valid, it doesn't validate that all the fields are used.
This work/checking must be done manually (if there is a way to do this at compile time, please enlighten me). So our current design tries to make this as easy as possible. We grouped most of the relevant functions into one folder/library so we could check over them in one place. However, some usages are embedded in private class functions that would honestly be more of a pain to refactor out into the common lib than the benefits it brings.
It's reasonable to just rely on documentation saying "Hey, after changing struct A, check the function foo in class FooThing". But I'm looking to see if we can get some type of compile time warnings.
My idea was to basically drop a static_assert next each relevant function that would check the size of A. Most changes should change the size, so an unchanged static_assert would fail at compile time, pointing me to the general area of the function. Then I could just change the function and the assert.
So besides the function foo for example, I would have something like static_assert(sizeof(A) == 16) or whatever size. This isn't foolproof, as it's possible that changes to struct might not change the total size, but I'm not looking for something really rigorous here, just something that could be helpful 90% of the time.
The main reason why this doesn't work for me is that int and other data types don't have a specified size from the standard. This is a problem for me since my project is cross platform.
In short, I am looking for a way to signal at compile time to check certain functions after a struct's definition has been changed.
One possibility is to put a version number into the struct itself, like so:
struct A{
int a;
int b;
static constexpr int major_version = 1;
};
Then, in calling code, you place assertions that check the value of the major version:
void doSomething(A a)
{
static_assert(A::major_version == 1, "Unexpected A major version");
// Do something with a
}
Then, any time you make an update to A that you think merits re-inspection of all calling code, you increment A::major_version, and then the static_assert will fire anywhere you haven't changed it.

Run-time exceptions when using std::functions. Why do they not point to valid code?

I am working on putting together a library to enable easy implementation of Finite State Machines.
My library is based upon arduino-fsm, a library that achieves this by defining Fsm, State and Transition objects. arduino-fsm uses raw function pointers to set the functions that States and Transitions will call (functions that cover what to do when: entering a state, in a state, exiting a state, making a particular state transition). This library allows very intuitive state machine definition. However, it's use of raw function pointers means that an Fsm can't usefully be an instance variable of an object (raw function pointers can't point to non-static member functions, and static member functions can't access instance variables - making it impossible to have multiple different parent objects of the same class running different instances of the same Fsm.
My implementation replaces the use of raw function pointers with std::functions to define the States' and Transitions' functions in order to allow this (following answers to this question).
I appear to be falling at the first hurdle, however. I'm having runtime exceptions that my stack backtrace shows are due to my std::functions not pointing to valid code (reference). I can't figure out why this is happening though.
The full code is here, but I'll attempt to describe the code in short and the issue more clearly below.
The library defines the struct FunctionState and class FunctionFsm (FunctionFsm also defines the struct Transition).
FunctionState holds three std::functions called on_enter, on_state and on_exit. The functions they point to are called whenever the FunctionState is first entered, is active and is left, respectively.
FunctionFsm has functions to setup the fsm's transitions, to run the state machine, and to trigger transitions between states. It tracks the current state, and keeps a list of Transitions that define the state machine.
Transitions hold pointers to the state they move from and the state they move to, and 'event' integer (which allows particular Transitions to be triggered) and a std::function that points to a function to be called in that transition (after the first state's exit function is called).
In all, std::functions are used to tell each state what it's three functions are and to tell transitions what function it should use. (If no action is required for any particular state function or transition, the std::function may be set to nullptr - and the library will handle it appropriately.)
My simple example appears to fail in the fsm's setup, when it defines the first transition for the state machine.
Important bits from FunctionFsm in the library:
struct Transition {
FunctionState* state_from;
FunctionState* state_to;
int event;
std::function<void()> on_transition;
};
Transition create_transition(FunctionState* state_from,
FunctionState* state_to,
int event,
std::function<void()> on_transition){
Transition t;
t.state_from = state_from;
t.state_to = state_to;
t.event = event;
t.on_transition = on_transition;
return t;
}
void add_transition(FunctionState* state_from,
FunctionState* state_to,
int event,
std::function<void()> on_transition){
if(state_from == NULL || state_to == NULL) return;
Transition transition = FunctionFsm::create_transition(state_from,
state_to,
event,
on_transition);
//stuff to keep track of number of transitions and add transition to the list
//m_transitions is just a Transition*, manual memory management copied
//like-for-like from arduino-fsm (which doesn't use stdlib features)
m_transitions = (Transition*) realloc (m_transitions,
(m_num_transitions + 1)
* sizeof(Transition));
m_transitions[m_num_transitions] = transition;
m_num_transitions++;
}
Example:
char a = 'a';
char b = 'b';
//state functions
void a_on_enter(){ Serial.print("Entering a: "); }
void a_on(){ Serial.print(a); }
void a_on_exit(){ Serial.println(" - exitting a. "); }
void a_on_trans_b(){ Serial.println("Moving from a to b."); }
void b_on_enter(){ Serial.print("Entering b: "); }
void b_on(){ Serial.print(b); }
void b_on_exit(){ Serial.println(" - exitting b. "); }
void b_on_trans_a(){ Serial.println("Moving from b to a."); }
//states
FunctionState state_a(&a_on_enter, &a_on, &a_on_exit);
FunctionState state_b(&b_on_enter, &b_on, &b_on_exit);
//fsm
FunctionFsm fsm(&state_a); //state_a is initial state
//...
//add transitions
fsm.add_transition(&state_a, &state_b, TOGGLE_SWITCH, &a_on_trans_b);//crashes here
fsm.add_transition(&state_b, &state_a, TOGGLE_SWITCH, &b_on_trans_a);
//... code to run fsm, and trigger transitions as appropriate (hasn't had a chance to run yet)
I note that the crash is occurring on the first attempt to give a std::function a value, which shows that I'm handling it wrong somewhere in the library functions... But I really can't understand it. I don't think the function itself has been called anywhere, it's just that a std::function has assigned it?
The error itself is certainly this (as my microcontroller tells me it is) - I just can't figure out why...
I had wondered if I was asking too much of implicit conversion by providing functions that called for std::function<void()> with raw function pointers, but my testing shows no improvement if I explicitly create std::functions and then pass those instead.
I also worried about whether I'm passing my functions by reference or by value, but I don't think that makes a difference here either.
If anyone with more experience using std::function has any helpful suggestions, I'd be really grateful. I'm quite stuck and don't know how to move forward from here.
Big thank you to #molbdnilo for reminding me to create Minimal, Complete, and Verifiable examples and pointing out the likely cause, which is now confirmed.
It turns out the issue was with the use of a Transition pointer and realloc to keep a list of the fsm's transitions in a very C way rather than using a std::vector<Transition> (much more C++). I'd flagged this initially as poor practice, but hadn't considered the effect it may have had.
I haven't gone into understanding the full details of why using realloc has messed up my std::functions, but I can certainly see the vagueries of where this issue comes from.
Having updated the library to use a std::vector instead, everything works perfectly.
If anyone would like to provide a real explanation of the issue with using realloc in this case I'd be very interested to hear it! (I'd certainly mark it as an answer!)

How to disable a class API when the class cannot operate under some conditions?

Say I have the following:
class Processor
{
public:
void Activate();
/* some more interface functions*/
};
void main()
{
Processor().Activate();
}
class Processor is just an example of any class that provides public interface.
Problem
What if class Processor is only operational iff some conditions are met. Just for example, class Processor is preforming some file system operations on a directory X and if the directory X does not exist it can't operate at all.
Issue
Who is responsible to validate the conditions are met and the class is operational?
Let's encapsulate evaluating those conditions to one logic function called Enabled()
Suggestion 1 - Caller responsibility
void main()
{
if (Enabled() )
Processor().Activate();
}
In this case, initiator of class Processor is responsible to make sure these condition are met before initiating the class.
Cons
Caller may not know what are the condition
This doesn't resolve the bigger issue, what if we have other callers that don't verify the condition?
Suggestion 2 - Class responsibility
class Processor
{
public:
Processor()
{
// init m_bIsEnabled based on conditions
}
void Activate()
{
if (!m_bIsEnabled)
return;
// do something
}
/* some more interface functions*/
private:
bool m_bIsEnabled;
};
in this case, all public interface functions are disabled if class is not enabled.
Cons
What if class Processor has numerous interface function, do we check the value of m_bIsEnabled in the beginning of each function?
What if in the future some developer enhance the interface and forgets to check the value of m_bIsEnabled?
What are the default values returned in each functions in case m_bIsEnabled == false
Suggestion 3 - Factory
using Processor_ptr = shared_ptr < Processor > ;
class ProcessorFactory
{
static Processor_ptr create()
{
Processor_ptr p;
p.reset(new Processor);
if (!p->IsEnabled())
p.reset(nullptr);
return p;
}
};
class Processor
{
public:
void Activate();
bool IsEnabled();
friend class ProcessorFactory;
private:
Processor();
bool m_bIsEnabled;
};
This method is so far my favorite since we prevent class generation if it cannot operate.
Cons
1. Perhaps an overkill?
Question
Which of the suggestions is preferable in terms of best practice? do we have other suggestions?
I'd go for option number #3, as options #2 & #1 are hard to track and enforces the developer to always validate the Active flag prior each function's execution. moreover, you can expand option #3 by return an empty object which implements the interface but actually does nothing.
I am also a fan of #3 like igalk. But it worth mention that this is a typical situation for the RAII pattern (Resource Acquisition Is Initialization).
You throw an exception when the condition of the instance creation is not met in your constructor:
class Processor
{
public:
Processor()
{
// based on conditions
if(!folderexist) // pseudocode
throw(cannotcreate)
}
// void Activate() not needed anymore
/* some more interface functions*/
private:
// bool m_bIsEnabled; not needed anymore
};
This is a common pattern in libs already using exceptions. I myself have no problems with exception as long as they are used in a proper way. Unfortunately I see often exceptions used as longjumps or as shortcuts to save some lines of code.
A fail can be a valid state of an instance. In this case IMHO it is better to create the instance and have a "valid" flag (case #2). But most time, the created object is worthless and the information of fail is only of interest for the creator. In this case RAII may be the better choice. The factory pattern avoid the exception in an elegant way if you do not want or cannot use exceptions.
Actually, any or all of those strategies can be employed, depending on what code is able to detect the problem.
Generally, though, I would not encourage usage of an "enabled" or "valid state" flag in the object. If a problem is detected, such a flag is an opportunity to continue despite the problem (e.g. forget to check the flag, forget to set the flag, inappropriately ignore the value of the flag). Programmers are human and, as code gets more complicated, it becomes easier to make a mistake that is very hard to track down.
If the caller detects a problem then it should not create the object or (if the object already exists) it should destroy the object. It can then give an indication to its caller of the problem.
If the object itself detects a problem, then it should recover as best as possible, and give an indication to the caller.
Either way, this means that, if the object exists, it remains in a valid state until a problem is detected, and detection of a problem will cause the object not to exist.
What indications need to be given depend on severity of the problem, but include various approaches between an error code (which can be ignored by the caller) and an exception (which must be handled, and the problem corrected, if the program is to continue).
BTW: main() returns int, not void.

Mimicing C# 'new' (hiding a virtual method) in a C++ code generator

I'm developing a system which takes a set of compiled .NET assemblies and emits C++ code which can then be compiled to any platform having a C++ compiler. Of course, this involves some extensive trickery due to various things .NET does that C++ doesn't.
One such situation is the ability to hide virtual methods, such as the following in C#:
class A
{
virtual void MyMethod()
{ ... }
}
class B : A
{
override void MyMethod()
{ ... }
}
class C : B
{
new virtual void MyMethod()
{ ... }
}
class D : C
{
override void MyMethod()
{ ... }
}
I came up with a solution to this that seemed clever and did work, as in the following example:
namespace impdetails
{
template<class by_type>
struct redef {};
}
struct A
{
virtual void MyMethod( void );
};
struct B : A
{
virtual void MyMethod( void );
};
struct C : B
{
virtual void MyMethod( impdetails::redef<C> );
};
struct D : C
{
virtual void MyMethod( impdetails::redef<D> );
};
This does of course require that all the call sites for C::MyMethod and D::MyMethod construct and pass the dummy object, as in this example:
C *c_d = &d;
c_d->MyMethod( impdetails::redef<C>() );
I'm not worried about this extra source code overhead; the output of this system is mainly not intended for human consumption.
Unfortunately, it turns out this actually causes runtime overhead. Intuitively, one would expect that because impdetails::redef<> is empty, it would take no space and passing it would involve no code.
However, the C++ standard, for reasons I understand but don't totally agree with, mandates that objects cannot have zero size. This leaves us with a situation where the compiler actually emits code to create and pass the object.
In fact, at least on VC2008, I found that it even went to the trouble of zeroing the dummy byte, even in release builds! I'm not sure why that was necessary, but it makes me even more not want to do it this way.
If all else fails I could always change the actual name of the function, such as perhaps having MyMethod, MyMethod$1, and MyMethod$2. However, this causes more problems. For instance, $ is actually not legal in C++ identifiers (although compilers I've tested will allow it.) A totally acceptable identifier in the output program could also be an identifier in the input program, which suggests a more complex approach would be needed, making this a less attractive option.
It also so turns out that there are other situations in this project where it would be nice to be able to modify method signatures using arbitrary type arguments similar to how I'm passing a type to impdetails::redef<>.
Is there any other clever way to get around this, or am I stuck between adding overhead at every call site or mangling names?
After considering some other aspects of the system as well such as interfaces in .NET, I am starting to think maybe it's better - perhaps even more-or-less necessary - to not even use the C++ virtual calling mechanism at all. The more I consider, the messier using that mechanism is getting.
In this approach, each user object class would have a separate struct for the vtable (perhaps kept in a separate namespace like vtabletype::. The generated class would have a pointer member that would be initialized through some trickery to point to a static instance of the vtable. Virtual calls would explicitly use a member pointer from that vtable.
If done properly this should have the same performance as the compiler's own implementation would. I've confirmed it does on VC2008. (By contrast, just using straight C, which is what I was planning on earlier, would likely not perform as well, since compilers often optimize this into a register.)
It would be hellish to write code like this manually, but of course this isn't a concern for a generator. This approach does have some advantages in this application:
Because it's a much more explicit approach, one can be more sure that it's doing exactly what .NET specifies it should be doing with respect to newslot as well as selection of interface implementations.
It might be more efficient (depending on some internal details) than a more traditional C++ approach to interfaces, which would tend to invoke multiple inheritance.
In .NET, objects are considered to be fully constructed when their .ctor runs. This impacts how virtual functions behave. With explicit knowledge of the vtables, this could be achieved by writing it in during allocation. (Although putting the .ctor code into a normal member function is another option.)
It might avoid redundant data when implementing reflection.
It provides better control and knowledge of object layout, which could be useful for the garbage collector.
On the downside, it totally loses the C++ compiler's overloading feature with regard to the vtable entries: those entries are data members, not functions, so there is no overloading. In this case it would be tempting to just number the members (say _0, _1...) This may not be so bad when debugging, since once the pointer is followed, you'll see an actual, properly-named member function anyway.
I think I may end up doing it this way but by all means I'd like to hear if there are better options, as this is admittedly a rather complex approach (and problem.)

What is easiest way to force compiler to throw error?

To the folks marking this as duplicate: it is not; the other question addresses enums which are compile-time constants. This is not a constant integral expression thus the solution would be very different. Please see my code below more carefully before suggesting this has already been answered in another question, as it has not in any way. I am checking the value of a member variable on an object, information created at runtime, and I'm curious what I can do with that in this context.
I'm at a point where I need to use something to make the compiler fail if the user of my API does something she should not.
I don't know if that's possible, it is? The options I mention above are primarily run-time, right?
For example, suppose you have a function:
void doSomethingIncredible(AwesomeClass amazingObject)
{
//perform life-changing work here except:
if (amazingObject.isntAmazing) //a bool property of object
//uh oh, life sucks, I refuse to compile this
Now calling this function will change how you live your life in all respects, except for occasions in which amazingObject has a particular property switched on, for example, in which case, I want the compiler to not even allow this to pass, i.e. cannot run the program.
Somewhere in the body of the function is a c++ mechanism that forces compiling to fail, which alerts the user that you cannot use this function for such an inferior un-amazing object.
Is this possible?
To clarify, this is something I want to do a compile time based the contents of a variable, as shown in my example above. The suggestion to use static_assert does not apply here.
You can either static_assert() a condition at compile time (C++11)
static_assert(false, "Hey user! You suck!");
or use
#if (some_erroneous_condition_to_be_avoided)
#error "Hey user! You suck!"
#endif
if you have a GNU-compatible compiler (g++, clang++, etc.)
The only way I can see to get it compile time checked is to subclass AwesomeClass and restrict the new class' creation to only be able to create objects where amazingObject.isntAmazing is never true. Then change the signature to;
void doSomethingIncredible(AwesomeAndAmazingClass amazingObject)
That will prevent the call to the method for objects that are simply awesome but not amazing.
As a maybe more illustrative example (not compiled, so consider pseudo code);
class Thing {
protected:
Color _color;
Shape _shape;
public:
Thing(Color color, Shape shape) {
_color=color; _shape=shape;
}
}
class GreenThing : Thing {
public:
GreenThing(Shape shape) : Thing(Color.Green, shape) {}
}
void doSomethingIncredible(GreenThing specialThing)
{
// specialThing here is still a Thing, but also compile time
// checked to also always be a GreenThing
}
It is impossible. The value of the variable is decided at runtime, but you want to throw a compile-time error depending on the runtime value.