Building custom theories in Z3

Building custom theories in Z3 - c++

This question is a follow up to the following question.
Procedural Attachment in Z3
I have a predicate (I use the name "heavier" in this case) over two integers that I need to evaluate using a custom algorithm. I have written the following piece of code to do it. But I see that the parameters that get passed into the function CMTh_reduce_app() are not actual integers, but consts of type integer. What I need is 2 integers, so that I can evaluate the predicate and return the result (The operations done in the function CMTh_reduce_app() right now are meaningless).
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#include<stdarg.h>
#include<memory.h>
#include<z3.h>
#include "z3++.h"
#include<iostream>
using namespace z3;
using namespace std;
struct _CMTheoryData {
Z3_func_decl heavier;
};
typedef struct _CMTheoryData CMTheoryData;
Z3_context ctx;
//Exit function
void exitf(const char* message)
{
fprintf(stderr,"BUG: %s.\n", message);
exit(1);
}
//Check and print model if available
void check(Z3_context ctx)
{
Z3_model m = 0;
Z3_lbool result = Z3_check_and_get_model(ctx, &m);
switch (result) {
case Z3_L_FALSE:
printf("unsat\n");
break;
case Z3_L_UNDEF:
printf("unknown\n");
printf("potential model:\n%s\n", Z3_model_to_string(ctx, m));
break;
case Z3_L_TRUE:
printf("sat\n%s\n", Z3_model_to_string(ctx, m));
break;
}
if (m) {
Z3_del_model(ctx, m);
}
}
//Create logical context. Enable model generation, and set error handler
void error_handler(Z3_error_code e)
{
printf("Error code: %d\n", e);
exitf("incorrect use of Z3");
}
Z3_context mk_context_custom(Z3_config cfg, Z3_error_handler err)
{
Z3_context ctx;
Z3_set_param_value(cfg, "MODEL", "true");
ctx = Z3_mk_context(cfg);
#ifdef TRACING
Z3_trace_to_stderr(ctx);
#endif
Z3_set_error_handler(ctx, err);
return ctx;
}
Z3_context mk_context()
{
Z3_config cfg;
Z3_context ctx;
cfg = Z3_mk_config();
ctx = mk_context_custom(cfg, error_handler);
Z3_del_config(cfg);
return ctx;
}
//Shortcut for binary fn application
Z3_ast mk_binary_app(Z3_context ctx, Z3_func_decl f, Z3_ast x, Z3_ast y)
{
Z3_ast args[2] = {x, y};
return Z3_mk_app(ctx, f, 2, args);
}
//Shortcut to create an int
Z3_ast mk_int(Z3_context ctx, int v)
{
Z3_sort ty = Z3_mk_int_sort(ctx);
return Z3_mk_int(ctx, v, ty);
}
Z3_ast mk_var(Z3_context ctx, const char * name, Z3_sort ty)
{
Z3_symbol s = Z3_mk_string_symbol(ctx, name);
return Z3_mk_const(ctx, s, ty);
}
Z3_ast mk_int_var(Z3_context ctx, const char * name)
{
Z3_sort ty = Z3_mk_int_sort(ctx);
return mk_var(ctx, name, ty);
}
//Callback when final check is to be carried out
Z3_bool CMTh_final_check(Z3_theory t) {
printf("Final check\n");
return Z3_TRUE;
}
//Callback when theory is to be deleted
void CMTh_delete(Z3_theory t) {
CMTheoryData * td = (CMTheoryData *)Z3_theory_get_ext_data(t);
printf("Delete\n");
free(td);
}
//Callback to reduce a function application(definition of custom functions, predicates)
Z3_bool CMTh_reduce_app(Z3_theory t, Z3_func_decl d, unsigned n, Z3_ast const args[], Z3_ast * result) {
CMTheoryData * td = (CMTheoryData*)Z3_theory_get_ext_data(t);
cout<<Z3_ast_to_string(ctx, args[0])<<' '<<Z3_ast_to_string(ctx,args[1])<<endl;
if (d == td->heavier) {
cout<<"Reducing the fn \'heavier\'"<<endl;
if(Z3_is_eq_ast(ctx,mk_int(ctx, 1),args[0])||Z3_is_eq_ast(ctx,mk_int(ctx,2),args[0]))
{
*result = Z3_mk_true(Z3_theory_get_context(t));
return Z3_TRUE;;
}
else
{
*result = Z3_mk_false(Z3_theory_get_context(t));
return Z3_TRUE;;
}
}
return Z3_FALSE; // failed to simplify
}
Z3_theory mk_cm_theory(Z3_context ctx) {
Z3_sort heavier_domain[2];
Z3_symbol heavier_name = Z3_mk_string_symbol(ctx, "heavier");
Z3_sort B = Z3_mk_bool_sort(ctx);
CMTheoryData * td = (CMTheoryData*)malloc(sizeof(CMTheoryData));
Z3_theory Th = Z3_mk_theory(ctx, "cm_th", td);
heavier_domain[0] = Z3_mk_int_sort(ctx);
heavier_domain[1] = Z3_mk_int_sort(ctx);
td->heavier = Z3_theory_mk_func_decl(ctx, Th, heavier_name, 2, heavier_domain, B); //context, theory, name_of_fn, number of arguments, argument type list, return type
Z3_set_delete_callback(Th, CMTh_delete);
Z3_set_reduce_app_callback(Th, CMTh_reduce_app);
Z3_set_final_check_callback(Th, CMTh_final_check);
return Th;
}
main()
{
Z3_ast a_ast, b_ast, c_ast, f1, f3, r;
Z3_sort i;
Z3_pattern p;
Z3_app bound[2];
Z3_theory Th;
CMTheoryData * td;
printf("\nCustom theory example\n");
ctx = mk_context();
Th = mk_cm_theory(ctx);
td = (CMTheoryData*)Z3_theory_get_ext_data(Th);
a_ast = mk_int_var(ctx, "a");
b_ast = mk_int_var(ctx, "b");
bound[0] = (Z3_app)a_ast;
f1=mk_binary_app(ctx, td->heavier, a_ast, b_ast);
r= Z3_mk_exists_const(ctx, 0, 1, bound, 0, 0,f1);
printf("assert axiom:\n%s\n", Z3_ast_to_string(ctx, r));
Z3_assert_cnstr(ctx, r);
check(ctx);
}
I know the user theory plugin is not supported anymore, but I really need to get this working, so if I could get any information, it would be really helpful. I tried looking at the source code, but I didn't know where to get started with building new theories into it. So, I'd appreciate some help with the theory plugin.

Models are not going to be accessible to you from the
abstraction that the deprecated theory plugin provides.
The problem is going to be that models are constructed later in the game.
It would require rewriting some of the internals to accommodate this
(it is not impossible, but a very fair chunk of work).
My impression is that it would be simpler to use just the basic interaction
with Z3 where you declare the predicates as uninterpreted, check for SAT.
Then if the current constraints are satisfiable,
use the current model to evaluate arguments. If you have values, that contradict your
built-in procedural attachment, then assert new facts that rule these values out (and as many
other infeasible values as possible). I call this the "lazy loop approach".
This interaction model corresponds to how SMT solvers can use
SAT solvers without providing theory propagation (propagating truth values
when new atoms are assigned). You would have to do a bit more work during
conflict analysis/resolution in order to produce strong lemmas. So a hybrid
between the built-in theory and the lazy loop approach may in the end work out.
But before getting there I suggest to just use Z3 as is and use the current model to
calculate new blocking clauses.
Of course you lose something: instantiation of quantifiers will proceed somewhat eagerly
and it could very well be the case that this lazy loop approach will not work well in the presence
of quantifiers.

Related

How to wrap several boolean flags into struct to pass them to a function with a convenient syntax

In some testing code there's a helper function like this:
auto make_condiment(bool salt, bool pepper, bool oil, bool garlic) {
// assumes that first bool is salt, second is pepper,
// and so on...
//
// Make up something according to flags
return something;
};
which essentially builds up something based on some boolean flags.
What concerns me is that the meaning of each bool is hardcoded in the name of the parameters, which is bad because at the call site it's hard to remember which parameter means what (yeah, the IDE can likely eliminate the problem entirely by showing those names when tab completing, but still...):
// at the call site:
auto obj = make_condiment(false, false, true, true); // what ingredients am I using and what not?
Therefore, I'd like to pass a single object describing the settings. Furthermore, just aggregating them in an object, e.g. std::array<bool,4>.
I would like, instead, to enable a syntax like this:
auto obj = make_smart_condiment(oil + garlic);
which would generate the same obj as the previous call to make_condiment.
This new function would be:
auto make_smart_condiment(Ingredients ingredients) {
// retrieve the individual flags from the input
bool salt = ingredients.hasSalt();
bool pepper = ingredients.hasPepper();
bool oil = ingredients.hasOil();
bool garlic = ingredients.hasGarlic();
// same body as make_condiment, or simply:
return make_condiment(salt, pepper, oil, garlic);
}
Here's my attempt:
struct Ingredients {
public:
enum class INGREDIENTS { Salt = 1, Pepper = 2, Oil = 4, Garlic = 8 };
explicit Ingredients() : flags{0} {};
explicit Ingredients(INGREDIENTS const& f) : flags{static_cast<int>(f)} {};
private:
explicit Ingredients(int fs) : flags{fs} {}
int flags; // values 0-15
public:
bool hasSalt() const {
return flags % 2;
}
bool hasPepper() const {
return (flags / 2) % 2;
}
bool hasOil() const {
return (flags / 4) % 2;
}
bool hasGarlic() const {
return (flags / 8) % 2;
}
Ingredients operator+(Ingredients const& f) {
return Ingredients(flags + f.flags);
}
}
salt{Ingredients::INGREDIENTS::Salt},
pepper{Ingredients::INGREDIENTS::Pepper},
oil{Ingredients::INGREDIENTS::Oil},
garlic{Ingredients::INGREDIENTS::Garlic};
However, I have the feeling that I am reinventing the wheel.
Is there any better, or standard, way of accomplishing the above?
Is there maybe a design pattern that I could/should use?

I think you can remove some of the boilerplate by using a std::bitset. Here is what I came up with:
#include <bitset>
#include <cstdint>
#include <iostream>
class Ingredients {
public:
enum Option : uint8_t {
Salt = 0,
Pepper = 1,
Oil = 2,
Max = 3
};
bool has(Option o) const { return value_[o]; }
Ingredients(std::initializer_list<Option> opts) {
for (const Option& opt : opts)
value_.set(opt);
}
private:
std::bitset<Max> value_ {0};
};
int main() {
Ingredients ingredients{Ingredients::Salt, Ingredients::Pepper};
// prints "10"
std::cout << ingredients.has(Ingredients::Salt)
<< ingredients.has(Ingredients::Oil) << "\n";
}
You don't get the + type syntax, but it's pretty close. It's unfortunate that you have to keep an Option::Max, but not too bad. Also I decided to not use an enum class so that it can be accessed as Ingredients::Salt and implicitly converted to an int. You could explicitly access and cast if you wanted to use enum class.

If you want to use enum as flags, the usual way is merge them with operator | and check them with operator &
#include <iostream>
enum Ingredients{ Salt = 1, Pepper = 2, Oil = 4, Garlic = 8 };
// If you want to use operator +
Ingredients operator + (Ingredients a,Ingredients b) {
return Ingredients(a | b);
}
int main()
{
using std::cout;
cout << bool( Salt & Ingredients::Salt ); // has salt
cout << bool( Salt & Ingredients::Pepper ); // doesn't has pepper
auto sp = Ingredients::Salt + Ingredients::Pepper;
cout << bool( sp & Ingredients::Salt ); // has salt
cout << bool( sp & Ingredients::Garlic ); // doesn't has garlic
}
note: the current code (with only the operator +) would not work if you mix | and + like (Salt|Salt)+Salt.
You can also use enum class, just need to define the operators

I would look at a strong typing library like:
https://github.com/joboccara/NamedType
For a really good video talking about this:
https://www.youtube.com/watch?v=fWcnp7Bulc8
When I first saw this, I was a little dismissive, but because the advice came from people I respected, I gave it a chance. The video convinced me.
If you look at CPP Best Practices and dig deeply enough, you'll see the general advice to avoid boolean parameters, especially strings of them. And Jonathan Boccara gives good reasons why your code will be stronger if you don't directly use the raw types, for the very reason that you've already identified.

Elegantly attempt to execute various functions a specific way

I'm attempting to execute various functions sequentially n number of times, only moving forward if previous function did not return false (error) otherwise I reset and start all over again.
An example of a sequence would be :
Turn module ON : module.power(true), 3 attempts
Wait for a signal : module.signal(), 10 attempts
Send a message : module.sendSMS('test'), 3 attempts
Turn module OFF : module.power(false), 1 attempt
Each of those actions are done the same way, only changing the DEBUG text and the function to launch :
DEBUG_PRINT("Powering ON"); // This line changes
uint8_t attempts = 0;
uint8_t max_attempts = 3; // max_attempts changes
while(!module.power(true) && attempts < max_attempts){ // This line changes
attempts++;
DEBUG_PRINT(".");
if(attempts == max_attempts) {
DEBUG_PRINTLN(" - Failed.");
soft_reset(); // Start all over again
}
delay(100);
}
DEBUG_PRINTLN(" - Success");
wdt_reset(); // Reset watchdog timer, ready for next action
Is there an elegant way I can put this process in a function I could call to execute the required functions this particular way, for example something like :
void try_this_action(description, function, n_attempts)
Which would make actions 1-4 above like :
try_this_action("Powering ON", module.power(true), 3);
try_this_action("Waiting for signal", module.signal(), 10);
try_this_action("Sending SMS", module.sendSMS('test'), 3);
try_this_action("Powering OFF", module.power(false), 1);
A difficulty I have is that the functions called have different syntax (some take parameters, some other don't...). Is there a more elegant modulable way of doing this besides copy/paste the chunck of code everywhere I need it ?

A difficulty I have is that the functions called have different syntax
(some take parameters, some other don't...).
That is indeed an issue. Along with it you have the possibility of variation in actual function arguments for the same function.
Is there a more elegant
modulable way of doing this besides copy/paste the chunck of code
everywhere I need it ?
I think you could make a variadic function that uses specific knowledge of the functions to dispatch in order to deal with the differing function signatures and actual arguments. I'm doubtful that I would consider the result more elegant, though.
I would be inclined to approach this job via a macro, instead:
// desc: a descriptive string, evaluated once
// action: an expression to (re)try until it evaluates to true in boolean context
// attempts: the maximum number of times the action will be evaluated, itself evaluated once
#define try_this_action(desc, action, attempts) do { \
int _attempts = (attempts); \
DEBUG_PRINT(desc); \
while(_attempts && !(action)) { \
_attempts -= 1; \
DEBUG_PRINT("."); \
delay(100); \
} \
if (_attempts) { \
DEBUG_PRINTLN(" - Success"); \
} else { \
DEBUG_PRINTLN(" - Failed."); \
soft_reset(); \
} \
wdt_reset(); \
} while (0)
Usage would be just as you described:
try_this_action("Powering ON", module.power(true), 3);
etc.. Although the effect is as if you did insert the code for each action in each spot, using a macro such as this would yield code that is much easier to read, and that is not lexically repetitive. Thus, for example, if you ever need to change the the steps for trying actions, you can do it once for all by modifying the macro.

You need to make the function pointers all have the same signature. I would use something like this;
typedef int(*try_func)(void *arg);
And have a try_this_action(...) signature similar to the following;
void try_this_action(char * msg, int max_trys, try_func func, void *arg)
You would then implement your actions similar to this;
int power(void *pv)
{
int *p = pv;
int on_off = *p;
static int try = 0;
if (on_off && try++)
return 1;
return 0;
}
int signal(void *pv)
{
static int try = 0;
if (try++ > 6)
return 1;
return 0;
}
And call them like this;
int main(int c, char *v[])
{
int on_off = 1;
try_this_action("Powering ON", 3, power, &on_off);
try_this_action("Signaling", 10, signal, 0);
}

Functions of different arity may be abstracted with a generic signature (think about main). Instead of each giving each their own unique arguments, you simply supply them all with:
An argument count.
A vector of pointers to the arguments.
This is how your operating system treats all programs it runs anyways. I've given a very basic example below which you can inspect.
#include <stdio.h>
#include <stdlib.h>
/* Define total function count */
#define MAX_FUNC 2
/* Generic function signature */
typedef void (*func)(int, void **, const char *);
/* Function pointer array (NULL - initialized) */
func functions[MAX_FUNC];
/* Example function #1 */
void printName (int argc, void **argv, const char *desc) {
fprintf(stdout, "Running: %s\n", desc);
if (argc != 1 || argv == NULL) {
fprintf(stderr, "Err in %s!\n", desc);
return;
}
const char *name = (const char *)(argv[0]);
fprintf(stdout, "Name: %s\n", name);
}
/* Example function #2 */
void printMax (int argc, void **argv, const char *desc) {
fprintf(stdout, "Running: %s\n", desc);
if (argc != 2 || argv == NULL) {
fprintf(stderr, "Err in %s!\n", desc);
return;
}
int *a = (int *)(argv[0]), *b = (int *)(argv[1]);
fprintf(stdout, "Max: %d\n", (*a > *b) ? *a : *b);
}
int main (void) {
functions[0] = printName; // Set function #0
functions[1] = printMax; // Set function #1
int f_arg_count[2] = {1, 2}; // Function 0 takes 1 argument, function 1 takes 2.
const char *descs[2] = {"printName", "printMax"};
const char *name = "Natasi"; // Args of function 0
int a = 2, b = 3; // Args of function 1
int *args[2] = {&a, &b}; // Args of function 1 in an array.
void **f_args[2] = {(void **)(&name),
(void **)(&args)}; // All function args.
// Invoke all functions.
for (int i = 0; i < MAX_FUNC; i++) {
func f = functions[i];
const char *desc = descs[i];
int n = f_arg_count[i];
void **args = f_args[i];
f(n, args, desc);
}
return EXIT_SUCCESS;
}

You can use a variadic function, declaring in the parameter list first those parameters that are always present, then the variable part.
In following code we define a type for action functions, void returning having as parameter an argument list:
typedef void (*action)(va_list);
Then define the generic action routine that prepare for the action execution:
void try_this_action(char *szActionName, int trials, action fn_action, ...)
{
va_list args;
va_start(args, fn_action); //Init the argument list
DEBUG_PRINT(szActionName); // This line changes
uint8_t attempts = 0;
uint8_t max_attempts = trials; // max_attempts changes
//Here we call our function through the pointer passed as argument
while (!fn_action(args) && attempts < max_attempts)
{ // This line changes
attempts++;
DEBUG_PRINT(".");
if (attempts == max_attempts)
{
DEBUG_PRINTLN(" - Failed.");
soft_reset(); // Start all over again
}
delay(100);
}
DEBUG_PRINTLN(" - Success");
wdt_reset(); // Reset watchdog timer, ready for next action
va_end(args);
}
Each function must be coded to use an argument list:
int power(va_list args)
{
//First recover all our arguments using the va_arg macro
bool cond = va_arg(args, bool);
if (cond == true)
{
... //do something
return true;
}
return false;
}
The usage will be:
try_this_action("Powering ON", 3, module.power, true);
try_this_action("Waiting for signal", 10, module.signal);
try_this_action("Sending SMS", 3, module.sendSMS, "test");
try_this_action("Powering OFF", 1, module.power, false);
If you need more info on variadic functions and usage of stdarg.h macros google the net. Start from here https://en.cppreference.com/w/c/variadic.
It could be coded also as a macro implementation, as the excellent proposal in the John Bollinger answer, but in that case you must consider that each macro usage will instantiate the whole code, that could be eventually even better for speed (avoiding a function call), but could be not suitable on systems with limited memory (embedded), or where you need reference to the function try_this_action (inexistent).

Retrieve ptr from function call asmjit

I am trying to generate a function call using AsmJit to which I pass an char*. This char* is in itself retrieved from another function call. I tried out this:
typedef
const char* getStr();
const char* getStrImpl() {
return "hello pie";
}
void use_str_impl(int id, const char* c_str) {
// do stuff...
}
int main() {
JitRuntime rt;
CodeHolder code;
code.init(rt.getCodeInfo());
X86Compiler c(&code);
auto jitted_func = c.addFunc(FuncSignature0<const char*>(code.getCodeInfo().getCdeclCallConv()));
auto err = c.getLastError();
auto call = c.call((uint64_t) fooFuncImpl, FuncSignature0<intptr_t>());
X86Gpd res(call->getRet().getId());
auto call2 = c.call((uint64_t) send_input, FuncSignature2<void, int, intptr_t>());
err = !call2->setArg(0, Imm(42));
err = !call2->setArg(1, res);
c.ret();
c.endFunc();
err = c.finalize();
if(err) return 0;
size_t size = code.getCodeSize();
VMemMgr vm;
void* p = vm.alloc(size);
if (!p) return 0;
code.relocate(p);
auto fun = (entrypoint*) p;
fun();
}
It turns out this does not generate any instructions for the second parameter or second call to setArg. I also tried to use .newIntPtr and using move instructions to move the result of call into place. But this generated dec and add instructions which made no sense to me and my small experience with assembly. What is the correct way of doing this type of thing?
Btw I am using the AsmJit next branch.

I have done few corrections to your sample with some comments.
Better Usage of JitRuntime:
JitRuntime rt;
size_t size = code.getCodeSize();
VMemMgr vm;
....
void* p = vm.alloc(size);
if (!p) return 0;
code.relocate(p);
auto fun = (entrypoint*) p;
You have used JitRuntime just to setup the parameters for CodeHolder, but then avoided it and allocated the memory for the function yourself. While that's a valid use case it's not what most people do. Using runtime's add() is sufficient in most cases.
Invalid use of CCFuncCall::getRet():
X86Gpd res(call->getRet().getId());
The call node at this point doesn't have any return register assigned so it would return an invalid id. If you need to create a virtual register you always have to call compiler's newSomething(). AsmJit's compiler provides API to check for that case at runtime, if you are unsure:
// Would print 0
printf("%d", (int)c.isVirtRegValid(call->getRet().getId()));
The solution is to create a new virtual register and ASSIGN it to the function's return value. Assigning return value requires an index (like assigning an argument), the reason is that some functions may return multiple values(like 64-bit value in 32-bit mode), using 0 as index is sufficient most of the time.
X86Gp reg = c.newIntPtr("reg");
call->setRet(0, reg);
You can verify getRet() functionality:
X86Gp reg = c.newIntPtr("reg");
assert(call->getRet(0).isNone());
call->setRet(0, reg);
assert(call->getRet(0) == reg);
Fully working example:
#include <stdio.h>
#include <asmjit/asmjit.h>
const char* func_a() {
printf("func_a(): Called\n");
return "hello pie";
}
void func_b(int id, const char* c_str) {
printf("func_b(%d, %s): Called\n", id, c_str);
}
int main() {
using namespace asmjit;
JitRuntime rt;
CodeHolder code;
code.init(rt.getCodeInfo());
X86Compiler c(&code);
X86Gp reg = c.newIntPtr("reg");
// Compilation step...
c.addFunc(FuncSignature0<void>(code.getCodeInfo().getCdeclCallConv()));
auto call_a = c.call((uint64_t)func_a, FuncSignature0<intptr_t>());
call_a->setRet(0, reg);
auto call_b = c.call((uint64_t)func_b, FuncSignature2<void, int, intptr_t>());
call_b->setArg(0, Imm(42));
call_b->setArg(1, reg);
c.ret();
c.endFunc();
// Finalize does the following:
// - allocates virtual registers
// - inserts prolog / epilog
// - assembles to CodeHolder
auto err = c.finalize();
if (err) {
printf("COMPILER FAILED: %s\b", DebugUtils::errorAsString(err));
return 1;
}
typedef void (*EntryPoint)(void);
EntryPoint entry;
// Adds function to the runtime. Should be freed by rt.release().
// Function is valid until the runtime is valid if not released.
err = rt.add(&entry, &code);
if (err) {
printf("RUNTIME FAILED: %s\b", DebugUtils::errorAsString(err));
return 1;
}
entry();
return 0;
}

I am trying to create a function that receives and returns a double. For the call method I used the approach with Mem. At the end I need to save the result in the variable xmm1.
I can't identify the error. The sine function is called correctly. But for the final assembler generation error occurs.
JitRuntime rt;
CodeHolder code;
code.init(rt.codeInfo());
asmjit::x86::Compiler cc(&code);
asmjit::x86::Gp reg = cc.newIntPtr("reg");
asmjit::Zone zonee(1024);
asmjit::ConstPool constPool(&zonee);
asmjit::Label constPoolLabel = cc.newLabel();
// Compilation step...
// c.addFunc(asmjit::FuncSignatureT<void>(code.codeInfo().getCdeclCallConv()));
cc.addFunc(asmjit::FuncSignatureT<void>());
auto call_a = cc.call((uint64_t)func_a, FuncSignatureT<intptr_t>());
call_a->setRet(0, reg);
auto call_b = cc.call((uint64_t)func_b, FuncSignatureT<void, int, intptr_t>());
call_b->setArg(0, Imm(42));
call_b->setArg(1, reg);
auto seno = [&](double value) {
size_t valueOffset;
double seno = static_cast<double_t>(std::sin(value));
cout << " seno " << seno << endl;
constPool.add(&seno, sizeof(double), valueOffset);
return asmjit::x86::ptr(constPoolLabel, valueOffset);
};
asmjit::x86::Mem mem;
double test = 180.5;
auto call_c = cc.call(seno(test), asmjit::FuncSignatureT<double_t>());
call_c->setArg(0, asmjit::Imm(test));
call_c->_setRet(0, mem);
cc.movsd(asmjit::x86::xmm1, mem);
cc.ret();
cc.endFunc();
// Finalize does the following:
// - allocates virtual registers
// - inserts prolog / epilog
// - assembles to CodeHolder
auto err = cc.finalize();
if (err) {
printf("COMPILER FAILED: %s\b", DebugUtils::errorAsString(err));
return;
}
typedef void (*EntryPoint)(void);
EntryPoint entry;
// Adds function to the runtime. Should be freed by rt.release().
// Function is valid until the runtime is valid if not released.
err = rt.add(&entry, &code);
if (err) {
printf("RUNTIME FAILED: %s\b", DebugUtils::errorAsString(err));
return;
}
entry();
return;

perhaps the memory object should relate to some memory address?
Mem mem = qword_ptr ((uint64_t) &test);

What's the most efficient way to do recursive XPath queries using libxml2?

I've written a C++ wrapper function for libxml2 that makes it easy for me to do queries on an XML document:
bool XPathQuery(
const std::string& doc,
const std::string& query,
XPathResults& results);
But I have a problem: I need to be able to do another XPath query on the results of my first query.
Currently I do this by storing the entire subdocument in my XPathResult object, and then I pass XPathResult.subdoc into the XPathQuery function. This is awfully inefficient.
So I'm wondering ... does libxml2 provide anything that would make it easy to store the context of an xpath query (a reference to a node, perhaps?) and then perform another query using that reference as the xpath root?

You should reuse the xmlXPathContext and just change its node member.
#include <stdio.h>
#include <libxml/xpath.h>
#include <libxml/xmlerror.h>
static xmlChar buffer[] =
"<?xml version=\"1.0\"?>\n<foo><bar><baz/></bar></foo>\n";
int
main()
{
const char *expr = "/foo";
xmlDocPtr document = xmlReadDoc(buffer,NULL,NULL,XML_PARSE_COMPACT);
xmlXPathContextPtr ctx = xmlXPathNewContext(document);
//ctx->node = xmlDocGetRootElement(document);
xmlXPathCompExprPtr p = xmlXPathCtxtCompile(ctx, (xmlChar *)expr);
xmlXPathObjectPtr res = xmlXPathCompiledEval(p, ctx);
if (XPATH_NODESET != res->type)
return 1;
fprintf(stderr, "Got object from first query:\n");
xmlXPathDebugDumpObject(stdout, res, 0);
xmlNodeSetPtr ns = res->nodesetval;
if (!ns->nodeNr)
return 1;
ctx->node = ns->nodeTab[0];
xmlXPathFreeObject(res);
expr = "bar/baz";
p = xmlXPathCtxtCompile(ctx, (xmlChar *)expr);
res = xmlXPathCompiledEval(p, ctx);
if (XPATH_NODESET != res->type)
return 1;
ns = res->nodesetval;
if (!ns->nodeNr)
return 1;
fprintf(stderr, "Got object from second query:\n");
xmlXPathDebugDumpObject(stdout, res, 0);
xmlXPathFreeContext(ctx);
return 0;
}

C state-machine design [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am crafting a small project in mixed C and C++. I am building one small-ish state-machine at the heart of one of my worker thread.
I was wondering if you gurus on SO would share your state-machine design techniques.
NOTE: I am primarily after tried & tested implementation techniques.
UPDATED: Based on all the great input gathered on SO, I've settled on this architecture:

State machines that I've designed before (C, not C++) have all come down to a struct array and a loop. The structure basically consists of a state and event (for look-up) and a function that returns the new state, something like:
typedef struct {
int st;
int ev;
int (*fn)(void);
} tTransition;
Then you define your states and events with simple defines (the ANY ones are special markers, see below):
#define ST_ANY -1
#define ST_INIT 0
#define ST_ERROR 1
#define ST_TERM 2
: :
#define EV_ANY -1
#define EV_KEYPRESS 5000
#define EV_MOUSEMOVE 5001
Then you define all the functions that are called by the transitions:
static int GotKey (void) { ... };
static int FsmError (void) { ... };
All these function are written to take no variables and return the new state for the state machine. In this example global variables are used for passing any information into the state functions where necessary.
Using globals isn't as bad as it sounds since the FSM is usually locked up inside a single compilation unit and all variables are static to that unit (which is why I used quotes around "global" above - they're more shared within the FSM, than truly global). As with all globals, it requires care.
The transitions array then defines all possible transitions and the functions that get called for those transitions (including the catch-all last one):
tTransition trans[] = {
{ ST_INIT, EV_KEYPRESS, &GotKey},
: :
{ ST_ANY, EV_ANY, &FsmError}
};
#define TRANS_COUNT (sizeof(trans)/sizeof(*trans))
What that means is: if you're in the ST_INIT state and you receive the EV_KEYPRESS event, make a call to GotKey.
The workings of the FSM then become a relatively simple loop:
state = ST_INIT;
while (state != ST_TERM) {
event = GetNextEvent();
for (i = 0; i < TRANS_COUNT; i++) {
if ((state == trans[i].st) || (ST_ANY == trans[i].st)) {
if ((event == trans[i].ev) || (EV_ANY == trans[i].ev)) {
state = (trans[i].fn)();
break;
}
}
}
}
As alluded to above, note the use of ST_ANY as wild-cards, allowing an event to call a function no matter the current state. EV_ANY also works similarly, allowing any event at a specific state to call a function.
It can also guarantee that, if you reach the end of the transitions array, you get an error stating your FSM hasn't been built correctly (by using the ST_ANY/EV_ANY combination.
I've used code similar for this on a great many communications projects, such as an early implementation of communications stacks and protocols for embedded systems. The big advantage was its simplicity and relative ease in changing the transitions array.
I've no doubt there will be higher-level abstractions which may be more suitable nowadays but I suspect they'll all boil down to this same sort of structure.
And, as ldog states in a comment, you can avoid the globals altogether by passing a structure pointer to all functions (and using that in the event loop). This will allow multiple state machines to run side-by-side without interference.
Just create a structure type which holds the machine-specific data (state at a bare minimum) and use that instead of the globals.
The reason I've rarely done that is simply because most of the state machines I've written have been singleton types (one-off, at-process-start, configuration file reading for example), not needing to run more than one instance. But it has value if you need to run more than one.

The other answers are good, but a very "lightweight" implementation I've used when the state machine is very simple looks like:
enum state { ST_NEW, ST_OPEN, ST_SHIFT, ST_END };
enum state current_state = ST_NEW;
while (current_state != ST_END)
{
input = get_input();
switch (current_state)
{
case ST_NEW:
/* Do something with input and set current_state */
break;
case ST_OPEN:
/* Do something different and set current_state */
break;
/* ... etc ... */
}
}
I would use this when the state machine is simple enough that the function pointer & state transition table approach is overkill. This is often useful for character-by-character or word-by-word parsing.

Pardon me for breaking every rule in computer science, but a state machine is one of the few (I can count only two off hand) places where a goto statement is not only more efficient, but also makes your code cleaner and easier to read. Because goto statements are based on labels, you can name your states instead of having to keep track of a mess of numbers or use an enum. It also makes for much cleaner code since you don't need all the extra cruft of function pointers or huge switch statements and while loops. Did I mention it's more efficient too?
Here's what a state machine might look like:
void state_machine() {
first_state:
// Do some stuff here
switch(some_var) {
case 0:
goto first_state;
case 1:
goto second_state;
default:
return;
}
second_state:
// Do some stuff here
switch(some_var) {
case 0:
goto first_state;
case 1:
goto second_state;
default:
return;
}
}
You get the general idea. The point is that you can implement the state machine in an efficient way and one that is relatively easy to read and screams at the reader that they are looking at a state machine. Note that if you are using goto statements, you must still be careful as it is very easy to shoot yourself in the foot while doing so.

You might consider the State Machine Compiler http://smc.sourceforge.net/
This splendid open source utility accepts a description of a state machine in a simple language and compiles it to any one of a dozen or so languages - including C and C++. The utility itself is written in Java, and can be included as part of a build.
The reason to do this, rather than hand coding using GoF State pattern or any other approach, is that once your state machine is expressed as code, the underlying structure tends to disappear under the weight of boilerplate that needs to be generated to support it. Using this approach gives you an excellent separation of concerns, and you keep the structure of your state machine 'visible'. The auto-generated code goes into modules that you don't need to touch, so that you can go back and fiddle with the state machine's structure without impacting the supporting code that you have written.
Sorry, I am being over-enthusiastic, and doubtless putting everyone off. But it is a top notch utility, and well-documented too.

Be sure to check the work of Miro Samek (blog State Space, website State Machines & Tools), whose articles at the C/C++ Users Journal were great.
The website contains a complete (C/C++) implementation in both open source and commercial license of a state machine framework (QP Framework), an event handler (QEP), a basic modeling tool (QM) and a tracing tool (QSpy) which allow to draw state machines, create code and debug them.
The book contains an extensive explanation on the what/why of the implementation and how to use it and is also great material to gain understanding of the fundamentals of hierachical and finite state machines.
The website also contains links to several board support packages for use of the software with embedded platforms.

I've done something similar to what paxdiablo describes, only instead of an array of state/event transitions, I set up a 2-dimensional array of function pointers, with the event value as the index of one axis and the current state value as the other. Then I just call state = state_table[event][state](params) and the right thing happens. Cells representing invalid state/event combinations get a pointer to a function that says so, of course.
Obviously, this only works if the state and event values are both contiguous ranges and start at 0 or close enough.

A very nice template-based C++ state machine "framework" is given by Stefan Heinzmann in his article.
Since there's no link to a complete code download in the article, I've taken the liberty to paste the code into a project and check it out. The stuff below is tested and includes the few minor but pretty much obvious missing pieces.
The major innovation here is that the compiler is generating very efficient code. Empty entry/exit actions have no cost. Non-empty entry/exit actions are inlined. The compiler is also verifying the completeness of the statechart. Missing actions generate linking errors. The only thing that is not caught is the missing Top::init.
This is a very nice alternative to Miro Samek's implementation, if you can live without what's missing -- this is far from a complete UML Statechart implementation, although it correctly implements the UML semantics, whereas Samek's code by design doesn't handle exit/transition/entry actions in correct order.
If this code works for what you need to do, and you have a decent C++ compiler for your system, it will probably perform better than Miro's C/C++ implementation. The compiler generates a flattened, O(1) transition state machine implementation for you. If the audit of assembly output confirms that the optimizations work as desired, you get close to theoretical performance. Best part: it's relatively tiny, easy to understand code.
#ifndef HSM_HPP
#define HSM_HPP
// This code is from:
// Yet Another Hierarchical State Machine
// by Stefan Heinzmann
// Overload issue 64 december 2004
// http://accu.org/index.php/journals/252
/* This is a basic implementation of UML Statecharts.
* The key observation is that the machine can only
* be in a leaf state at any given time. The composite
* states are only traversed, never final.
* Only the leaf states are ever instantiated. The composite
* states are only mechanisms used to generate code. They are
* never instantiated.
*/
// Helpers
// A gadget from Herb Sutter's GotW #71 -- depends on SFINAE
template<class D, class B>
class IsDerivedFrom {
class Yes { char a[1]; };
class No { char a[10]; };
static Yes Test(B*); // undefined
static No Test(...); // undefined
public:
enum { Res = sizeof(Test(static_cast<D*>(0))) == sizeof(Yes) ? 1 : 0 };
};
template<bool> class Bool {};
// Top State, Composite State and Leaf State
template <typename H>
struct TopState {
typedef H Host;
typedef void Base;
virtual void handler(Host&) const = 0;
virtual unsigned getId() const = 0;
};
template <typename H, unsigned id, typename B>
struct CompState;
template <typename H, unsigned id, typename B = CompState<H, 0, TopState<H> > >
struct CompState : B {
typedef B Base;
typedef CompState<H, id, Base> This;
template <typename X> void handle(H& h, const X& x) const { Base::handle(h, x); }
static void init(H&); // no implementation
static void entry(H&) {}
static void exit(H&) {}
};
template <typename H>
struct CompState<H, 0, TopState<H> > : TopState<H> {
typedef TopState<H> Base;
typedef CompState<H, 0, Base> This;
template <typename X> void handle(H&, const X&) const {}
static void init(H&); // no implementation
static void entry(H&) {}
static void exit(H&) {}
};
template <typename H, unsigned id, typename B = CompState<H, 0, TopState<H> > >
struct LeafState : B {
typedef H Host;
typedef B Base;
typedef LeafState<H, id, Base> This;
template <typename X> void handle(H& h, const X& x) const { Base::handle(h, x); }
virtual void handler(H& h) const { handle(h, *this); }
virtual unsigned getId() const { return id; }
static void init(H& h) { h.next(obj); } // don't specialize this
static void entry(H&) {}
static void exit(H&) {}
static const LeafState obj; // only the leaf states have instances
};
template <typename H, unsigned id, typename B>
const LeafState<H, id, B> LeafState<H, id, B>::obj;
// Transition Object
template <typename C, typename S, typename T>
// Current, Source, Target
struct Tran {
typedef typename C::Host Host;
typedef typename C::Base CurrentBase;
typedef typename S::Base SourceBase;
typedef typename T::Base TargetBase;
enum { // work out when to terminate template recursion
eTB_CB = IsDerivedFrom<TargetBase, CurrentBase>::Res,
eS_CB = IsDerivedFrom<S, CurrentBase>::Res,
eS_C = IsDerivedFrom<S, C>::Res,
eC_S = IsDerivedFrom<C, S>::Res,
exitStop = eTB_CB && eS_C,
entryStop = eS_C || eS_CB && !eC_S
};
// We use overloading to stop recursion.
// The more natural template specialization
// method would require to specialize the inner
// template without specializing the outer one,
// which is forbidden.
static void exitActions(Host&, Bool<true>) {}
static void exitActions(Host&h, Bool<false>) {
C::exit(h);
Tran<CurrentBase, S, T>::exitActions(h, Bool<exitStop>());
}
static void entryActions(Host&, Bool<true>) {}
static void entryActions(Host& h, Bool<false>) {
Tran<CurrentBase, S, T>::entryActions(h, Bool<entryStop>());
C::entry(h);
}
Tran(Host & h) : host_(h) {
exitActions(host_, Bool<false>());
}
~Tran() {
Tran<T, S, T>::entryActions(host_, Bool<false>());
T::init(host_);
}
Host& host_;
};
// Initializer for Compound States
template <typename T>
struct Init {
typedef typename T::Host Host;
Init(Host& h) : host_(h) {}
~Init() {
T::entry(host_);
T::init(host_);
}
Host& host_;
};
#endif // HSM_HPP
Test code follows.
#include <cstdio>
#include "hsm.hpp"
#include "hsmtest.hpp"
/* Implements the following state machine from Miro Samek's
* Practical Statecharts in C/C++
*
* |-init-----------------------------------------------------|
* | s0 |
* |----------------------------------------------------------|
* | |
* | |-init-----------| |-------------------------| |
* | | s1 |---c--->| s2 | |
* | |----------------|<--c----|-------------------------| |
* | | | | | |
* |<-d-| |-init-------| | | |-init----------------| | |
* | | | s11 |<----f----| | s21 | | |
* | /--| |------------| | | |---------------------| | |
* | a | | | | | | | | |
* | \->| | |------g--------->|-init------| | | |
* | | |____________| | | |-b->| s211 |---g--->|
* | |----b---^ |------f------->| | | | |
* | |________________| | |<-d-|___________|<--e----|
* | | |_____________________| | |
* | |_________________________| |
* |__________________________________________________________|
*/
class TestHSM;
typedef CompState<TestHSM,0> Top;
typedef CompState<TestHSM,1,Top> S0;
typedef CompState<TestHSM,2,S0> S1;
typedef LeafState<TestHSM,3,S1> S11;
typedef CompState<TestHSM,4,S0> S2;
typedef CompState<TestHSM,5,S2> S21;
typedef LeafState<TestHSM,6,S21> S211;
enum Signal { A_SIG, B_SIG, C_SIG, D_SIG, E_SIG, F_SIG, G_SIG, H_SIG };
class TestHSM {
public:
TestHSM() { Top::init(*this); }
~TestHSM() {}
void next(const TopState<TestHSM>& state) {
state_ = &state;
}
Signal getSig() const { return sig_; }
void dispatch(Signal sig) {
sig_ = sig;
state_->handler(*this);
}
void foo(int i) {
foo_ = i;
}
int foo() const {
return foo_;
}
private:
const TopState<TestHSM>* state_;
Signal sig_;
int foo_;
};
bool testDispatch(char c) {
static TestHSM test;
if (c<'a' || 'h'<c) {
return false;
}
printf("Signal<-%c", c);
test.dispatch((Signal)(c-'a'));
printf("\n");
return true;
}
int main(int, char**) {
testDispatch('a');
testDispatch('e');
testDispatch('e');
testDispatch('a');
testDispatch('h');
testDispatch('h');
return 0;
}
#define HSMHANDLER(State) \
template<> template<typename X> inline void State::handle(TestHSM& h, const X& x) const
HSMHANDLER(S0) {
switch (h.getSig()) {
case E_SIG: { Tran<X, This, S211> t(h);
printf("s0-E;");
return; }
default:
break;
}
return Base::handle(h, x);
}
HSMHANDLER(S1) {
switch (h.getSig()) {
case A_SIG: { Tran<X, This, S1> t(h);
printf("s1-A;"); return; }
case B_SIG: { Tran<X, This, S11> t(h);
printf("s1-B;"); return; }
case C_SIG: { Tran<X, This, S2> t(h);
printf("s1-C;"); return; }
case D_SIG: { Tran<X, This, S0> t(h);
printf("s1-D;"); return; }
case F_SIG: { Tran<X, This, S211> t(h);
printf("s1-F;"); return; }
default: break;
}
return Base::handle(h, x);
}
HSMHANDLER(S11) {
switch (h.getSig()) {
case G_SIG: { Tran<X, This, S211> t(h);
printf("s11-G;"); return; }
case H_SIG: if (h.foo()) {
printf("s11-H");
h.foo(0); return;
} break;
default: break;
}
return Base::handle(h, x);
}
HSMHANDLER(S2) {
switch (h.getSig()) {
case C_SIG: { Tran<X, This, S1> t(h);
printf("s2-C"); return; }
case F_SIG: { Tran<X, This, S11> t(h);
printf("s2-F"); return; }
default: break;
}
return Base::handle(h, x);
}
HSMHANDLER(S21) {
switch (h.getSig()) {
case B_SIG: { Tran<X, This, S211> t(h);
printf("s21-B;"); return; }
case H_SIG: if (!h.foo()) {
Tran<X, This, S21> t(h);
printf("s21-H;"); h.foo(1);
return;
} break;
default: break;
}
return Base::handle(h, x);
}
HSMHANDLER(S211) {
switch (h.getSig()) {
case D_SIG: { Tran<X, This, S21> t(h);
printf("s211-D;"); return; }
case G_SIG: { Tran<X, This, S0> t(h);
printf("s211-G;"); return; }
}
return Base::handle(h, x);
}
#define HSMENTRY(State) \
template<> inline void State::entry(TestHSM&) { \
printf(#State "-ENTRY;"); \
}
HSMENTRY(S0)
HSMENTRY(S1)
HSMENTRY(S11)
HSMENTRY(S2)
HSMENTRY(S21)
HSMENTRY(S211)
#define HSMEXIT(State) \
template<> inline void State::exit(TestHSM&) { \
printf(#State "-EXIT;"); \
}
HSMEXIT(S0)
HSMEXIT(S1)
HSMEXIT(S11)
HSMEXIT(S2)
HSMEXIT(S21)
HSMEXIT(S211)
#define HSMINIT(State, InitState) \
template<> inline void State::init(TestHSM& h) { \
Init<InitState> i(h); \
printf(#State "-INIT;"); \
}
HSMINIT(Top, S0)
HSMINIT(S0, S1)
HSMINIT(S1, S11)
HSMINIT(S2, S21)
HSMINIT(S21, S211)

The technique I like for state machines (at least ones for program control) is to use function pointers. Each state is represented by a different function. The function takes an input symbol and returns the function pointer for the next state. The central dispatch loop monitors takes the next input, feeds it to the current state, and processes the result.
The typing on it gets a little odd, since C doesn't have a way to indicate types of function pointers returning themselves, so the state functions return void*. But you can do something like this:
typedef void* (*state_handler)(input_symbol_t);
void dispatch_fsm()
{
state_handler current = initial_handler;
/* Let's assume returning null indicates end-of-machine */
while (current) {
current = current(get_input);
}
}
Then your individual state functions can switch on their input to process and return the appropriate value.

Simplest case
enum event_type { ET_THIS, ET_THAT };
union event_parm { uint8_t this; uint16_t that; }
static void handle_event(enum event_type event, union event_parm parm)
{
static enum { THIS, THAT } state;
switch (state)
{
case THIS:
switch (event)
{
case ET_THIS:
// Handle event.
break;
default:
// Unhandled events in this state.
break;
}
break;
case THAT:
// Handle state.
break;
}
}
Points:
State is private, not only to the compilation unit but also to the event_handler.
Special cases may be handled separately from the main switch using whatever construct deemed necessary.
More complex case
When the switch gets bigger than a couple of screens full, split it into functions that handle each state, using a state table to look up the function directly. The state is still private to the event handler. The state handler functions return the next state. If needed some events can still receive special treatment in the main event handler. I like to throw in pseudo-events for state entry and exit and perhaps state machine start:
enum state_type { THIS, THAT, FOO, NA };
enum event_type { ET_START, ET_ENTER, ET_EXIT, ET_THIS, ET_THAT, ET_WHATEVER, ET_TIMEOUT };
union event_parm { uint8_t this; uint16_t that; };
static void handle_event(enum event_type event, union event_parm parm)
{
static enum state_type state;
static void (* const state_handler[])(enum event_type event, union event_parm parm) = { handle_this, handle_that };
enum state_type next_state = state_handler[state](event, parm);
if (NA != next_state && state != next_state)
{
(void)state_handler[state](ET_EXIT, 0);
state = next_state;
(void)state_handler[state](ET_ENTER, 0);
}
}
I am not sure if I nailed the syntax, especially regarding the array of function pointers. I have not run any of this through a compiler. Upon review, I noticed that I forgot to explicitly discard the next state when handling the pseudo events (the (void) parenthesis before the call to state_handler()). This is something that I like to do even if compilers accept the omission silently. It tells readers of the code that "yes, I did indeed mean to call the function without using the return value", and it may stop static analysis tools from warning about it. It may be idiosyncratic because I do not recall having seen anybody else doing this.
Points: adding a tiny bit of complexity (checking if the next state is different from the current), can avoid duplicated code elsewhere, because the state handler functions can enjoy the pseudo events that occur when a state is entered and left. Remember that state cannot change when handling the pseudo events, because the result of the state handler is discarded after these events. You may of course choose to modify the behaviour.
A state handler would look like so:
static enum state_type handle_this(enum event_type event, union event_parm parm)
{
enum state_type next_state = NA;
switch (event)
{
case ET_ENTER:
// Start a timer to do whatever.
// Do other stuff necessary when entering this state.
break;
case ET_WHATEVER:
// Switch state.
next_state = THAT;
break;
case ET_TIMEOUT:
// Switch state.
next_state = FOO;
break;
case ET_EXIT:
// Stop the timer.
// Generally clean up this state.
break;
}
return next_state;
}
More complexity
When the compilation unit becomes too large (whatever you feel that is, I should say around 1000 lines), put each state handler in a separate file. When each state handler becomes longer than a couple of screens, split each event out in a separate function, similar to the way that the state switch was split. You may do this in a number of ways, separately from the state or by using a common table, or combining various schemes. Some of them have been covered here by others. Sort your tables and use binary search if speed is a requirement.
Generic programming
I should like the preprocessor to deal with issues such as sorting tables or even generating state machines from descriptions, allowing you to "write programs about programs". I believe this is what the Boost people are exploiting C++ templates for, but I find the syntax cryptic.
Two-dimensional tables
I have used state/event tables in the past but I have to say that for the simplest cases I do not find them necessary and I prefer the clarity and readability of the switch statement even if it does extend past one screen full. For more complex cases the tables quickly get out of hand as others have noted. The idioms I present here allow you to add a slew of events and states when you feel like it, without having to maintain a memory consuming table (even if it may be program memory).
Disclaimer
Special needs may render these idioms less useful, but I have found them to be very clear and maintainable.

Saw this somewhere
#define FSM
#define STATE(x) s_##x :
#define NEXTSTATE(x) goto s_##x
FSM {
STATE(x) {
...
NEXTSTATE(y);
}
STATE(y) {
...
if (x == 0)
NEXTSTATE(y);
else
NEXTSTATE(x);
}
}

Extremely untested, but fun to code, now in a more refined version than my original answer; up-to-date versions can be found at mercurial.intuxication.org:
sm.h
#ifndef SM_ARGS
#error "SM_ARGS undefined: " \
"use '#define SM_ARGS (void)' to get an empty argument list"
#endif
#ifndef SM_STATES
#error "SM_STATES undefined: " \
"you must provide a list of comma-separated states"
#endif
typedef void (*sm_state) SM_ARGS;
static const sm_state SM_STATES;
#define sm_transit(STATE) ((sm_state (*) SM_ARGS)STATE)
#define sm_def(NAME) \
static sm_state NAME ## _fn SM_ARGS; \
static const sm_state NAME = (sm_state)NAME ## _fn; \
static sm_state NAME ## _fn SM_ARGS
example.c
#include <stdio.h>
#define SM_ARGS (int i)
#define SM_STATES EVEN, ODD
#include "sm.h"
sm_def(EVEN)
{
printf("even %i\n", i);
return ODD;
}
sm_def(ODD)
{
printf("odd %i\n", i);
return EVEN;
}
int main(void)
{
int i = 0;
sm_state state = EVEN;
for(; i < 10; ++i)
state = sm_transit(state)(i);
return 0;
}

I really liked paxdiable's answer and decided to implement all the missing features for my application like guard variables and state machine specific data.
I uploaded my implementation to this site to share with the community. It has been tested using IAR Embedded Workbench for ARM.
https://sourceforge.net/projects/compactfsm/

Another interesting open source tool is Yakindu Statechart Tools on statecharts.org. It makes use of Harel statecharts and thus provides hierarchical and parallel states and generates C and C++ (as well as Java) code. It does not make use of libraries but follows a 'plain code' approach. The code basically applies switch-case structures. The code generators can also be customized. Additionally the tool provides many other features.

Here is an example of a Finite State Machine for Linux that uses message queues as the events. The events are put on the queue and handled in order. The state changes depending on what happens for each event.
This is an example for a data connection with states like:
Uninitialized
Initialized
Connected
MTU Negotiated
Authenticated
One little extra feature I added was a timestamp for each message/event. The event handler will ignore events that are too old (they have expired). This can happen a lot in the real world where you might get stuck in a state unexpectedly.
This example runs on Linux, use the Makefile below to compile it and play around with it.
state_machine.c
#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include <unistd.h> // sysconf()
#include <errno.h> // errno
#include <string.h> // strerror()
#include <sys/time.h> // gettimeofday()
#include <fcntl.h> // For O_* constants
#include <sys/stat.h> // For mode constants
#include <mqueue.h>
#include <poll.h>
//------------------------------------------------
// States
//------------------------------------------------
typedef enum
{
ST_UNKNOWN = 0,
ST_UNINIT,
ST_INIT,
ST_CONNECTED,
ST_MTU_NEGOTIATED,
ST_AUTHENTICATED,
ST_ERROR,
ST_DONT_CHANGE,
ST_TERM,
} fsmState_t;
//------------------------------------------------
// Events
//------------------------------------------------
typedef enum
{
EV_UNKNOWN = 0,
EV_INIT_SUCCESS,
EV_INIT_FAIL,
EV_MASTER_CMD_MSG,
EV_CONNECT_SUCCESS,
EV_CONNECT_FAIL,
EV_MTU_SUCCESS,
EV_MTU_FAIL,
EV_AUTH_SUCCESS,
EV_AUTH_FAIL,
EV_TX_SUCCESS,
EV_TX_FAIL,
EV_DISCONNECTED,
EV_DISCON_FAILED,
EV_LAST_ENTRY,
} fsmEvName_t;
typedef struct fsmEvent_type
{
fsmEvName_t name;
struct timeval genTime; // Time the event was generated.
// This allows us to see how old the event is.
} fsmEvent_t;
// Finite State Machine Data Members
typedef struct fsmData_type
{
int connectTries;
int MTUtries;
int authTries;
int txTries;
} fsmData_t;
// Each row of the state table
typedef struct stateTable_type {
fsmState_t st; // Current state
fsmEvName_t evName; // Got this event
int (*conditionfn)(void *); // If this condition func returns TRUE
fsmState_t nextState; // Change to this state and
void (*fn)(void *); // Run this function
} stateTable_t;
// Finite State Machine state structure
typedef struct fsm_type
{
const stateTable_t *pStateTable; // Pointer to state table
int numStates; // Number of entries in the table
fsmState_t currentState; // Current state
fsmEvent_t currentEvent; // Current event
fsmData_t *fsmData; // Pointer to the data attributes
mqd_t mqdes; // Message Queue descriptor
mqd_t master_cmd_mqdes; // Master command message queue
} fsm_t;
// Wildcard events and wildcard state
#define EV_ANY -1
#define ST_ANY -1
#define TRUE (1)
#define FALSE (0)
// Maximum priority for message queues (see "man mq_overview")
#define FSM_PRIO (sysconf(_SC_MQ_PRIO_MAX) - 1)
static void addev (fsm_t *fsm, fsmEvName_t ev);
static void doNothing (void *fsm) {addev(fsm, EV_MASTER_CMD_MSG);}
static void doInit (void *fsm) {addev(fsm, EV_INIT_SUCCESS);}
static void doConnect (void *fsm) {addev(fsm, EV_CONNECT_SUCCESS);}
static void doMTU (void *fsm) {addev(fsm, EV_MTU_SUCCESS);}
static void reportFailConnect (void *fsm) {addev(fsm, EV_ANY);}
static void doAuth (void *fsm) {addev(fsm, EV_AUTH_SUCCESS);}
static void reportDisConnect (void *fsm) {addev(fsm, EV_ANY);}
static void doDisconnect (void *fsm) {addev(fsm, EV_ANY);}
static void doTransaction (void *fsm) {addev(fsm, EV_TX_FAIL);}
static void fsmError (void *fsm) {addev(fsm, EV_ANY);}
static int currentlyLessThanMaxConnectTries (void *fsm) {
fsm_t *l = (fsm_t *)fsm;
return (l->fsmData->connectTries < 5 ? TRUE : FALSE);
}
static int isMoreThanMaxConnectTries (void *fsm) {return TRUE;}
static int currentlyLessThanMaxMTUtries (void *fsm) {return TRUE;}
static int isMoreThanMaxMTUtries (void *fsm) {return TRUE;}
static int currentyLessThanMaxAuthTries (void *fsm) {return TRUE;}
static int isMoreThanMaxAuthTries (void *fsm) {return TRUE;}
static int currentlyLessThanMaxTXtries (void *fsm) {return FALSE;}
static int isMoreThanMaxTXtries (void *fsm) {return TRUE;}
static int didNotSelfDisconnect (void *fsm) {return TRUE;}
static int waitForEvent (fsm_t *fsm);
static void runEvent (fsm_t *fsm);
static void runStateMachine(fsm_t *fsm);
static int newEventIsValid(fsmEvent_t *event);
static void getTime(struct timeval *time);
void printState(fsmState_t st);
void printEvent(fsmEvName_t ev);
// Global State Table
const stateTable_t GST[] = {
// Current state Got this event If this condition func returns TRUE Change to this state and Run this function
{ ST_UNINIT, EV_INIT_SUCCESS, NULL, ST_INIT, &doNothing },
{ ST_UNINIT, EV_INIT_FAIL, NULL, ST_UNINIT, &doInit },
{ ST_INIT, EV_MASTER_CMD_MSG, NULL, ST_INIT, &doConnect },
{ ST_INIT, EV_CONNECT_SUCCESS, NULL, ST_CONNECTED, &doMTU },
{ ST_INIT, EV_CONNECT_FAIL, &currentlyLessThanMaxConnectTries, ST_INIT, &doConnect },
{ ST_INIT, EV_CONNECT_FAIL, &isMoreThanMaxConnectTries, ST_INIT, &reportFailConnect },
{ ST_CONNECTED, EV_MTU_SUCCESS, NULL, ST_MTU_NEGOTIATED, &doAuth },
{ ST_CONNECTED, EV_MTU_FAIL, &currentlyLessThanMaxMTUtries, ST_CONNECTED, &doMTU },
{ ST_CONNECTED, EV_MTU_FAIL, &isMoreThanMaxMTUtries, ST_CONNECTED, &doDisconnect },
{ ST_CONNECTED, EV_DISCONNECTED, &didNotSelfDisconnect, ST_INIT, &reportDisConnect },
{ ST_MTU_NEGOTIATED, EV_AUTH_SUCCESS, NULL, ST_AUTHENTICATED, &doTransaction },
{ ST_MTU_NEGOTIATED, EV_AUTH_FAIL, &currentyLessThanMaxAuthTries, ST_MTU_NEGOTIATED, &doAuth },
{ ST_MTU_NEGOTIATED, EV_AUTH_FAIL, &isMoreThanMaxAuthTries, ST_MTU_NEGOTIATED, &doDisconnect },
{ ST_MTU_NEGOTIATED, EV_DISCONNECTED, &didNotSelfDisconnect, ST_INIT, &reportDisConnect },
{ ST_AUTHENTICATED, EV_TX_SUCCESS, NULL, ST_AUTHENTICATED, &doDisconnect },
{ ST_AUTHENTICATED, EV_TX_FAIL, &currentlyLessThanMaxTXtries, ST_AUTHENTICATED, &doTransaction },
{ ST_AUTHENTICATED, EV_TX_FAIL, &isMoreThanMaxTXtries, ST_AUTHENTICATED, &doDisconnect },
{ ST_AUTHENTICATED, EV_DISCONNECTED, &didNotSelfDisconnect, ST_INIT, &reportDisConnect },
{ ST_ANY, EV_DISCON_FAILED, NULL, ST_DONT_CHANGE, &doDisconnect },
{ ST_ANY, EV_ANY, NULL, ST_UNINIT, &fsmError } // Wildcard state for errors
};
#define GST_COUNT (sizeof(GST)/sizeof(stateTable_t))
int main()
{
int ret = 0;
fsmData_t dataAttr;
dataAttr.connectTries = 0;
dataAttr.MTUtries = 0;
dataAttr.authTries = 0;
dataAttr.txTries = 0;
fsm_t lfsm;
memset(&lfsm, 0, sizeof(fsm_t));
lfsm.pStateTable = GST;
lfsm.numStates = GST_COUNT;
lfsm.currentState = ST_UNINIT;
lfsm.currentEvent.name = EV_ANY;
lfsm.fsmData = &dataAttr;
struct mq_attr attr;
attr.mq_maxmsg = 30;
attr.mq_msgsize = sizeof(fsmEvent_t);
// Dev info
//printf("Size of fsmEvent_t [%ld]\n", sizeof(fsmEvent_t));
ret = mq_unlink("/abcmq");
if (ret == -1) {
fprintf(stderr, "Error on mq_unlink(), errno[%d] strerror[%s]\n",
errno, strerror(errno));
}
lfsm.mqdes = mq_open("/abcmq", O_CREAT | O_RDWR, S_IWUSR | S_IRUSR, &attr);
if (lfsm.mqdes == (mqd_t)-1) {
fprintf(stderr, "Error on mq_open(), errno[%d] strerror[%s]\n",
errno, strerror(errno));
return -1;
}
doInit(&lfsm); // This will generate the first event
runStateMachine(&lfsm);
return 0;
}
static void runStateMachine(fsm_t *fsm)
{
int ret = 0;
if (fsm == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return;
}
// Cycle through the state machine
while (fsm->currentState != ST_TERM) {
printf("current state [");
printState(fsm->currentState);
printf("]\n");
ret = waitForEvent(fsm);
if (ret == 0) {
printf("got event [");
printEvent(fsm->currentEvent.name);
printf("]\n");
runEvent(fsm);
}
sleep(2);
}
}
static int waitForEvent(fsm_t *fsm)
{
//const int numFds = 2;
const int numFds = 1;
struct pollfd fds[numFds];
int timeout_msecs = -1; // -1 is forever
int ret = 0;
int i = 0;
ssize_t num = 0;
fsmEvent_t newEv;
if (fsm == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return -1;
}
fsm->currentEvent.name = EV_ANY;
fds[0].fd = fsm->mqdes;
fds[0].events = POLLIN;
//fds[1].fd = fsm->master_cmd_mqdes;
//fds[1].events = POLLIN;
ret = poll(fds, numFds, timeout_msecs);
if (ret > 0) {
// An event on one of the fds has occurred
for (i = 0; i < numFds; i++) {
if (fds[i].revents & POLLIN) {
// Data may be read on device number i
num = mq_receive(fds[i].fd, (void *)(&newEv),
sizeof(fsmEvent_t), NULL);
if (num == -1) {
fprintf(stderr, "Error on mq_receive(), errno[%d] "
"strerror[%s]\n", errno, strerror(errno));
return -1;
}
if (newEventIsValid(&newEv)) {
fsm->currentEvent = newEv;
} else {
return -1;
}
}
}
} else {
fprintf(stderr, "Error on poll(), ret[%d] errno[%d] strerror[%s]\n",
ret, errno, strerror(errno));
return -1;
}
return 0;
}
static int newEventIsValid(fsmEvent_t *event)
{
if (event == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return FALSE;
}
printf("[%s]\n", __func__);
struct timeval now;
getTime(&now);
if ( (event->name < EV_LAST_ENTRY) &&
((now.tv_sec - event->genTime.tv_sec) < (60*5))
)
{
return TRUE;
} else {
return FALSE;
}
}
//------------------------------------------------
// Performs event handling on the FSM (finite state machine).
// Make sure there is a wildcard state at the end of
// your table, otherwise; the event will be ignored.
//------------------------------------------------
static void runEvent(fsm_t *fsm)
{
int i;
int condRet = 0;
if (fsm == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return;
}
printf("[%s]\n", __func__);
// Find a relevant entry for this state and event
for (i = 0; i < fsm->numStates; i++) {
// Look in the table for our current state or ST_ANY
if ( (fsm->pStateTable[i].st == fsm->currentState) ||
(fsm->pStateTable[i].st == ST_ANY)
)
{
// Is this the event we are looking for?
if ( (fsm->pStateTable[i].evName == fsm->currentEvent.name) ||
(fsm->pStateTable[i].evName == EV_ANY)
)
{
if (fsm->pStateTable[i].conditionfn != NULL) {
condRet = fsm->pStateTable[i].conditionfn(fsm->fsmData);
}
// See if there is a condition associated
// or we are not looking for any condition
//
if ( (condRet != 0) || (fsm->pStateTable[i].conditionfn == NULL))
{
// Set the next state (if applicable)
if (fsm->pStateTable[i].nextState != ST_DONT_CHANGE) {
fsm->currentState = fsm->pStateTable[i].nextState;
printf("new state [");
printState(fsm->currentState);
printf("]\n");
}
// Call the state callback function
fsm->pStateTable[i].fn(fsm);
break;
}
}
}
}
}
//------------------------------------------------
// EVENT HANDLERS
//------------------------------------------------
static void getTime(struct timeval *time)
{
if (time == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return;
}
printf("[%s]\n", __func__);
int ret = gettimeofday(time, NULL);
if (ret != 0) {
fprintf(stderr, "gettimeofday() failed: errno [%d], strerror [%s]\n",
errno, strerror(errno));
memset(time, 0, sizeof(struct timeval));
}
}
static void addev (fsm_t *fsm, fsmEvName_t ev)
{
int ret = 0;
if (fsm == NULL) {
fprintf(stderr, "[%s] NULL argument\n", __func__);
return;
}
printf("[%s] ev[%d]\n", __func__, ev);
if (ev == EV_ANY) {
// Don't generate a new event, just return...
return;
}
fsmEvent_t newev;
getTime(&(newev.genTime));
newev.name = ev;
ret = mq_send(fsm->mqdes, (void *)(&newev), sizeof(fsmEvent_t), FSM_PRIO);
if (ret == -1) {
fprintf(stderr, "[%s] mq_send() failed: errno [%d], strerror [%s]\n",
__func__, errno, strerror(errno));
}
}
//------------------------------------------------
// end EVENT HANDLERS
//------------------------------------------------
void printState(fsmState_t st)
{
switch(st) {
case ST_UNKNOWN:
printf("ST_UNKNOWN");
break;
case ST_UNINIT:
printf("ST_UNINIT");
break;
case ST_INIT:
printf("ST_INIT");
break;
case ST_CONNECTED:
printf("ST_CONNECTED");
break;
case ST_MTU_NEGOTIATED:
printf("ST_MTU_NEGOTIATED");
break;
case ST_AUTHENTICATED:
printf("ST_AUTHENTICATED");
break;
case ST_ERROR:
printf("ST_ERROR");
break;
case ST_TERM:
printf("ST_TERM");
break;
default:
printf("unknown state");
break;
}
}
void printEvent(fsmEvName_t ev)
{
switch (ev) {
case EV_UNKNOWN:
printf("EV_UNKNOWN");
break;
case EV_INIT_SUCCESS:
printf("EV_INIT_SUCCESS");
break;
case EV_INIT_FAIL:
printf("EV_INIT_FAIL");
break;
case EV_MASTER_CMD_MSG:
printf("EV_MASTER_CMD_MSG");
break;
case EV_CONNECT_SUCCESS:
printf("EV_CONNECT_SUCCESS");
break;
case EV_CONNECT_FAIL:
printf("EV_CONNECT_FAIL");
break;
case EV_MTU_SUCCESS:
printf("EV_MTU_SUCCESS");
break;
case EV_MTU_FAIL:
printf("EV_MTU_FAIL");
break;
case EV_AUTH_SUCCESS:
printf("EV_AUTH_SUCCESS");
break;
case EV_AUTH_FAIL:
printf("EV_AUTH_FAIL");
break;
case EV_TX_SUCCESS:
printf("EV_TX_SUCCESS");
break;
case EV_TX_FAIL:
printf("EV_TX_FAIL");
break;
case EV_DISCONNECTED:
printf("EV_DISCONNECTED");
break;
case EV_LAST_ENTRY:
printf("EV_LAST_ENTRY");
break;
default:
printf("unknown event");
break;
}
}
Makefile
CXX = gcc
COMPFLAGS = -c -Wall -g
state_machine: state_machine.o
$(CXX) -lrt state_machine.o -o state_machine
state_machine.o: state_machine.c
$(CXX) $(COMPFLAGS) state_machine.c
clean:
rm state_machine state_machine.o

Coming to this late (as usual) but scanning the answers to date I thinks something important is missing;
I have found in my own projects that it can be very helpful to not have a function for every valid state/event combination. I do like the idea of effectively having a 2D table of states/events. But I like the table elements to be more than a simple function pointer. Instead I try to organize my design so at it's heart it comprises a bunch of simple atomic elements or actions. That way I can list those simple atomic elements at each intersection of my state/event table. The idea is that you don't have to define a mass of N squared (typically very simple) functions. Why have something so error-prone, time consuming, hard to write, hard to read, you name it ?
I also include an optional new state, and an optional function pointer for each cell in the table. The function pointer is there for those exceptional cases where you don't want to just fire off a list of atomic actions.
You know you are doing it right when you can express a lot of different functionality, just by editing your table, with no new code to write.

Alrght, I think mine's just a little different from everybody else's. A little more separation of code and data than I see in the other answers. I really read up on the theory to write this, which implements a full Regular-language (without regular expressions, sadly). Ullman, Minsky, Chomsky. Can't say I understood it all, but I've drawn from the old masters as directly as possible: through their words.
I use a function pointer to a predicate that determines the transition to a 'yes' state or a 'no' state. This facilitates the creation of a finite state acceptor for a regular language that you program in a more assembly-language-like manner.
Please don't be put-off by my silly name choices. 'czek' == 'check'. 'grok' == [go look it up in the Hacker Dictionary].
So for each iteration, czek calls a predicate function with the current character as argument. If the predicate returns true, the character is consumed (the pointer advanced) and we follow the 'y' transition to select the next state. If the predicate returns false, the character is NOT consumed and we follow the 'n' transition. So every instruction is a two-way branch! I must have been reading The Story of Mel at the time.
This code comes straight from my postscript interpreter, and evolved into its current form with much guidance from the fellows on comp.lang.c. Since postscript basically has no syntax (only requiring balanced brackets), a Regular Language Accepter like this functions as the parser as well.
/* currentstr is set to the start of string by czek
and used by setrad (called by israd) to set currentrad
which is used by israddig to determine if the character
in question is valid for the specified radix
--
a little semantic checking in the syntax!
*/
char *currentstr;
int currentrad;
void setrad(void) {
char *end;
currentrad = strtol(currentstr, &end, 10);
if (*end != '#' /* just a sanity check,
the automaton should already have determined this */
|| currentrad > 36
|| currentrad < 2)
fatal("bad radix"); /* should probably be a simple syntaxerror */
}
/*
character classes
used as tests by automatons under control of czek
*/
char *alpha = "0123456789" "ABCDE" "FGHIJ" "KLMNO" "PQRST" "UVWXYZ";
#define EQ(a,b) a==b
#define WITHIN(a,b) strchr(a,b)!=NULL
int israd (int c) {
if (EQ('#',c)) { setrad(); return true; }
return false;
}
int israddig(int c) {
return strchrnul(alpha,toupper(c))-alpha <= currentrad;
}
int isdot (int c) {return EQ('.',c);}
int ise (int c) {return WITHIN("eE",c);}
int issign (int c) {return WITHIN("+-",c);}
int isdel (int c) {return WITHIN("()<>[]{}/%",c);}
int isreg (int c) {return c!=EOF && !isspace(c) && !isdel(c);}
#undef WITHIN
#undef EQ
/*
the automaton type
*/
typedef struct { int (*pred)(int); int y, n; } test;
/*
automaton to match a simple decimal number
*/
/* /^[+-]?[0-9]+$/ */
test fsm_dec[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, -1 },
/* 2*/ { isdigit, 2, -1 },
};
int acc_dec(int i) { return i==2; }
/*
automaton to match a radix number
*/
/* /^[0-9]+[#][a-Z0-9]+$/ */
test fsm_rad[] = {
/* 0*/ { isdigit, 1, -1 },
/* 1*/ { isdigit, 1, 2 },
/* 2*/ { israd, 3, -1 },
/* 3*/ { israddig, 4, -1 },
/* 4*/ { israddig, 4, -1 },
};
int acc_rad(int i) { return i==4; }
/*
automaton to match a real number
*/
/* /^[+-]?(d+(.d*)?)|(d*.d+)([eE][+-]?d+)?$/ */
/* represents the merge of these (simpler) expressions
[+-]?[0-9]+\.[0-9]*([eE][+-]?[0-9]+)?
[+-]?[0-9]*\.[0-9]+([eE][+-]?[0-9]+)?
The complexity comes from ensuring at least one
digit in the integer or the fraction with optional
sign and optional optionally-signed exponent.
So passing isdot in state 3 means at least one integer digit has been found
but passing isdot in state 4 means we must find at least one fraction digit
via state 5 or the whole thing is a bust.
*/
test fsm_real[] = {
/* 0*/ { issign, 1, 1 },
/* 1*/ { isdigit, 2, 4 },
/* 2*/ { isdigit, 2, 3 },
/* 3*/ { isdot, 6, 7 },
/* 4*/ { isdot, 5, -1 },
/* 5*/ { isdigit, 6, -1 },
/* 6*/ { isdigit, 6, 7 },
/* 7*/ { ise, 8, -1 },
/* 8*/ { issign, 9, 9 },
/* 9*/ { isdigit, 10, -1 },
/*10*/ { isdigit, 10, -1 },
};
int acc_real(int i) {
switch(i) {
case 2: /* integer */
case 6: /* real */
case 10: /* real with exponent */
return true;
}
return false;
}
/*
Helper function for grok.
Execute automaton against the buffer,
applying test to each character:
on success, consume character and follow 'y' transition.
on failure, do not consume but follow 'n' transition.
Call yes function to determine if the ending state
is considered an acceptable final state.
A transition to -1 represents rejection by the automaton
*/
int czek (char *s, test *fsm, int (*yes)(int)) {
int sta = 0;
currentstr = s;
while (sta!=-1 && *s) {
if (fsm[sta].pred((int)*s)) {
sta=fsm[sta].y;
s++;
} else {
sta=fsm[sta].n;
}
}
return yes(sta);
}
/*
Helper function for toke.
Interpret the contents of the buffer,
trying automatons to match number formats;
and falling through to a switch for special characters.
Any token consisting of all regular characters
that cannot be interpreted as a number is an executable name
*/
object grok (state *st, char *s, int ns,
object *src,
int (*next)(state *,object *),
void (*back)(state *,int, object *)) {
if (czek(s, fsm_dec, acc_dec)) {
long num;
num = strtol(s,NULL,10);
if ((num==LONG_MAX || num==LONG_MIN) && errno==ERANGE) {
error(st,limitcheck);
/* } else if (num > INT_MAX || num < INT_MIN) { */
/* error(limitcheck, OP_token); */
} else {
return consint(num);
}
}
else if (czek(s, fsm_rad, acc_rad)) {
long ra,num;
ra = (int)strtol(s,NULL,10);
if (ra > 36 || ra < 2) {
error(st,limitcheck);
}
num = strtol(strchr(s,'#')+1, NULL, (int)ra);
if ((num==LONG_MAX || num==LONG_MIN) && errno==ERANGE) {
error(st,limitcheck);
/* } else if (num > INT_MAX || num < INT_MAX) { */
/* error(limitcheck, OP_token); */
} else {
return consint(num);
}
}
else if (czek(s, fsm_real, acc_real)) {
double num;
num = strtod(s,NULL);
if ((num==HUGE_VAL || num==-HUGE_VAL) && errno==ERANGE) {
error(st,limitcheck);
} else {
return consreal(num);
}
}
else switch(*s) {
case '(': {
int c, defer=1;
char *sp = s;
while (defer && (c=next(st,src)) != EOF ) {
switch(c) {
case '(': defer++; break;
case ')': defer--;
if (!defer) goto endstring;
break;
case '\\': c=next(st,src);
switch(c) {
case '\n': continue;
case 'a': c = '\a'; break;
case 'b': c = '\b'; break;
case 'f': c = '\f'; break;
case 'n': c = '\n'; break;
case 'r': c = '\r'; break;
case 't': c = '\t'; break;
case 'v': c = '\v'; break;
case '\'': case '\"':
case '(': case ')':
default: break;
}
}
if (sp-s>ns) error(st,limitcheck);
else *sp++ = c;
}
endstring: *sp=0;
return cvlit(consstring(st,s,sp-s));
}
case '<': {
int c;
char d, *x = "0123456789abcdef", *sp = s;
while (c=next(st,src), c!='>' && c!=EOF) {
if (isspace(c)) continue;
if (isxdigit(c)) c = strchr(x,tolower(c)) - x;
else error(st,syntaxerror);
d = (char)c << 4;
while (isspace(c=next(st,src))) /*loop*/;
if (isxdigit(c)) c = strchr(x,tolower(c)) - x;
else error(st,syntaxerror);
d |= (char)c;
if (sp-s>ns) error(st,limitcheck);
*sp++ = d;
}
*sp = 0;
return cvlit(consstring(st,s,sp-s));
}
case '{': {
object *a;
size_t na = 100;
size_t i;
object proc;
object fin;
fin = consname(st,"}");
(a = malloc(na * sizeof(object))) || (fatal("failure to malloc"),0);
for (i=0 ; objcmp(st,a[i]=toke(st,src,next,back),fin) != 0; i++) {
if (i == na-1)
(a = realloc(a, (na+=100) * sizeof(object))) || (fatal("failure to malloc"),0);
}
proc = consarray(st,i);
{ size_t j;
for (j=0; j<i; j++) {
a_put(st, proc, j, a[j]);
}
}
free(a);
return proc;
}
case '/': {
s[1] = (char)next(st,src);
puff(st, s+2, ns-2, src, next, back);
if (s[1] == '/') {
push(consname(st,s+2));
opexec(st, op_cuts.load);
return pop();
}
return cvlit(consname(st,s+1));
}
default: return consname(st,s);
}
return null; /* should be unreachable */
}
/*
Helper function for toke.
Read into buffer any regular characters.
If we read one too many characters, put it back
unless it's whitespace.
*/
int puff (state *st, char *buf, int nbuf,
object *src,
int (*next)(state *,object *),
void (*back)(state *,int, object *)) {
int c;
char *s = buf;
while (isreg(c=next(st,src))) {
if (s-buf >= nbuf-1) return false;
*s++ = c;
}
*s = 0;
if (!isspace(c) && c != EOF) back(st,c,src); /* eat interstice */
return true;
}
/*
Helper function for Stoken Ftoken.
Read a token from src using next and back.
Loop until having read a bona-fide non-whitespace non-comment character.
Call puff to read into buffer up to next delimiter or space.
Call grok to figure out what it is.
*/
#define NBUF MAXLINE
object toke (state *st, object *src,
int (*next)(state *, object *),
void (*back)(state *, int, object *)) {
char buf[NBUF] = "", *s=buf;
int c,sta = 1;
object o;
do {
c=next(st,src);
//if (c==EOF) return null;
if (c=='%') {
if (DUMPCOMMENTS) fputc(c, stdout);
do {
c=next(st,src);
if (DUMPCOMMENTS) fputc(c, stdout);
} while (c!='\n' && c!='\f' && c!=EOF);
}
} while (c!=EOF && isspace(c));
if (c==EOF) return null;
*s++ = c;
*s = 0;
if (!isdel(c)) sta=puff(st, s,NBUF-1,src,next,back);
if (sta) {
o=grok(st,buf,NBUF-1,src,next,back);
return o;
} else {
return null;
}
}

boost.org comes with 2 different state chart implementations:
Meta State Machine
Statechart
As always, boost will beam you into template hell.
The first library is for more performance-critical state machines. The second library gives you a direct transition path from a UML Statechart to code.
Here's the SO question asking for a comparison between the two where both of the authors respond.

You can consider UML-state-machine-in-c, a "lightweight" state machine framework in C. I have written this framework to support both Finite state machine and Hierarchical state machine. Compare to state tables or simple switch cases, a framework approach is more scalable. It can be used for simple finite state machines to complex hierarchical state machines.
State machine is represented by state_machine_t structure. It contains only two members "Event" and a pointer to the "state_t".
struct state_machine_t
{
uint32_t Event; //!< Pending Event for state machine
const state_t* State; //!< State of state machine.
};
state_machine_t must be the first member of your state machine structure. e.g.
struct user_state_machine
{
state_machine_t Machine; // Base state machine. Must be the first member of user derived state machine.
// User specific state machine members
uint32_t param1;
uint32_t param2;
...
};
state_t contains a handler for the state and also optional handlers for entry and exit action.
//! finite state structure
struct finite_state{
state_handler Handler; //!< State handler to handle event of the state
state_handler Entry; //!< Entry action for state
state_handler Exit; //!< Exit action for state.
};
If the framework is configured for a hierarchical state machine then the state_t contains a pointer to parent and child state.
Framework provides an API dispatch_event to dispatch the event to the state machine and switch_state to trigger state transition.
For further details on how to implement a hierarchical state machine refer to the GitHub repository.
code examples,
https://github.com/kiishor/UML-State-Machine-in-C/blob/master/demo/simple_state_machine/readme.md
https://github.com/kiishor/UML-State-Machine-in-C/blob/master/demo/simple_state_machine_enhanced/readme.md

This series of Ars OpenForum posts about a somewhat complicated bit of control logic includes a very easy-to-follow implementation as a state machine in C.

Your question is quite generic,
Here are two reference articles that might be useful,
Embedded State Machine Implementation
This article describes a simple approach to implementing a state machine for an embedded system. For purposes of this article, a state machine is defined as an algorithm that can be in one of a small number of states. A state is a condition that causes a prescribed relationship of inputs to outputs, and of inputs to next states.
A savvy reader will quickly note that the state machines described in this article are Mealy machines. A Mealy machine is a state machine where the outputs are a function of both present state and input, as opposed to a Moore machine, in which the outputs are a function only of state.
Coding State Machines in C and C++
My preoccupation in this article is with state-machine fundamentals and some straightforward programming guidelines for coding state machines in C or C++. I hope that these simple techniques can become more common, so that you (and others) can readily see the state-machine structure right from the source code.

I have used State Machine Compiler in Java and Python projects to with success.

Given that you imply you can use C++ and hence OO code, I would suggest evaluating the 'GoF'state pattern (GoF = Gang of Four, the guys who wrote the design patterns book which brought design patterns into the limelight).
It is not particularly complex and it is widely used and discussed so it is easy to see examples and explanations on line.
It will also quite likely be recognizable by anyone else maintaining your code at a later date.
If efficiency is the worry, it would be worth actually benchmarking to make sure that a non OO approach is more efficient as lots of factors affect performance and it is not always simply OO bad, functional code good. Similarly, if memory usage is a constraint for you it is again worth doing some tests or calculations to see if this will actually be an issue for your particular application if you use the state pattern.
The following are some links to the 'Gof' state pattern, as Craig suggests:
http://en.wikipedia.org/wiki/State_pattern
http://www.vincehuston.org/dp/state.html

This is an old post with lots of answers, but I thought I'd add my own approach to the finite state machine in C. I made a Python script to produce the skeleton C code for any number of states. That script is documented on GituHub at FsmTemplateC
This example is based on other approaches I've read about. It doesn't use goto or switch statements but instead has transition functions in a pointer matrix (look-up table). The code relies on a big multi-line initializer macro and C99 features (designated initializers and compound literals) so if you don't like these things, you might not like this approach.
Here is a Python script of a turnstile example which generates skeleton C-code using FsmTemplateC:
# dict parameter for generating FSM
fsm_param = {
# main FSM struct type string
'type': 'FsmTurnstile',
# struct type and name for passing data to state machine functions
# by pointer (these custom names are optional)
'fopts': {
'type': 'FsmTurnstileFopts',
'name': 'fopts'
},
# list of states
'states': ['locked', 'unlocked'],
# list of inputs (can be any length > 0)
'inputs': ['coin', 'push'],
# map inputs to commands (next desired state) using a transition table
# index of array corresponds to 'inputs' array
# for this example, index 0 is 'coin', index 1 is 'push'
'transitiontable': {
# current state | 'coin' | 'push' |
'locked': ['unlocked', ''],
'unlocked': [ '', 'locked']
}
}
# folder to contain generated code
folder = 'turnstile_example'
# function prefix
prefix = 'fsm_turnstile'
# generate FSM code
code = fsm.Fsm(fsm_param).genccode(folder, prefix)
The generated output header contains the typedefs:
/* function options (EDIT) */
typedef struct FsmTurnstileFopts {
/* define your options struct here */
} FsmTurnstileFopts;
/* transition check */
typedef enum eFsmTurnstileCheck {
EFSM_TURNSTILE_TR_RETREAT,
EFSM_TURNSTILE_TR_ADVANCE,
EFSM_TURNSTILE_TR_CONTINUE,
EFSM_TURNSTILE_TR_BADINPUT
} eFsmTurnstileCheck;
/* states (enum) */
typedef enum eFsmTurnstileState {
EFSM_TURNSTILE_ST_LOCKED,
EFSM_TURNSTILE_ST_UNLOCKED,
EFSM_TURNSTILE_NUM_STATES
} eFsmTurnstileState;
/* inputs (enum) */
typedef enum eFsmTurnstileInput {
EFSM_TURNSTILE_IN_COIN,
EFSM_TURNSTILE_IN_PUSH,
EFSM_TURNSTILE_NUM_INPUTS,
EFSM_TURNSTILE_NOINPUT
} eFsmTurnstileInput;
/* finite state machine struct */
typedef struct FsmTurnstile {
eFsmTurnstileInput input;
eFsmTurnstileCheck check;
eFsmTurnstileState cur;
eFsmTurnstileState cmd;
eFsmTurnstileState **transition_table;
void (***state_transitions)(struct FsmTurnstile *, FsmTurnstileFopts *);
void (*run)(struct FsmTurnstile *, FsmTurnstileFopts *, const eFsmTurnstileInput);
} FsmTurnstile;
/* transition functions */
typedef void (*pFsmTurnstileStateTransitions)(struct FsmTurnstile *, FsmTurnstileFopts *);
enum eFsmTurnstileCheck is used to determine whether a transition was blocked with EFSM_TURNSTILE_TR_RETREAT, allowed to progress with EFSM_TURNSTILE_TR_ADVANCE, or the function call was not preceded by a transition with EFSM_TURNSTILE_TR_CONTINUE.
enum eFsmTurnstileState is simply the list of states.
enum eFsmTurnstileInput is simply the list of inputs.
The FsmTurnstile struct is the heart of the state machine with the transition check, function lookup table, current state, commanded state, and an alias to the primary function that runs the machine.
Every function pointer (alias) in FsmTurnstile should only be called from the struct and has to have its first input as a pointer to itself so as to maintain a persistent state, object-oriented style.
Now for the function declarations in the header:
/* fsm declarations */
void fsm_turnstile_locked_locked (FsmTurnstile *fsm, FsmTurnstileFopts *fopts);
void fsm_turnstile_locked_unlocked (FsmTurnstile *fsm, FsmTurnstileFopts *fopts);
void fsm_turnstile_unlocked_locked (FsmTurnstile *fsm, FsmTurnstileFopts *fopts);
void fsm_turnstile_unlocked_unlocked (FsmTurnstile *fsm, FsmTurnstileFopts *fopts);
void fsm_turnstile_run (FsmTurnstile *fsm, FsmTurnstileFopts *fopts, const eFsmTurnstileInput input);
Function names are in the format {prefix}_{from}_{to}, where {from} is the previous (current) state and {to} is the next state. Note that if the transition table does not allow for certain transitions, a NULL pointer instead of a function pointer will be set. Finally, the magic happens with a macro. Here we build the transition table (matrix of state enums) and the state transition functions look up table (a matrix of function pointers):
/* creation macro */
#define FSM_TURNSTILE_CREATE() \
{ \
.input = EFSM_TURNSTILE_NOINPUT, \
.check = EFSM_TURNSTILE_TR_CONTINUE, \
.cur = EFSM_TURNSTILE_ST_LOCKED, \
.cmd = EFSM_TURNSTILE_ST_LOCKED, \
.transition_table = (eFsmTurnstileState * [EFSM_TURNSTILE_NUM_STATES]) { \
(eFsmTurnstileState [EFSM_TURNSTILE_NUM_INPUTS]) { \
EFSM_TURNSTILE_ST_UNLOCKED, \
EFSM_TURNSTILE_ST_LOCKED \
}, \
(eFsmTurnstileState [EFSM_TURNSTILE_NUM_INPUTS]) { \
EFSM_TURNSTILE_ST_UNLOCKED, \
EFSM_TURNSTILE_ST_LOCKED \
} \
}, \
.state_transitions = (pFsmTurnstileStateTransitions * [EFSM_TURNSTILE_NUM_STATES]) { \
(pFsmTurnstileStateTransitions [EFSM_TURNSTILE_NUM_STATES]) { \
fsm_turnstile_locked_locked, \
fsm_turnstile_locked_unlocked \
}, \
(pFsmTurnstileStateTransitions [EFSM_TURNSTILE_NUM_STATES]) { \
fsm_turnstile_unlocked_locked, \
fsm_turnstile_unlocked_unlocked \
} \
}, \
.run = fsm_turnstile_run \
}
When creating the FSM, the macro FSM_EXAMPLE_CREATE() has to be used.
Now, in the source code every state transition function declared above should be populated. The FsmTurnstileFopts struct can be used to pass data to/from the state machine. Every transition must set fsm->check to be equal to either EFSM_EXAMPLE_TR_RETREAT to block it from transitioning or EFSM_EXAMPLE_TR_ADVANCE to allow it to transition to the commanded state.
A working example can be found at (FsmTemplateC)[https://github.com/ChisholmKyle/FsmTemplateC].
Here is the very simple actual usage in your code:
/* create fsm */
FsmTurnstile fsm = FSM_TURNSTILE_CREATE();
/* create fopts */
FsmTurnstileFopts fopts = {
.msg = ""
};
/* initialize input */
eFsmTurnstileInput input = EFSM_TURNSTILE_NOINPUT;
/* main loop */
for (;;) {
/* wait for timer signal, inputs, interrupts, whatever */
/* optionally set the input (my_input = EFSM_TURNSTILE_IN_PUSH for example) */
/* run state machine */
my_fsm.run(&my_fsm, &my_fopts, my_input);
}
All that header business and all those functions just to have a simple and fast interface is worth it in my mind.

You could use the open source library OpenFST.
OpenFst is a library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). Weighted finite-state transducers are automata where each transition has an input label, an output label, and a weight. The more familiar finite-state acceptor is represented as a transducer with each transition's input and output label equal. Finite-state acceptors are used to represent sets of strings (specifically, regular or rational sets); finite-state transducers are used to represent binary relations between pairs of strings (specifically, rational transductions). The weights can be used to represent the cost of taking a particular transition.

void (* StateController)(void);
void state1(void);
void state2(void);
void main()
{
StateController=&state1;
while(1)
{
(* StateController)();
}
}
void state1(void)
{
//do something in state1
StateController=&state2;
}
void state2(void)
{
//do something in state2
//Keep changing function direction based on state transition
StateController=&state1;
}

I personally use self referencing structs in combination with pointer arrays.
I uploaded a tutorial on github a while back, link:
https://github.com/mmelchger/polling_state_machine_c
Note: I do realise that this thread is quite old, but I hope to get input and thoughts on the design of the state-machine as well as being able to provide an example for a possible state-machine design in C.

Here is a method for a state machine that uses macros such that each function can have its own set of states: https://www.codeproject.com/Articles/37037/Macros-to-simulate-multi-tasking-blocking-code-at
It is titled "simulate multi tasking" but that is not the only use.
This method uses callbacks to pickup in each function where it left off. Each function contains a list of states unique to each function. A central "idle loop" is used to run the state machines. The "idle loop" has no idea how the state machines work, it is the individual functions that "know what to do". In order to write code for the functions, one just creates a list of states and uses the macros to "suspend" and "resume". I used these macros at Cisco when I wrote the Transceiver Library for the Nexus 7000 switch.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Building custom theories in Z3 - c++

Related

How to wrap several boolean flags into struct to pass them to a function with a convenient syntax

Elegantly attempt to execute various functions a specific way

Retrieve ptr from function call asmjit

What's the most efficient way to do recursive XPath queries using libxml2?

C state-machine design [closed]

Categories

Resources