I am making an application in C++, and it requires a config file that will be read and interpreted on launch. It will contain things such as:
Module1=true
Now, my original plan was to store it all in variables and simply have
If(module1) {
DO_STUFF();
}
However this seems wasteful as it would be checking constantly for a value that would never change. Any ideas?
Optimize the code, only if you find a bottleneck with a profiler. Branch prediction should do its thing here, module1 never changes, so if you call it in a loop, even, there shouldn't be a noticeable performance loss.
If you want to experiment, you can branch once, and make a pointer point to the right function:
using func_ptr = void (*)();
func_ptr p = [](){};
if(module1)
p = DO_STUFF;
while(...)
p();
But this is just something to profile, look at the assembly...
There are also slower, but comfortable ways you could be storing the configuration, e.g. in an array with enumerated indexes, or a map. If I were to get some value in a loop, I'd do:
auto module1 = modules[MODULE1]; // array and enumeration
//auto module1 = modules.at("module1"); // map and string
while(...)
{
if(module1)
DO_STUFF;
...
}
So I'd end up with what you already have.
performance wise a boolean check is no problem, except you start doing it millions or billions of times. Maybe you can start merging code which belongs to module1, but other than that you'd have to check for it like you currently do
This really isn't an issue. If your program requires that Module1 should be true then let it check the value and continue on. It wont affect your performance unless it is being checked too many times.
One thing you could do is make an inline function if it being checked too many times. However, you will have to make sure the function shouldnt be too big otherwise it will be a bigger bottleneck
Sorry guys, didn't spot this when I looked it up:
MDSN
So I check the boolean once on launch and then I don't need to anymore as only the correct functions are launched.
Depending on how your program is set up and how the variables change the behaviour of the code you might be able to use function pointers:
if(Module1 == true)
{
std::function<void(int)> DoStuff = Module1Stuff;
}
And then later:
while(true)
{
DoStuff(ImportantVariable);
}
See http://en.cppreference.com/w/cpp/utility/functional/function for further reference.
Not that I think it'll help all that much but it's an alternative to try out at least.
This can be solved if you know the all use cases of the values you check. For example, if you've read your config file and module1 is true - you do one thing, if it is false - another. Let's start with example:
class ConfigFileWorker {
public:
virtual void run() = 0;
};
class WithModule1Worker {
public:
void run() final override {
// do stuff as if your `Module1` is true
}
};
class WithoutModule1Worker {
public:
void run() final override {
// do stuff as if your `Module1` is false
}
};
int main() {
std::unique_ptr<ConfigFileWorker> worker;
const bool Module1 = read_config_file(file, "Module1");
if (Module1) { // you check this only once during launch and just use `worker` all the time after
worker.reset(new WithModule1Worker);
} else {
worker.reset(new WithoutModule1Worker);
}
// here and after just use the pointer with `run()` - then you will not need to check the variable all the time, you'll just perform action.
}
So you have predefined behaviour for 2 cases (true and false) and just create an object of one of them during parsing the config file on launch. This is java-like code, but of course you may use function pointers, std::function and other abstractions instead of a base class, however, base class-option has more flexibility in my opinion.
Related
I have a function which processes data that comes as a sequence. Because of this, I need to know the value of certain variables from the last function call during the current function call.
My current approach to doing this is to use static variables. My function goes something like this:
bool processData(Object message){
static int lastVar1 = -1;
int curVar1 = message.var1;
if (curVar1 > lastVar1){
// Do something
}
lastVar1 = curVar1;
}
This is just a small sample of the code; in reality I have 10+ static variables tracking different things. My gut tells me using so many static variables probably isn't a good idea, though I have nothing to back that feeling up.
My question: Is there a better way to do this?
An alternative I've been looking into is using an object whose fields are lastVar1, lastVar2, etc. However, I'm not sure if keeping an object in memory would be more efficient than using static variables.
Your question has a taste of being purely about style and opinions, though there are aspects that are not a matter of opinion: multithreading and testing.
Consider this:
bool foo(int x) {
static last_val = -1;
bool result = (x == last_val);
last_val = x;
return result;
}
You can call this function concurrently from multiple threads but it wont do the expected. Moreover you can only test the function by asserting that it does the right thing:
foo(1);
assert( foo(1) ); // silenty assumes that the last call did the right thing
To setup the preconditions for the test (first line) you already have to assume that foo(1) does the right thing, which somehow defeats the purpose of testing that call in the second line.
If the methods need the current object and the previous object, simply pass both:
bool processData(const Object& message,const Object& previous_message){
if (message.var1 > previous_message.var1){
// Do something
return true;
}
return false;
}
Of course this just shifts the issue of keeping track of the previous message to the caller, though thats straight-forward and requires not messing around with statics:
Object message, old_message;
while ( get_more( message )) {
processData(message, old_message);
old_message = message;
}
I have a value which is expensive to calculate and can be asked for ahead of time--something like a lazily initiated value whose initialization is actually done at the moment of definition, but in a different thread. My immediate thought was to use parallelism.-Task seems purpose-built for this exact use-case. So, let's put it in a class:
class Foo
{
import std.parallelism : Task,task;
static int calculate(int a, int b)
{
return a+b;
}
private Task!(calculate,int,int)* ourTask;
private int _val;
int val()
{
return ourTask.workForce();
}
this(int a, int b)
{
ourTask = task!calculate(a,b);
}
}
That seems all well and good... except when I want the task to be based on a non-static method, in which case I want to make the task a delegate, in which case I start having to do stuff like this:
private typeof(task(&classFunc)) working;
And then, as it turns out, typeof(task(&classFunc)), when it's asked for outside of a function body, is actually Task!(run,ReturnType!classFunc function(Parameters!classFunc))*, which you may notice is not the type actually returned by runtime function calls of that. That would be Task!(run,ReturnType!classFunc delegate(Parameters!classFunc))*, which requires me to cast to typeof(working) when I actually call task(&classFunc). This is all extremely hackish feeling.
This was my attempt at a general template solution:
/**
Provides a transparent wrapper that allows for lazy
setting of variables. When lazySet!!func(args) is called
on the value, the function will be called in a new thread;
as soon as the value's access is attempted, it'll return the
result of the task, blocking if it's not done calculating.
Accessing the value is as simple as using it like the
type it's templated for--see the unit test.
*/
shared struct LazySet(T)
{
/// You can set the value directly, as normal--this throws away the current task.
void opAssign(T n)
{
import core.atomic : atomicStore;
working = false;
atomicStore(_val,n);
}
import std.traits : ReturnType;
/**
Called the same way as std.parallelism.task;
after this is called, the next attempt to access
the value will result in the value being set from
the result of the given function before it's returned.
If the task isn't done, it'll wait on the task to be done
once accessed, using workForce.
*/
void lazySet(alias func,Args...)(Args args)
if(is(ReturnType!func == T))
{
import std.parallelism : task,taskPool;
auto t = task!func(args);
taskPool.put(t);
curTask = (() => t.workForce);
working = true;
}
/// ditto
void lazySet(F,Args...)(F fpOrDelegate, ref Args args)
if(is(ReturnType!F == T))
{
import std.parallelism : task,taskPool;
auto t = task(fpOrDelegate,args);
taskPool.put(t);
curTask = (() => t.workForce);
working = true;
}
private:
T _val;
T delegate() curTask;
bool working = false;
T val()
{
import core.atomic : atomicStore,atomicLoad;
if(working)
{
atomicStore(_val,curTask());
working = false;
}
return atomicLoad(_val);
}
// alias this is inherently public
alias val this;
}
This lets me call lazySet using any function, function pointer or delegate that returns T, and then it'll calculate the value in parallel and return it, fully calculated, next time anything tries to access the underlying value, exactly as I wanted. Unit tests I wrote to describe its functionality pass, etc., it works perfectly.
But one thing's bothering me:
curTask = (() => t.workForce);
Moving the Task around by way of creating a lambda on-the-spot that happens to have the Task in its context still seems like I'm trying to "pull one over" on the language, even if it's less "hackish-feeling" than all the casting from earlier.
Am I missing some obvious language feature that would allow me to do this more "elegantly"?
Templates that take an alias function parameter (such as the Task family) are finicky regarding their actual type, as they can receive any type of function as parameter (including in-place delegates that get inferred themselves). As the actual function that gets called is part of the type itself, you would have to pass it to your custom struct to be able to save the Task directly.
As for the legitimacy of your solution, there is nothing wrong with storing lambdas to interact with complicated (or "hidden") types later.
An alternative is to store a pointer to &t.workForce directly.
Also, in your T val() two threads could enter if(working) at the same time, but I guess due to the atomic store it wouldn't really break anything - anyway, that could be fixed by core.atomic.cas.
For example, I have to assure that a certain function for a certain real-time system works for 20 ms or less. I can simply measure time at the beginning of a function and at the end of it, then assert the difference to be satisfactory. And I do this in C++.
But this look pretty much like contract, except time checking is a post-condition, and time measurement at the beginning is not a condition at all. It would be nice to put it into contract not only for the notation of it, but for building reasons as well.
So I wonder, can I use contract capabilities to check the time of function working?
Sort of, but not really well. The reason is variables declared in the in{} block are not visible in the out{} block. (There's been some discussing about changing this, so it can check pre vs post state by making a copy in the in block, but nothing has been implemented.)
So, this will not work:
void foo()
in { auto before = Clock.currTime(); }
out { assert(Clock.currTime - before < dur!"msecs"(20)); }
body { ... }
The variable from in won't carry over to out, giving you an undefined identifier error. But, I say "sort of" though because there is a potential workaround:
import std.datetime;
struct Foo {
SysTime test_before;
void test()
in {
test_before = Clock.currTime();
}
out {
assert(Clock.currTime - test_before < dur!"msecs"(20));
}
body {
}
}
Declaring the variable as a regular member of the struct. But this would mean a lot of otherwise useless variables for each function, wouldn't work with recursion, and just pollutes the member namespace.
Part of me is thinking you could do your own stack off to the side and have in{} push the time, then out{} pops it and checks.... but a quick test shows that it is liable to break once inheritance gets involved. If you repeat the in{} block each time, it might work. But this strikes me as awfully brittle. The rule with contract inheritance is ALL of the out{} blocks of the inheritance tree need to pass, but only any ONE of the in{} blocks needs to pass. So if you had a different in{} down the chain, it might forget to push the time, and then when out tries to pop it, your stack would underflow.
// just for experimenting.....
SysTime[] timeStack; // WARNING: use a real stack here in production, a plain array will waste a *lot* of time reallocating as you push and pop on to it
class Foo {
void test()
in {
timeStack ~= Clock.currTime();
}
out {
auto start = timeStack[$-1];
timeStack = timeStack[0 .. $-1];
assert(Clock.currTime - start < dur!"msecs"(20));
import std.stdio;
// making sure the stack length is still sane
writeln("stack length ", timeStack.length);
}
body { }
}
class Bar : Foo {
override void test()
in {
// had to repeat the in block on the child class for this to work at all
timeStack ~= Clock.currTime();
}
body {
import core.thread;
Thread.sleep(10.msecs); // bump that up to force a failure, ensuring the test is actually run
}
}
That seems to work, but I think it is more trouble than it's worth. I expect it would break somehow as the program got bigger, and if your test breaks your program, that kinda defeats the purpose.
I'd probably do it as a unittest{}, if only checking with explicit tests fulfills you requirements (however, note that contracts, like most asserts in D, are removed if you compile with the -release switch, so they won't actually be checked in release versions either. If you need it to reliably fail, throw an exception rather than assert, since that will always work, in debug and release modes.).
Or you could do it with an assert in the function or a helper struct or whatever, similar to C++. I'd use a scope guard:
void test() {
auto before = Clock.currTime();
scope(exit) assert(Clock.currTime - before < dur!"msecs"(20)); // or import std.exception; and use enforce instead of assert if you want it in release builds too
/* write the rest of your function */
}
Of course, here you'll have to copy it in the subclasses too, but it seems like you'd have to do that with the in{} blocks anyway, so meh, and at least the before variable is local.
Bottom line, I'd say you're probably best off doing it more or less the same way you have been in C++.
Let's say you have a function in C/C++, that behaves a certain way the first time it runs. And then, all other times it behaves another way (see below for example). After it runs the first time, the if statement becomes redundant and could be optimized away if speed is important. Is there any way to make this optimization?
bool val = true;
void function1() {
if (val == true) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
gcc has a builtin function that let you inform the implementation about branch prediction:
__builtin_expect
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
For example in your case:
bool val = true;
void function1()
{
if (__builtin_expect(val, 0)) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
You should only make the change if you're certain that it truly is a bottleneck. With branch-prediction, the if statement is probably instant, since it's a very predictable pattern.
That said, you can use callbacks:
#include <iostream>
using namespace std;
typedef void (*FunPtr) (void);
FunPtr method;
void subsequentRun()
{
std::cout << "subsequent call" << std::endl;
}
void firstRun()
{
std::cout << "first run" << std::endl;
method = subsequentRun;
}
int main()
{
method = firstRun;
method();
method();
method();
}
produces the output:
first run subsequent call subsequent call
You could use a function pointer but then it will require an indirect call in any case:
void (*yourFunction)(void) = &firstCall;
void firstCall() {
..
yourFunction = &otherCalls;
}
void otherCalls() {
..
}
void main()
{
yourFunction();
}
One possible method is to compile two different versions of the function (this can be done from a single function in the source with templates), and use a function pointer or object to decide at runtime. However, the pointer overhead will likely outweigh any potential gains unless your function is really expensive.
You could use a static member variable instead of a global variable..
Or, if the code you're running the first time changes something for all future uses (eg, opening a file?), you could use that change as a check to determine whether or not to run the code (ie, check if the file is open). This would save you the extra variable. Also, it might help with error checking - if for some reason the initial change is be unchanged by another operation (eg, the file is on removable media that is removed improperly), your check could try to re-do the change.
A compiler can only optimize what is known at compile time.
In your case, the value of val is only known at runtime, so it can't be optimized.
The if test is very quick, you shouldn't worry about optimizing it.
If you'd like to make the code a little bit cleaner you could make the variable local to the function using static:
void function() {
static bool firstRun = true;
if (firstRun) {
firstRun = false;
...
}
else {
...
}
}
On entering the function for the first time, firstRun would be true, and it would persist so each time the function is called, the firstRun variable will be the same instance as the ones before it (and will be false each subsequent time).
This could be used well with #ouah's solution.
Compilers like g++ (and I'm sure msvc) support generating profile data upon a first run, then using that data to better guess what branches are most likely to be followed, and optimizing accordingly. If you're using gcc, look at the -fprofile-generate option.
The expected behavior is that the compiler will optimize that if statement such that the else will be ordered first, thus avoiding the jmp operation on all your subsequent calls, making it pretty much as fast as if it wern't there, especially if you return somewhere in that else (thus avoiding having to jump past the 'if' statements)
One way to make this optimization is to split the function in two. Instead of:
void function1()
{
if (val == true) {
// do something
val = false;
} else {
// do other stuff
}
}
Do this:
void function1()
{
// do something
}
void function2()
{
// do other stuff
}
One thing you can do is put the logic into the constructor of an object, which is then defined static. If such a static object occurs in a block scope, the constructor is run the fist time that an execution of that scope takes place. The once-only check is emitted by the compiler.
You can also put static objects at file scope, and then they are initialized before main is called.
I'm giving this answer because perhaps you're not making effective use of C++ classes.
(Regarding C/C++, there is no such language. There is C and there is C++. Are you working in C that has to also compile as C++ (sometimes called, unofficially, "Clean C"), or are you really working in C++?)
What is "Clean C" and how does it differ from standard C?
To remain compiler INDEPENDENT you can code the parts of if() in one function and else{} in another. almost all compilers optimize the if() else{} - so, once the most LIKELY being the else{} - hence code the occasional executable code in if() and the rest in a separate function that's called in else
I have a setup that looks like this.
class Checker
{ // member data
Results m_results; // see below
public:
bool Check();
private:
bool Check1();
bool Check2();
// .. so on
};
Checker is a class that performs lengthy check computations for engineering analysis. Each type of check has a resultant double that the checker stores. (see below)
bool Checker::Check()
{ // initilisations etc.
Check1();
Check2();
// ... so on
}
A typical Check function would look like this:
bool Checker::Check1()
{ double result;
// lots of code
m_results.SetCheck1Result(result);
}
And the results class looks something like this:
class Results
{ double m_check1Result;
double m_check2Result;
// ...
public:
void SetCheck1Result(double d);
double GetOverallResult()
{ return max(m_check1Result, m_check2Result, ...); }
};
Note: all code is oversimplified.
The Checker and Result classes were initially written to perform all checks and return an overall double result. There is now a new requirement where I only need to know if any of the results exceeds 1. If it does, subsequent checks need not be carried out(it's an optimisation). To achieve this, I could either:
Modify every CheckN function to keep check for result and return. The parent Check function would keep checking m_results. OR
In the Results::SetCheckNResults(), throw an exception if the value exceeds 1 and catch it at the end of Checker::Check().
The first is tedious, error prone and sub-optimal because every CheckN function further branches out into sub-checks etc.
The second is non-intrusive and quick. One disadvantage is I can think of is that the Checker code may not necessarily be exception-safe(although there is no other exception being thrown anywhere else). Is there anything else that's obvious that I'm overlooking? What about the cost of throwing exceptions and stack unwinding?
Is there a better 3rd option?
I don't think this is a good idea. Exceptions should be limited to, well, exceptional situations. Yours is a question of normal control flow.
It seems you could very well move all the redundant code dealing with the result out of the checks and into the calling function. The resulting code would be cleaner and probably much easier to understand than non-exceptional exceptions.
Change your CheckX() functions to return the double they produce and leave dealing with the result to the caller. The caller can more easily do this in a way that doesn't involve redundancy.
If you want to be really fancy, put those functions into an array of function pointers and iterate over that. Then the code for dealing with the results would all be in a loop. Something like:
bool Checker::Check()
{
for( std::size_t id=0; idx<sizeof(check_tbl)/sizeof(check_tbl[0]); ++idx ) {
double result = check_tbl[idx]();
if( result > 1 )
return false; // or whichever way your logic is (an enum might be better)
}
return true;
}
Edit: I had overlooked that you need to call any of N SetCheckResultX() functions, too, which would be impossible to incorporate into my sample code. So either you can shoehorn this into an array, too, (change them to SetCheckResult(std::size_t idx, double result)) or you would have to have two function pointers in each table entry:
struct check_tbl_entry {
check_fnc_t checker;
set_result_fnc_t setter;
};
check_tbl_entry check_tbl[] = { { &Checker::Check1, &Checker::SetCheck1Result }
, { &Checker::Check2, &Checker::SetCheck2Result }
// ...
};
bool Checker::Check()
{
for( std::size_t id=0; idx<sizeof(check_tbl)/sizeof(check_tbl[0]); ++idx ) {
double result = check_tbl[idx].checker();
check_tbl[idx].setter(result);
if( result > 1 )
return false; // or whichever way your logic is (an enum might be better)
}
return true;
}
(And, no, I'm not going to attempt to write down the correct syntax for a member function pointer's type. I've always had to look this up and still never ot this right the first time... But I know it's doable.)
Exceptions are meant for cases that shouldn't happen during normal operation. They're hardly non-intrusive; their very nature involves unwinding the call stack, calling destructors all over the place, yanking the control to a whole other section of code, etc. That stuff can be expensive, depending on how much of it you end up doing.
Even if it were free, though, using exceptions as a normal flow control mechanism is a bad idea for one other, very big reason: exceptions aren't meant to be used that way, so people don't use them that way, so they'll be looking at your code and scratching their heads trying to figure out why you're throwing what looks to them like an error. Head-scratching usually means you're doing something more "clever" than you should be.