Avoiding Checking likely if - c++

Given the following:
class ReadWrite {
public:
int Read(size_t address);
void Write(size_t address, int val);
private:
std::map<size_t, int> db;
}
In read function when accessing an address which no previous write was made to I want to either throw exception designating such error or allow that and return 0, in other words I would like to either use std::map<size_t, int>::operator[]() or std::map<size_t, int>::at(), depending on some bool value which user can set. So I add the following:
class ReadWrite {
public:
int Read(size_t add) { if (allow) return db[add]; return db.at(add);}
void Write(size_t add, int val) { db[add] = val; }
void Allow() { allow = true; }
private:
bool allow = false;
std::map<size_t, int> db;
}
The problem with that is:
Usually, the program will have one call of allow or none at the beginning of the program and then afterwards many accesses. So, performance wise, this code is bad because it every-time performs the check if (allow) where usually it's either always true or always false.
So how would you solve such problem?
Edit:
While the described use case (one or none Allow() at first) of this class is very likely it's not definite and so I must allow user call Allow() dynamically.
Another Edit:
Solutions which use function pointer: What about the performance overhead incurred by using function pointer which is not able to make inline by the compiler? If we use std::function instead will that solve the issue?

Usually, the program will have one call of allow or none at the
beginning of the program and then afterwards many accesses. So,
performance wise, this code is bad because it every-time performs the
check if (allow) where usually it's either always true or always
false. So how would you solve such problem?
I won't, The CPU will.
the Branch Prediction will figure out that the answer is most likely to be same for some long time so it will able to optimize the branch in the hardware level very much. it will still incur some overhead, but very negligible.
If you really need to optimize your program, I think your better use std::unordered_map instead of std::map, or move to some faster map implementation, like google::dense_hash_map. the branch is insignificant compared to map-lookup.

If you want to decrease the time-cost, you have to increase the memory-cost. Accepting that, you can do this with a function pointer. Below is my answer:
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
// when allowed, make the function pointer point to read2
void Allow() { Read = &ReadWrite::read2;}
//function pointer that points to read1 by default
int (ReadWrite::*Read)(size_t) = &ReadWrite::read1;
private:
int read1(size_t add){return db.at(add);}
int read2(size_t add) {return db[add];}
std::map<size_t, int> db;
};
The function pointer can be called as the other member functions. As an example:
ReadWrite rwObject;
//some code here
//...
rwObject.Read(5); //use of function pointer
//
Note that non-static data member initialization is available with c++11, so the int (ReadWrite::*Read)(size_t) = &ReadWrite::read1; may not compile with older versions. In that case, you have to explicitly declare one constructor, where the initialization of the function pointer can be done.

You can use a pointer to function.
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
int Read(size_t add) { (this->*Rfunc)(add); }
void Allow() { Rfunc = &ReadWrite::Read2; }
private:
std::map<size_t, int> db;
int Read1(size_t add) { return db.at(add); }
int Read2(size_t add) { return db[add]; }
int (ReadWrite::*Rfunc)(size_t) = &ReadWrite::Read1;
}

If you want runtime dynamic behaviour you'll have to pay for it at runtime (at the point you want your logic to behave dynamically).
You want different behaviour at the point where you call Read depending on a runtime condition and you'll have to check that condition.
No matter whether your overhad is a function pointer call or a branch, you'll find a jump or call to different places in your program depending on allow at the point Read is called by the client code.
Note: Profile and fix real bottlenecks - not suspected ones. (You'll learn more if you profile by either having your suspicion confirmed or by finding out why your assumption about the performance was wrong.)

Related

Unchecked read from a map in a const function

Suppose the following:
struct C {
... // lots of other stuff
int get(int key) const { return m.at(key); } // This will never throw
private:
std::unordered_map<int, int> m;
};
Due to how the application works, I know that get never throws. I want to make get as fast as possible. So, I would like to make the access unchecked, i.e. I would like to write something like return m[key]. Of course, I cannot write exactly that while keeping get const. However, I want to keep get const, since it is logically const.
Here is the only (ugly) solution I came up with:
struct C {
... // lots of other stuff
int get(int key) const { return const_cast<C *>(this)->m[key]; }
private:
std::unordered_map<int, int> m;
};
Is there a better way?
One approach would be to use std::unordered_map::find:
struct C {
... // lots of other stuff
int get(int key) const { return m.find(key)->second; }
private:
std::unordered_map<int, int> m;
};
I object to the very reasoning behind this question. The overhead (of map.at() vs map[]) associated with catching an error due to unknown key is presumably tiny compared to the cost of finding the key in the first place.
Yet, you willingly take the serious risk of a run-time error just for such a marginal efficiency advantage that you presumably have not even validated/measured. You may think that you know that key is always contained in the map, but perhaps future code changes (including bugs introduced by others) may change that?
If you really know, then you should use
map.find(key)->second;
which makes the bug explicit if the iterator returned is invalid (i.e. equal to map.end()). You may use assert in pre-production code, i.e.
auto it = map.find(key);
assert(it!=map.end());
return it->second;
which in production code (when assert is an empty macro) is removed.

Is locking a dereferenced mutex bad behaviour?

c++ pseudocode class:
Simple class which has a member variable, and mutex to control access to it.
I'm curious about the pro's and con's of managing the data and it's access.
In a multithreaded enviroment, is it wrong to use the approach to accessing and locking the member mutex in cbMethodA()?
I've seen samples where the members are accessed directly, and it seems incorrect to do that. The class exposes access via a public method for a reason.
Also, dereferencing a mutex to then lock it doesn't seem like best practice. Any comments?
Thanks
class A
{
public:
A():val(0);
~A();
int getVal(void);
static void cbMethodA();
static void cbMethodB();
private:
Mutex m_mutex;
int val;
}
int A::getVal(){
{
int returnVal = 0;
lockMutex(m_mutex);
returnVal = m_val;
unlock(mutex);
return returnVal;
}
void A::cbMethodA(void *ptr)
{
A* ptr = static_cast<A*> (ptr);
//get val
lockMutex(ptr->m_mutex);
//read val
int tempVal = ptr->m_val;
unlockMutex(ptr->m_mutex);
//do something with data
}
void A::cbMethodB(void *ptr)
{
A* ptr = static_cast<A*> (ptr);
//get val
int tempVal = ptr->getVal();
//process val....
}
This seems like a direct application of SPOT (Single Point Of Truth), a.k.a. DRY (Don't Repeat Yourself), two names for a single important idea. You've created a function for accessing val that performs some tasks that should always go along with accessing it. Unless there is some private, implementation-specific reason to access the member field directly, you should probably use the getter method you define. That way, if you change the synchronization mechanism that protects val, you only need to update one piece of code.
I can't think of any reason why "dereferencing a mutex to lock it" would be a bad thing, repeating yourself is a bad thing.

Watching write from application at specified memory pointer received from DLL

I'll try to explain my problem as best I can. So, I have application written in C/C++ (language in client application doesn't matter), which is importing one function from DLL for example uint32_t* GetMemoryPointer(). Then it does write in sequences to this memory pointer like this:
uint32_t* ptr = (uint32_t*)GetMemoryPointer();
*ptr = 3;
*ptr = 4;
*ptr = 1050;
It does this sequence without any information for DLL that value was changed. Is it possible to watch this value in DLL? I tried to make a thread and in loop seek changes but it's not reliable. Is there a better solution? I'm interested in doing this way: application writes, DLL finds that value was changed, HOLDS application execution then is interpreting this value then ALLOW application to continue execution. Another way without holding application might be pushing on the stack new value but I need to be informed on every change. The platform I'm interested in is Windows. Language doesn't matter may be C or C++. Is it possible to achieve this? It's really important for me and I'm out of ideas. I don't want code but I would like to be informed if it is possible and in which way I need to go. Thanks in advance.
One options is to implement a Value type that holds the actual data to be monitored and use the observer pattern to dispatch notifications when the value changes. Start with a simple implementation that holds a value of the desired type (uint32_t in this case) along with an assignment operator that invokes callbacks any time the operator changes the value.
The example below does just that and includes a conversion operator to allow a fair amount of operations to be performed with other uint32_t values. You can expand on this to meet your requirements including providing a full set of operators (operator+, operator/, etc.) to make it a bit more robust.
#include <iostream>
#include <vector>
#include <cstdint>
class Value
{
uint32_t value;
std::vector<void(*)()> observers;
public:
Value() : value(0) {}
// Allows you to register a observer that gets called when
// the value changes
void RegisterListener(void (*f)())
{
observers.push_back(f);
}
// Conversion operator that allows implicit conversions
// from Value to uint32_t.
operator uint32_t() const
{
return value;
}
Value& operator=(uint32_t newValue)
{
// Only alert observers if the value is actually changing.
if (value != newValue)
{
value = newValue;
for (std::vector<void(*)()>::const_iterator it = observers.begin();
it != observers.end();
++it)
{
// Call the observer
(*it)();
}
}
return *this;
}
};
void Callback()
{
std::cout << "value changed\n";
}
int main()
{
Value value;
value.RegisterListener(Callback);
// Value held in object can be assigned to a uint32_t due to the
// conversion operator.
uint32_t original = value;
// Change the value see the callback get invoked
value = value + 1;
// Restore the value to it's original and see the callback get invoked.
value = original;
}
Well on the top of my head, if you could mark the memory as read-only and when ever someone tries to write to it OS will throw an exception/error, you have to catch it. I don't know whether any libraries exists for this, so try googling it.

Inline function pointer to avoid if statement

In my jpg decoder I have a loop with an if statement that will always be true or always be false depending on the image. I could make two separate functions to avoid the if statement but I was wondering out of curiosity what the effect on efficiency would be using a function pointer instead of the if statement. It will point to the inline function if true or point to an empty inline function if false.
class jpg{
private:
// emtpy function
void inline nothing();
// real function
void inline function();
// pointer to inline function
void (jpg::*functionptr)() = nullptr;
}
jpg::nothing(){}
main(){
functionptr = &jpg::nothing;
if(trueorfalse){
functionptr = &jpg::function;
}
while(kazillion){
(this->*functionptr)();
dootherstuff();
}
}
Could this be faster than an if statement? My guess is no, because the inline will be useless since the compiler won't know which function to inline at compile time and the function pointer address resolve is slower than an if statement.
I have profiled my program and while I expected a noticeable difference one way or the other when I ran my program... I did not experience a noticeable difference. So I'm just wondering out of curiosity.
It is very likely that the if statement would be faster than invoking a function, as the if will just be a short jump vs the overhead of a function call.
This has been discussed here: Which one is faster ? Function call or Conditional if Statement?
The "inline" keyword is just a hint to the compiler to tell it to try to put the instructions inline when assembling it. If you use a function pointer to an inline, the inline optimization cannot be used anyway:
Read: Do inline functions have addresses?
If you feel that the if statement is slowing it too much, you could eliminate it altogether by using separate while statements:
if (trueorfalse) {
while (kazillion) {
trueFunction();
dootherstuff();
}
} else {
while (kazillion) {
dootherstuff();
}
}
Caution 1: I am not really answering the above question, on purpose. If one wants to know what it faster between an if statement and a function call via a pointer in the above example, then mbonneau gives a very good answer.
Caution 2: The following is pseudo-code.
Besides curiosity, I truly think one should not ask himself what is faster between an if statement and a function call to optimize his code. The gain would certainly be very small, and the resulting code might be twisted in such a way it could impact readability AND maintenance.
For my research, I do care about performance, this is a fundamental notion I have to stick with. But I do more care about code maintenance, and if I have to choose between a good structure and a slight optimization, I definitely choose the good structure. Then, if it was me, I would write the above code as follows (avoiding if statements), using composition through a Strategy Pattern.
class MyStrategy {
public:
virtual void MyFunction( Stuff& ) = 0;
};
class StrategyOne : public MyStrategy {
public:
void MyFunction( Stuff& ); // do something
};
class StrategyTwo : public MyStrategy {
public:
void MyFunction( Stuff &stuff ) { } // do nothing, and if you
// change your mind it could
// do something later.
};
class jpg{
public:
jpg( MyStrategy& strat) : strat(strat) { }
void func( Stuff &stuff ) { return strat.MyFunction( stuff ); }
private:
...
MyStrategy strat;
}
main(){
jpg a( new StrategyOne );
jpg b( new StrategyTwo );
vector<jpg> v { a, b };
for( auto e : v )
{
e.func();
dootherstuff();
}
}

optimize output value using a class and public member

Suppose you have a function, and you call it a lot of times, every time the function return a big object. I've optimized the problem using a functor that return void, and store the returning value in a public member:
#include <vector>
const int N = 100;
std::vector<double> fun(const std::vector<double> & v, const int n)
{
std::vector<double> output = v;
output[n] *= output[n];
return output;
}
class F
{
public:
F() : output(N) {};
std::vector<double> output;
void operator()(const std::vector<double> & v, const int n)
{
output = v;
output[n] *= n;
}
};
int main()
{
std::vector<double> start(N,10.);
std::vector<double> end(N);
double a;
// first solution
for (unsigned long int i = 0; i != 10000000; ++i)
a = fun(start, 2)[3];
// second solution
F f;
for (unsigned long int i = 0; i != 10000000; ++i)
{
f(start, 2);
a = f.output[3];
}
}
Yes, I can use inline or optimize in an other way this problem, but here I want to stress on this problem: with the functor I declare and construct the output variable output only one time, using the function I do that every time it is called. The second solution is two time faster than the first with g++ -O1 or g++ -O2. What do you think about it, is it an ugly optimization?
Edit:
to clarify my aim. I have to evaluate the function >10M times, but I need the output only few random times. It's important that the input is not changed, in fact I declared it as a const reference. In this example the input is always the same, but in real world the input change and it is function of the previous output of the function.
More common scenario is to create object with reserved large enough size outside the function and pass large object to the function by pointer or by reference. You could reuse this object on several calls to your function. Thus you could reduce continual memory allocation.
In both cases you are allocating new vector many many times.
What you should do is to pass both input and output objects to your class/function:
void fun(const std::vector<double> & in, const int n, std::vector<double> & out)
{
out[n] *= in[n];
}
this way you separate your logic from the algorithm. You'll have to create a new std::vector once and pass it to the function as many time as you want. Notice that there's unnecessary no copy/allocation made.
p.s. it's been awhile since I did c++. It may not compile right away.
It's not an ugly optimization. It's actually a fairly decent one.
I would, however, hide output and make an operator[] member to access its members. Why? Because you just might be able to perform a lazy evaluation optimization by moving all the math to that function, thus only doing that math when the client requests that value. Until the user asks for it, why do it if you don't need to?
Edit:
Just checked the standard. Behavior of the assignment operator is based on insert(). Notes for that function state that an allocation occurs if new size exceeds current capacity. Of course this does not seem to explicitly disallow an implementation from reallocating even if otherwise...I'm pretty sure you'll find none that do and I'm sure the standard says something about it somewhere else. Thus you've improved speed by removing allocation calls.
You should still hide the internal vector. You'll have more chance to change implementation if you use encapsulation. You could also return a reference (maybe const) to the vector from the function and retain the original syntax.
I played with this a bit, and came up with the code below. I keep thinking there's a better way to do this, but it's escaping me for now.
The key differences:
I'm allergic to public member variables, so I made output private, and put getters around it.
Having the operator return void isn't necessary for the optimization, so I have it return the value as a const reference so we can preserve return value semantics.
I took a stab at generalizing the approach into a templated base class, so you can then define derived classes for a particular return type, and not re-define the plumbing. This assumes the object you want to create takes a one-arg constructor, and the function you want to call takes in one additional argument. I think you'd have to define other templates if this varies.
Enjoy...
#include <vector>
template<typename T, typename ConstructArg, typename FuncArg>
class ReturnT
{
public:
ReturnT(ConstructArg arg): output(arg){}
virtual ~ReturnT() {}
const T& operator()(const T& in, FuncArg arg)
{
output = in;
this->doOp(arg);
return this->getOutput();
}
const T& getOutput() const {return output;}
protected:
T& getOutput() {return output;}
private:
virtual void doOp(FuncArg arg) = 0;
T output;
};
class F : public ReturnT<std::vector<double>, std::size_t, const int>
{
public:
F(std::size_t size) : ReturnT<std::vector<double>, std::size_t, const int>(size) {}
private:
virtual void doOp(const int n)
{
this->getOutput()[n] *= n;
}
};
int main()
{
const int N = 100;
std::vector<double> start(N,10.);
double a;
// second solution
F f(N);
for (unsigned long int i = 0; i != 10000000; ++i)
{
a = f(start, 2)[3];
}
}
It seems quite strange(I mean the need for optimization at all) - I think that a decent compiler should perform return value optimization in such cases. Maybe all you need is to enable it.