C++ Returning a Count - c++

Can someone help me understand what does it mean to return a count?
I know the function involves a for-loop or a while-loop depending on the situation but I'm having trouble understanding the concept. I'll try to be as clear as possible.
Here's an example: I have two functions from a cryptography class: lock() and unlock(). They are polymorphic and they take no parameter and return no value. Does that mean the functions itself are blank?
And then I have another function encryptionLvl(). This one takes no parameter but they should return a count of the current number of encryption level, and that number should be incremented each time lock() is called and decremented each time unlock() is call.
How do I make this work?
Sorry if I'm confusing you. I'm a beginner at this programming, but I appreciate the effort.

A function can take no parameter and return void. You can think of such a function as a procedure. Although there is not return value, a procedure can affect your program through side effects. Like this:
class Counter {
int i;
public:
void increment() { i = i + 1; }
void reset() {
i = 0;
return; // The empty return statement is optional
}
int get() { return i; }
}
In your case, lock() does not return a value, but it increments the counter. unlock() does the opposite. encryptionLvl() is equivalent to get().

Related

Calling map function (Indirection requires pointer operand) [duplicate]

I'm trying to make a table of function pointers within a class. I haven't been able to find any examples of this online, most involve using member function pointers outside of their class.
for example:
class Test
{
typedef void (Test::*FunctionType)();
FunctionType table[0x100];
void TestFunc()
{
}
void FillTable()
{
for(int i = 0; i < 0x100; i++)
table[i] = &Test::TestFunc;
}
void Execute(int which)
{
table[which]();
}
}test;
Gives me the error "term does not evaluate to a function taking 0 arguments".
In this line in the Execute function:
table[which]();
You can't call it like that because it's not a normal function. You have to provide it with an object on which to operate, because it's a pointer to a member function, not a pointer to a function (there's a difference):
(this->*table[which])();
That will make the invoking object whichever object is pointed to by the this pointer (the one that's executing Execute).
Also, when posting errors, make sure to include the line on which the error occurs.
Seth has the right answer. Next time, look up the compiler error number on MSDN and you'll see the same: Compiler Error C2064.
You need a context in which to call your function. In your case, the context is this:
void Execute(int which)
{
(this->*table[which])();
}

Avoiding Checking likely if

Given the following:
class ReadWrite {
public:
int Read(size_t address);
void Write(size_t address, int val);
private:
std::map<size_t, int> db;
}
In read function when accessing an address which no previous write was made to I want to either throw exception designating such error or allow that and return 0, in other words I would like to either use std::map<size_t, int>::operator[]() or std::map<size_t, int>::at(), depending on some bool value which user can set. So I add the following:
class ReadWrite {
public:
int Read(size_t add) { if (allow) return db[add]; return db.at(add);}
void Write(size_t add, int val) { db[add] = val; }
void Allow() { allow = true; }
private:
bool allow = false;
std::map<size_t, int> db;
}
The problem with that is:
Usually, the program will have one call of allow or none at the beginning of the program and then afterwards many accesses. So, performance wise, this code is bad because it every-time performs the check if (allow) where usually it's either always true or always false.
So how would you solve such problem?
Edit:
While the described use case (one or none Allow() at first) of this class is very likely it's not definite and so I must allow user call Allow() dynamically.
Another Edit:
Solutions which use function pointer: What about the performance overhead incurred by using function pointer which is not able to make inline by the compiler? If we use std::function instead will that solve the issue?
Usually, the program will have one call of allow or none at the
beginning of the program and then afterwards many accesses. So,
performance wise, this code is bad because it every-time performs the
check if (allow) where usually it's either always true or always
false. So how would you solve such problem?
I won't, The CPU will.
the Branch Prediction will figure out that the answer is most likely to be same for some long time so it will able to optimize the branch in the hardware level very much. it will still incur some overhead, but very negligible.
If you really need to optimize your program, I think your better use std::unordered_map instead of std::map, or move to some faster map implementation, like google::dense_hash_map. the branch is insignificant compared to map-lookup.
If you want to decrease the time-cost, you have to increase the memory-cost. Accepting that, you can do this with a function pointer. Below is my answer:
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
// when allowed, make the function pointer point to read2
void Allow() { Read = &ReadWrite::read2;}
//function pointer that points to read1 by default
int (ReadWrite::*Read)(size_t) = &ReadWrite::read1;
private:
int read1(size_t add){return db.at(add);}
int read2(size_t add) {return db[add];}
std::map<size_t, int> db;
};
The function pointer can be called as the other member functions. As an example:
ReadWrite rwObject;
//some code here
//...
rwObject.Read(5); //use of function pointer
//
Note that non-static data member initialization is available with c++11, so the int (ReadWrite::*Read)(size_t) = &ReadWrite::read1; may not compile with older versions. In that case, you have to explicitly declare one constructor, where the initialization of the function pointer can be done.
You can use a pointer to function.
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
int Read(size_t add) { (this->*Rfunc)(add); }
void Allow() { Rfunc = &ReadWrite::Read2; }
private:
std::map<size_t, int> db;
int Read1(size_t add) { return db.at(add); }
int Read2(size_t add) { return db[add]; }
int (ReadWrite::*Rfunc)(size_t) = &ReadWrite::Read1;
}
If you want runtime dynamic behaviour you'll have to pay for it at runtime (at the point you want your logic to behave dynamically).
You want different behaviour at the point where you call Read depending on a runtime condition and you'll have to check that condition.
No matter whether your overhad is a function pointer call or a branch, you'll find a jump or call to different places in your program depending on allow at the point Read is called by the client code.
Note: Profile and fix real bottlenecks - not suspected ones. (You'll learn more if you profile by either having your suspicion confirmed or by finding out why your assumption about the performance was wrong.)

An attempt to create atomic reference counting is failing with deadlock. Is this the right approach?

So I'm attempting to create copy-on-write map that uses an attempt at atomic reference counting on the read-side to not have locking.
Something isn't quite right. I see some references getting over-incremented and some are going down negative, so something isn't really atomic. In my tests I have 10 reader threads looping 100 times each doing a get() and 1 writer thread doing 100 writes.
It gets stuck in the writer because some of the references never go down to zero, even though they should.
I'm attempting to use the 128-bit DCAS technique laid explained by this blog.
Is there something blatantly wrong with this or is there an easier way to debugging this rather than playing with it in the debugger?
typedef std::unordered_map<std::string, std::string> StringMap;
static const int zero = 0; //provides an l-value for asm code
class NonBlockingReadMapCAS {
public:
class OctaWordMapWrapper {
public:
StringMap* fStringMap;
//std::atomic<int> fCounter;
int64_t fCounter;
OctaWordMapWrapper(OctaWordMapWrapper* copy) : fStringMap(new StringMap(*copy->fStringMap)), fCounter(0) { }
OctaWordMapWrapper() : fStringMap(new StringMap), fCounter(0) { }
~OctaWordMapWrapper() {
delete fStringMap;
}
/**
* Does a compare and swap on an octa-word - in this case, our two adjacent class members fStringMap
* pointer and fCounter.
*/
static bool inline doubleCAS(OctaWordMapWrapper* target, StringMap* compareMap, int64_t compareCounter, StringMap* swapMap, int64_t swapCounter ) {
bool cas_result;
__asm__ __volatile__
(
"lock cmpxchg16b %0;" // cmpxchg16b sets ZF on success
"setz %3;" // if ZF set, set cas_result to 1
: "+m" (*target),
"+a" (compareMap), //compare target's stringmap pointer to compareMap
"+d" (compareCounter), //compare target's counter to compareCounter
"=q" (cas_result) //results
: "b" (swapMap), //swap target's stringmap pointer with swapMap
"c" (swapCounter) //swap target's counter with swapCounter
: "cc", "memory"
);
return cas_result;
}
OctaWordMapWrapper* atomicIncrementAndGetPointer()
{
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
return this;
else
return NULL;
}
OctaWordMapWrapper* atomicDecrement()
{
while(true) {
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter -1))
break;
}
return this;
}
bool atomicSwapWhenNotReferenced(StringMap* newMap)
{
return doubleCAS(this, this->fStringMap, zero, newMap, 0);
}
}
__attribute__((aligned(16)));
std::atomic<OctaWordMapWrapper*> fReadMapReference;
pthread_mutex_t fMutex;
NonBlockingReadMapCAS() {
fReadMapReference = new OctaWordMapWrapper();
}
~NonBlockingReadMapCAS() {
delete fReadMapReference;
}
bool contains(const char* key) {
std::string keyStr(key);
return contains(keyStr);
}
bool contains(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
bool result = map->fStringMap->count(key) != 0;
map->atomicDecrement();
return result;
}
std::string get(const char* key) {
std::string keyStr(key);
return get(keyStr);
}
std::string get(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
//std::cout << "inc " << map->fStringMap << " cnt " << map->fCounter << "\n";
std::string value = map->fStringMap->at(key);
map->atomicDecrement();
return value;
}
void put(const char* key, const char* value) {
std::string keyStr(key);
std::string valueStr(value);
put(keyStr, valueStr);
}
void put(std::string &key, std::string &value) {
pthread_mutex_lock(&fMutex);
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);
std::pair<std::string, std::string> kvPair(key, value);
newWrapper->fStringMap->insert(kvPair);
fReadMapReference.store(newWrapper);
std::cout << oldWrapper->fCounter << "\n";
while (oldWrapper->fCounter > 0);
delete oldWrapper;
pthread_mutex_unlock(&fMutex);
}
void clear() {
pthread_mutex_lock(&fMutex);
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);
fReadMapReference.store(newWrapper);
while (oldWrapper->fCounter > 0);
delete oldWrapper;
pthread_mutex_unlock(&fMutex);
}
};
Maybe not the answer but this looks suspicious to me:
while (oldWrapper->fCounter > 0);
delete oldWrapper;
You could have a reader thread just entering atomicIncrementAndGetPointer() when the counter is 0 thus pulling the rug underneath the reader thread by deleting the wrapper.
Edit to sum up the comments below for potential solution:
The best implementation I'm aware of is to move fCounter from OctaWordMapWrapper to fReadMapReference (You don't need the OctaWordMapWrapper class at all actually). When the counter is zero swap the pointer in your writer. Because you can have high contention of reader threads which essentially blocks the writer indefinitely you can have highest bit of fCounter allocated for reader lock, i.e. while this bit is set the readers spin until the bit is cleared. The writer sets this bit (__sync_fetch_and_or()) when it's about to change the pointer, waits for the counter to fall down to zero (i.e. existing readers finish their work) and then swap the pointer and clears the bit.
This approach should be waterproof, though it's obviously blocking readers upon writes. I don't know if this is acceptable in your situation and ideally you would like this to be non-blocking.
The code would look something like this (not tested!):
class NonBlockingReadMapCAS
{
public:
NonBlockingReadMapCAS() :m_ptr(0), m_counter(0) {}
private:
StringMap *acquire_read()
{
while(1)
{
uint32_t counter=atom_inc(m_counter);
if(!(counter&0x80000000))
return m_ptr;
atom_dec(m_counter);
while(m_counter&0x80000000);
}
return 0;
}
void release_read()
{
atom_dec(m_counter);
}
void acquire_write()
{
uint32_t counter=atom_or(m_counter, 0x80000000);
assert(!(counter&0x80000000));
while(m_counter&0x7fffffff);
}
void release_write()
{
atom_and(m_counter, uint32_t(0x7fffffff));
}
StringMap *volatile m_ptr;
volatile uint32_t m_counter;
};
Just call acquire/release_read/write() before & after accessing the pointer for read/write. Replace atom_inc/dec/or/and() with __sync_fetch_and_add(), __sync_fetch_and_sub(), __sync_fetch_and_or() and __sync_fetch_and_and() respectively. You don't need doubleCAS() for this actually.
As noted correctly by #Quuxplusone in a comment below this is single producer & multiple consumer implementation. I modified the code to assert properly to enforce this.
Well, there are probably lots of problems, but here are the obvious two.
The most trivial bug is in atomicIncrementAndGetPointer. You wrote:
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
That is, you're attempting to increment this->fCounter in a lock-free way. But it doesn't work, because you're fetching the old value twice with no guarantee that the same value is read each time. Consider the following sequence of events:
Thread A fetches this->fCounter (with value 0) and computes argument 5 as this->fCounter +1 = 1.
Thread B successfully increments the counter.
Thread A fetches this->fCounter (with value 1) and computes argument 3 as this->fCounter = 1.
Thread A executes doubleCAS(this, this->fStringMap, 1, this->fStringMap, 1). It succeeds, of course, but we've lost the "increment" we were trying to do.
What you wanted is more like
StringMap* oldMap = this->fStringMap;
int64_t oldCounter = this->fCounter;
if (doubleCAS(this, oldMap, oldValue, oldMap, oldValue+1))
...
The other obvious problem is that there's a data race between get and put. Consider the following sequence of events:
Thread A begins to execute get: it fetches fReadMapReference.load() and prepares to execute atomicIncrementAndGetPointer on that memory address.
Thread B finishes executing put: it deletes that memory address. (It is within its rights to do so, because the wrapper's reference count is still at zero.)
Thread A starts executing atomicIncrementAndGetPointer on the deleted memory address. If you're lucky, you segfault, but of course in practice you probably won't.
As explained in the blog post:
The garbage collection interface is omitted, but in real applications you would need to scan the hazard pointers before deleting a node.
Another user has suggested a similar approach, but if you are compiling with gcc (and perhaps with clang), you could use the intrinsic __sync_add_and_fetch_4 which does something similar to what your assembly code does, and is likely much more portable.
I have used it when I implemented refcounting in an Ada library (but the algorithm remains the same).
int __sync_add_and_fetch_4 (int* ptr, int value);
// increments the value pointed to by ptr by value, and returns the new value
Although I'm not sure how your reader threads work, I suspect your problem is that you are not catching and handling possible out_of_range exceptions in your get() method that might arise from this line: std::string value = map->fStringMap->at(key);. Note that if key is not found in the map, this will throw, and exit the function without decrementing the counter, which would lead to the condition you describe (of getting stuck in the while-loop within the writer thread while waiting for the counters to decrement).
In any event, whether this is the cause of the issues you're seeing or not, you definitely need to either handle this exception (and any others) or modify your code such that there's no risk of a throw. For the at() method, I would probably just use find() instead, and then check the iterator it returns. However, more generally, I would suggest using the RAII pattern to ensure that you don't let any unexpected exceptions escape without unlocking/decrementing. For example, you might check out boost::scoped_lock to wrap your fMutex and then write something simple like this for the OctaWordMapWrapper increment/decrement:
class ScopedAtomicMapReader
{
public:
explicit ScopedAtomicMapReader(std::atomic<OctaWordMapWrapper*>& map) : fMap(NULL) {
do {
fMap = map.load()->atomicIncrementAndGetPointer();
} while (NULL == fMap);
}
~ScopedAtomicMapReader() {
if (NULL != fMap)
fMap->atomicDecrement();
}
OctaWordMapWrapper* map(void) {
return fMap;
}
private:
OctaWordMapWrapper* fMap;
}; // class ScopedAtomicMapReader
With something like that, then for example, your contains() and get() methods would simplify to (and be immune to exceptions):
bool contains(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return (mapWrapper.map()->fStringMap->count(key) != 0);
}
std::string get(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return mapWrapper.map()->fStringMap->at(key); // Now it's fine if this throws...
}
Finally, although I don't think you should have to do this, you might also try declaring fCounter as volatile as well (given your access to it in the while-loop in the put() method will be on a different thread than the writes to it on the reader threads.
Hope this helps!
By the way, one other minor thing: fReadMapReference is leaking. I think you should delete this in your destructor.

C++ inline function & context specific optimization

I have read in Scott Meyers' Effective C++ book that:
When you inline a function you may enable the compiler to perform context specific optimizations on the body of function. Such optimization would be impossible for normal function calls.
Now the question is: what is context specific optimization and why it is necessary?
I don't think "context specific optimization" is a defined term, but I think it basically means the compiler can analyse the call site and the code around it and use this information to optimise the function.
Here's an example. It's contrived, of course, but it should demonstrate the idea:
Function:
int foo(int i)
{
if (i < 0) throw std::invalid_argument("");
return -i;
}
Call site:
int bar()
{
int i = 5;
return foo(i);
}
If foo is compiled separately, it must contain a comparison and exception-throwing code. If it's inlined in bar, the compiler sees this code:
int bar()
{
int i = 5;
if (i < 0) throw std::invalid_argument("");
return -i;
}
Any sane optimiser will evaluate this as
int bar()
{
return -5;
}
If the compile choose to inline a function, it will replace a function call to this function by the body of the function. It now has more code to optimize inside the caller function body. Therefore, it often leads to better code.
Imagine that:
bool callee(bool a){
if(a) return false;
else return true;
}
void caller(){
if(callee(true)){
//Do something
}
//Do something
}
Once inlined, the code will be like this (approximatively):
void caller(){
bool a = true;
bool ret;
if(a) ret = false;
else ret = true;
if(ret){
//Do something
}
//Do something
}
Which may be optimized further too:
void caller(){
if(false){
//Do something
}
//Do something
}
And then to:
void caller(){
//Do something
}
The function is now much smaller and you don't have the cost of the function call and especially (regarding the question) the cost of branching.
Say the function is
void fun( bool b) { if(b) do_sth1(); else do_sth2(); }
and it is called in the context with pre-defined false parameter
bool param = false;
...
fun( param);
then the compiler may reduce the function body to
...
do_sth2();
I don't think that context specific optimization means something specific and you probably can't find exact definition.
Nice example would be classical getter for some class attributes, without inlining it program has to:
jump to getter body
move value to registry (eax on x86 under windows with default Visual studio settings)
jump back to callee
move value from eax to local variable
While using inlining can skip almost all the work and move value directly to local variable.
Optimizations strictly depend on compiler but lot of think can happen (variable allocation may be skipped, code may get reorder and so on... But you always save call/jump which is expensive instruction.
More reading on optimisation here.

Is qsort thread safe?

I have some old code that uses qsort to sort an MFC CArray of structures but am seeing the occasional crash that may be down to multiple threads calling qsort at the same time. The code I am using looks something like this:
struct Foo
{
CString str;
time_t t;
Foo(LPCTSTR lpsz, time_t ti) : str(lpsz), t(ti)
{
}
};
class Sorter()
{
public:
static void DoSort();
static int __cdecl SortProc(const void* elem1, const void* elem2);
};
...
void Sorter::DoSort()
{
CArray<Foo*, Foo*> data;
for (int i = 0; i < 100; i++)
{
Foo* foo = new Foo("some string", 12345678);
data.Add(foo);
}
qsort(data.GetData(), data.GetCount(), sizeof(Foo*), SortProc);
...
}
int __cdecl SortProc(const void* elem1, const void* elem2)
{
Foo* foo1 = (Foo*)elem1;
Foo* foo2 = (Foo*)elem2;
// 0xC0000005: Access violation reading location blah here
return (int)(foo1->t - foo2->t);
}
...
Sorter::DoSort();
I am about to refactor this horrible code to use std::sort instead but wondered if the above is actually unsafe?
EDIT: Sorter::DoSort is actually a static function but uses no static variables itself.
EDIT2: The SortProc function has been changed to match the real code.
Your problem doesn't necessarily have anything to do with thread saftey.
The sort callback function takes in pointers to each item, not the item itself. Since you are sorting Foo* what you actually want to do is access the parameters as Foo**, like this:
int __cdecl SortProc(const void* elem1, const void* elem2)
{
Foo* foo1 = *(Foo**)elem1;
Foo* foo2 = *(Foo**)elem2;
if(foo1->t < foo2->t) return -1;
else if (foo1->t > foo2->t) return 1;
else return 0;
}
Your SortProc isn't returning correct results, and this likely leads to memory corruption by something assuming that the data is, well, sorted after you get done sorting it. You could even be leading qsort into corruption as it tries to sort, but that of course varies by implementation.
The comparison function for qsort must return negative if the first object is less than the second, zero if they are equal, and positive otherwise. Your current code only ever returns 0 or 1, and returns 1 when you should be returning negative.
int __cdecl Sorter::SortProc(const void* ap, const void* bp) {
Foo const& a = *(Foo const*)ap;
Foo const& b = *(Foo const*)bp;
if (a.t == b.t) return 0;
return (a.t < b.t) ? -1 : 1;
}
C++ doesn't really make any guarantees about thread safety. About the most you can say is that either multiple readers OR a single writer to a data structure will be OK. Any combination of readers and writers, and you need to serialise the access somehow.
Since you tagged your question with MFC tag I suppose you should select Multi-threaded Runtime Library in Project Settings.
Right now, your code is thread-safe, but useless, as the DoSort-method only uses local variables, and doesn't even return anything. If the data you are sorting is a member of Sorter, then it is not safe to call the function from multiple threads. In gerenal, read up on reentrancy, this may give you an idea of what you need to look out for.
what make it thread safe is, whether your object are thread safe, for example to make qsort thread-safe you must ensure that anything that write or read to or from and to the object are thread safe.
The pthreads man page lists the standard functions which are not required to be thread-safe. qsort is not among them, so it is required to be thread-safe in POSIX.
http://www.kernel.org/doc/man-pages/online/pages/man7/pthreads.7.html
I can't find the equivalent list for Windows, though, so this isn't really an answer to your question. I'd be a bit surprised if it was different.
Be aware what "thread-safe" means in this context, though. It means you can call the same function concurrently on different arrays -- it doesn't mean that concurrent access to the same data via qsort is safe (it isn't).
As a word of warning, you may find std::sort is not as fast as qsort. If you do find that try std::stable_sort.
I once wrote a BWT compressor based on the code presented my Mark Nelson in Dr Dobbs and when I turned it into classes I found that regular sort was a lot slower. stable_sort fixed the speed problems.