Are C++ exceptions sufficient to implement thread-local storage? - c++

I was commenting on an answer that thread-local storage is nice and recalled another informative discussion about exceptions where I supposed
The only special thing about the
execution environment within the throw
block is that the exception object is
referenced by rethrow.
Putting two and two together, wouldn't executing an entire thread inside a function-catch-block of its main function imbue it with thread-local storage?
It seems to work fine, albeit slowly. Is this novel or well-characterized? Is there another way of solving the problem? Was my initial premise correct? What kind of overhead does get_thread incur on your platform? What's the potential for optimization?
#include <iostream>
#include <pthread.h>
using namespace std;
struct thlocal {
string name;
thlocal( string const &n ) : name(n) {}
struct thread_exception_base {
thlocal &th;
thread_exception_base( thlocal &in_th ) : th( in_th ) {}
thread_exception_base( thread_exception_base const &in ) : th( ) {}
thlocal &get_thread() throw() {
try {
} catch( thread_exception_base &local ) {
void print_thread() {
cerr << get_thread().name << endl;
void *kid( void *local_v ) try {
thlocal &local = * static_cast< thlocal * >( local_v );
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
return NULL;
int main() {
thlocal local( "main" );
try {
throw thread_exception_base( local );
} catch( thread_exception_base & ) {
pthread_t th;
thlocal kid_local( "kid" );
pthread_create( &th, NULL, &kid, &kid_local );
pthread_join( th, NULL );
return 0;
This does require defining new exception classes derived from thread_exception_base, initializing the base with get_thread(), but altogether this doesn't feel like an unproductive insomnia-ridden Sunday morning…
EDIT: Looks like GCC makes three calls to pthread_getspecific in get_thread. EDIT: and a lot of nasty introspection into the stack, environment, and executable format to find the catch block I missed on the first walkthrough. This looks highly platform-dependent, as GCC is calling some libunwind from the OS. Overhead on the order of 4000 cycles. I suppose it also has to traverse the class hierarchy but that can be kept under control.

In the playful spirit of the question, I offer this horrifying nightmare creation:
class tls
void push(void *ptr)
// allocate a string to store the hex ptr
// and the hex of its own address
char *str = new char[100];
sprintf(str, " |%x|%x", ptr, str);
strtok(str, "|");
template <class Ptr>
Ptr *next()
// retrieve the next pointer token
return reinterpret_cast<Ptr *>(strtoul(strtok(0, "|"), 0, 16));
void *pop()
// retrieve (and forget) a previously stored pointer
void *ptr = next<void>();
delete[] next<char>();
return ptr;
// private constructor/destructor
tls() { push(0); }
~tls() { pop(); }
static tls &singleton()
static tls i;
return i;
void *set(void *ptr)
void *old = pop();
return old;
void *get()
// forget and restore on each access
void *ptr = pop();
return ptr;
Taking advantage of the fact that according to the C++ standard, strtok stashes its first argument so that subsequent calls can pass 0 to retrieve further tokens from the same string, so therefore in a thread-aware implementation it must be using TLS.
example *e = new example;
example *e2 = reinterpret_cast<example *>(tls::singleton().get());
So as long as strtok is not used in the intended way anywhere else in the program, we have another spare TLS slot.

I think you're onto something here. This might even be a portable way to get data into callbacks that don't accept a user "state" variable, as you've mentioned, even apart from any explicit use of threads.
So it sounds like you've answered the question in your subject: YES.

void *kid( void *local_v ) try {
thlocal &local = * static_cast< thlocal * >( local_v );
throw local;
} catch( thlocal & ) {
return NULL;
void *kid (void *local_v ) { print_thread(local_v); }
I might be missing something here, but it's not a thread local storage, just unnecessarily complicated argument passing. Argument is different for each thread only because it is passed to pthread_create, not because of any exception juggling.
It turned out that I indeed was missing that GCC is producing actual thread local storage calls in this example. It actually makes the issue interesting. I'm still not quite sure whether it is a case for other compilers, and how is it different from calling thread storage directly.
I still stand by my general argument that the same data can be accessed in a more simple and straight-forward way, be it arguments, stack walking or thread local storage.

Accessing data on the current function call stack is always thread safe. That's why your code is thread safe, not because of the clever use of exceptions. Thread local storage allows us to store per-thread data and reference it outside of the immediate call stack.


An attempt to create atomic reference counting is failing with deadlock. Is this the right approach?

So I'm attempting to create copy-on-write map that uses an attempt at atomic reference counting on the read-side to not have locking.
Something isn't quite right. I see some references getting over-incremented and some are going down negative, so something isn't really atomic. In my tests I have 10 reader threads looping 100 times each doing a get() and 1 writer thread doing 100 writes.
It gets stuck in the writer because some of the references never go down to zero, even though they should.
I'm attempting to use the 128-bit DCAS technique laid explained by this blog.
Is there something blatantly wrong with this or is there an easier way to debugging this rather than playing with it in the debugger?
typedef std::unordered_map<std::string, std::string> StringMap;
static const int zero = 0; //provides an l-value for asm code
class NonBlockingReadMapCAS {
class OctaWordMapWrapper {
StringMap* fStringMap;
//std::atomic<int> fCounter;
int64_t fCounter;
OctaWordMapWrapper(OctaWordMapWrapper* copy) : fStringMap(new StringMap(*copy->fStringMap)), fCounter(0) { }
OctaWordMapWrapper() : fStringMap(new StringMap), fCounter(0) { }
~OctaWordMapWrapper() {
delete fStringMap;
* Does a compare and swap on an octa-word - in this case, our two adjacent class members fStringMap
* pointer and fCounter.
static bool inline doubleCAS(OctaWordMapWrapper* target, StringMap* compareMap, int64_t compareCounter, StringMap* swapMap, int64_t swapCounter ) {
bool cas_result;
__asm__ __volatile__
"lock cmpxchg16b %0;" // cmpxchg16b sets ZF on success
"setz %3;" // if ZF set, set cas_result to 1
: "+m" (*target),
"+a" (compareMap), //compare target's stringmap pointer to compareMap
"+d" (compareCounter), //compare target's counter to compareCounter
"=q" (cas_result) //results
: "b" (swapMap), //swap target's stringmap pointer with swapMap
"c" (swapCounter) //swap target's counter with swapCounter
: "cc", "memory"
return cas_result;
OctaWordMapWrapper* atomicIncrementAndGetPointer()
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
return this;
return NULL;
OctaWordMapWrapper* atomicDecrement()
while(true) {
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter -1))
return this;
bool atomicSwapWhenNotReferenced(StringMap* newMap)
return doubleCAS(this, this->fStringMap, zero, newMap, 0);
std::atomic<OctaWordMapWrapper*> fReadMapReference;
pthread_mutex_t fMutex;
NonBlockingReadMapCAS() {
fReadMapReference = new OctaWordMapWrapper();
~NonBlockingReadMapCAS() {
delete fReadMapReference;
bool contains(const char* key) {
std::string keyStr(key);
return contains(keyStr);
bool contains(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
bool result = map->fStringMap->count(key) != 0;
return result;
std::string get(const char* key) {
std::string keyStr(key);
return get(keyStr);
std::string get(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
//std::cout << "inc " << map->fStringMap << " cnt " << map->fCounter << "\n";
std::string value = map->fStringMap->at(key);
return value;
void put(const char* key, const char* value) {
std::string keyStr(key);
std::string valueStr(value);
put(keyStr, valueStr);
void put(std::string &key, std::string &value) {
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);
std::pair<std::string, std::string> kvPair(key, value);
std::cout << oldWrapper->fCounter << "\n";
while (oldWrapper->fCounter > 0);
delete oldWrapper;
void clear() {
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);;
while (oldWrapper->fCounter > 0);
delete oldWrapper;
Maybe not the answer but this looks suspicious to me:
while (oldWrapper->fCounter > 0);
delete oldWrapper;
You could have a reader thread just entering atomicIncrementAndGetPointer() when the counter is 0 thus pulling the rug underneath the reader thread by deleting the wrapper.
Edit to sum up the comments below for potential solution:
The best implementation I'm aware of is to move fCounter from OctaWordMapWrapper to fReadMapReference (You don't need the OctaWordMapWrapper class at all actually). When the counter is zero swap the pointer in your writer. Because you can have high contention of reader threads which essentially blocks the writer indefinitely you can have highest bit of fCounter allocated for reader lock, i.e. while this bit is set the readers spin until the bit is cleared. The writer sets this bit (__sync_fetch_and_or()) when it's about to change the pointer, waits for the counter to fall down to zero (i.e. existing readers finish their work) and then swap the pointer and clears the bit.
This approach should be waterproof, though it's obviously blocking readers upon writes. I don't know if this is acceptable in your situation and ideally you would like this to be non-blocking.
The code would look something like this (not tested!):
class NonBlockingReadMapCAS
NonBlockingReadMapCAS() :m_ptr(0), m_counter(0) {}
StringMap *acquire_read()
uint32_t counter=atom_inc(m_counter);
return m_ptr;
return 0;
void release_read()
void acquire_write()
uint32_t counter=atom_or(m_counter, 0x80000000);
void release_write()
atom_and(m_counter, uint32_t(0x7fffffff));
StringMap *volatile m_ptr;
volatile uint32_t m_counter;
Just call acquire/release_read/write() before & after accessing the pointer for read/write. Replace atom_inc/dec/or/and() with __sync_fetch_and_add(), __sync_fetch_and_sub(), __sync_fetch_and_or() and __sync_fetch_and_and() respectively. You don't need doubleCAS() for this actually.
As noted correctly by #Quuxplusone in a comment below this is single producer & multiple consumer implementation. I modified the code to assert properly to enforce this.
Well, there are probably lots of problems, but here are the obvious two.
The most trivial bug is in atomicIncrementAndGetPointer. You wrote:
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
That is, you're attempting to increment this->fCounter in a lock-free way. But it doesn't work, because you're fetching the old value twice with no guarantee that the same value is read each time. Consider the following sequence of events:
Thread A fetches this->fCounter (with value 0) and computes argument 5 as this->fCounter +1 = 1.
Thread B successfully increments the counter.
Thread A fetches this->fCounter (with value 1) and computes argument 3 as this->fCounter = 1.
Thread A executes doubleCAS(this, this->fStringMap, 1, this->fStringMap, 1). It succeeds, of course, but we've lost the "increment" we were trying to do.
What you wanted is more like
StringMap* oldMap = this->fStringMap;
int64_t oldCounter = this->fCounter;
if (doubleCAS(this, oldMap, oldValue, oldMap, oldValue+1))
The other obvious problem is that there's a data race between get and put. Consider the following sequence of events:
Thread A begins to execute get: it fetches fReadMapReference.load() and prepares to execute atomicIncrementAndGetPointer on that memory address.
Thread B finishes executing put: it deletes that memory address. (It is within its rights to do so, because the wrapper's reference count is still at zero.)
Thread A starts executing atomicIncrementAndGetPointer on the deleted memory address. If you're lucky, you segfault, but of course in practice you probably won't.
As explained in the blog post:
The garbage collection interface is omitted, but in real applications you would need to scan the hazard pointers before deleting a node.
Another user has suggested a similar approach, but if you are compiling with gcc (and perhaps with clang), you could use the intrinsic __sync_add_and_fetch_4 which does something similar to what your assembly code does, and is likely much more portable.
I have used it when I implemented refcounting in an Ada library (but the algorithm remains the same).
int __sync_add_and_fetch_4 (int* ptr, int value);
// increments the value pointed to by ptr by value, and returns the new value
Although I'm not sure how your reader threads work, I suspect your problem is that you are not catching and handling possible out_of_range exceptions in your get() method that might arise from this line: std::string value = map->fStringMap->at(key);. Note that if key is not found in the map, this will throw, and exit the function without decrementing the counter, which would lead to the condition you describe (of getting stuck in the while-loop within the writer thread while waiting for the counters to decrement).
In any event, whether this is the cause of the issues you're seeing or not, you definitely need to either handle this exception (and any others) or modify your code such that there's no risk of a throw. For the at() method, I would probably just use find() instead, and then check the iterator it returns. However, more generally, I would suggest using the RAII pattern to ensure that you don't let any unexpected exceptions escape without unlocking/decrementing. For example, you might check out boost::scoped_lock to wrap your fMutex and then write something simple like this for the OctaWordMapWrapper increment/decrement:
class ScopedAtomicMapReader
explicit ScopedAtomicMapReader(std::atomic<OctaWordMapWrapper*>& map) : fMap(NULL) {
do {
fMap = map.load()->atomicIncrementAndGetPointer();
} while (NULL == fMap);
~ScopedAtomicMapReader() {
if (NULL != fMap)
OctaWordMapWrapper* map(void) {
return fMap;
OctaWordMapWrapper* fMap;
}; // class ScopedAtomicMapReader
With something like that, then for example, your contains() and get() methods would simplify to (and be immune to exceptions):
bool contains(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return (>fStringMap->count(key) != 0);
std::string get(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return>fStringMap->at(key); // Now it's fine if this throws...
Finally, although I don't think you should have to do this, you might also try declaring fCounter as volatile as well (given your access to it in the while-loop in the put() method will be on a different thread than the writes to it on the reader threads.
Hope this helps!
By the way, one other minor thing: fReadMapReference is leaking. I think you should delete this in your destructor.

boost::shared_?? for non-pointer resources

Basically i need to do reference counting on certain resources (like an integer index) that are not inmediately equivalent to a pointer/address semantic; basically i need to pass around the resource around, and call certain custom function when the count reaches zero. Also the way to read/write access to the resource is not a simple pointer derreference operation but something more complex. I don't think boost::shared_ptr will fit the bill here, but maybe i'm missing some other boost equivalent class i might use?
example of what i need to do:
struct NonPointerResource
NonPointerResource(int a) : rec(a) {}
int rec;
int createResource ()
data BasicResource("get/resource");
boost::shared_resource< MonPointerResource > r( BasicResource.getId() ,
boost::function< BasicResource::RemoveId >() );
TypicalUsage( r );
//when r goes out of scope, it will call BasicResource::RemoveId( NonPointerResource& ) or something similar
int TypicalUsage( boost::shared_resource< NonPointerResource > r )
data* d = access_object( r );
// do something with d
Allocate NonPointerResource on the heap and just give it a destructor as normal.
Maybe boost::intrusive_ptr could fit the bill. Here's a RefCounted base class and ancillary functions that I'm using in some of my code. Instead of delete ptr you can specify whatever operation you need.
struct RefCounted {
int refCount;
RefCounted() : refCount(0) {}
virtual ~RefCounted() { assert(refCount==0); }
// boost::intrusive_ptr expects the following functions to be defined:
void intrusive_ptr_add_ref(RefCounted* ptr) { ++ptr->refCount; }
void intrusive_ptr_release(RefCounted* ptr) { if (!--ptr->refCount) delete ptr; }
With that in place you can then have
boost::intrusive_ptr<DerivedFromRefCounted> myResource = ...
is a small example about the use of shared_ptr<void> as a counted handle.
Preparing proper create/delete functions enables us to use
shared_ptr<void> as any resource handle in a sense.
However, as you can see, since this is weakly typed, the use of it causes us
inconvenience in some degree...

Allocating memory for delayed event arguments

Here is my issue.
I have a class to create timed events. It takes in:
A function pointer of void (*func)(void* arg)
A void* to the argument
A delay
The issue is I may want to create on-the-fly variables that I dont want to be a static variable in the class, or a global variable. If either of these are not met, I cant do something like:
void doStuff(void *arg)
somebool = *(bool*)arg;
void makeIt()
bool a = true;
That wont work because the bool gets destroyed when the function returns. So I'd have to allocate these on the heap. The issue then becomes, who allocates and who deletes. what I'd like to do is to be able to take in anything, then copy its memory and manage it in the timed event class. But I dont think I can do memcpy since I dont know the tyoe.
What would be a good way to acheive this where the time event is responsible for memory managment.
I do not use boost
class AguiTimedEvent {
void (*onEvent)(void* arg);
void* argument;
AguiWidgetBase* caller;
double timeStamp;
void call() const;
bool expired() const;
AguiWidgetBase* getCaller() const;
AguiTimedEvent(void(*Timefunc)(void* arg),void* arg, double timeSec, AguiWidgetBase* caller);
void AguiWidgetContainer::handleTimedEvents()
for(std::vector<AguiTimedEvent>::iterator it = timedEvents.begin(); it != timedEvents.end();)
it = timedEvents.erase(it);
void AguiWidgetBase::createTimedEvent( void (*func)(void* data),void* data,double timeInSec )
void AguiWidgetContainer::addTimedEvent( const AguiTimedEvent &timedEvent )
Why would you not use boost::shared_ptr?
It offers storage duration you require since an underlying object will be destructed only when all shared_ptrs pointing to it will have been destructed.
Also it offers full thread safety.
Using C++0x unique_ptr is perfect for the job. This is a future standard, but unique_ptr is already supported under G++ and Visual Studio. For C++98 (current standard), auto_ptr works like a harder to use version of unique_ptr... For C++ TR1 (implemented in Visual Studio and G++), you can use std::tr1::shared_ptr.
Basically, you need a smart pointer. Here's how unique_ptr would work:
unique_ptr<bool> makeIt(){ // More commonly, called a "source"
bool a = true;
return new unique_ptr<bool>(a)
When you use the code later...
void someFunction(){
unique_ptr<bool> stuff = makeIt();
} // stuff is deleted here, because unique_ptr deletes
// things when they leave their scope
You can also use it as a function "sink"
void sink(unique_ptr<bool> ptr){
// Use the pointer somehow
void somewhereElse(){
unique_ptr<bool> stuff = makeIt();
// stuff is now deleted! Stuff points to null now
Aside from that, you can use unique_ptr like a normal pointer, aside from the strange movement rules. There are many smart pointers, unique_ptr is just one of them. shared_ptr is implemented in both Visual Studio and G++ and is the more typical ptr. I personally like to use unique_ptr as often as possible however.
If you can't use boost or tr1, then what I'd do is write my own function that behaves like auto_ptr. In fact that's what I've done on a project here that doesn't have any boost or tr1 access. When all of the events who care about the data are done with it it automatically gets deleted.
You can just change your function definition to take in an extra parameter that represents the size of the object passed in. Then just pass the size down. So your new function declarations looks like this:
void (*func)(void* arg, size_t size)
void doStuff(void *arg, size_t size)
somebool = *(bool*)arg;
memcpy( arg, myStorage, size );
void makeIt()
bool a = true;
container->createTimedEvent(doStuff,(void*)&a,sizeof(bool), 5);
Then you can pass variables that are still on the stack and memcpy them in the timed event class. The only problem is that you don't know the type any more... but that's what happens when you cast to void*
Hope that helps.
You should re-work your class to use inheritance, not a function pointer.
class AguiEvent {
virtual void Call() = 0;
virtual ~AguiEvent() {}
class AguiTimedEvent {
std::auto_ptr<AguiEvent> event;
double timeSec;
AguiWidgetBase* caller;
AguiTimedEvent(std::auto_ptr<AguiEvent> ev, double time, AguiWidgetBase* base)
: event(ev)
, timeSec(time)
, caller(base) {}
void call() { event->Call(); }
// All the rest of it
void MakeIt() {
class someclass : AguiEvent {
bool MahBool;
someclass() { MahBool = false; }
void Call() {
// access to MahBool through this.
something->somefunc(AguiTimedEvent(new someclass())); // problem solved

Best Practice for Scoped Reference Idiom?

I just got burned by a bug that is partially due to my lack of understanding, and partially due to what I think is suboptimal design in our codebase. I'm curious as to how my 5-minute solution can be improved.
We're using ref-counted objects, where we have AddRef() and Release() on objects of these classes. One particular object is derived from the ref-count object, but a common function to get an instance of these objects (GetExisting) hides an AddRef() within itself without advertising that it is doing so. This necessitates doing a Release at the end of the functional block to free the hidden ref, but a developer who didn't inspect the implementation of GetExisting() wouldn't know that, and someone who forgets to add a Release at the end of the function (say, during a mad dash of bug-fixing crunch time) leaks objects. This, of course, was my burn.
void SomeFunction(ProgramStateInfo *P)
ThreadClass *thread = ThreadClass::GetExisting( P );
// some code goes here
bool result = UseThreadSomehow(thread);
// some code goes here
thread->Release(); // Need to do this because GetExisting() calls AddRef()
So I wrote up a little class to avoid the need for the Release() at the end of these functions.
class ThreadContainer
ThreadClass *m_T;
ThreadContainer(Thread *T){ m_T = T; }
~ThreadContainer() { if(m_T) m_T->Release(); }
ThreadClass * Thread() const { return m_T; }
So that now I can just do this:
void SomeFunction(ProgramStateInfo *P)
ThreadContainer ThreadC(ThreadClass::GetExisting( P ));
// some code goes here
bool result = UseThreadSomehow(ThreadC.Thread());
// some code goes here
// Automagic Release() in ThreadC Destructor!!!
What I don't like is that to access the thread pointer, I have to call a member function of ThreadContainer, Thread(). Is there some clever way that I can clean that up so that it's syntactically prettier, or would anything like that obscure the meaning of the container and introduce new problems for developers unfamiliar with the code?
use boost::shared_ptr
it is possible to define your own destructor function, such us in next example:
Yes, you can implement operator ->() for the class, which will recursively call operator ->() on whatever you return:
class ThreadContainer
ThreadClass *m_T;
ThreadContainer(Thread *T){ m_T = T; }
~ThreadContainer() { if(m_T) m_T->Release(); }
ThreadClass * operator -> () const { return m_T; }
It's effectively using smart pointer semantics for your wrapper class:
Thread *t = new Thread();
ThreadContainer tc(t);
tc->SomeThreadFunction(); // invokes tc->t->SomeThreadFunction() behind the scenes...
You could also write a conversion function to enable your UseThreadSomehow(ThreadContainer tc) type calls in a similar way.
If Boost is an option, I think you can set up a shared_ptr to act as a smart reference as well.
Take a look at ScopeGuard. It allows syntax like this (shamelessly stolen from that link):
FILE* topSecret = fopen("cia.txt");
ON_BLOCK_EXIT(std::fclose, topSecret);
... use topSecret ...
} // topSecret automagically closed
Or you could try Boost::ScopeExit:
void World::addPerson(Person const& aPerson) {
bool commit = false;
m_persons.push_back(aPerson); // (1) direct action
BOOST_SCOPE_EXIT( (&commit)(&m_persons) )
m_persons.pop_back(); // (2) rollback action
// ... // (3) other operations
commit = true; // (4) turn all rollback actions into no-op
I would recommend following bb advice and using boost::shared_ptr<>. If boost is not an option, you can take a look at std::auto_ptr<>, which is simple and probably addresses most of your needs. Take into consideration that the std::auto_ptr has special move semantics that you probably don't want to mimic.
The approach is providing both the * and -> operators together with a getter (for the raw pointer) and a release operation in case you want to release control of the inner object.
You can add an automatic type-cast operator to return your raw pointer. This approach is used by Microsoft's CString class to give easy access to the underlying character buffer, and I've always found it handy. There might be some unpleasant surprises to be discovered with this method, as in any time you have an implicit conversion, but I haven't run across any.
class ThreadContainer
ThreadClass *m_T;
ThreadContainer(Thread *T){ m_T = T; }
~ThreadContainer() { if(m_T) m_T->Release(); }
operator ThreadClass *() const { return m_T; }
void SomeFunction(ProgramStateInfo *P)
ThreadContainer ThreadC(ThreadClass::GetExisting( P ));
// some code goes here
bool result = UseThreadSomehow(ThreadC);
// some code goes here
// Automagic Release() in ThreadC Destructor!!!

Dynamically created scope guards

I've read the article about scope guards (Generic: Change the Way You Write Exception-Safe Code — Forever) in DDJ and I understand their common use.
However, the common use is to instantiate a particular stack guard on the stack for a particular operation, e.g.:
FILE* topSecret = fopen("cia.txt");
ON_BLOCK_EXIT(std::fclose, topSecret);
... use topSecret ...
} // topSecret automagically closed
but what if I want to schedule cleanup operations in runtime, e.g. when I have a loop:
vector<FILE*> topSecretFiles;
for (int i=0; i<numberOfFiles; ++i)
char filename[256];
sprintf(filename, "cia%d.txt", i);
FILE* topSecret = fopen(filename);
ON_BLOCK_EXIT(std::fclose, topSecret); // no good
Obviously, the above example wouldn't work, since topSecret would be closed along with the for scope. I'd like a scope guard pattern where I can just as easily queue up cleanup operations which I determine to be needed at runtime. Is there something like this available?
I can't push scope guard objects into a standard queue, cause the original object (the one I'm pushing) would be dismissed in the process. How about pushing heap-allocated stack guards and using a queue which deletes its members on dtor? Does anyone have a more clever approach?
It seems you don't appreciate RAII for what it is. These scope guards are nice on occasion for local ("scope") things but you should try to avoid them in favour of what RAII is really supposed to do: encapsulating a resource in an object. The type FILE* is really just not good at that.
Here's an alternative:
void foo() {
typedef std::tr1::shared_ptr<FILE> file_sptr;
vector<file_sptr> bar;
for (...) {
file_sptr fsp ( std::fopen(...), std::fclose );
void foo() {
typedef std::tr1::shared_ptr<std::fstream> stream_sptr;
vector<stream_sptr> bar;
for (...) {
file_sptr fsp ( new std::fstream(...) );
Or in "C++0x" (upcoming C++ standard):
void foo() {
vector<std::fstream> bar;
for (...) {
// streams will become "movable"
bar.push_back( std::fstream(...) );
Edit: Since I like movable types in C++0x so much and you showed interest in it: Here's how you could use unique_ptr in combination with FILE* without any ref-counting overhead:
struct file_closer {
void operator()(FILE* f) const { if (f) std::fclose(f); }
typedef std::unique_ptr<FILE,file_closer> file_handle;
file_handle source() {
file_handle fh ( std::fopen(...) );
return fh;
int sink(file_handle fh) {
return std::fgetc( fh.get() );
int main() {
return sink( source() );
Be sure to check out Dave's blog on efficient movable value types
Huh, turns out the DDJ scope guard is "movable", not in the C++0x sense, but in the same sense that an auto_ptr is movable: during the copy ctor, the new guard "dismisses" the old guard (like auto_ptr's copy ctor calls the old one's auto_ptr::release).
So I can simply keep a queue<ScopeGuard> and it'll work:
queue<ScopeGuard> scopeGuards;
// ...
for (...)
// the temporary scopeguard is being neutralized when copied into the queue,
// so it won't cause a double call of cleanupFunc
scopeGuards.push_back(MakeScopeGuard(cleanupFunc, arg1));
// ...
By the way, thank you for the answer above. It was informative and educational to me in different ways.