Kill std::thread while reading a big file - c++

I have a std::thread function that is calling fopen to load a big file into an array:
void loadfile(char *fname, char *fbuffer, long fsize)
{
FILE *fp = fopen(fname, "rb");
fread(fbuffer, 1, fsize, fp);
flose(fp);
}
This is called by:
std::thread loader(loadfile, fname, fbuffer, fsize);
loader.detach();
At some point, something in my program wants to stop reading that file and asks for another file. The problem is that by the time I delete the fbuffer pointer, the loader thread is still going, and I get a race condition that trows an exception.
How can I kill that thread? My idea was to check for the existance of the fbuffer and maybe split the fread in small chunks:
void loadfile(char *fname, char *fbuffer, long fsize)
{
FILE *fp = fopen(fname, "rb");
long ch = 0;
while (ch += 256 < fsize)
{
if (fbuffer == NULL) return;
fread(fbuffer + ch, 1, 256, fp);
}
fclose(fp);
}
Will this slow down the reading of the file? Do you have a better idea?

You should avoid killing a thread at all costs. Doing so causes evil things to happen, like resources left in a permanently locked state.
The thread must be given a reference to a flag, the value of which can be set from elsewhere, to tell the thread to voluntarily quit.
You cannot use a buffer for this purpose; if one thread deletes the memory of the buffer while the other is writing to it, very evil things will happen. (Memory corruption.) So, pass a reference to a boolean flag.
Of course, in order for the thread to be able to periodically check the flag, it must have small chunks of work to do, so splitting your freads to small chunks was a good idea.
256 bytes might be a bit too small though; definitely use 4k or more, perhaps even 64k.

Killing threads is usually not the way to go - doing this may lead to leaked resources, critical sections you cannot exit and inconsistent program state.
Your idea is almost spot-on, but you need a way to signal the thread to finalize. You can use a boolean shared between your thread and the rest of your code that your thread reads after every read, and once it is set, stops reading into the buffer cleans up the file handles and exits cleanly.
On another note, handling the deletion of pointers with owning semantics by yourself is most of the time frowned upon in modern C++ - unless you have a very good reason not to, I'd recommend using the stl fstream and string classes.

You need proper thread synchronization. The comments about resource leaks and the proposal by #Mike Nakis about making the thread exit voluntarily by setting a boolean are almost correct (well, they're correct, but not complete). You need to go even farther than that.
You must ensure not only that the loader thread exits on its own, you must also ensure that it has exited before you delete the buffer it is writing to. Or, at least, you must ensure that it isn't ever touching that buffer in any way after you deleted it. Checking the pointer for null-ness does not work for two reasons. First, it doesn't work anyway, since you are looking at a copy of the original pointer (you would have to use a pointer-pointer or a reference). Second, and more importantly, even if the check worked, there is a race condition between the if statement and fread. In other words, there is no way to guarantee that you aren't freeing the buffer while fread is accessing it (no matter how small you make your chunks).
At the very minimum, you neeed two boolean flags, but preferrably you would use a proper synchronization primitive such as a condition variable to notify the main thread (so you don't have to spin waiting for the loader to exit, but can block).
The correct way of operation would be:
Notify loader thread
Wait for loader thread to signal me (block on cond var)
Loader thread picks up notification, sets condition variable and never touches the buffer afterwards, then exits
Resume (delete buffer, allocate new buffer, etc)
If you do not insist on detaching the loader thread, you could instead simply join it after telling it to exit (so you would not need a cond var).

Related

Do I need to call CloseHandle?

I have code that creates a thread and closes it with CloseHandle when the program finishes.
int main()
{
....
HANDLE hth1;
unsigned uiThread1ID;
hth1 = (HANDLE)_beginthreadex( NULL, // security
0, // stack size
ThreadX::ThreadStaticEntryPoint,
o1, // arg list
CREATE_SUSPENDED, // so we can later call ResumeThread()
&uiThread1ID );
....
CloseHandle( hth1 );
....
}
But why do I need to close the handle at all? What will happen if I will not do so?
But why do I need to close handle at all?
Handles are a limited resource that occupy both kernel and userspace memory. Keeping a handle alive not only takes an integer worth of storage, but also means that the kernel has to keep the thread information (such as user time, kernel time, thread ID, exit code) around, and it cannot recycle the thread ID since you might query it using that handle.
Therefore, it is best practice to close handles when they are no longer needed.
This is what you are to do per the API contract (but of course you can break that contract).
What will happens if I will not do so?
Well, to be honest... nothing. You will leak that handle, but for one handle the impact will not be measurable. When your process exits, Windows will close the handle.
Still, in the normal case, you should close handles that are not needed any more just like you would free memory that you have allocated (even though the operating system will release it as well when your process exits).
Although it may even be considered an "optimization" not to explicitly free resources, in order to have a correct program, this should always be done. Correctness first, optimization second.
Also, the sooner you release a resource (no matter how small it is) the sooner it is available again to the system for reuse.
Here you have a topic that response your question:
Can I call CloseHandle() immediately after _beginthreadex() succeeded?
Greetings.
Yes, you need to close the handle at some point, otherwise you will be leaking a finite OS resource.
May I suggest creating an RAII wrapper for your handles? Basically you write a wrapper that stores the handle you created and calls CloseHandle in its destructor. That way you never have to worry about remembering to close it (it will automatically close when it goes out of scope) or leaking a handle if an exception happens between opening the handle and closing it.
If you don't close the handle, then it will remain open until your process terminates. Depending on what's behind the handle, this can be bad. Resources are often associated to handles and these won't be cleaned up until your program terminates; if you only use a few handles and these happen to be 'lightweight' then it doesn't really matter. Other handles, such as file handles, have other side-effects to keeping the handle opened, for instance locking an opened file until your process exits. This can be very annoying to the user or other applications.
In general, it's best to clean-up all handles, but at the end of the process, all handles are closed by Windows.

C++ - Should data passed to a thread be volatile?

In Microsoft Visual C++ I can call CreateThread() to create a thread by starting a function with one void * parameter. I pass a pointer to a struct as that parameter, and I see a lot of other people do that as well.
My question is if I am passing a pointer to my struct how do I know if the structure members have been actually written to memory before CreateThread() was called? Is there any guarantee they won't be just cached? For example:
struct bigapple { string color; int count; } apple;
apple.count = 1;
apple.color = "red";
hThread = CreateThread( NULL, 0, myfunction, &apple, 0, NULL );
DWORD WINAPI myfunction( void *param )
{
struct bigapple *myapple = (struct bigapple *)param;
// how do I know that apple's struct was actually written to memory before CreateThread?
cout << "Apple count: " << myapple->count << endl;
}
This afternoon while I was reading I saw a lot of Windows code on this website and others that passes in data that is not volatile to a thread, and there doesn't seem to be any memory barrier or anything else. I know C++ or at least older revisions are not "thread aware" so I'm wondering if maybe there's some other reason. My guess would be the compiler sees that I've passed a pointer &apple in a call to CreateThread() so it knows to write out members of apple before the call.
Thanks
No. The relevant Win32 thread functions all take care of the necessary memory barriers. All writes prior to CreateThread are visible to the new thread. Obviously the reads in that newly created thread cannot be reordered before the call to CreateThread.
volatile would not add any extra useful constraints on the compiler, and merely slow down the code. In practice thiw wouldn't be noticeable compared to the cost of creating a new thread, though.
No, it should not be volatile. At the same time you are pointing at the valid issue. Detailed operation of the cache is described in the Intel/ARM/etc papers.
Nevertheless you can safely assume that the data WILL BE WRITTEN. Otherwise too many things will be broken. Several decades of experience tell that this is so.
If thread scheduler will start thread on the same core, the state of the cache will be fine, otherwise, if not, kernel will flush the cache. Otherwise, nothing will work.
Never use volatile for interaction between threads. It is an instruction on how to handle data inside the thread only (use a register copy or always reread, etc).
First, I think optimizer cannot change the order at expense of the correctness. CreateThread() is a function, parameter binidng for function calls happens before the call is made.
Secondly, volatile is not very helpful for the purpose you intend. Check out this article.
You're struggling into a non-problem, and are creating at least other two...
Don't worry about the parameter given to CreateThread: if they exist at the time the thread is created they exist until CreateThread returns. And since the thread who creates them does not destroy them, they are also available to the other thread.
The problem now becomes who and when they will be destroyed: You create them with new so they will exist until a delete is called (or until the process terminates: good memory leak!)
The process terminate when its main thread terminate (and all other threads will also be terminated as well by the OS!). And there is nothing in your main that makes it to wait for the other thread to complete.
Beware when using low level API like CreateThread form languages that have thir own library also interfaced with thread. The C-runtime has _beginthreadex. It call CreateThread and perform also other initialization task for the C++ library you will otherwise miss. Some C (and C++) library function may not work properly without those initializations, that are also required to properly free the runtime resources at termination. Unsing CreateThread is like using malloc in a context where delete is then used to cleanup.
The proper main thread bnehavior should be
// create the data
// create the other thread
// // perform othe task
// wait for the oter thread to terminate
// destroy the data
What the win32 API documentation don't say clearly is that every HANDLE is waitable, and become signaled when the associate resource is freed.
To wait for the other thread termination, you main thread will just have to call
WaitForSingleObject(hthread,INFINITE);
So the main thread will be more properly:
{
data* pdata = new data;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, pdata,0,0);
WaitForSingleObject(htread,INFINITE);
delete pdata;
}
or even
{
data d;
HANDLE hthread = (HANDLE)_beginthreadex(0,0,yourprocedure, &d,0,0);
WaitForSingleObject(htread,INFINITE);
}
I think the question is valid in another context.
As others have pointed out using a struct and the contents is safe (although access to the data should by synchronized).
However I think that the question is valid if you hav an atomic variable (or a pointer to one) that can be changed outside the thread. My opinion in that case would be that volatile should be used in this case.
Edit:
I think the examples on the wiki page are a good explanation http://en.wikipedia.org/wiki/Volatile_variable

Asynchronous File I/O in C++

I can't find information about asynchronous reading and writing in C++. So I write code, function read() works correctly, but synchronization doesn't. Sync() function doesn't wait for the end of reading.
For my opinion variable state_read in thread has incorrect value. Please, understand me why.
struct IOParams{
char* buf;
unsigned int nBytesForRead;
FILE* fp;
};
struct AsyncFile {
FILE* fp;
bool state_read;
HANDLE hThreadRead;
IOParams read_params;
void AsyncFile::read(char* buf, unsigned int nBytesForRead){
sync();
read_params.buf = buf;
read_params.fp = fp;
read_params.nBytesForRead = nBytesForRead;
hThreadRead = CreateThread(0,0,ThreadFileRead,this,0);
}
void AsyncFile::sync() {
if (state_read) {
WaitForSingleObject(hThreadRead,INFINITE);
CloseHandle(hThreadRead);
}
state_read = false;
}
};
DWORD WINAPI ThreadFileRead(void* lpParameter) {
AsyncFile* asf = (AsyncFile*)lpParameter;
asf->setReadState(true);
IOParams & read_params = *asf->getReadParams();
fread(read_params.buf, 1, read_params.nBytesForRead, read_params.fp);
asf->setReadState(false);
return 0;
}
Maybe you know how to write the asynchronous reading in more reasonable way.
Maybe you know how to write the asynchronous reading in more reasonable way.
Since your question is tagged "Windows", you might look into FILE_FLAG_OVERLAPPED and ReadFileEx, which do asynchronous reading without extra threads (synchronisation via an event, a callback, or a completion port).
If you insist on using a separate loader thread (there may be valid reasons for that, though few), you do not want to read and write a flag repeatedly from two threads and use that for synchronisation. Although your code looks correct, the mere fact that does not work as intended shows that it's a bad idea.
Always use a proper synchronisation primitive (event or semaphore) for synchronisation, do not tamper with some flag that's (possibly inconsistently) written and read from different threads.
Alternatively, if you don't want an extra event object, you could always wait on the thread to die, unconditionally (but, read the next paragraph).
Generally, spawning a thread and letting it die for every read is not a good design. Not only is spawning a thread considerable overhead (both for CPU and memory), it can also introduce hard to predict "funny effects" and turn out to be a total anti-optimization. Imagine for example having 50 threads thrashing the harddrive on seeks, all of them trying to get a bit of it. This will be asynchronous for sure, but it will be a hundred times slower, too.
Using a small pool of workers (emphasis on small) will probably be a much superior design, if you do not want to use the operating system's native asynchronous mechanisms.

Updating global variables from a single worker thread: Do I need mutexes?

It seems that this question gets asked frequently, but I am not coming to any definitive conclusion. I need a little help on determining whether or not I should (or must!) implement locking code when accessing/modifying global variables when I have:
global variables defined at file scope
a single "worker" thread reading/writing to global variables
calls from the main process thread calling accessor functions which return these globals
So the question is, should I be locking access to my global variables with mutexes?
More specifically, I am writing a C++ library which uses a webcam to track objects on a page of paper -- computer vision is CPU intensive, so performance is critical. I have a single worker thread which is spun off in an Open() function. This thread handles all of the object tracking. It is terminated (indirectly w/global flag) when a Close() function is called.
It feels like I'm just asking for memory corruption, but I have observed no deadlock issues nor have I experienced any bad values returned from these accessor functions. And after several hours of research, the general impression I get is, "Meh, probably. Whatever. Have fun with that." If I indeed should be using mutexes, why I have not experienced any problems yet?
Here is an over-simplification on my current program:
// *********** lib.h ***********
// Structure definitions
struct Pointer
{
int x, y;
};
// more...
// API functions
Pointer GetPointer();
void Start();
void Stop();
// more...
The implementation looks like this...
// *********** lib.cpp ***********
// Globals
Pointer p1;
bool isRunning = false;
HANDLE hWorkerThread;
// more...
// API functions
Pointer GetPointer()
{
// NOTE: my current implementation is actually returning a pointer to the
// global object in memory, not a copy of it, like below...
// Return copy of pointer data
return p1;
}
// more "getters"...
void Open()
{
// Create worker thread -- continues until Close() is called by API user
hWorkerThread = CreateThread(NULL, 0, DoWork, NULL, 0, NULL);
}
void Close()
{
isRunning = false;
// Wait for the thread to close nicely or else you WILL get nasty
// deadlock issues on close
WaitForSingleObject(hWorkerThread, INFINITE);
}
DWORD WINAPI DoWork(LPVOID lpParam)
{
while (isRunning)
{
// do work, including updating 'p1' about 10 times per sec
}
return 0;
}
Finally, this code is being called from an external executable. Something like this (pseudocode):
// *********** main.cpp ***********
int main()
{
Open();
while ( <esc not pressed> )
{
Pointer p = GetPointer();
<wait 50ms or so>
}
Close();
}
Is there perhaps a different approach I should be taking? This non-issue issue is driving me nuts today :-/ I need to ensure this library is stable and returning accurate values. Any insight would be greatly appreciated.
Thanks
If only one thread access an object (both read and write) then no locks are required.
If an object is read only then no locks are required. (Assuming you can guarantee that only one thread access the object during construction).
If any thread writes (changes the state) of an object. If there are other threads that access that object then ALL accesses (both read and write) must be locked. Though you may use read locks that allow multiple readers. But write operations must be exclusive and no readers can access the object while the state is being changed.
I suppose it depends on what you are doing in your DoWork() function. Let's assume it writes a point value to p1. At the very least you have the following possibility of a race condition that will return invalid results to the main thread:
Suppose the worker thread wants to update the value of p1. For example, lets change the value of p1 from (A, B) to (C, D). This will involve at least two operations, store C in x and store D in y. If the main thread decides to read the value of p1 in the GetPointer() function, it must perform at least two operations also, load value for x and load value for y. If the sequence of operations is:
update thread: store C
main thread: load x (main thread receives C)
main thread: load y (main thread receives B)
update thread: store D
The main thread will get the point (C, B), which is not correct.
This particular problem is not a good use of threads, since the main thread isn't doing any real work. I would use a single thread, and an API like WaitForMultipleObjectsEx which allows you to simultaneously wait for input from the keyboard stdin handle, an I/O event from the camera, and a timeout value.
You won't get a deadlock, but you may see some occasional bad value with an extremely low probability: since reading and writing take fractions of a nanosecond, and you only read the variable 50 times per second, the chance of a collision is something like 1 in 50 million.
If this is happening on an Intel 64, "Pointer" is aligned to a 8 byte boundary, and it is read and written in one operation (all 8 bytes with one assembly instruction), then accesses are atomic and you don't need a mutex.
If either of those conditions are not satisfied, there's a possibility that the reader will get bad data.
I'd put a mutex just to be on the safe side, since it's only going to be used 50 times a second and it's not going to be a performance issue.
The situation's pretty clear cut - the reader may not see updates until something triggers synchronisation (a mutex, memory barrier, atomic operation...). Many things processes do implicitly trigger such synchronisation - e.g. external function calls (for reason's explained the the Usenet threading FAQ (http://www.lambdacs.com/cpt/FAQ.html) - see Dave Butenhof's answer re need for volatile, so if your code is dealing in values that are small enough that they can't be half-written (e.g. numbers rather than strings, fixed address rather than dynamic (re)allocations) then it can limp along without explicit syncs.
If your idea of performance is getting more loops through your write code, then you'll get a nicer number if you leave out the synchronisation. But if you're interested in minimising the average and worst-case latency, and how many distinct updates the reader cam actually see, then you should do synchronisation from the writer.
You may not be seeing problems because of the nature of the information in Pointer. If it is tracking coordinates of some object that is not moving very fast, and the position is updated during a read, then the coordinates might be a "little off", but not enough to notice.
For instance, assume that after an update, p.x is 100, and p.y is 100. The object your are tracking moves a bit, so after the next update, p.x is 102 and p.y is 102. If you happen to read in the middle of this update, after x is updated but before y is updated, you will end getting a pointer value of p.x as 102, and p.y as 100.

C++ Thread question - setting a value to indicate the thread has finished

Is the following safe?
I am new to threading and I want to delegate a time consuming process to a separate thread in my C++ program.
Using the boost libraries I have written code something like this:
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
Where finished_flag is a boolean member of my class. When the thread is finished it sets the value and the main loop of my program checks for a change in that value.
I assume that this is okay because I only ever start one thread, and that thread is the only thing that changes the value (except for when it is initialised before I start the thread)
So is this okay, or am I missing something, and need to use locks and mutexes, etc
You never mentioned the type of finished_flag...
If it's a straight bool, then it might work, but it's certainly bad practice, for several reasons. First, some compilers will cache the reads of the finished_flag variable, since the compiler doesn't always pick up the fact that it's being written to by another thread. You can get around this by declaring the bool volatile, but that's taking us in the wrong direction. Even if reads and writes are happening as you'd expect, there's nothing to stop the OS scheduler from interleaving the two threads half way through a read / write. That might not be such a problem here where you have one read and one write op in separate threads, but it's a good idea to start as you mean to carry on.
If, on the other hand it's a thread-safe type, like a CEvent in MFC (or equivilent in boost) then you should be fine. This is the best approach: use thread-safe synchronization objects for inter-thread communication, even for simple flags.
Instead of using a member variable to signal that the thread is done, why not use a condition? You are already are using the boost libraries, and condition is part of the thread library.
Check it out. It allows the worker thread to 'signal' that is has finished, and the main thread can check during execution if the condition has been signaled and then do whatever it needs to do with the completed work. There are examples in the link.
As a general case I would neve make the assumption that a resource will only be modified by the thread. You might know what it is for, however someone else might not - causing no ends of grief as the main thread thinks that the work is done and tries to access data that is not correct! It might even delete it while the worker thread is still using it, and causing the app to crash. Using a condition will help this.
Looking at the thread documentation, you could also call thread.timed_join in the main thread. timed_join will wait for a specified amount for the thread to 'join' (join means that the thread has finsihed)
I don't mean to be presumptive, but it seems like the purpose of your finished_flag variable is to pause the main thread (at some point) until the thread thrd has completed.
The easiest way to do this is to use boost::thread::join
// launch the thread...
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
// ... do other things maybe ...
// wait for the thread to complete
thrd.join();
If you really want to get into the details of communication between threads via shared memory, even declaring a variable volatile won't be enough, even if the compiler does use appropriate access semantics to ensure that it won't get a stale version of data after checking the flag. The CPU can issue reads and writes out of order as long (x86 usually doesn't, but PPC definitely does) and there is nothing in C++9x that allows the compiler to generate code to order memory accesses appropriately.
Herb Sutter's Effective Concurrency series has an extremely in depth look at how the C++ world intersects the multicore/multiprocessor world.
Having the thread set a flag (or signal an event) before it exits is a race condition. The thread has not necessarily returned to the OS yet, and may still be executing.
For example, consider a program that loads a dynamic library (pseudocode):
lib = loadLibrary("someLibrary");
fun = getFunction("someFunction");
fun();
unloadLibrary(lib);
And let's suppose that this library uses your thread:
void someFunction() {
volatile bool finished_flag = false;
thrd = new boost::thread(boost::bind(&myclass::mymethod, this, &finished_flag);
while(!finished_flag) { // ignore the polling loop, it's besides the point
sleep();
}
delete thrd;
}
void myclass::mymethod() {
// do stuff
finished_flag = true;
}
When myclass::mymethod() sets finished_flag to true, myclass::mymethod() hasn't returned yet. At the very least, it still has to execute a "return" instruction of some sort (if not much more: destructors, exception handler management, etc.). If the thread executing myclass::mymethod() gets pre-empted before that point, someFunction() will return to the calling program, and the calling program will unload the library. When the thread executing myclass::mymethod() gets scheduled to run again, the address containing the "return" instruction is no longer valid, and the program crashes.
The solution would be for someFunction() to call thrd->join() before returning. This would ensure that the thread has returned to the OS and is no longer executing.