We have a decoding function that runs in its own thread to carry out its job.
The time of execution is usually well below a defined timeout value, but on some occasions it may take much longer to complete. Thus the need to have a timeout in order to make sure this function will not cause extra delays to the rest of the program.
This is currently being developed on Windows OS but I'm also looking at a portable solution to Linux.
The implementation so far as multiple checks within the decoding function to see if it still has time to continue or abort processing. Which is def. not great practice and I'm looking at improving this.
I'm aware that boost provides such facility, but we do not use boost in this project.
Here is an excellent article by Herb Sutter on the subject. The conclusion would be: your current approach is OK. Just have your decoding threads periodicly check if they run out of time. The important thing is to strike a balance about how frequently you check.
One way is to set a flag on timeout to instruct the thread instance to not report any completion, not continue and to delete/terminate itself ASAP. Reduce its priority to the lowest possible and forget about it. Create another thread object immediately, overwriting the old instance value, and use the new thread instance for subsequent decoding.
The lowest-priority orphaned thread will eventually die off itself when it finally gets around to checking its suicide-flag.
Related
It is a short question. I believe there is no way to cancel a job submitted to libclang through python bindings (for example code completion task).
Can anybody prove me wrong? I am interested in using libclang in a multi threaded environment but it seems it is intended to be accesses from single thread only. If there is also no mechanism to cancel tasks, then one has to wait till the task finishes even if the results are not needed anymore. Does anybody have any ideas on how to overcome this?
[..] it seems it is intended to be accesses from single thread only.
I don't have anything that clearly backs this, but as the documentation nowhere even talks about thread safety I think all of libclang should be considered not thread safe.
But: Seeing that basically everything libclang does is (indirectly) bound to an CXIndex I would guess that you could have a CXIndex per thread and then use those (or anything that's created from them) in parallel (but not "share" anything between threads).
If there is also no mechanism to cancel tasks, then one has to wait till the task finishes even if the results are not needed anymore. Does anybody have any ideas on how to overcome this?
The "safe" solution is to move all libclang related code into a dedicated process. From your main application you then start (or kill) these processes (using OS dependent mechanisms) as you like. This is, of course, "heavy" in terms of both performance (starting processes) and development effort (serializing communication between processes).
The alternative is to hope (or verify in the source code) that the libclang devs keep all data associated to a CXIndex and thus don't introduce possible data races in their code. Then you can give every thread its own index, its own translation units etc. When you have a "job", you launch a thread (or reuse one) to work on it. If in the mean time the results are no longer needed, then you just discard the results when (if) they ever get ready.
I'm using TinyThread++ to get clean and simple platform independent control over threading features in my project. I just came upon a situation where I'd like to have responsive synchronized message passing without pegging the CPU, while allowing a thread to continue to do a bit of work on the side while it is idle. Sure, I could simply spawn a third thread to do this "other work" but all I'm missing is a condition variable wait(int ms) type function rather than the wait() that already works great. The idea is that I'd like for it to block only for up to ms milliseconds, so it will be able to time out and perform some actions periodically (during which the thread will not be actively waiting on the condition variable). The idea is that even though it's nice to have the thread sitting there waiting to pounce on any incoming messages, if I give it some task to do on the side which takes only 50 microseconds to execute, and I only need to run that once every second, it definitely shouldn't push me to make yet another thread (and message queue and other resources) to get it done.
Does any of this make sense? I'm looking for suggestions on how i might go about implementing this. I'm hoping adding a couple of lines to the TinyThread code can provide me with this functionality.
Well the source code for the wait function isn't very complicated so making the required modificiations looks simple enough:
The linux implementation relies on the pthread_cond_wait function
which can trivially be changed to the pthread_cond_timedwait
function. Do read the documentation carefully in case I forgot about any minutias.
On the windows side of things, it's a little more
complicated and I'm no expert on multithreading on windows. That
being said, if there's a timed version of the _wait function (I'm pretty sure there is),
changing that should work just fine. Again, read over the documentation carefully before doing any modifications.
Now before you go off and do these modifications, I don't think what you're trying to do is a good idea. The main advantage of using threads is to conceptually seperate different tasks. Trying to do multiple things in a single thread is a bit like trying to do multiple things in a single function: it complicates the design and makes things harder to debug. So unless the overhead of creating a new thread is provably too great or unless the resulting code remains simple and easy to understand, I'd split it up into multiple threads.
Finally, I get the feeling that you might not be aware that condition variables can return spuriously (returns without anybody having done any signalling or returns when the condition is still false). So just in case, I'd suggest reviewing the usage examples and making sure you understand why those loops are there.
I have a program which executes constantly and I need to save data every minute.
The program process data and every minute I want to save the value of a variable and do some statistical operations to know the variation of this variable.
I thought i can make it with a signal, SIGALRM and alarm(60). My subquestion is, can I put a class method as the destiny method for SIGALRM?
Any other idea to execute a method to save data and do some operations every minute ??
The program is written in C++, runs in Linux an a mono-core processor.
Your solution using alarm will work, both open and write being asynchronous-signal-safe. Though you have to be aware that interactions between alarm and sleep are undefined, so don't use them in the same program.
A different solution, especially in case you already use an epoll, would be to have a timerfd trigger the epoll. That will avoid possible undefined interactions.
As for the actual saving, consider forking. This is a technique that I learned from redis (maybe someone else invented it, but that's where I learned it from), and which I consider totally cool. The point being that the forked process can take all time in the universe to finish writing as much data as you want to disk. It can access the snapshot at the time of forking while the other process keeps running and modifying data. And thanks to page magic done in the kernel, it still all works seamlessly without any risk of corruption, without ever stalling, and without ever needing to look at something like asynchronous IO, which is great.
You can call a class method using something like boost bind
Apart from that I wouldn't recommend to use signals for that, they are not that reliable, and could, for example, make one of your syscalls to return prematurely.
I would spawn a thread, assuming your monocore doesn't mean no threads, that waits 60 seconds, takes locks, makes calcs, outputs and releases locks.
As they have already suggested, if you have an async compatible system(driven by events) you could use timerfd to generate events.
Saving data from a signal handler is a very bad idea. Even if open and write are async-signal-safe, your data could very well be in an inconsistent state due to a signal interrupting a function that was modifying it.
A much better approach would be to add to all functions which modify the data:
if (current_time > last_save_time + 60) save();
This will avoid useless saves when the data has not been modified, too. If you don't want the overhead of making a system call to determine the current time on every operation, you could instead install a timer/signal handler that updates current_time, as long as you declare it volatile.
Another good approach would be to use threads instead of signals. Then you should use a mutex (or better, rwlock) to synchronize access to the data.
I'm using SQLite3 in a Windows application. I have the source code (so-called SQLite amalgamation).
Sometimes I have to execute heavy queries. That is, I call sqlite3_step on a prepared statement, and it takes a lot of time to complete (due to the heavy I/O load).
I wonder if there's a possibility to abort such a call. I would also be glad if there was an ability to do some background processing in the middle of the call within the same thread (since most of the time is spent in waiting for the I/O to complete).
I thought about modifying the SQLite code myself. In the simplest scenario I could check some condition (like an abort event handle for instance) before every invocation of either ReadFile/WriteFile, and return an error code appropriately. And in order to allow the background processing the file should be opened in the overlapped mode (this enables asynchronous ReadFile/WriteFile).
Is there a chance that interruption of WriteFile may in some circumstances leave the database in the inconsistent state, even with the journal enabled? I guess not, since the whole idea of the journal file is to be prepared for any error of any kind. But I'd like to hear more opinions about this.
Also, did someone tried something similar?
Thanks in advance.
EDIT:
Thanks to ereOn. I wasn't aware of the existence of sqlite3_interrupt. This probably answers my question.
Now, for all of you who wonders how (and why) one expects to do some background processing during the I/O within the same thread.
Unfortunately not many people are familiar with so-called "Overlapped I/O".
http://en.wikipedia.org/wiki/Overlapped_I/O
Using it one issues an I/O operation asynchronously, and the calling thread is not blocked. Then one receives the I/O completion status using one of the completion mechanisms: waitable event, new routine queued into the APC, or the completion port.
Using this technique one doesn't have to create extra threads. Actually the only real legitimation for creating threads is when your bottleneck is the computation time (i.e. CPU load), and the machine has several CPUs (or cores).
And creating a thread just to let it be blocked by the OS most of the time - this doesn't make sense. This leads to the unjustified waste of the OS resources, complicates the program (need for synchronization and etc.).
Unfortunately not all the libraries/APIs allow asynchronous mode of operation, thus making creating extra threads the necessarily evil.
EDIT2:
I've already found the solution, thansk to ereOn.
For all those who nevertheless insist that it's not worth doing things "in background" while "waiting" for the I/O to complete using overlapped I/O. I disagree, and I think there's no point to argue about this. At least this is not related to the subject.
I'm a Windows programmer (as you may noticed), and I have a very extensive experience in all kinds of multitasking. Plus I'm also a driver writer, so that I also know how things work "behind the scenes".
I know that it's a "common practice" to create several threads to do several things "in parallel". But this doesn't mean that this is a good practice. Please allow me not to follow the "common practice".
I don't understand why you want the interruption to come from the same thread and I even don't understand how that would be possible: if the current thread is blocked, waiting for some IO, you can't execute any other code. (Yeah, that's what "blocked" means)
Perhaps if you give us more hints about why you want this, we might help further.
Usually, I use sqlite3_interrupt() to cancel calls. But this, obviously, involves that the call is made from another thread.
By default, SQLite is threadsafe. It sounds to me like the easiest thing to do would be to start the Sqlite command on a background thread, and let SQLite to the necessary locking to have that work.
From your perspective then, the sqlite call looks like an asynchronous bit of I/O, and you can continue normal processing on this thread, such as e.g. using a loop including interruptible sleep and a bit of occasional background processing (e.g. to update a liveness indicator). When the SQLite statement completes, the background thread should set a state variable to indicate this, wake the main thread (if necessary), and terminate.
I'm looking for a way to do asynchronous and thread-safe logging in my C++ project, if possible to one file. I'm currently using cerr and clog for the task, but since they are synchronous, execution shortly pauses every time something is logged. It's a relatively graphics-heavy app, so this kind of thing is quite annoying.
The new logger should use asynchronous I/O to get rid of these pauses. Thread-safety would also be desirable as I intend to add some basic multithreading soon.
I considered a one-file-per-thread approach, but that seemed like it would make managing the logs a nightmare. Any suggestions?
I noticed this 1 year+ old thread. Maybe the asynchronous logger I wrote could be of interest.
http://www.codeproject.com/KB/library/g2log.aspx
G2log uses a protected message queue to forward log entries to a background worker that the slow disk accesses.
I have tried it with a lock-free queue which increased the average time for a LOG call but decreased the worst case time, however I am using the protected queue now as it is cross-platform. It's tested on Windows/Visual Studio 2010 and Ubuntu 11.10/gcc4.6.
It's released as public domain so you can do with it what you want with no strings attached.
This is VERY possible and practical. How do I know? I wrote exactly that at my last job. Unfortunately (for us), they now own the code. :-) Sadly, they don't even use it.
I intend on writing an open source version in the near future. Meanwhile, I can give you some hints.
I/O manipulators are really just function names. You can implement them for your own logging class so that your logger is cout/cin compatible.
Your manipulator functions can tokenize the operations and store them into a queue.
A thread can be blocked on that queue waiting for chunks of log to come flying through. It then processes the string operations and generates the actual log.
This is intrinsically thread compatible since you are using a queue. However, you still would want to put some mutex-like protection around writing to the queue so that a given log << "stuff" << "more stuff"; type operation remains line-atomic.
Have fun!
I think the proper approach is not one-file-per-thread, but one-thread-per-file. If any one file (or resource in general) in your system is only ever accessed by one thread, thread-safe programming becomes so much easier.
So why not make Logger a dedicated thread (or several threads, one per file, if you're logging different things in different files), and in all other threads, writing to log would place the message on the input queue in the appropriate Logger thread, which would get to it after it's done writing the previous message. All it takes is a mutex to protect the queue from adding an event while Logger is reading an event, and a condvar for Logger to wait on when its queue is empty.
Have you considered using a log library.
There are several available, I discovered Pantheios recently and it really seems to be quite incredible.
It's more a front-end logger, you can customize which system is used. It can interact with ACE or log4cxx for example and it seems really easy to use and configure. The main advantage is that it use typesafe operators, which is always great.
If you just want a barebone logging library:
ACE
log4c*
Boost.Log
Pick any :)
I should note that it's possible to implement lock-free queues in C++ and that they are great for logging.
I had the same issue and I believe I have found the perfect solution. I present to you, a single-header library called loguru: https://github.com/emilk/loguru
It's simple to use, portable, configurable, macro-based and by default doesn't #include anything (for that sweet, sweet compilation times).