It is a short question. I believe there is no way to cancel a job submitted to libclang through python bindings (for example code completion task).
Can anybody prove me wrong? I am interested in using libclang in a multi threaded environment but it seems it is intended to be accesses from single thread only. If there is also no mechanism to cancel tasks, then one has to wait till the task finishes even if the results are not needed anymore. Does anybody have any ideas on how to overcome this?
[..] it seems it is intended to be accesses from single thread only.
I don't have anything that clearly backs this, but as the documentation nowhere even talks about thread safety I think all of libclang should be considered not thread safe.
But: Seeing that basically everything libclang does is (indirectly) bound to an CXIndex I would guess that you could have a CXIndex per thread and then use those (or anything that's created from them) in parallel (but not "share" anything between threads).
If there is also no mechanism to cancel tasks, then one has to wait till the task finishes even if the results are not needed anymore. Does anybody have any ideas on how to overcome this?
The "safe" solution is to move all libclang related code into a dedicated process. From your main application you then start (or kill) these processes (using OS dependent mechanisms) as you like. This is, of course, "heavy" in terms of both performance (starting processes) and development effort (serializing communication between processes).
The alternative is to hope (or verify in the source code) that the libclang devs keep all data associated to a CXIndex and thus don't introduce possible data races in their code. Then you can give every thread its own index, its own translation units etc. When you have a "job", you launch a thread (or reuse one) to work on it. If in the mean time the results are no longer needed, then you just discard the results when (if) they ever get ready.
Related
Context: I was looking at how asynchronous programming really works. After some investigation on the topic, the resulting idea was that there are two things to differentiate:
Concurrency (synchronous/asynchronous): About tasks
Multi-threading: About workers
Based on these concepts, we can identify 4 main ways to parallelize tasks. Better than 100 words, I have made a drawing to illustrate this:
Note: The 4th column (Multi-threaded asynchronous) will not be considered here since it mixes multi-threading and asynchronous programming.
In c++, we have the template function std::async() to allow us to run a function asynchronously.
We can set the launch policy at:
std::launch::async: Run "asynchronously" in a separate thread.
std::launch::deferred: Run when the result is requested.
Question: If we take a look at my drawing, the std::launch::async policy seems to behave as Multi-threaded synchronous and the std::launch::deferred policy seems to behave as an isolated case of Single-threaded asynchronous (the function is oneshot executed when the result is requested).
But if I'm not mistaken, the idea behind Single-threaded asynchronous is that in case of waiting for a resource to be available or when struggling with some latency (disk access time, ...), the program should not keep blocking the main thread (and so wasting time) and go on to do the next task instead (and come back later to the previous one).
What I don't understand is that std::async() does not seem to allow this kind of behaviour. We can only either run the task synchronously in another thread or running it once and for all when the result is requested (as late as possible).
If we take a look at my drawing, the Single-threaded asynchronous method is not really implemented since the function runs in "oneshot" no matter if it will have to wait for a resource or not. So we will still waste time in this case.
I'm wondering why ? Is my understanding wrong ? Is it an oversight in the std::async() implementation or is it intentional (by the standard) ?
Edit: I'm not sure if it is the right place to ask this question since it is not really a "coding" issue/question.
We have a decoding function that runs in its own thread to carry out its job.
The time of execution is usually well below a defined timeout value, but on some occasions it may take much longer to complete. Thus the need to have a timeout in order to make sure this function will not cause extra delays to the rest of the program.
This is currently being developed on Windows OS but I'm also looking at a portable solution to Linux.
The implementation so far as multiple checks within the decoding function to see if it still has time to continue or abort processing. Which is def. not great practice and I'm looking at improving this.
I'm aware that boost provides such facility, but we do not use boost in this project.
Here is an excellent article by Herb Sutter on the subject. The conclusion would be: your current approach is OK. Just have your decoding threads periodicly check if they run out of time. The important thing is to strike a balance about how frequently you check.
One way is to set a flag on timeout to instruct the thread instance to not report any completion, not continue and to delete/terminate itself ASAP. Reduce its priority to the lowest possible and forget about it. Create another thread object immediately, overwriting the old instance value, and use the new thread instance for subsequent decoding.
The lowest-priority orphaned thread will eventually die off itself when it finally gets around to checking its suicide-flag.
I'm using TinyThread++ to get clean and simple platform independent control over threading features in my project. I just came upon a situation where I'd like to have responsive synchronized message passing without pegging the CPU, while allowing a thread to continue to do a bit of work on the side while it is idle. Sure, I could simply spawn a third thread to do this "other work" but all I'm missing is a condition variable wait(int ms) type function rather than the wait() that already works great. The idea is that I'd like for it to block only for up to ms milliseconds, so it will be able to time out and perform some actions periodically (during which the thread will not be actively waiting on the condition variable). The idea is that even though it's nice to have the thread sitting there waiting to pounce on any incoming messages, if I give it some task to do on the side which takes only 50 microseconds to execute, and I only need to run that once every second, it definitely shouldn't push me to make yet another thread (and message queue and other resources) to get it done.
Does any of this make sense? I'm looking for suggestions on how i might go about implementing this. I'm hoping adding a couple of lines to the TinyThread code can provide me with this functionality.
Well the source code for the wait function isn't very complicated so making the required modificiations looks simple enough:
The linux implementation relies on the pthread_cond_wait function
which can trivially be changed to the pthread_cond_timedwait
function. Do read the documentation carefully in case I forgot about any minutias.
On the windows side of things, it's a little more
complicated and I'm no expert on multithreading on windows. That
being said, if there's a timed version of the _wait function (I'm pretty sure there is),
changing that should work just fine. Again, read over the documentation carefully before doing any modifications.
Now before you go off and do these modifications, I don't think what you're trying to do is a good idea. The main advantage of using threads is to conceptually seperate different tasks. Trying to do multiple things in a single thread is a bit like trying to do multiple things in a single function: it complicates the design and makes things harder to debug. So unless the overhead of creating a new thread is provably too great or unless the resulting code remains simple and easy to understand, I'd split it up into multiple threads.
Finally, I get the feeling that you might not be aware that condition variables can return spuriously (returns without anybody having done any signalling or returns when the condition is still false). So just in case, I'd suggest reviewing the usage examples and making sure you understand why those loops are there.
I know you cannot kill a boost thread, but can you change it's task?
Currently I have an array of 8 threads. When a button is pressed, these threads are assigned a task. The task which they are assigned to do is completely independent of the main thread and the other threads. None of the the threads have to wait or anything like that, so an interruption point is never reach.
What I need is to is, at anytime, change the task that each thread is doing. Is this possible? I have tried looping through the array of threads and changing what each thread object points to to a new one, but of course that doesn't do anything to the old threads.
I know you can interrupt pThreads, but I cannot find a working link to download the library to check it out.
A thread is not some sort of magical object that can be made to do things. It is a separate path of execution through your code. Your code cannot be made to jump arbitrarily around its codebase unless you specifically program it to do so. And even then, it can only be done within the rules of C++ (ie: calling functions).
You cannot kill a boost::thread because killing a thread would utterly wreck some of the most fundamental assumptions a programmer makes. You now have to take into account the possibility that the next line doesn't execute for reasons that you can neither predict nor prevent.
This isn't like exception handling, where C++ specifically requires destructors to be called, and you have the ability to catch exceptions and do special cleanup. You're talking about executing one piece of code, then suddenly inserting a call to some random function in the middle of already compiled code. That's not going to work.
If you want to be able to change the "task" of a thread, then you need to build that thread with "tasks" in mind. It needs to check every so often that it hasn't been given a new task, and if it has, then it switches to doing that. You will have to define when this switching is done, and what state the world is in when switching happens.
I'm using SQLite3 in a Windows application. I have the source code (so-called SQLite amalgamation).
Sometimes I have to execute heavy queries. That is, I call sqlite3_step on a prepared statement, and it takes a lot of time to complete (due to the heavy I/O load).
I wonder if there's a possibility to abort such a call. I would also be glad if there was an ability to do some background processing in the middle of the call within the same thread (since most of the time is spent in waiting for the I/O to complete).
I thought about modifying the SQLite code myself. In the simplest scenario I could check some condition (like an abort event handle for instance) before every invocation of either ReadFile/WriteFile, and return an error code appropriately. And in order to allow the background processing the file should be opened in the overlapped mode (this enables asynchronous ReadFile/WriteFile).
Is there a chance that interruption of WriteFile may in some circumstances leave the database in the inconsistent state, even with the journal enabled? I guess not, since the whole idea of the journal file is to be prepared for any error of any kind. But I'd like to hear more opinions about this.
Also, did someone tried something similar?
Thanks in advance.
EDIT:
Thanks to ereOn. I wasn't aware of the existence of sqlite3_interrupt. This probably answers my question.
Now, for all of you who wonders how (and why) one expects to do some background processing during the I/O within the same thread.
Unfortunately not many people are familiar with so-called "Overlapped I/O".
http://en.wikipedia.org/wiki/Overlapped_I/O
Using it one issues an I/O operation asynchronously, and the calling thread is not blocked. Then one receives the I/O completion status using one of the completion mechanisms: waitable event, new routine queued into the APC, or the completion port.
Using this technique one doesn't have to create extra threads. Actually the only real legitimation for creating threads is when your bottleneck is the computation time (i.e. CPU load), and the machine has several CPUs (or cores).
And creating a thread just to let it be blocked by the OS most of the time - this doesn't make sense. This leads to the unjustified waste of the OS resources, complicates the program (need for synchronization and etc.).
Unfortunately not all the libraries/APIs allow asynchronous mode of operation, thus making creating extra threads the necessarily evil.
EDIT2:
I've already found the solution, thansk to ereOn.
For all those who nevertheless insist that it's not worth doing things "in background" while "waiting" for the I/O to complete using overlapped I/O. I disagree, and I think there's no point to argue about this. At least this is not related to the subject.
I'm a Windows programmer (as you may noticed), and I have a very extensive experience in all kinds of multitasking. Plus I'm also a driver writer, so that I also know how things work "behind the scenes".
I know that it's a "common practice" to create several threads to do several things "in parallel". But this doesn't mean that this is a good practice. Please allow me not to follow the "common practice".
I don't understand why you want the interruption to come from the same thread and I even don't understand how that would be possible: if the current thread is blocked, waiting for some IO, you can't execute any other code. (Yeah, that's what "blocked" means)
Perhaps if you give us more hints about why you want this, we might help further.
Usually, I use sqlite3_interrupt() to cancel calls. But this, obviously, involves that the call is made from another thread.
By default, SQLite is threadsafe. It sounds to me like the easiest thing to do would be to start the Sqlite command on a background thread, and let SQLite to the necessary locking to have that work.
From your perspective then, the sqlite call looks like an asynchronous bit of I/O, and you can continue normal processing on this thread, such as e.g. using a loop including interruptible sleep and a bit of occasional background processing (e.g. to update a liveness indicator). When the SQLite statement completes, the background thread should set a state variable to indicate this, wake the main thread (if necessary), and terminate.