I have a code that looks like this:
myCmd::myCmd(std::string outFile) : _tclInterp(Tcl_CreateInterp()), _outFile(outFile)
{
_myParser = CC::CmdParser(_tclInterp, registerAllCommands);
}
The thread which created the Tcl_CreateInterp() might be different from what is being used in the CC::CmdParser. Internally CC::CmdParser also calls Tcl_DeleteInterp() once all the procedures are complete. It will be a problem if the thread id is different during calls to CreateInterp and DeleteInterp.
How do I associate or tell such that same thread does Tcl_CreateInterp() (in the constructor) and Tcl_DeleteInterp() (in the CmdParser).
NOTE that the CC::CmdParser is a third-party API and I don't have source code.
Tcl interpreter objects internally use thread-specific data extensively in order to avoid most global locks. This goes as far as including thread-specific memory management pools, at least for common allocations, and Tcl code is executed as part of initialising the interpreter state, so it's really not possible to use an interpreter in a different thread from the one in which it was created. The Tcl library checks that you're using it correctly in a few places, at least in some build modes, and will make the process abort() (or Windows's equivalent) if you disobey the rules; that's much friendlier than the crash that would otherwise be coming due to fundamental assumptions not being respected…
This means that you must create, use, and destroy the interpreter on the same thread.
You might be advised to initialise the Tcl interpreter lazily if you can't guarantee that the containing object is used in the same thread that it was created in. You might think that this results in handling of things that isn't really the C++ way, but the thread-bound nature of interpreters means that you've really not got any alternative. (You might want to look at Tcl_CreateExitHandler() for help with cleaning up the interpreter when the thread goes away, but that will only help if Tcl_FinalizeThread() or Tcl_ExitThread() are called; if you're being thread-promiscuous on the C++ side then there's a good chance that you'll be baking in all sorts of problems anyway.)
If the C++ part insists on using multiple threads, you'll have to have a single thread for the Tcl code and use inter-thread messaging to let the other threads tell it things. That's done with Tcl_ThreadQueueEvent(). The canonical way to use that is the thread package, and that includes the canonical example of how to do so (github.com/tcltk/thread/generic/threadCmd.c); most of the complexity in there relates to turning an essentially one-way (but “reliable”) operation into a two-way one that lets you do a procedure call into a different thread; one-way calls are far simpler.
Related
Is there a way for a thread-pool to cancel a task underway? Better yet, is there a safe alternative for on-demand cancelling opaque function calls in thread_pools?
Killing the entire process is a bad idea and using native handle to perform pthread_cancel or similar API is a last resort only.
Extra
Bonus if the cancellation is immediate, but it's acceptable if the cancellation has some time constraint 'guarantees' (say cancellation within 0.1 execution seconds of the thread in question for example)
More details
I am not restricted to using Boost.Thread.thread_pool or any specific library. The only limitation is compatibility with C++14, and ability to work on at least BSD and Linux based OS.
The tasks are usually data-processing related, pre-compiled and loaded dynamically using C-API (extern "C") and thus are opaque entities. The aim is to perform compute intensive tasks with an option to cancel them when the user sends interrupts.
While launching, the thread_id for a specific task is known, and thus some API can be sued to find more details if required.
Disclaimer
I know using native thread handles to cancel/exit threads is not recommended and is a sign of bad design. I also can't modify the functions using boost::this_thread::interrupt_point, but can wrap them in lambdas/other constructs if that helps. I feel like this is a rock and hard place situation, so alternate suggestions are welcome, but they need to be minimally intrusive in existing functionality, and can be dramatic in their scope for the feature-set being discussed.
EDIT:
Clarification
I guess this should have gone in the 'More Details' section, but I want it to remain separate to show that existing 2 answers are based o limited information. After reading the answers, I went back to the drawing board and came up with the following "constraints" since the question I posed was overly generic. If I should post a new question, please let me know.
My interface promises a "const" input (functional programming style non-mutable input) by using mutexes/copy-by-value as needed and passing by const& (and expecting thread to behave well).
I also mis-used the term "arbitrary" since the jobs aren't arbitrary (empirically speaking) and have the following constraints:
some which download from "internet" already use a "condition variable"
not violate const correctness
can spawn other threads, but they must not outlast the parent
can use mutex, but those can't exist outside the function body
output is via atomic<shared_ptr> passed as argument
pure functions (no shared state with outside) **
** can be lambda binding a functor, in which case the function needs to makes sure it's data structures aren't corrupted (which is the case as usually, the state is a 1 or 2 atomic<inbuilt-type>). Usually the internal state is queried from an external db (similar architecture like cookie + web-server, and the tab/browser can be closed anytime)
These constraints aren't written down as a contract or anything, but rather I generalized based on the "modules" currently in use. The jobs are arbitrary in terms of what they can do: GPU/CPU/internet all are fair play.
It is infeasible to insert a periodic check because of heavy library usage. The libraries (not owned by us) haven't been designed to periodically check a condition variable since it'd incur a performance penalty for the general case and rewriting the libraries is not possible.
Is there a way for a thread-pool to cancel a task underway?
Not at that level of generality, no, and also not if the task running in the thread is implemented natively and arbitrarily in C or C++. You cannot terminate a running task prior to its completion without terminating its whole thread, except with the cooperation of the task.
Better
yet, is there a safe alternative for on-demand cancelling opaque
function calls in thread_pools?
No. The only way to get (approximately) on-demand preemption of a specific thread is to deliver a signal to it (that is is not blocking or ignoring) via pthread_kill(). If such a signal terminates the thread but not the whole process then it does not automatically make any provision for freeing allocated objects or managing the state of mutexes or other synchronization objects. If the signal does not terminate the thread then the interruption can produce surprising and unwanted effects in code not designed to accommodate such signal usage.
Killing the entire process is a bad idea and using native handle to
perform pthread_cancel or similar API is a last resort only.
Note that pthread_cancel() can be blocked by the thread, and that even when not blocked, its effects may be deferred indefinitely. When the effects do occur, they do not necessarily include memory or synchronization-object cleanup. You need the thread to cooperate with its own cancellation to achieve these.
Just what a thread's cooperation with cancellation looks like depends in part on the details of the cancellation mechanism you choose.
Cancelling a non cooperative, not designed to be cancelled component is only possible if that component has limited, constrained, managed interactions with the rest of the system:
the ressources owned by the components should be managed externally (the system knows which component uses what resources)
all accesses should be indirect
the modifications of shared ressources should be safe and reversible until completion
That would allow the system to clean up resource, stop operations, cancel incomplete changes...
None of these properties are cheap; all the properties of threads are the exact opposite of these properties.
Threads only have an implied concept of ownership apparent in the running thread: for a deleted thread, determining what was owned by the thread is not possible.
Threads access shared objects directly. A thread can start modifications of shared objects; after cancellation, such modifications that would be partial, non effective, incoherent if stopped in the middle of an operation.
Cancelled threads could leave locked mutexes around. At least subsequent accesses to these mutexes by other threads trying to access the shared object would deadlock.
Or they might find some data structure in a bad state.
Providing safe cancellation for arbitrary non cooperative threads is not doable even with very large scale changes to thread synchronization objects. Not even by a complete redesign of the thread primitives.
You would have to make thread almost like full processes to be able to do that; but it wouldn't be called a thread then!
My function func() is being called by multiple threads. (Each thread will call this function only once.)
From inside the func(), I want each thread to call a Tcl proc named tcl_proc_name (which takes no arguments).
For this, I did like this
Tcl_Eval( Tcl_CreateInterp() , "tcl_proc_name");
But it somehow this code is not able to invoke the Tcl proc.
Am I missing something?
Every Tcl interpreter (i.e., every instance of Tcl_Interp) is strongly bound by design to the thread that creates it; the implementation internally uses thread-specific data extensively so as to virtually completely avoid the need for major locks (like the Global Interpreter Lock that bedevils performant multithreaded Python code). Tcl commands are utterly bound to the interpreter that contains them. You have to either:
Send messages to a single thread to perform the action. (See the Tcl_QueueEvent() function, or use the Thread package's thread::send command from the Tcl level.)
Duplicate the command in multiple interpreters, one per thread. That might be easy or complex in your application.
For avoidance of all doubt: you cannot safely use a Tcl interpreter from multiple threads. It will not work; guaranteed. It will cause crashes.
I'm designing the threading architecture for my game engine, and I have reached a point where I am stumped.
The engine is partially inspired by Grimrock's engine, where they put as much as they could into LuaJIT, with some things, including low level systems, written in C++.
This seemed like a good plan, given that LuaJIT is easy to use, and I can continue to add API functions in C++ and expand it further. Faster iteration is nice, the ability to have a custom IDE attached to the game and edit the code while it runs is an interesting option, and serializing from Lua is also easy.
But I am stumped on how to go about adding threading. I know Lua has coroutines, but that is not true threading; it's basically to keep Lua from stalling as it waits for code that takes too long.
I originally had in mind to have the main thread running in Lua and calling C++ functions which are dispatched to the scheduler, but I can't find enough information on how Lua functions. I do know that when Lua calls a C++ function it runs outside of the state, so theoretically it may be possible.
I also don't know whether, if Lua makes such a call that is not supposed to return anything, it will hang on the function until it's done.
And I'm not sure whether the task scheduler runs in the main thread, or if it is simply all worker threads pulling data from a queue.
Basically meaning that, instead of everything running at once, it waits for the game state update before doing anything.
Does anyone have any ideas, or suggestions for threading?
In general, a single lua_State * is not thread safe. It's written in pure C and meant to go very fast. It's not safe to allow exceptions go through it either. There's no locks in there and no way for it to protect itself.
If you want to run multiple lua scripts simultaneously in separate threads, the most straightforward way is to use luaL_newstate() separately in each thread, initialize each of them, and load and run scripts in each of them. They can talk to the C++ safely as long as your callbacks use locks when necessary. At least, that's how I would try to do it.
There are various things you could do to speed it up, for instance, if you are loading copies of a single script in each of the threads, you could compile it to lua bytecode before you launch any of the threads, then put the buffer into shared memory, and have the scripts load the shared byte code without changing. That's most likely an unnecessary optimization though, depending on your application.
Is it ok to check the current thread inside a function?
For example if some non-thread safe data structure is only altered by one thread, and there is a function which is called by multiple threads, it would be useful to have separate code paths depending on the current thread. If the current thread is the one that alters the data structure, it is ok to alter the data structure directly in the function. However, if the current thread is some other thread, the actual altering would have to be delayed, so that it is performed when it is safe to perform the operation.
Or, would it be better to use some boolean which is given as a parameter to the function to separate the different code paths?
Or do something totally different?
What do you think?
You are not making all too much sense. You said a non-thread safe data structure is only ever altered by one thread, but in the next sentence you talk about delaying any changes made to that data structure by other threads. Make up your mind.
In general, I'd suggest wrapping the access to the data structure up with a critical section, or mutex.
It's possible to use such animals as reader/writer locks to differentiate between readers and writers of datastructures but the performance advantage for typical cases usually wont merit the additional complexity associated with their use.
From the way your question is stated, I'm guessing you're fairly new to multithreaded development. I highly suggest sticking with the simplist and most commonly used approaches for ensuring data integrity (most books/articles you readon the issue will mention the same uses for mutexes/critical sections). Multithreaded development is extremely easy to get wrong and can be difficult to debug. Also, what seems like the "optimal" solution very often doesn't buy you the huge performance benefit you might think. It's usually best to implement the simplist approach that will work then worry about optimizing it after the fact.
There is a trick that could work in case, as you said, the other threads will only make changes only once in a while, although it is still rather hackish:
make sure your "master" thread can't be interrupted by the other ones (higher priority, non fair scheduling)
check your thread
if "master", just change
if other, put off scheduling, if needed by putting off interrupts, make change, reinstall scheduling
really test to see whether there are no issues in your setup.
As you can see, if requirements change a little bit, this could turn out worse than using normal locks.
As mentioned, the simplest solution when two threads need access to the same data is to use some synchronization mechanism (i.e. critical section or mutex).
If you already have synchronization in your design try to reuse it (if possible) instead of adding more. For example, if the main thread receives its work from a synchronized queue you might be able to have thread 2 queue the data structure update. The main thread will pick up the request and can update it without additional synchronization.
The queuing concept can be hidden from the rest of the design through the Active Object pattern. The activ object may also be able to publish the data structure changes through the Observer pattern to other interested threads.
What is the common theory behind thread communication? I have some primitive idea about how it should work but something doesn't settle well with me. Is there a way of doing it with interrupts?
Really, it's just the same as any concurrency problem: you've got multiple threads of control, and it's indeterminate which statements on which threads get executed when. That means there are a large number of POTENTIAL execution paths through the program, and your program must be correct under all of them.
In general the place where trouble can occur is when state is shared among the threads (aka "lightweight processes" in the old days.) That happens when there are shared memory areas,
To ensure correctness, what you need to do is ensure that these data areas get updated in a way that can't cause errors. To do this, you need to identify "critical sections" of the program, where sequential operation must be guaranteed. Those can be as little as a single instruction or line of code; if the language and architecture ensure that these are atomic, that is, can't be interrupted, then you're golden.
Otherwise, you idnetify that section, and put some kind of guards onto it. The classic way is to use a semaphore, which is an atomic statement that only allows one thread of control past at a time. These were invented by Edsgar Dijkstra, and so have names that come from the Dutch, P and V. When you come to a P, only one thread can proceed; all other threads are queued and waiting until the executing thread comes to the associated V operation.
Because these primitives are a little primitive, and because the Dutch names aren't very intuitive, there have been some ther larger-scale approaches developed.
Per Brinch-Hansen invented the monitor, which is basically just a data structure that has operations which are guaranteed atomic; they can be implemented with semaphores. Monitors are pretty much what Java synchronized statements are based on; they make an object or code block have that particular behavir -- that is, only one thread can be "in" them at a time -- with simpler syntax.
There are other modeals possible. Haskell and Erlang solve the problem by being functional languages that never allow a variable to be modified once it's created; this means they naturally don't need to wory about synchronization. Some new languages, like Clojure, instead have a structure called "transactional memory", which basically means that when there is an assignment, you're guaranteed the assignment is atomic and reversible.
So that's it in a nutshell. To really learn about it, the best places to look at Operating Systems texts, like, eg, Andy Tannenbaum's text.
The two most common mechanisms for thread communication are shared state and message passing.
THe most common way for threads to communicate is via some shared data structure, typically a queue. Some threads put information into the queue while others take it out. The queue must be protected by operating system facilities such as mutexes and semaphores. Interrupts have nothing to do with it.
If you're really interested in a theory of thread communications, you may want to look into formalisms like the pi Calculus.
To communicate between threads, you'll need to use whatever mechanism is supplied by your operating system and/or runtime. Interrupts would be unusually low level, although they might be used implicitly if your threads communicate using sockets or named pipes.
A common pattern would be to implement shared state using a shared memory block, relying on an os-supplied synchronization primitive such as a mutex to spare you from busy-waiting when your read from the block. Remember that if you have threads at all, then you must have some kind of scheduler already (whether it's native from the OS or emulated in your language runtime). So this scheduler can provide synchronization objects and a "sleep" function without necessarily having to rely on hardware support.
Sockets, pipes, and shared memory work between processes too. Sometimes a runtime will give you a lighter-weight way of doing synchronization for threads within the same process. Shared memory is cheaper within a single process. And sometimes your runtime will also give you an atomic message-passing mechanism.