how to know if a spinlock is held in kernel space - concurrency

I'm writing a Linux Devide Driver using NVIDIA API, and I notice that there's a function that fails if I call it holding a spinlock. I was asking myself, how a kernel function knows if it's called holding a spinklock?
maybe better if I go with an pseudocode example:
spin_lock_irqsave(&my_lock,flags)
nvidia_p2p_get_pages(...)
spin_unlock_irqrestore(&my_lock, flags)
the function p2p_get_pages in this situation returns an error (but it works if I use it without the spinlock... anyway this is not the problem).
What could happen inside that function that fails? maybe it tries to take another spinlock? or sleep? or it checks for spinlocks... and how? how can it knows about the spinlock?
thank you!

It would be possible to check whether a specific lock is locked by calling spin_is_locked().
However, it is unlikely that the nvidia driver knows about your my_lock.
Many functions care only about the fact that they (might) need to sleep.
This would blow up when in an atomic context (i.e., when interrupts are disabled).
To detect this, many such functions call might_sleep() to log a warning before bad things actually happen.
(might_sleep() does not know about your locks either, it just checks whether interrupts are disabled.)

Related

Is it safe to kill a thread that is executing a memcpy?

Context:
I am working on an application that needs fast access to large files, so I use memory-mapping. Reading and writing becomes a simple memcpy in consequence. I am now trying to add the ability to abort any reads or writes in progress.
The first thing that came to mind (since I don't know about any interruptable memcpy function) was to periodically memcpy a few KB and check if the operation should be aborted. This should ensure a near-instantanious abortion, if the read is reasonably fast.
If it isn't however the application shouldn't take ages to abort, so my second idea was using multithreading. The memcpy happens in its own thread, and a controlling thread uses WaitForMultipleObjects on an event that signals abortion, and the memcpy-thread. It would then kill the memcpy-thread, if the abortion-event was signaled. However, the documentation on TerminateThread states that one should be absolutely sure that one does not leave the system in a bad state by not releasing ressource for instance.
Question:
Does a memcpy do anything that would make it unsafe to kill it when copying mapped memory? Is it safe to do so? Is it implementation dependant (using different operating systems/architectures than Windows x86-64)?
I do realize that using the second approach may be complete overkill, since no 1KB read/write is every going to take that long realistically, but I just want to be safe.
If at all possible you should choose a different design, TerminateThread should not be thought of as a normal function, it is more for debugging/power tools.
I would recommend that you create a wrapper around memcpy that copies in chunks. The chunk size is really up to you, depends on your responsiveness requirements. 1 MiB is probably a good starting point.
If you absolutely want to kill threads you have to take a couple of things into account:
You obviously don't know anything about how memcpy works internally nor how much it copied so you have to assume that the whole range is undefined when you abort.
Terminating a thread will leak memory on some versions of Windows. There are workarounds for that.
Don't hold any locks in the thread.

Detecting infinite recursion in v8

I am using google's v8 javascript engine to have an embedded js interpreter in my project, which must be able to execute user-provided code, but I am wondering if it is possible to set something up in advance of calling any user code which ensures that if the code tries to recurse indefinitely (or even if it just executes for too long), that it can somehow be made to abort, throw an otherwise uncaught exception, and report the issue back to the caller.
Thank you all for responses so far... yes, I realized not long after I posted this that I was basically asking for some kind of solution to the halting problem, which I know is unsolvable, and is actually far more than what I really need.
What I'd need is either some mechanism for detecting when something running in the v8 environment is returning quickly enough, or else simply a mechanism to detect if recursion is happening at all... my use cases are such that the end user should not be utilizing any recursion anyways, and if I can possibly even detect that, then I could reject it at that point instead of blindly executing it. It would be allowed, however, for different threads, with different isolates to invoke the same functions at the same time, so I can't just use a static local variable to lock out another call to the same function.
A compiler [V8 is definitely a compiler in this context, even if it isn't "always" a compiler] can detect recursion, but if the code is clever enough (for example depending on variables that aren't known at compile time), it's not possible to detect whether it has infinite or finite recursion.
I would simply state that "execution over X seconds is disallowed", and if the execution takes more than that long, abort it. You can do this by having a "watchdog thread", that gets triggered when the code completes - and if the watchdog thread gets to run X seconds, kill the main thread and report back to user-code. No, I don't know EXACTLY how to write this code in conjunction with V8.

How to (unit) test if a function is lock free ?

I would like to add several unit tests to my code, also as I load plug ins I don't always have access to the code I'm running.
The test I would really like to check is if the function I'm calling is lock free ?
Is there any hook, or way to test if between a point A and B in my program there was a call to a non lock free function ?
Another less complicated function is how to hook all calls to locking functions (like locks, system calls ...). I know how to hook calls to malloc on windows but nothing else.
Thank you for your help
You can't.
You could substitute a different implementation of pthread_lock but code could make direct calls to e.g. futex, and if you replace that the code could still call it directly with syscall(SYS_futex,...). You could profile the code or use something like strace to detect all such calls, but that still wouldn't tell you if the code implements its own custom spinlock in assembly.
I'm pretty sure you can't do that without instrumenting the locks, or something similar.
One could come up with a lot of scenarios where the call of a locking function causes different behaviour in testing [possibly only when "special test-mode for identifying testing" is enabled] than in production code - for example, add a sleep for 100ms into the lock method, and try to use another locked function and compare the time with "no competiton for the lock.
Or we could keep a count of calls to lock, and see if the count before and after the function is the same (or has increased by the expected amount, if the function is supposed to call lock a certain number of times).
But a generic way that isn't intrusive into the locking mechanism, I'm pretty sure it's impossible.
Of course, code-review and clear documentation as to what code calls locks and which doesn't would also be useful - and good reviewers that spot errors.
As the others have already answered it is not possible to test whether the algorithm is lock-free or not. However, it is possible to test that it behaves consistently in a multi-threaded environment. My experience in this area is only using a lock-free queue (which I wrote myself, but based on an academic paper) so my tests are based around a queue which may or may not be useful to you.
I used multiple threads to test to hammer the queue.
Thread Safety: the queue must not crash under heavy loads
Speed: how does the response times vary under a heavy load
Consistency: the queue mustn't loose items.
In my test, I also varied the number of readers and writers. The queue will behave differently depending on the ratio of readers to writers. More readers than writers will generally result in a nearly empty queue, whereas the inverse will result in a queue that continually expands until the writers stop writing.
Point 2 might be of interest to you as you can you can generally tell if the algorithm is lock-free or not based on the variance of response times under a heavy load. If response times remain fast under a heavy load then you can infer that the algorithm is lock-free. Or at least if it isn't it behaves as it if is.

Whats the proper way to flag a thread to exit using boost without c++11

After reading various answers on how volatile should not be used to flag a running thread to exit, (And the suggestions to use boost:atomic<>) I still cannot find an answer on how to properly do this using boost without C++11.
Should I use boost::mutex?
If so, do I need to lock on my m_stopThread variable where I change it to true and in my run loop where I check it?
Is boost::mutex lock call going to make a call into the operating system or is it lighter just using memory barriers instructions etc.?
I suppose it is only necessary to call something to issue write mamory barrier after setting and read memory barrier before testing. It may be atomic operation, mutex access or anything else.
(I suppose that even entering different mutexes will be Ok :)
If you don't hurry, you may do nothing, because the right barrier instruction should be issued somewhen in the future (at least when hardware interrupt occurs).
Of course, m_stopThread should be declared volatile
(Although I may be wrong from the Stadard viewpoint)

Multithreading: a blocking wait with timeout

I'm using TinyThread++ to get clean and simple platform independent control over threading features in my project. I just came upon a situation where I'd like to have responsive synchronized message passing without pegging the CPU, while allowing a thread to continue to do a bit of work on the side while it is idle. Sure, I could simply spawn a third thread to do this "other work" but all I'm missing is a condition variable wait(int ms) type function rather than the wait() that already works great. The idea is that I'd like for it to block only for up to ms milliseconds, so it will be able to time out and perform some actions periodically (during which the thread will not be actively waiting on the condition variable). The idea is that even though it's nice to have the thread sitting there waiting to pounce on any incoming messages, if I give it some task to do on the side which takes only 50 microseconds to execute, and I only need to run that once every second, it definitely shouldn't push me to make yet another thread (and message queue and other resources) to get it done.
Does any of this make sense? I'm looking for suggestions on how i might go about implementing this. I'm hoping adding a couple of lines to the TinyThread code can provide me with this functionality.
Well the source code for the wait function isn't very complicated so making the required modificiations looks simple enough:
The linux implementation relies on the pthread_cond_wait function
which can trivially be changed to the pthread_cond_timedwait
function. Do read the documentation carefully in case I forgot about any minutias.
On the windows side of things, it's a little more
complicated and I'm no expert on multithreading on windows. That
being said, if there's a timed version of the _wait function (I'm pretty sure there is),
changing that should work just fine. Again, read over the documentation carefully before doing any modifications.
Now before you go off and do these modifications, I don't think what you're trying to do is a good idea. The main advantage of using threads is to conceptually seperate different tasks. Trying to do multiple things in a single thread is a bit like trying to do multiple things in a single function: it complicates the design and makes things harder to debug. So unless the overhead of creating a new thread is provably too great or unless the resulting code remains simple and easy to understand, I'd split it up into multiple threads.
Finally, I get the feeling that you might not be aware that condition variables can return spuriously (returns without anybody having done any signalling or returns when the condition is still false). So just in case, I'd suggest reviewing the usage examples and making sure you understand why those loops are there.