What would happen if you call read (or write, or both) in two different thread, on the same file descriptor (lets says we are interested about a local file, and a it's a socket file descriptor), without using explicitly a synchronization mechanism?
Read and Write are syscall, so, on a single core CPU, it's probably unlucky that two read would be executed "at the same time". But with multiple cores...
What the linux kernel will do?
And let's be a bit more general : is the behavior always the same for other kernels (like BSDs) ?
Edit : According to the close documentation, we should be sure that the file descriptor isn't used by a syscall in an other thread. So it seams that explicit synchronization would be required before closing a file descriptor (and so, also around read/write if thread that may call it are still running).
Any system level (syscall) file descriptor access is thread safe in all mainstream UNIX-like OSes.
Though depending on the age they are not necessarily signal safe.
If you call read, write, accept or similar on a file descriptor from two different tasks then the kernel's internal locking mechanism will resolve contention.
For reads each byte may be only read once though and writes will go in any undefined order.
The stdio library functions fread, fwrite and co. also have by default internal locking on the control structures, though by using flags it is possible to disable that.
The comment about close is because it doesn't make a lot of sense to close a file descriptor in any situation in which some other thread might be trying to use it. So while it is 'safe' as far as the kernel is concerned, it can lead to odd, hard to diagnose corner cases.
If a thread closes a file descriptor while a second thread is trying to read from it, the second thread may get an unexpected EBADF error. Worse, if a third thread is simultaneously opening a new file, that might reallocate the same fd, and the second thread might accidentally read from the new file rather than the one it was expecting...
Have a care for those who follow in your footsteps
It's perfectly normal to protect the file descriptor with a mutex semaphore. It removes any dependence on kernel behaviour so your message boundaries are now certain. You then don't have to cite the last paragraph at the bottom of a 15,489 line manpage which explains why the mutex isn't necessary (I exaggerated, but you get my meaning)
It also makes it clear to anyone reading your code that the file descriptor is being used by more than one thread.
Fringe Benefit
There is a fringe benefit to using a mutex that way. Suppose you've got different messages coming from the different threads and some of those messages are more important than others. All you need to do is set the thread priorities to reflect their messages' importance. That way the OS will ensure that your messages will be sent in order of importance for minimal effort on your part.
The result would depend on how the threads are scheduled to run at that particular instant in time.
One way to potentially avoid undefined behavior with multi-threading is to assume that you are doing memory operations. E.g. updating a linked list or changing a variable, etc.
If you use mutex/semaphores/lock or some other synchronization mechanism, it should work as intended.
Related
This is similar but a bit different to existing questions. Say I have many threads that open the same file but they all do their own fopen and maintain their own FILE pointer.
a) is it necessary to lock fwrite calls if they have their own FILE ptrs?
b) if it is necessary, is locking around fwrite enough or will they potentially flush at different times and end up intermingling when they flush? If yes, would locking on fwrite and then fflush cover it?
This question can not be answered in the context of programming languages. As far as programming language is concerned, those file handles are completely independent objects, and whatever you do with one has no effect whatsoever on another.
The question is on the operating system - can it handle multiple write operation to the same underlying file at the same time. In other words, are those writes atomic. I can't say for all of them, but in Linux, for example, writes for less than PIPE_BUF size are atomic.
For the quick measure, yeah, you can put a lock around the I/O part. That'd work, I guarantee it. As for flusing I/O cache, I'd recommend not doing that. It's always best to let OS to handle I/O timing because kernel knows what's going on the best. You are not gonna have it in effect immediately after calling flush anyway because it's that complicated. Just like the other flush operations(java GC, glFlush and so on). If you choose to stick to this option, please be mindful of a start and an end point of the concurrent I/O op. You wouldn't want a case where the main thread closes the file and another worker thread tries to do I/O on that.
The general solution to this problem is creating a thread that handles the file exclusively. If other thread should read/write from/to the file, they must ask the thread to do that for them. This is tricky, I know. You'd need to compose a simple protocol, sync mechanism, but in a nutshell, it goes like this:
prep a queue, a cv(condition variable), a lock. create a thread and open the file. Doesn't matter who opens the file
The thread spawns and waits for the queue to be filled in
Other threads send a request I/O op to the thread. The request includes the data for the file and an op code.
The thread handles the requests from the queue. This is where the real I/O happens.
You could use anonymous FIFO instead of a queue. Or skip the opcode part if the file is write-only.
Unlike network I/O, modern OSes can't do file I/Os in a non-blocking manner. So expect a significant blocking time(io wait). Also, there's this problem where the queue fills up too quick and eats a lot of memory when I/O is relatively slow. There will be a case where the whole program should wait for the I/O to complete before terminating itself. Not much you can do about it. You could close the file from another thread while I/O is in progress on Linux(close() is MT-safe ), I don't know how that's gonna work on other OS.
There are alternatives like async file I/O or overlapped I/O which involves signal handling or callbacks. Using these doesn't require a creating of a thread but each has pros and cons, mostly regarding portability.
I would like to write multithreading-safe logger using lock-free queue. Logging threads will push messages to queue and logger will be popping them and send to output. I consider how to solve that issue- sending to output.
I would like to avoid using mutex/locks as long as it is possible.
So, let's assume that I am going to use C++ streams to write to the file/console. We can assume that target system is Linux.
Ok, writing to stream must be just a wrapper ( perhaps a advanced wrapper) for system call offered by Unix write. From what I know syscalls are atomic ( only one process can execute syscall at the same time). So, it is tempting not to use locks to make safe writing to file.
But write is a system call but it doesn't guarantees writing "whole output". It returns number of bytes which are succesfully written to the file.
Basically, my question is:
How to solve it? Is it possible to avoid mutex? ( I think it is not possible). And please mark my considerations, am I wrong?
Igor is right: just have one thread do all the log writes. Keep in mind that the kernel has to do locking to synchronize access to the open file descriptor (which keeps track of the file position), so by doing writes from multiple cores you're causing contention inside the kernel. Even worse, you're making system calls from multiple cores, which means the kernel's code / data accesses will dirty your caches on multiple cores.
See this paper for more about the impact of making system calls on the performance of user-space code after the syscall completes. (And about data / instruction cache misses inside the kernel for infrequent syscalls). It definitely makes sense to have one thread doing all the system calls, at least all the write system calls, to keep that part of your process's footprint isolated to one core. As well as the locking contention inside the kernel.
That FlexSC paper is about an idea for batching system calls to reduce user->kernel->user transitions, but they also measure overhead for the normal synchronous system-call method. More important is the discussion of cache-pollution from making system calls.
Alternatively, if you can let multiple threads write to your log file, you could just do that and not use the queue at all.
It's not guaranteed that a large write will finish uninterrupted, but a small to medium sized write should (almost?) always copy its whole buffer on most OSes. Especially if you're writing to a file, not a pipe. IDK how Linux write() behaves when it's preempted, but I expect it usually resumes to finish the write instead of returning without having written all the requested bytes. Partial writes might be more likely when interrupted by a signal.
It is guaranteed that bytes from two write() system calls won't be mixed together; all the bytes from one will be before or after the bytes from the other. You're correct that partial writes are a potential problem, though. I forget if the glibc syscall wrapper will resume the call for you on EINTR. Although in that case, it means no bytes actually got written, or it would have returned success with a byte count.
You should test this, for partial writes and for performance. kernel-space locking might be cheaper than the overhead of your lock-free queue, but making system calls from every thread that generates log messages might be worse for performance. (And when you test this, make sure you do it with some real work happening in your user-space process, not just a loop that only calls write.)
Visual Studio's fread "locks out other threads." There is an alternate version _fread_nolock, which reads "without locking other threads", which should only be used "in thread-safe contexts such as single-threaded applications or where the calling scope already handles thread isolation."
Even after reading other somewhat relevant discussions on the two, I'm confused if the locking fread implements is on a specific FILE struct, a specific actual file, or on all fread calls on totally different files.
If you use the nolock versions, what level of locking do you need to provide? Can multiple threads in parallel be reading separate files without any locking? Can multiple threads in parallel be writing separate files without any locking? Or are there global or static variables involved that would be corrupted?
So, by using the nolock versions, are you able to potentially achieve better I/O throughput (if you aren't needlessly moving heads, like reading off separate drives, or a SSD drive), or is the potential gain just reducing redundant locks to a single lock (which should be negligible.)
Does VS' ifstream.read function work just like the regular fread? (I don't see a nolock version of it.)
The MS standard library implementation fully supports multi-threading. The C++ standard explain this requirement:
27.2.3: Concurrent access to a stream object, stream buffer object, or C Library stream by multiple threads may result in a data
race unless otherwise specified.
If one thread makes a library call a that writes a value to a stream
and, as a result, another thread reads this value from the stream
through a library call b such that this does not result in a data
race, then a’s write synchronizes with b’s read.
This means that if you write on a stream, a locking (not file locking, but concurrent access locking to the in-memory stream data structure) is done, to be sure that concurrency is well manageged for all the other threads using the same stream.
This locking overhead is always there, even if not needed. This could have a performance aspect, according to Microsoft:
the performance of the multithreaded libraries has been improved and
is close to the performance of the now-eliminated single-threaded
libraries. For those situations when even higher performance is
required, there are several new features.
This is why _nolock functions are provided. They access the stream directly without thread locking. It must be used with extreme care, for example:
if your application is single threaded (another process using the same stream has its own data structure, and OS manageds concurrency here)
if you're sure that no two threads use the same stream (for example if you have only one reader thread and writing is done outside your porgramme).
if you have other synchronisation mechasnism that protect a critical section of your code. For example, if you use a mutex lock, or an thread safe non blocking algorithm that makes use of atomics.
In such cases, the additional lock for stream access is not needed/redundant. For file intensive functions, it could be worth using the no_lock then.
Note: as you've pointed out: it's only worth using the nolock for intensive file accesses where you make millions of accesses.
fread_no_lock() appears to be used once you make sure that the file is locked with an external mechanism (some form of mutex, probably), and then you use it to reduce overhead: related: What's the intended use of _fread_nolock, _fseek_nolock?
This may also answer any further questions you might have: it may or may not be possible for your hard-drive to actually perform more than I/O operation at the same time depending on what type of hard drive you have: https://superuser.com/questions/252959/which-is-faster-copying-everything-at-once-or-one-thing-at-a-time
I'm working a project under linux, which needs read/write the same fd using multi-threads. And I want to use posix_fadvise to free page cache.
Can I call posix_fadvise when another thread is reading or writing the same fd?
Read posix_fadvise(2) and syscalls(2). Since posix_fadvise is a genuine syscall (e.g. wraps fadvise64 having its __NR_fadvise64 in <asm/unistd.h>...) you should be able to call it while another thread is writing the same fd, exactly as you may have two threads doing write(2) to the same file descriptor (but what happens then is perhaps non-deterministic).
I imagine that the kernel is internally locking the kernel file object referenced by a file descriptor.
BTW, the man page of posix_advise tells:
Programs can use posix_fadvise() to announce an intention to access
file data in a specific pattern in the future, thus allowing the
kernel to perform appropriate optimizations.
The advice applies to a (not necessarily existent) region starting at
offset and extending for len bytes (or until the end of the file if
len is 0) within the file referred to by fd. The advice is not
binding; it merely constitutes an expectation on behalf of the
application.
Hence I guess that the kernel may follow the posix_fadvise later (or not at all)...
So I think you can do that, but I believe you should avoid, at least for readability reasons (and because of the non-determinism), to have several threads working on the same file descriptor. My feeling is that your code may have some design issues, but something will perhaps happen...
Generally, I would avoid having several threads doing I/O on the same file descriptor (or at the very least, use pwrite(2) or lock the I/O with a mutex...). So while you could do what you are asking, I would avoid doing that.
Remember that I/O operations to a disk file system are much much slower (they may take many milliseconds) that ordinary computations. Locking them with a mutex should not be significant, and will give you more determinism.
I have many POSIX threads, two reader that read from serial port and others write to same port using a file descriptor. How can I share same descriptor between them? I have synchronized read/write and write/write actions between all threads by semaphores.
Note: I'm supposing a file descriptor should be shared between threads of same process but my code fails to run with a EBUSY error when second reader tries to read from port. (asked a question before)
Update
This is a little weird situation, even if only one thread is present at runtime, any call to read() after write() return -l with EBUSY error. Maybe I'm asking wrong question. There should be a some kind of flush after each write() to make sure that device is free? or somehow force write() to block?
Clearly, the EBUSY return code signals that the port is in use, and should be queried again later. Your threads should just wait a little bit and try again, until the command passes.
You sort of mention in one of your comments that the system behind the port is a mechanical one, which would explain why it could take a little while for a command to get processed.
I think the "one thread to handle IO" is the best approach. Each read/write would block the thread and avoid the EBUSY problem you are witnessing. All you would have left to do is implement a command queue (very easy with std::queue or similar and just just one mutex to sync all accesses).
UPDATE: reading your update, I guess that EBUSY are just the sign that commands are really slow to execute, and finish a little while after the system call returned, to the point that even when one single thread is doing IO, it may experience it. As I said at the beginning of my answer, have the thread wait a bit before reissuing its command, and that should do it.
Open file with 'O_NONBLOCK' flag.