Qt QIODevice::write / QTcpSocket::write and bytes written - c++

We are quite confused about the behavior of QIODevice::write in general and the QTcpSocket implementation specifically. There is a similar question already, but the answer is not really satisfactory. The main confusion stems from the there mentioned bytesWritten signal respectively the waitForBytesWritten method. Those two seem to indicate the bytes that were written from the buffer employed by the QIODevice to the actual underlying device (there must be such buffer, otherwise the method would not make much sense). The question then is though, if the number returned by QIODevice::write corresponds with this number, or if in that case it indicates the number of bytes that were stored in the internal buffer, not the bytes written to the underlying device. If the number returned would indicate the bytes written to the internal buffer, we would need to employ a pattern like the following to ensure all our data is written:
void writeAll(QIODevice& device, const QByteArray& data) {
int written = 0;
do {
written = device.write(data.constData() + written, data.size() - written);
} while(written < data.size());
}
However, this will insert duplicate data if the return value of QIODevice::write corresponds with the meaning of the bytesWritten signal. The documentation is very confusing about this, as in both methods the word device is used, even though it seems logical and the general understanding, that one actually indicates written to buffer, and not device.
So to summarize, the question is: Is the number returned bye QIODevice::write the number of bytes written to the underlying device, and hence its save to call QIODevice::write without checking the returned number of bytes, as everything is stored in the internal buffer. Or does it indicate how much bytes it could store internally and a pattern like the above writeAll has to be employed to safely write all data to the device?
(UPDATE: Looking at the source, the QTcpSocket::write implementation actually will never return less bytes than one wanted to write, so the writeAll above is not needed. However, that is specific to the socket and this Qt version, the documentation is still confusing...)

QTcpSocket is a buffered QAbstractSocket. An internal buffer is allocated inside QAbstractSocket, and data is copied in that buffer. The return value of write is the size of the data passed to write().
waitForBytesWritten waits until the data in the internal buffer of QAbstractSocket is written to the native socket.

That previous question answers your question, as does the QIODevice::write(const char * data, qint64 maxSize) documentation:
Writes at most maxSize bytes of data from data to the device. Returns the number of bytes that were actually written, or -1 if an error occurred.
This can (and will in real life) return less than what you requested, and it's up to you to call write again with the remainder.
As for waitForBytesWritten:
For buffered devices, this function waits until a payload of buffered written data has been written to the device...
It applies only to buffered devices. Not all devices are buffered. If they are, and you wrote less than what the buffer can hold, write can return successfully before the device has finished sending all the data.
Devices are not necessarily buffered.

Related

Implementing QIODevice::writeData, confusing documentation

I'm trying to implement a double buffer for a real-time audio application, and QAudioInput requires it to be a subclass of QIODevice. I'm finding the documentation for this method pretty confusing.
First of all, the method signature in the documentation doesn't match the header for QT 5.9.2, which has virtual qint64 writeData(const char *data, qint64 len) = 0;.
The documentation has this signature though: qint64 QIODevice::writeData(const char *data, qint64 maxSize)
The maxSize parameter confuses me because it implies that I can just buffer some of the data, which the documentation also implies with:
Writes up to maxSize bytes from data to the device. Returns the number of bytes written, or -1 if an error occurred.
However, immediately afterword the documentation says this, which seems contradictory to me:
When reimplementing this function it is important that this function writes all the data available before returning. This is required in order for QDataStream to be able to operate on the class. QDataStream assumes all the information was written and therefore does not retry writing if there was a problem.
So is my QIODevice implementation responsible for buffering all the data in a single call or not?
What they basically trying to say is: The passed data is maxSize bytes long. Your implementation should write all data and return the number of bytes written.
It is possible to write less data then available, but you should not. If you do, some classes that use your device may not react to this (like QDataStream). It depends on how QAudioInput handles write calls. If it checks the result and writes missing data again if not completly written, not writing all data is fine. If thats not the case, you have to always write all data.
Simply try it out: always write only 1 byte (and return 1). If it works, it's fine, if not you have to always write all passed data, or fail with -1.

C++ char pointer size exceeds after malloc

I have a char pointer & have used malloc like
char *message;
message=(char *)malloc(4000*sizeof(char));
later I'm receiving data from socket in message what happens if data exceeds 4000 bytes ?
I'll assume you are asking what will happen if you do something like this:
recv(socket,message,5000,0);
and the amount of data read is greater than 4000.
This will be undefined behavior, so you need to make sure that it can't happen. Whenever you read from a socket, you should be able to specify the maximum number of characters to read.
Your question leaves out many details about the network protocol, see the answer by #DavidSchwartz.
But focussing on the buffer in which you store it: if you try to write more than 4K chars into the memory allocated by message, your program could crash.
If you test for the size of the message being received, you could do realloc:
int buf_len = 4000;
char *message;
message = static_cast<char*>(malloc(buf_len));
/* read message, and after you have read 4000 chars, do */
buf_len *= 2;
message = static_cast<char*>(realloc(message, buf_len));
/* rinse and repeat if buffer is still too small */
free(message); // don't forget to clean-up
But this is very labor-intensive. Just use a std::string
int buf_len = 4000;
std::string message;
message.reserve(buf_len); // allocate 4K to save on repeated allocations
/* read message, std::string will automatically expand, no worries! */
// destructor will automatically clean-up!
It depends on a few factors. Assuming there's no bug in your code, it will depend on the protocol you're using.
If TCP, you will never get more bytes than you asked for. You'll get more of the data the next time you call the receive function.
If UDP, you may get truncation, you may get an error (like MSG_TRUNC). This depends on the specifics of your platform and how you're invoking a receive function. I know of no platform that will save part of a datagram for your next invocation of a receive function.
Of course, if there's a bug in your code and you actually overflow the buffer, very bad things can happen. So make sure you pass only sane values to whatever receive function you're using.
For the best result,you get a segmentation fault error
see
What is a segmentation fault?
dangers of heap overflows?

Is reading from an anonymous pipe atomic, in the sense of atomic content?

I am writing a process on Linux with two threads. They communicate using an anonymous pipe, created with the pipe() call.
One end is copying a C structure into the pipe:
struct EventStruct e;
[...]
ssize_t n = write(pipefd[1], &e, sizeof(e));
The other end reads it from the pipe:
struct EventStruct e;
ssize_t n = read(pipefd[0], &e, sizeof(e));
if(n != -1 && n != 0 && n < sizeof(e))
{
// Is a partial read possible here??
}
Can partial reads occur with the anonymous pipe?
The man page (man 7 pipe) stipulates that any write under PIPE_BUF size is atomic. But what they mean is atomic regarding other writers threads... I am not concerned with multiple writers issues. I have only one writer thread, and only one reader thread.
As a side note, my structure is 56 bytes long. Well below the PIPE_BUF size, which is at least 4096 bytes on Linux. It looks like it's even higher on most recent kernel.
Told otherwise: on the reading end, do I have to deal with partial read and store them meanwhile I receive a full structure instance?
As long as you are dealing with fixed size units, there isn't a problem. If you write a unit of N bytes on the pipe and the reader requests a unit of N bytes from the pipe, then there will be no issue. If you can't read all the data in one fell swoop (you don't know the size until after you've read its length, for example), then life gets trickier. However, as shown, you should be fine.
That said, you should still detect short reads. There's a catastrophe pending if you get a short read but assume it is full length. However, you should not expect to detect short reads — code coverage will be a problem. I'd simply test n < (ssize_t)sizeof(e) and anything detected is an error or EOF. Note the cast; otherwise, the signed value will be converted to unsigned and -1 won't be spotted properly.
For specification, you'll need to read the POSIX specifications for:
read()
write()
pipe()
and possibly trace links from those pages. For example, for write(), the specification says:
Write requests to a pipe or FIFO shall be handled in the same way as a regular file with the following exceptions:
There is no file offset associated with a pipe, hence each write request shall append to the end of the pipe.
Write requests of {PIPE_BUF} bytes or less shall not be interleaved with data from other processes doing writes on the same pipe. Writes of greater than {PIPE_BUF} bytes may have data interleaved, on arbitrary boundaries, with writes by other processes, whether or not the O_NONBLOCK flag of the file status flags is set.
Or from the specification of read():
Upon successful completion, where nbyte is greater than 0, read() shall mark for update the last data access timestamp of the file, and shall return the number of bytes read. This number shall never be greater than nbyte. The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading. For example, a read() from a file associated with a terminal may return one typed line of data.
So, the write() will write atomic units; the read() will only read atomic units because that's what was written. There won't be a problem, which is what I said at the start.

What does `POLLOUT` event in `poll` Linux function mean?

From Linux documentation, POLLOUT means Normal data may be written without blocking. Well, but this explanation is ambigous.
How much data is it possible to write without blocking after poll reported this event? 1 byte? 2 bytes? Gigabyte?
After POLLOUT event on blocking socket, how to check how much data I can send to socket without block?
poll system call only tells you that there is something happen in the file descriptor(physical device) but it doesn't tell you how much space is available for you to read or write. In order to know exactly how many bytes data is available to be used for reading or writing, you must use read() or write() system call to get the return value which says the number of bytes you have actually been read or written.
Thus,poll() is mainly used for applications that must use multiple input or output streams without getting stuck on any one of them. You can't use write() or read() in this case since you can't monitor multiple descriptors at the same time within one thread.
BTW,for device driver,the underlying implementation for POLL in driver usually likes this(code from ldd3):
static unsigned int scull_p_poll(struct file *filp, poll_table *wait)
{
poll_wait(filp, &dev->inq, wait);
poll_wait(filp, &dev->outq, wait);
...........
if (spacefree(dev))
mask |= POLLOUT | POLLWRNORM; /* writable */
up(&dev->sem);
return mask;
}
If poll() sets the POLLOUT flag then at least one byte may be written without blocking. You may then find that a write() operation performs only a partial write, so indicated by returning a short count. You must always be prepared for partial reads and writes when multiplexing I/O via poll() and/or select().

C++ how to flush std:stringbuf?

I need to put the standard output of a process (binary data) to a string buffer and consume it in another thread.
Here is the producer:
while (ReadFile(ffmpeg_OUT_Rd, cbBuffer, sizeof(cbBuffer), &byteRead, NULL)){
tByte += byteRead; //total bytes
sb->sputn(cbBuffer, byteRead);
}
m_bIsFinished = true;
printf("%d bytes are generated.\n", tByte);
Here is the consumer:
while (!PCS_DISPATCHER_INSTANCE->IsFinished())
Sleep(200);
Sleep(5000);
Mystringbuf* sb = PCS_DISPATCHER_INSTANCE->sb;
printf("Avail: %d\n", sb->in_avail());
It turns out that the consumer cannot get all the bytes of the produced by the producer.
( tByte <> sb->in_avail() )
Is it a kind of internal buffering problem? If yes, how to force the stringbuf to flush its internal buffer?
A streambufhas nothing like flush: writes are done directly into the buffer. There is a pubsync() member that could help, if you would use an object derived such as a filebuf. But this does not apply to your case.
Your issue certainly comes from a a data race on sputn() or is_avail(). Either protect access to the streambuf via a mutex, or via an atomic. If m_bIsFinished is not an atomic, and depending on your implementation of isFinished(), the synchronisation between the threads might not be guaranteed (for example: producer could write to memory, but consumer still obtains an outdated value from the CPU memory cache), which could conduct to such a data race.
Edit:
If you'd experience the issue within a single thread, thus eliminating any potential racing condition, it may come from implementation of streambuf. I could experience this with MSVC13 in a single thread application:
tracing showed that number of bytes read were accurate, but in_avail() result was always smaller or equal to tByte through the whole loop.
when reading the streambuf, the correct total number of bytes were read (thus more than indicated by in_avail()).
This behaviour is compliant. According to C++ standard: in_avail() shall return egptr() - gptr() if a read position is available, and otherwhise showmanyc(). The latter is defined as returning an estimate of the number of characters available in the sequence. The only guarantee given is that you could read at least in_avail() bytes without encountering eof.
Workaround use sb->pubseekoff(0,ios::end)- sb->pubseekoff(0,ios::beg); to count the number of bytes available, and make sure you're repositioned at sb->pubseekoff(0,ios::beg) beofre you read your streambuf.