stringstream vs ifstream (ofstream) in c++ using socket programming - c++

I have one question about socket programming in C++. Most of the tutorials I found on web assume that
(binding etc. is omitted)
there is a string at client process
it is saved to a file
then the file is sent to server by first reading the file into stream
server receives the stream and write it into another file.
Then, my question is that what if we can use stringstrem at step 2 instead of saving as a file? File I/O (in C++ ifstream and ofstream are typically used) is generally slow. Is it more efficient if I use stringstream directory?

Your Original Question:
"What if we can use stringstrem at step 2 instead of saving as a file?"
My Initial Response:
stringstream has nothing to do with server sockets and IO files.
You are lacking the fundamental idea of IO operations which is the concept of files for IO devices. There is no way around. You save nothing in a logical file stream. Your file bytes are buffered temporarily on your memory and flushed.
stringstream is a nice C++ library utility that let's you treat strings as file streams. Just like you read from an input file stream bytes after bytes until EOF/some other errors, or write into an output file stream bytes after bytes, using stringstream you can treat your string like the way you do to file streams. This is really helpful when you want to divide your string in small logical units. For example, suppose
you read a string line and want to read each word from that line by treating the string line as a stream of words.
Further Instructions To Guide You To The Right Direction:
Nothing is "saved" in a logical file stream. Every I/O operation is performed through "logical" files in any computer system. Socket connection has two file descriptors on both ends: one is a client file descriptor and another one is a server file descriptor (connected file descriptor). Server listens for connection requests through a listening file descriptor which actually stays around as long as the lifetime of the server, and when it accepts a connection request, it returns another file descriptor through accept function called connected file descriptor that stays around as long as the client-server connection/transaction is ongoing.
int accept(int listenfd, struct sockaddr *addr, int *addrlen);
If you want to read from or write into a file stream and also wish to buffer your file bytes, you exactly need to do that- buffer your bytes. This is also very important in the context of servers and short counts because your connection might time out or it might get interrupted by signals. There are several options and techniques that you might implement. However, such discussions are not possible in this small thread. What I'm going to do based on your question is give you an example of how you can buffer your file stream, avoid short count, and handle signal interruptions through following steps:
For example, following is a function that reads n bytes and doesn't buffer
ssize_t rio_readn(int fd, void *usrbuf, size_t n)
{
size_t nleft = n;
ssize_t nread;
char *bufp = usrbuf;
while (nleft > 0) {
if ((nread = read(fd, bufp, nleft)) < 0) {
if (errno == EINTR) /* Interrupted by sig handler return */
nread = 0;/* and call read() again */
else
return -1;/* errno set by read() */
}
else if (nread == 0)
break;/* EOF */
nleft -= nread;
bufp += nread;
}
return (n - nleft);/* Return >= 0 */
}
We can implement the following steps to do buffered and robust IO operations (note RIO means robust IO):
Step 1: Set up empty read buffer and associate an open file descriptor so that we can implement our robust IO operations
#define RIO_BUFSIZE 8192
typedef struct {
int rio_fd;/* Descriptor for this internal buf */
int rio_cnt;/* Unread bytes in internal buf */
char *rio_bufptr;/* Next unread byte in internal buf */
char rio_buf[RIO_BUFSIZE]; /* Internal buffer */
} rio_t;
//Initialize robust IO buffer
void rio_readinitb(rio_t *rp, int fd)
{
rp->rio_fd = fd;
rp->rio_cnt = 0;
rp->rio_bufptr = rp->rio_buf;
}
Step 2: A robust read utility function to handle short count
static ssize_t rio_read(rio_t *rp, char *usrbuf, size_t n)
{
int cnt;
while (rp->rio_cnt <= 0) {/* Refill if buf is empty */
rp->rio_cnt = read(rp->rio_fd, rp->rio_buf,
sizeof(rp->rio_buf));
if (rp->rio_cnt < 0) {
if (errno != EINTR) /* Interrupted by sig handler return */
return -1;
}
else if (rp->rio_cnt == 0)/* EOF */
return 0;
else
rp->rio_bufptr = rp->rio_buf; /* Reset buffer ptr */
}
/* Copy min(n, rp->rio_cnt) bytes from internal buf to user buf */
cnt = n;
if (rp->rio_cnt < n)
cnt = rp->rio_cnt;
memcpy(usrbuf, rp->rio_bufptr, cnt);
rp->rio_bufptr += cnt;
rp->rio_cnt -= cnt;
return cnt;
}
Step 3: A robust IO function for buffered reading
ssize_t rio_readnb(rio_t *rp, void *usrbuf, size_t n)
{
size_t nleft = n;
ssize_t nread;
char *bufp = usrbuf;
while (nleft > 0) {
if ((nread = rio_read(rp, bufp, nleft)) < 0) {
if (errno == EINTR) /* Interrupted by sig handler return */
nread = 0;/* Call read() again */
else
return -1;/* errno set by read() */
}
else if (nread == 0)
break;/* EOF */
nleft -= nread;
bufp += nread;
}
return (n - nleft);/* Return >= 0 */
}

Related

C++ TCP recv unknown buffer size

I want to use the function recv(socket, buf, len, flags) to receive an incoming packet. However I do not know the length of this packet prior to runtime so the first 8 bytes are supposed to tell me the length of this packet. I don't want to just allocate an arbitrarily large len to accomplish this so is it possible to set len = 8 have buf be a type of uint64_t. Then afterwards
memcpy(dest, &buf, buf)?
Since TCP is stream-based, I'm not sure what type of packages you mean. I will assume that you are referring to application level packages. I mean packages which are defined by your application and not by underlying protocols like TCP. I will call them messages instead to avoid confusion.
I will show two possibilities. First I will show, how you could read a message without knowing the length before you have finished reading. The second example will do two calls. First it reads the size of the message. Then it read the whole message at once.
Read data until the message is complete
Since TCP is stream-based, you will not loss any data when your buffer is not big enough. So you can read a fixed amount of bytes. If something is missing, you can call recv again. Here is a extensive example. I just wrote it without testing. I hope everything would work.
std::size_t offset = 0;
std::vector<char> buf(512);
std::vector<char> readMessage() {
while (true) {
ssize_t ret = recv(fd, buf.data() + offset, buf.size() - offset, 0);
if (ret < 0) {
if (errno == EINTR) {
// Interrupted, just try again ...
continue;
} else {
// Error occured. Throw exception.
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore.
if (offset == 0) {
// Client did just close the connection
return std::vector<char>(); // return empty vector
} else {
// Client did close connection while sending package?
// It is not a clean shutdown. Throw exception.
throw ProtocolException("Unexpected end of stream");
}
} else if (isMessageComplete(buf)) {
// Message is complete.
buf.resize(offset + ret); // Truncate buffer
std::vector<char> msg = std::move(buf);
std::size_t msgLen = getSizeOfMessage(msg);
if (msg.size() > msgLen) {
// msg already contains the beginning of the next message.
// write it back to buf
buf.resize(msg.size() - msgLen)
std::memcpy(buf.data(), msg.data() + msgLen, msg.size() - msgLen);
msg.resize(msgLen);
}
buf.resize(std::max(2*buf.size(), 512)) // prepare buffer for next message
return msg;
} else {
// Message is not complete right now. Read more...
offset += ret;
buf.resize(std::max(buf.size(), 2 * offset)); // double available memory
}
}
}
You have to define bool isMessageComplete(std::vector<char>) and std::size_t getSizeOfMessage(std::vector<char>) by yourself.
Read the header and check the length of the package
The second possibility is to read the header first. Just the 8 bytes which contains the size of the package in your case. After that, you know the size of the package. This mean you can allocate enough storage and read the whole message at once:
/// Reads n bytes from fd.
bool readNBytes(int fd, void *buf, std::size_t n) {
std::size_t offset = 0;
char *cbuf = reinterpret_cast<char*>(buf);
while (true) {
ssize_t ret = recv(fd, cbuf + offset, n - offset, MSG_WAITALL);
if (ret < 0) {
if (errno != EINTR) {
// Error occurred
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore
if (offset == 0) return false;
else throw ProtocolException("Unexpected end of stream");
} else if (offset + ret == n) {
// All n bytes read
return true;
} else {
offset += ret;
}
}
}
/// Reads message from fd
std::vector<char> readMessage(int fd) {
std::uint64_t size;
if (readNBytes(fd, &size, sizeof(size))) {
std::vector buf(size);
if (readNBytes(fd, buf.data(), size)) {
return buf;
} else {
throw ProtocolException("Unexpected end of stream");
}
} else {
// connection was closed
return std::vector<char>();
}
}
The flag MSG_WAITALL requests that the function blocks until the full amount of data is available. However, you cannot rely on that. You have to check it and read again if something is missing. Just like I did it above.
readNBytes(fd, buf, n) reads n bytes. As far as the connection was not closed from the other side, the function will not return without reading n bytes. If the connection was closed by the other side, the function returns false. If the connection was closed in the middle of a message, an exception is thrown. If an i/o-error occurred, another exception is thrown.
readMessage reads 8 bytes [sizeof(std::unit64_t)] und use them as size for the next message. Then it reads the message.
If you want to have platform independency, you should convert size to a defined byte order. Computers (with x86 architecture) are using little endian. It is common to use big endian in network traffic.
Note: With MSG_PEEK it is possible to implement this functionality for UDP. You can request the header while using this flag. Then you can allocate enough space for the whole package.
A fairly common technique is to read leading message length field, then issue a read for the exact size of the expected message.
HOWEVER! Do not assume that the first read will give you all eight bytes(see Note), or that the second read will give you the entire message/packet.
You must always check the number of bytes read and issue another read (or two (or three, or...)) to get all the data you want.
Note: Because TCP is a streaming protocol and because the packet size "on the wire" varies in accordance with a very arcane algorithm designed to maximize network performance, you could easily issue a read for eight bytes and the read could return having only read three (or seven or ...) bytes. The guarantee is that unless there is an unrecoverable error you will receive at least one byte and at most the number of bytes you requested. Because of this you must be prepared to do byte address arithmetic and issue all reads in a loop that repeats until the desired number of bytes is returned.
Since TCP is streaming there isn't really any end to the data you receive, not until the connection is closed or there is an error.
Instead you need to implement your own protocol on top of TCP, one that either contains a specific end-of-message marker, a length-of-data header field, or possibly a command-based protocol where the data of each command is of a well-known size.
That way you can read into a small fixed-sized buffer and append to a larger (possibly expanding) buffer as needed. The "possibly expanding" part is ridiculously easy in C++, what with std::vector and std::string (depending on the data you have)
There is another important thing to remember, that since TCP is stream-based, a single read or recv call may not actually fetch all the data you request. You need to receive the data in a loop until you have received everything.
In my Personal opinion.
I suggest receive "size of message"(integer 4 byte fixed) first.
recv(socket, "size of message written in integer" , "size of integer")
then
receive real message after.
recv(socket, " real message" ,"size of message written in integer")
This techinique also can be used on "sending files, images ,long messages"

How to send image data over linux socket

I have a relatively simple web server I have written in C++. It works fine for serving text/html pages, but the way it is written it seems unable to send binary data and I really need to be able to send images.
I have been searching and searching but can't find an answer specific to this question which is written in real C++ (fstream as opposed to using file pointers etc.) and whilst this kind of thing is necessarily low level and may well require handling bytes in a C style array I would like the the code to be as C++ as possible.
I have tried a few methods, this is what I currently have:
int sendFile(const Server* serv, const ssocks::Response& response, int fd)
{
// some other stuff to do with headers etc. ........ then:
// open file
std::ifstream fileHandle;
fileHandle.open(serv->mBase + WWW_D + resource.c_str(), std::ios::binary);
if(!fileHandle.is_open())
{
// error handling code
return -1;
}
// send file
ssize_t buffer_size = 2048;
char buffer[buffer_size];
while(!fileHandle.eof())
{
fileHandle.read(buffer, buffer_size);
status = serv->mSock.doSend(buffer, fd);
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
return 0
}
And then elsewhere:
int TcpSocket::doSend(const char* message, int fd) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, strlen(message), 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
As I say, the problem is that when the client requests an image it won't work. I get in std::cerr "Error: socket error sending file"
EDIT : I got it working using the advice in the answer I accepted. For completeness and to help those finding this post I am also posting the final working code.
For sending I decided to use a std::vector rather than a char array. Primarily because I feel it is a more C++ approach and it makes it clear that the data is not a string. This is probably not necessary but a matter of taste. I then counted the bytes read for the stream and passed that over to the send function like this:
// send file
std::vector<char> buffer(SEND_BUFFER);
while(!fileHandle.eof())
{
fileHandle.read(&buffer[0], SEND_BUFFER);
status = serv->mSock.doSend(&buffer[0], fd, fileHandle.gcount());
if (status == -1)
{
std::cerr << "Error: socket error, sending file\n";
return -1;
}
}
Then the actual send function was adapted like this:
int TcpSocket::doSend(const char* message, int fd, size_t size) const
{
if (fd == 0)
{
fd = mFiledes;
}
ssize_t bytesSent = send(fd, message, size, 0);
if (bytesSent < 1)
{
return -1;
}
return 0;
}
The first thing you should change is the while (!fileHandle.eof()) loop, because that will not work as you expect it to, in fact it will iterate once too many because the eof flag isn't set until after you try to read from beyond the end of the file. Instead do e.g. while (fileHandle.read(...)).
The second thing you should do is to check how many bytes was actually read from the file, and only send that amount of bytes.
Lastly, you read binary data, not text, so you can't use strlen on the data you read from the file.
A little explanations of the binary file problem: As you should hopefully know, C-style strings (the ones you use strlen to get the length of) are terminated by a zero character '\0' (in short, a zero byte). Random binary data can contain lots of zero bytes anywhere inside it, and it's a valid byte and doesn't have any special meaning.
When you use strlen to get the length of binary data there are two possible problems:
There's a zero byte in the middle of the data. This will cause strlen to terminate early and return the wrong length.
There's no zero byte in the data. That will cause strlen to go beyond the end of the buffer to look for the zero byte, leading to undefined behavior.

missing bytes while transfering files from client to server, the bytes values also represent some of the control characters

I'm writing a file transfer client/server application
where the client is operating on windows7 and written in vb.net
and the server is operating on linux mint and written in c++ (I'm using vmware)
my problem is when i try to upload files to the server (such as images) the received data is missing many bytes which also represent the control characters (such as EOT, ETB,...) and I guess they're read as tcp control characters and ignored by the receiving OS.
I already tested the application with simple text files (size up to 4MB) without any problem.
is there a way to prevent the system from ignoring those bytes?
this is the c++ function that receives the file:
string readSockBytes(int port,int num,int size)
{
int dcmbuffSize = 1460;
int n;
stringstream temp;
string strBuffer,Sbuffer;
char Rbuffer[dcmbuffSize];
struct socketVar sockets;
sockets = setSocket(port);
sockets = sockListen(sockets);
cout<<"user connected\n";
strBuffer = readsock(sockets);
cout<<strBuffer.substr(0,strBuffer.find("$"))<<endl;
if(num == atoi(strBuffer.substr(0,strBuffer.find("$")).c_str()))
Sbuffer = "ready$";
else
{
Sbuffer = "exit$";
close(sockets.newsockfd);
close(sockets.sockfd);
}
n = writesock(sockets, Sbuffer, 100);
if (n < 0) error("ERROR writing to socket");
while(strBuffer.length() < fileSize)
{
n = read(sockets.newsockfd,Rbuffer,dcmbuffSize-1);
if (n < 0) error("ERROR reading from socket");
temp.str(Rbuffer);
strBuffer = strBuffer+temp.str();
}
strBuffer = strBuffer.substr(0,size);
return strBuffer;
}
The issue is most likely that you sent binary data. And binary data can contain zeros. And zeroes are the normal string terminator.
This means that when you do temp.str(Rbuffer) (assuming temp is a std::stringstream) then it only gets data from Rbuffer until the first zero.
Instead of using e.g. std::stringstream use std::string:
while(strBuffer.length() < fileSize)
{
char buffer[2048];
ssize_t n = read(sockets.newsockfd, buffer, sizeof(buffer));
if (n <= 0)
{
// An error, or connection closed
if (n < 0)
error("ERROR reading from socket");
break;
}
// Create a string of `n` bytes, including possible string terminators
// and add it to out current buffer
strBuffer += std::string(buffer, n);
}
The important thing to remember here is that you can't use the received data as a string! If it's binary data it will with most certainty contain the string terminator and so you have to treat is as binary data and not a string (even though you can store it in a std::string).
You also need to be aware that you can't print the data, as many binary values are either unprintable or will print as "garbage".
And lastly, if you read and write binary files, you need to open them in binary modes, or you will get errors with the bytes 0x0d and 0x0a (i.e. carriage-return and newline).

Raspberry Pi C++ Read NMEA Sentences from Adafruit's Ultimate GPS Module

I'm trying to read the GPS NMEA sentences from Adafruit's Ultimate GPS module. I'm using C++ on the raspberry pi to read the serial port connection to the module
Here is my read function:
int Linuxutils::readFromSerialPort(int fd, int bufferSize) {
/*
Reading data from a port is a little trickier. When you operate the port in raw data mode,
each read(2) system call will return however many characters are actually available in the
serial input buffers. If no characters are available, the call will block (wait) until
characters come in, an interval timer expires, or an error occurs. The read function can be
made to return immediately by doing the following:
fcntl(fd, F_SETFL, FNDELAY);
The NDELAY option causes the read function to return 0 if no characters are available on the port.
*/
// Check the file descriptor
if ( !checkFileDecriptorIsValid(fd) ) {
fprintf(stderr, "Could not read from serial port - it is not a valid file descriptor!\n");
return -1;
}
// Now, let's wait for an input from the serial port.
fcntl(fd, F_SETFL, 0); // block until data comes in
// Now read the data
int absoluteMax = bufferSize*2;
char *buffer = (char*) malloc(sizeof(char) * bufferSize); // allocate buffer.
int rcount = 0;
int length = 0;
// Read in each newline
FILE* fdF = fdopen(fd, "r");
int ch = getc(fdF);
while ( (ch != '\n') ) { // Check for end of file or newline
// Reached end of file
if ( ch == EOF ) {
printf("ERROR: EOF!");
continue;
}
// Expand by reallocating if necessary
if( rcount == absoluteMax ) { // time to expand ?
absoluteMax *= 2; // expand to double the current size of anything similar.
rcount = 0; // Re-init count
buffer = (char*)realloc(buffer, absoluteMax); // Re-allocate memory.
}
// Read from stream
ch = getc(fdF);
// Stuff in buffer
buffer[length] = ch;
// Increment counters
length++;
rcount++;
}
// Don't care if we return 0 chars read
if ( rcount == 0 ) {
return 0;
}
// Stick
buffer[rcount] = '\0';
// Print results
printf("Received ( %d bytes ): %s\n", rcount,buffer);
// Return bytes read
return rcount;
}
So I kind of get the sentences as you can see below, the problem is I get these "repeated" portions of a complete sentence like this:
Received ( 15 bytes ): M,-31.4,M,,*61
Here is the complete thing:
Received ( 72 bytes ): GPGGA,182452.000,4456.2019,N,09337.0243,W,1,8,1.19,292.6,M,-31.4,M,,*61
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,1.49,1.19,0.91*00
Received ( 15 bytes ): M,-31.4,M,,*61
Received ( 72 bytes ): GPGGA,182453.000,4456.2019,N,09337.0242,W,1,8,1.19,292.6,M,-31.4,M,,*61
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,1.49,1.19,0.91*00
Received ( 15 bytes ): M,-31.4,M,,*61
Received ( 72 bytes ): GPGGA,182456.000,4456.2022,N,09337.0241,W,1,8,1.21,292.6,M,-31.4,M,,*64
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,2.45,1.21,2.13*0C
Received ( 70 bytes ): GPRMC,182456.000,A,4456.2022,N,09337.0241,W,0.40,183.74,110813,,,A*7F
Received ( 37 bytes ): GPVTG,183.74,T,,M,0.40,N,0.73,K,A*34
Received ( 70 bytes ): GPRMC,182453.000,A,4456.2019,N,09337.0242,W,0.29,183.74,110813,,,A*7E
Received ( 37 bytes ): GPVTG,183.74,T,,M,0.29,N,0.55,K,A*3F
Received ( 32 bytes ): 242,W,0.29,183.74,110813,,,A*7E
Received ( 70 bytes ): GPRMC,182452.000,A,4456.2019,N,09337.0243,W,0.33,183.74,110813,,,A*75
Why am I getting the repeated sentences and how can I fix it? I tried flushing the serial port buffers but then things became really ugly! Thanks.
I'm not sure I understand your exact problem. There are a few problems with the function though which might explain a variety of errors.
The lines
int absoluteMax = bufferSize*2;
char *buffer = (char*) malloc(sizeof(char) * bufferSize); // allocate buffer.
seem wrong. You'll decide when to grow the buffer by comparing the number of characters read to absoluteMax so this needs to match the size of the buffer allocated. You're currently writing beyond the end of allocated memory before you reallocate. This results in undefined behaviour. If you're lucky your app will crash, if you're unlucky, things will appear to work but you'll lose the second half of the data you've read since only the data written to memory you own will be moved by realloc (if it relocates your heap cell).
Also, you shouldn't cast the return from malloc (or realloc) and can rely on sizeof(char) being 1.
You lose the first character read (the one that is read just before the while loop). Is this deliberate?
When you reallocate buffer, you shouldn't reset rcount. This causes the same bug as above where you'll write beyond the end of buffer before reallocating again. Again, the effects of doing this are undefined but could include losing portions of output.
Not related to the bug you're currently concerned with but also worth noting is the fact that you leak buffer and fdF. You should free and fclose them respectively before exiting the function.
The following (untested) version ought to fix these issues
int Linuxutils::readFromSerialPort(int fd, int bufferSize)
{
if ( !checkFileDecriptorIsValid(fd) ) {
fprintf(stderr, "Could not read from serial port - it is not a valid file descriptor!\n");
return -1;
}
fcntl(fd, F_SETFL, 0); // block until data comes in
int absoluteMax = bufferSize;
char *buffer = malloc(bufferSize);
int rcount = 0;
int length = 0;
// Read in each newline
FILE* fdF = fdopen(fd, "r");
int ch = getc(fdF);
for (;;) {
int ch = getc(fdF);
if (ch == '\n') {
break;
}
if (ch == EOF) { // Reached end of file
printf("ERROR: EOF!\n");
break;
}
if (length+1 >= absoluteMax) {
absoluteMax *= 2;
char* tmp = realloc(buffer, absoluteMax);
if (tmp == NULL) {
printf("ERROR: OOM\n");
goto cleanup;
}
buffer = tmp;
}
buffer[length++] = ch;
}
if (length == 0) {
return 0;
}
buffer[length] = '\0';
// Print results
printf("Received ( %d bytes ): %s\n", rcount,buffer);
cleanup:
free(buffer);
fclose(fdH);
return length;
}
Maybe you could try to flush serial port buffers before reading from it as shown in this link ?
I would also consider not reopening the serial port every time you call Linuxutils::readFromSerialPort - you could keep the file descriptor open for further reading (anyway the call is blocking so from the caller's point of view nothing changes).

Sending a binary file through socket

I'm trying to send a binary file through socket in C to an embedded platform, but when I run it after its sent it just gives me segfault (sending through ftp works fine, but its very slow).
Sending the binary file in same system works ok (the embedded is little-endian so I don't think its endian problem).
What can be the problem? the program is mft.cpp
You are assuming that every read returns the number of bytes that you want to read. That is incorrect. You should always check the read return value to see if you got as many bytes as you wanted.
This also means that you can rewrite your send loop as:
int bytesLeft = file_length;
char buf[1024]; //no need to reallocate it in the loop
while(bytesLeft > 0)
{
int to_read = 1024;
if(bytesLeft < to_read)
to_read = bytesLeft
int bytesRead = read(new_sock_id, buf, to_read);
if(error("reading file", false)) continue;
write(file, buf, bytesRead);
if(error("writing file", false)) continue;
bytesLeft -= bytesRead ;
}