my c++ client/server file exchange implementation is very slow...why? - c++

Hi have implemented simple file exchange over a client/server connection in c++. Works fine except for the one problem that its so damn slow. This is my code:
For sending the file:
int send_file(int fd)
{
char rec[10];
struct stat stat_buf;
fstat (fd, &stat_buf);
int size=stat_buf.st_size;
while(size > 0)
{
char buffer[1024];
bzero(buffer,1024);
bzero(rec,10);
int n;
if(size>=1024)
{
n=read(fd, buffer, 1024);
// Send a chunk of data
n=send(sockFile_, buffer, n, 0 );
// Wait for an acknowledgement
n = recv(sockFile_, rec, 10, 0 );
}
else // reamining file bytes
{
n=read(fd, buffer, size);
buffer[size]='\0';
send(sockFile_,buffer, n, 0 );
n=recv(sockFile_, rec, 10, 0 ); // ack
}
size -= 1024;
}
// Send a completion string
int n = send(sockFile_, "COMP",strlen("COMP"), 0 );
char buf[10];
bzero(buf,10);
// Receive an acknowledgemnt
n = recv(sockFile_, buf, 10, 0 );
return(0);
}
And for receiving the file:
int receive_file(int size, const char* saveName)
{
ofstream outFile(saveName,ios::out|ios::binary|ios::app);
while(size > 0)
{
// buffer for storing incoming data
char buf[1024];
bzero(buf,1024);
if(size>=1024)
{
// receive chunk of data
n=recv(sockFile_, buf, 1024, 0 );
// write chunk of data to disk
outFile.write(buf,n);
// send acknowledgement
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
else
{
n=recv(sockFile_, buf, size, 0 );
buf[size]='\0';
outFile.write(buf,n);
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
size -= 1024;
}
outFile.close();
// Receive 'COMP' and send acknowledgement
// ---------------------------------------
char buf[10];
bzero(buf,10);
n = recv(sockFile_, buf, 10, 0 );
n = send(sockFile_, "OK", strlen("OK"), 0 );
std::cout<<"File received..."<<std::endl;
return(0);
}
Now here are my initial thoughts: Perhaps the buffer is too small. I should therefore try increasing the size from I dunno, 1024 bytes (1KB) to 65536 (64KB) blocks, possibly. But this results in file corruption. Ok, so perhaps the code is also being slowed down by the need to receive an acknowledgement after each 1024 byte block of data has been sent, so why not remove them? Unfortunately this results in the blocks not arriving in the correct order and hence file corruption.
Perhaps I could split the file into chunks before hand and create multiple connections and send each chunk over its own threaded connection and then reassemble the chunks somehow in the receiver....
Any idea how I could make the file transfer process more efficient (faster)?
Thanks,
Ben.

Skip the acknowledgement of buffers! You insert an artificial round trip (server->client+client->server) for probably each single packet.
This slows down the transfer.
You do not need this ack. You are using TCP, which gives you a reliable stream. Send the number of bytes, then send the whole file. Do not read after send and so on.
EDIT: As a second step, you should increase the buffer size. For internet transfer you can assume an MTU of 1500, so there will be space for a payload of 1452 bytes in each IP packet. This should be your minimal buffer size. Make it larger and let the operating system slice the buffers into packets for you. For LAN you have a much higher MTU.

My guess is that you are getting out of sync and some of your reads are less than 1024. It happens all the time with sockets. The "size -= 1024" statement should be "size -= n".
My guess is that n is sometimes less than 1024 from the recv().

You should certainly increase the buffer size, and if this causes corruption it is an error in your code, which you need to fix. Also, if you use a stream protocol (i.e. TCP/IP) the order and delivery of packets is guaranteed.

Read this thread:
Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
Oh, and use sendfile POSIX command, here's an example to get you started:
http://tldp.org/LDP/LGNET/91/misc/tranter/server.c.txt

A couple of things.
1) You are reallocating the buffer each time you go through your while loop:
while(size > 0)
{
char buf[1024];
You can pull it out of the while loop on both sides and you won't be dumping on your stack as much.
2) 1024 is a standard buffer size, and I wouldn't go much above 2048 because then the lower level TCP/IP stack will just have to break it up anyways.
3) If you really need speed, rather than waiting for a recv ack you could just add a packet number to each packet and then check them on the receiving end. This makes your receiving code a little more complex because it has to store packets that are out of order and put them in order. But then you wouldn't need an acknowledgement.
4) It's a little thing, but what if the file that you are sending has a size that is a multiple of 1024... Then you won't send the trailing '/0'. To fix that you just need to change your while to:
while (size >= 0)

Related

I am getting some garbage value while sending file data through socket to server? Why?

I am setting up a client server connection through sockets in C++. I am successfully connecting them but while sending filesize and file data i am receiving some garbage values also in my server.
I am firstly sending File Size through send system call from client and then sending file Buffer to server.
I have recv system call in server which is successfully receiving Filesize, but while getting File data after some bytes i am getting garbage.
Client Code
File = fopen("index.jpg", "rb");
if (!File)
{
MessageBox(L"Error while readaing the file");
}
fseek(File, 0, SEEK_END);
Size = ftell(File);
fseek(File, 0, SEEK_SET);
char* Buffer = new char[Size];
fread(Buffer, Size, 1, File);
char cSize[MAX_PATH];
sprintf(cSize, "%lu", Size);
send(Socket, cSize, MAX_PATH, 0); // File size
send(Socket, Buffer, Size, 0); // File Binary
Server Code
unsigned long Size;
char *Filesize = new char[1024];
if (recv(Sub, Filesize, 1024, 0)) // File size
{
Size = strtoul(Filesize, NULL, 0); //getting filesize
}
Buffer = new char[Size];
int reader = recv(Sub, Buffer, Size, 0);
Buffer[Size] = '\0';
if (reader == -1) // File Binary
{
MessageBox(L"Perror Recv");
}
else if (reader == 0)
{
MessageBox(L"Connection is Closed");
}
else
{
FILE *File;
File = fopen("test.jpg", "wb");
fwrite((const char*)Buffer, 1, Size, File);
MessageBox(L"DATA Received");
fclose(File);
}
One problem is that you aren't handling the return values from recv() correctly. For example:
if (recv(Sub, Filesize, 1024, 0)) // File size
... when the function quoted above returns, it has written some number of bytes (more than 0, less than 1025) into Filesize. How many? Your program doesn't know, because you didn't store the return value into a variable to find out (rather you only checked it to see if it was non-zero, or not, then discarded the value). Therefore, it's entirely likely that Filesize contains not only your file size value, but the some portion of your file's data as well... which is why that portion of your file's data won't get written to disk later on in your program.
A similar problem is here:
int reader = recv(Sub, Buffer, Size, 0);
You check reader to see if it is -1 or 0 (which is good), but in your final case you just fwrite() out Size bytes from the array, when Buffer contains reader bytes, not Size bytes (and reader could have any value between 1 and Size, depending on how many bytes the TCP stack decided to deliver to you in that particular recv() call.
One more problem is that you send MAX_PATH bytes for the file size, but you receive (up to) 1024 bytes for file size. Is MAX_PATH equal to 1024? If not, then even if recv() did fill out all 1024 bytes, your sender and receiver would still be out of sync with each other, since the excess bytes would show up in future recv() calls, or (alternatively) you'd get bytes from subsequent send() calls placed into your FileSize buffer.
So that's the direct problem -- I think the underlying problem is that you are making some assumptions about how TCP network works that are not true. In particular:
There is no guarantee of a one-to-one correspondence between send() and recv() calls. (TCP is a byte-stream protocol and doesn't do any data-framing)
You cannot rely on N bytes of data from a single call to send() being delivered via a single call to recv(). The bytes you send() will be delivered in order, but there are no guarantees about how many calls to recv() it will require to receive them all, nor about how many bytes any given call to recv() will write into your receive-buffer.
You cannot rely on recv() to fill up the entire buffer you passed to it. recv() will write as few or as many bytes as it wants to, and it's up to your code to handle it correctly regardless of how many bytes it gets per recv() call.
In practice, this means you'll need to call recv() in a loop, and keep careful track of the return value from each recv() call, so you always know exactly how many bytes you've received so far and therefore where inside your buffer the next recv() call should start writing at.
You are not handling the responses of send and recv function make sure you collect that as there are number of bytes send and receive and further use them were ever required.

C++ TCP recv unknown buffer size

I want to use the function recv(socket, buf, len, flags) to receive an incoming packet. However I do not know the length of this packet prior to runtime so the first 8 bytes are supposed to tell me the length of this packet. I don't want to just allocate an arbitrarily large len to accomplish this so is it possible to set len = 8 have buf be a type of uint64_t. Then afterwards
memcpy(dest, &buf, buf)?
Since TCP is stream-based, I'm not sure what type of packages you mean. I will assume that you are referring to application level packages. I mean packages which are defined by your application and not by underlying protocols like TCP. I will call them messages instead to avoid confusion.
I will show two possibilities. First I will show, how you could read a message without knowing the length before you have finished reading. The second example will do two calls. First it reads the size of the message. Then it read the whole message at once.
Read data until the message is complete
Since TCP is stream-based, you will not loss any data when your buffer is not big enough. So you can read a fixed amount of bytes. If something is missing, you can call recv again. Here is a extensive example. I just wrote it without testing. I hope everything would work.
std::size_t offset = 0;
std::vector<char> buf(512);
std::vector<char> readMessage() {
while (true) {
ssize_t ret = recv(fd, buf.data() + offset, buf.size() - offset, 0);
if (ret < 0) {
if (errno == EINTR) {
// Interrupted, just try again ...
continue;
} else {
// Error occured. Throw exception.
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore.
if (offset == 0) {
// Client did just close the connection
return std::vector<char>(); // return empty vector
} else {
// Client did close connection while sending package?
// It is not a clean shutdown. Throw exception.
throw ProtocolException("Unexpected end of stream");
}
} else if (isMessageComplete(buf)) {
// Message is complete.
buf.resize(offset + ret); // Truncate buffer
std::vector<char> msg = std::move(buf);
std::size_t msgLen = getSizeOfMessage(msg);
if (msg.size() > msgLen) {
// msg already contains the beginning of the next message.
// write it back to buf
buf.resize(msg.size() - msgLen)
std::memcpy(buf.data(), msg.data() + msgLen, msg.size() - msgLen);
msg.resize(msgLen);
}
buf.resize(std::max(2*buf.size(), 512)) // prepare buffer for next message
return msg;
} else {
// Message is not complete right now. Read more...
offset += ret;
buf.resize(std::max(buf.size(), 2 * offset)); // double available memory
}
}
}
You have to define bool isMessageComplete(std::vector<char>) and std::size_t getSizeOfMessage(std::vector<char>) by yourself.
Read the header and check the length of the package
The second possibility is to read the header first. Just the 8 bytes which contains the size of the package in your case. After that, you know the size of the package. This mean you can allocate enough storage and read the whole message at once:
/// Reads n bytes from fd.
bool readNBytes(int fd, void *buf, std::size_t n) {
std::size_t offset = 0;
char *cbuf = reinterpret_cast<char*>(buf);
while (true) {
ssize_t ret = recv(fd, cbuf + offset, n - offset, MSG_WAITALL);
if (ret < 0) {
if (errno != EINTR) {
// Error occurred
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore
if (offset == 0) return false;
else throw ProtocolException("Unexpected end of stream");
} else if (offset + ret == n) {
// All n bytes read
return true;
} else {
offset += ret;
}
}
}
/// Reads message from fd
std::vector<char> readMessage(int fd) {
std::uint64_t size;
if (readNBytes(fd, &size, sizeof(size))) {
std::vector buf(size);
if (readNBytes(fd, buf.data(), size)) {
return buf;
} else {
throw ProtocolException("Unexpected end of stream");
}
} else {
// connection was closed
return std::vector<char>();
}
}
The flag MSG_WAITALL requests that the function blocks until the full amount of data is available. However, you cannot rely on that. You have to check it and read again if something is missing. Just like I did it above.
readNBytes(fd, buf, n) reads n bytes. As far as the connection was not closed from the other side, the function will not return without reading n bytes. If the connection was closed by the other side, the function returns false. If the connection was closed in the middle of a message, an exception is thrown. If an i/o-error occurred, another exception is thrown.
readMessage reads 8 bytes [sizeof(std::unit64_t)] und use them as size for the next message. Then it reads the message.
If you want to have platform independency, you should convert size to a defined byte order. Computers (with x86 architecture) are using little endian. It is common to use big endian in network traffic.
Note: With MSG_PEEK it is possible to implement this functionality for UDP. You can request the header while using this flag. Then you can allocate enough space for the whole package.
A fairly common technique is to read leading message length field, then issue a read for the exact size of the expected message.
HOWEVER! Do not assume that the first read will give you all eight bytes(see Note), or that the second read will give you the entire message/packet.
You must always check the number of bytes read and issue another read (or two (or three, or...)) to get all the data you want.
Note: Because TCP is a streaming protocol and because the packet size "on the wire" varies in accordance with a very arcane algorithm designed to maximize network performance, you could easily issue a read for eight bytes and the read could return having only read three (or seven or ...) bytes. The guarantee is that unless there is an unrecoverable error you will receive at least one byte and at most the number of bytes you requested. Because of this you must be prepared to do byte address arithmetic and issue all reads in a loop that repeats until the desired number of bytes is returned.
Since TCP is streaming there isn't really any end to the data you receive, not until the connection is closed or there is an error.
Instead you need to implement your own protocol on top of TCP, one that either contains a specific end-of-message marker, a length-of-data header field, or possibly a command-based protocol where the data of each command is of a well-known size.
That way you can read into a small fixed-sized buffer and append to a larger (possibly expanding) buffer as needed. The "possibly expanding" part is ridiculously easy in C++, what with std::vector and std::string (depending on the data you have)
There is another important thing to remember, that since TCP is stream-based, a single read or recv call may not actually fetch all the data you request. You need to receive the data in a loop until you have received everything.
In my Personal opinion.
I suggest receive "size of message"(integer 4 byte fixed) first.
recv(socket, "size of message written in integer" , "size of integer")
then
receive real message after.
recv(socket, " real message" ,"size of message written in integer")
This techinique also can be used on "sending files, images ,long messages"

recv the first few bytes from a socket to determine buffer size

I'm writing a distributed system in c++ using TCP/IP and sockets.
For each of my messages, I need to receive the first 5 bytes to know the full length of the incoming message.
What's the best way to do this?
recv() only 5 bytes, then recv() again. if I choose this, would it be safe to assume I'll get 0 or 5 bytes in the recv (aka not write a loop to keep trying)?
use MSG_PEEK
recv() some larger buffer size, then read the first 5 bytes and allocate the final buffer then.
You don't need to know anything. TCP is a stream protocol, and at any given moment you can get as little as one byte, or as much as multiple megabytes of data. The correct and only way to use a TCP socket is to read in a loop.
char buf[4096]; // or whatever
std::deque<char> data;
for (int res ; ; )
{
res = recv(fd, buf, sizeof buf, MSG_DONTWAIT);
if (res == -1)
{
if (errno == EAGAIN || errno == EWOULDBLOCK)
{
break; // done reading
}
else
{
// error, break, die
}
}
if (res == 0)
{
// socket closed, finalise, break
}
else
{
data.insert(data.end(), buf, buf + res);
}
}
The only purpose of the loop is to transfer data from the socket buffer to your application. Your application must then decide independently if there's enough data in the queue to attempt extraction of some sort of higher-level application message.
For example, in your case you would check if the queue's size is at least 5, then inspect the first five bytes, and then check if the queue holds a complete application message. If no, you abort, and if yes, you extract the entire message and pop if off from the front of the queue.
Use a state machine with two states:
First State.
Receive bytes as they arrive into a buffer. When there are 5 or more bytes perform your check on those first 5 and possibly process the rest of the buffer. Switch to the second state.
Second State.
Receive and process bytes as they arrive to the end of the message.
to answer your question specifically:
it's not safe to assume you'll get 0 or 5. it is possible to get 1-4 as well. loop until you get 5 or an error as others have suggested.
i wouldn't bother with PEEK, most of the time you'll block (assuming blocking calls) or get 5 so skip the extra call into the stack.
this is fine too but adds complexity for little gain.

Partial receipt of packets from socket C++

I have a trouble, my server application sends packet 8 bytes length - AABBCC1122334455 but my application receives this packet in two parts AABBCC1122 and 334455, via "recv" function, how can i fix that?
Thanks!
To sum up a liitle bit:
TCP connection doesn't operate with packets or messages on the application level, you're dealing with stream of bytes. From this point of view it's similar to writing and reading from a file.
Both send and recv can send and receive less data than provided in the argument. You have to deal with it correctly (usually by applying proper loop around the call).
As you're dealing with streams, you have to find the way to convert it to meaningful data in your application. In other words, you have to design serialisation protocol.
From what you've already mentioned, you most probably want to send some kind of messages (well, it's usually what people do). The key thing is to discover the boundaries of messages properly. If your messages are of fixed size, you simply grab the same amount of data from the stream and translate it to your message; otherwise, you need a different approach:
If you can come up with a character which cannot exist in your message, it could be your delimiter. You can then read the stream until you reach the character and it'll be your message. If you transfer ASCII characters (strings) you can use zero as a separator.
If you transfer binary data (raw integers etc.), all characters can appear in your message, so nothing can act as a delimiter. Probably the most common approach in this case is to use fixed-size prefix containing size of your message. Size of this extra field depends on the max size of your message (you will be probably safe with 4 bytes, but if you know what is the maximum size, you can use lower values). Then your packet would look like SSSS|PPPPPPPPP... (stream of bytes), where S is the additional size field and P is your payload (the real message in your application, number of P bytes is determined by value of S). You know every packet starts with 4 special bytes (S bytes), so you can read them as an 32-bit integer. Once you know the size of the encapsulated message, you read all the P bytes. After you're done with one packet, you're ready to read another one from the socket.
Good news though, you can come up with something completely different. All you need to know is how to deserialise your message from a stream of bytes and how send/recv behave. Good luck!
EDIT:
Example of function receiving arbitrary number of bytes into array:
bool recv_full(int sock, char *buffer, size_t size)
{
size_t received = 0;
while (received < size)
{
ssize_t r = recv(sock, buffer + received, size - received, 0);
if (r <= 0) break;
received += r;
}
return received == size;
}
And example of receiving packet with 2-byte prefix defining size of payload (size of payload is then limited to 65kB):
uint16_t msgSize = 0;
char msg[0xffff];
if (recv_full(sock, reinterpret_cast<char *>(&msgSize), sizeof(msgSize)) &&
recv_full(sock, msg, msgSize))
{
// Got the message in msg array
}
else
{
// Something bad happened to the connection
}
That's just how recv() works on most platforms. You have to check the number of bytes you receive and continue calling it in a loop until you get the number that you need.
You "fix" that by reading from TCP socket in a loop until you get enough bytes to make sense to your application.
my server application sends packet 8 bytes length
Not really. Your server sends 8 individual bytes, not a packet 8 bytes long. TCP data is sent over a byte stream, not a packet stream. TCP neither respects nor maintains any "packet" boundary that you might have in mind.
If you know that your data is provided in quanta of N bytes, then call recv in a loop:
std::vector<char> read_packet(int N) {
std::vector buffer(N);
int total = 0, count;
while ( total < N && (count = recv(sock_fd, &buffer[N], N-total, 0)) > 0 )
total += count;
return buffer;
}
std::vector<char> packet = read_packet(8);
If your packet is variable length, try sending it before the data itself:
int read_int() {
std::vector<char> buffer = read_packet(sizeof (int));
int result;
memcpy((void*)&result, (void*)&buffer[0], sizeof(int));
return result;
}
int length = read_int();
std::vector<char> data = read_buffer(length);

Determining the size of the next UDP datagram in system's queue

I want to know the size of the next UDP datagram in the system's queue.
I found this question with a similar doubt, but using boost. The last answer (as of 2010/09/23) say something about using getsockopt with the SO_NREAD option in OS X, but I can't find anything about this with Windows (using Winsock).
Here I found that I can use ioctlsocket with FIONREAD to find out what is the size of the entire queue, but I didn't find anything about the first datagram.
So my question is:
Is there a way to determine what is the size of the next UDP datagram in the queue using the sockets API? (I'm not using boost).
I want my code to look like this:
char BigBuffer[ 64 * 1024 ];
void Read( void *Buf, size_t Size ) {
size_t LengthInQueue = WhatTheSizeOfTheNextDatagram();
if( Size < LengthInQueue ) {
recvfrom( Socket, BigBuffer, 64*1024, /*...*/ );
memcpy( Buf, BigBuffer, Size );
}
else {
recvfrom( Socket, Buf, size, /*...*/ );
}
}
I left out error checking and some parameters for the sake of space and readability.
I want to avoid copying to a intermediary buffer when its not needed.
If I understand correctly, you just want to ensure that Buf doesn't overflow, but you don't care about any extra data beyond Size, since you're discarding it anyway. In that case, all you need is this:
recvfrom( Socket, Buf, size, /*...*/ );
The remainder of the packet is automatically discarded.
Quoted from the docs:
For message-oriented sockets, data is extracted from the first enqueued message, up to the size of the buffer specified. If the datagram or message is larger than the buffer specified, the buffer is filled with the first part of the datagram, and recvfrom generates the error WSAEMSGSIZE. For unreliable protocols (for example, UDP) the excess data is lost.
You can pop this datagram and calculate lenght of this, for get datagram existence information you can use select function, and for get all datagram you can use recv_from function with 64k buffer size agrument, then this function was result what you want.
Call ioctl (ioctlsocket on windows) function with FIONREAD:
#ifdef WIN32
if(ioctlsocket(socket, FIONREAD, &ul) == SOCKET_ERROR) {
return -1;
}
return (int)ul;
#else
int i; // ?? int ??
if(ioctl(socket, FIONREAD, &i) == -1) {
return -1;
}
return i;
#endif