Pointer in C++ - Need explanation how it works

Pointer in C++ - Need explanation how it works - c++

http://www.codeproject.com/KB/IP/SocketFileTransfer.aspx?artkw=socket%20send%20a%20file
I don't clearly understand this line :
// get the file's size first
cbLeftToReceive = sizeof( dataLength );
do
{
BYTE* bp = (BYTE*)(&dataLength) + sizeof(dataLength) - cbLeftToReceive;
cbBytesRet = sockClient.Receive( bp, cbLeftToReceive );
// test for errors and get out if they occurred
if ( cbBytesRet == SOCKET_ERROR || cbBytesRet == 0 )
{
int iErr = ::GetLastError();
TRACE( "GetFileFromRemoteSite returned
a socket error while getting file length\n"
"\tNumber of Bytes received (zero means connection was closed) = %d\n"
"\tGetLastError = %d\n", cbBytesRet, iErr );
/* you should handle the error here */
bRet = FALSE;
goto PreReturnCleanup;
}
// good data was retrieved, so accumulate
// it with already-received data
cbLeftToReceive -= cbBytesRet;
}
while ( cbLeftToReceive > 0 );
I want to know how get it get size of the file to dataLength :)
This line : BYTE* bp = (BYTE*)(&dataLength) + sizeof(dataLength) - cbLeftToReceive; does it right that bp is a byte pointer of dataLength addr but what + sizeof(dataLength) - cbLeftToReceive mean ?
I don't think the file is that small : 4 bytes, just onc Receive how can they receive dataLength and data ? Does it send dataLength first and data after ?

Oh. The funny array arithmetic. The idea is to count from the end, so that when you reach the end you know you're done. In pieces:
1. Find the address of dataLength (BYTE*)(&dataLength)
2. Skip to the end of dataLength + sizeof(dataLength)
3. Back up by the number of bytes we expect to receive - cbLeftToReceive
This is where we are writing the bytes we get from the network.
As we get bytes from the network, we reduce cbLeftToReceive (cbLeftToReceive -= cbBytesRet;) and continue trying to receive bytes until we are done. So every time through the loop, bp points to where we need to write the next bytes we Receive().
EDIT:
So now that we know how many bytes we're going to get, how to we receive them without potentially filling all of RAM with hunks of the data? We get a buffer, repeatedly fill it, and flush that buffer to disk whenever it's not empty. When there's still a lot of data (more than a buffer) left to receive, we try to Receive() a fill buffer. When there's less than a full buffer to go, we only request to the end of the file.
iiGet = (cbLeftToReceive<RECV_BUFFER_SIZE) ?
cbLeftToReceive : RECV_BUFFER_SIZE ;
iiRecd = sockClient.Receive( recdData, iiGet );
The we catch and handle errors. If there was no error, write however many bytes we got and reduce the number of bytes we expect to receive by the number we got.
destFile.Write( recdData, iiRecd); // Write it
cbLeftToReceive -= iiRecd;
If we're still not done receiving bytes, go back to the top and keep going.
while ( cbLeftToReceive > 0 );
General advice:
It's good to practice reading code where you don't pay too much attention to the error handling and exception handling code. Typically what's left is much easier to understand.

He/She means that he sets aside the size of int in the start of the buffer, where the size of the file will be placed (it will be later read from the socket)

Related

C++ TCP recv unknown buffer size

I want to use the function recv(socket, buf, len, flags) to receive an incoming packet. However I do not know the length of this packet prior to runtime so the first 8 bytes are supposed to tell me the length of this packet. I don't want to just allocate an arbitrarily large len to accomplish this so is it possible to set len = 8 have buf be a type of uint64_t. Then afterwards
memcpy(dest, &buf, buf)?

Since TCP is stream-based, I'm not sure what type of packages you mean. I will assume that you are referring to application level packages. I mean packages which are defined by your application and not by underlying protocols like TCP. I will call them messages instead to avoid confusion.
I will show two possibilities. First I will show, how you could read a message without knowing the length before you have finished reading. The second example will do two calls. First it reads the size of the message. Then it read the whole message at once.
Read data until the message is complete
Since TCP is stream-based, you will not loss any data when your buffer is not big enough. So you can read a fixed amount of bytes. If something is missing, you can call recv again. Here is a extensive example. I just wrote it without testing. I hope everything would work.
std::size_t offset = 0;
std::vector<char> buf(512);
std::vector<char> readMessage() {
while (true) {
ssize_t ret = recv(fd, buf.data() + offset, buf.size() - offset, 0);
if (ret < 0) {
if (errno == EINTR) {
// Interrupted, just try again ...
continue;
} else {
// Error occured. Throw exception.
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore.
if (offset == 0) {
// Client did just close the connection
return std::vector<char>(); // return empty vector
} else {
// Client did close connection while sending package?
// It is not a clean shutdown. Throw exception.
throw ProtocolException("Unexpected end of stream");
}
} else if (isMessageComplete(buf)) {
// Message is complete.
buf.resize(offset + ret); // Truncate buffer
std::vector<char> msg = std::move(buf);
std::size_t msgLen = getSizeOfMessage(msg);
if (msg.size() > msgLen) {
// msg already contains the beginning of the next message.
// write it back to buf
buf.resize(msg.size() - msgLen)
std::memcpy(buf.data(), msg.data() + msgLen, msg.size() - msgLen);
msg.resize(msgLen);
}
buf.resize(std::max(2*buf.size(), 512)) // prepare buffer for next message
return msg;
} else {
// Message is not complete right now. Read more...
offset += ret;
buf.resize(std::max(buf.size(), 2 * offset)); // double available memory
}
}
}
You have to define bool isMessageComplete(std::vector<char>) and std::size_t getSizeOfMessage(std::vector<char>) by yourself.
Read the header and check the length of the package
The second possibility is to read the header first. Just the 8 bytes which contains the size of the package in your case. After that, you know the size of the package. This mean you can allocate enough storage and read the whole message at once:
/// Reads n bytes from fd.
bool readNBytes(int fd, void *buf, std::size_t n) {
std::size_t offset = 0;
char *cbuf = reinterpret_cast<char*>(buf);
while (true) {
ssize_t ret = recv(fd, cbuf + offset, n - offset, MSG_WAITALL);
if (ret < 0) {
if (errno != EINTR) {
// Error occurred
throw IOException(strerror(errno));
}
} else if (ret == 0) {
// No data available anymore
if (offset == 0) return false;
else throw ProtocolException("Unexpected end of stream");
} else if (offset + ret == n) {
// All n bytes read
return true;
} else {
offset += ret;
}
}
}
/// Reads message from fd
std::vector<char> readMessage(int fd) {
std::uint64_t size;
if (readNBytes(fd, &size, sizeof(size))) {
std::vector buf(size);
if (readNBytes(fd, buf.data(), size)) {
return buf;
} else {
throw ProtocolException("Unexpected end of stream");
}
} else {
// connection was closed
return std::vector<char>();
}
}
The flag MSG_WAITALL requests that the function blocks until the full amount of data is available. However, you cannot rely on that. You have to check it and read again if something is missing. Just like I did it above.
readNBytes(fd, buf, n) reads n bytes. As far as the connection was not closed from the other side, the function will not return without reading n bytes. If the connection was closed by the other side, the function returns false. If the connection was closed in the middle of a message, an exception is thrown. If an i/o-error occurred, another exception is thrown.
readMessage reads 8 bytes [sizeof(std::unit64_t)] und use them as size for the next message. Then it reads the message.
If you want to have platform independency, you should convert size to a defined byte order. Computers (with x86 architecture) are using little endian. It is common to use big endian in network traffic.
Note: With MSG_PEEK it is possible to implement this functionality for UDP. You can request the header while using this flag. Then you can allocate enough space for the whole package.

A fairly common technique is to read leading message length field, then issue a read for the exact size of the expected message.
HOWEVER! Do not assume that the first read will give you all eight bytes(see Note), or that the second read will give you the entire message/packet.
You must always check the number of bytes read and issue another read (or two (or three, or...)) to get all the data you want.
Note: Because TCP is a streaming protocol and because the packet size "on the wire" varies in accordance with a very arcane algorithm designed to maximize network performance, you could easily issue a read for eight bytes and the read could return having only read three (or seven or ...) bytes. The guarantee is that unless there is an unrecoverable error you will receive at least one byte and at most the number of bytes you requested. Because of this you must be prepared to do byte address arithmetic and issue all reads in a loop that repeats until the desired number of bytes is returned.

Since TCP is streaming there isn't really any end to the data you receive, not until the connection is closed or there is an error.
Instead you need to implement your own protocol on top of TCP, one that either contains a specific end-of-message marker, a length-of-data header field, or possibly a command-based protocol where the data of each command is of a well-known size.
That way you can read into a small fixed-sized buffer and append to a larger (possibly expanding) buffer as needed. The "possibly expanding" part is ridiculously easy in C++, what with std::vector and std::string (depending on the data you have)
There is another important thing to remember, that since TCP is stream-based, a single read or recv call may not actually fetch all the data you request. You need to receive the data in a loop until you have received everything.

In my Personal opinion.
I suggest receive "size of message"(integer 4 byte fixed) first.
recv(socket, "size of message written in integer" , "size of integer")
then
receive real message after.
recv(socket, " real message" ,"size of message written in integer")
This techinique also can be used on "sending files, images ,long messages"

error on tcp sending buffer of a Mat

I am trying to send out Mat image by TCP. Firstly the Mat has been transferred into uchar and then into char format. The whole image in char format will be send out buffer by buffer whose size is 1024 byte. The following is my code.
Mat decodeImg = imdecode(Mat(bufferFrame), 1);
uchar *transferImg = decodeImg.data;
char* charImg = (char*) transferImg;
int length = strlen(charImg);
int offset = 0;
while (true)
{
bzero(bufferSend, BUFFER_SIZE);
if (offset + BUFFER_SIZE <= length)
{
for (int i = 0; i < BUFFER_SIZE; i++)
{
bufferSend[i] = charImg[i + offset];
}
// memcpy(charImg+offset, bufferSend,BUFFER_SIZE);
if (send(sockfd, bufferSend, sizeof(bufferSend), 0) < 0)
{
printf("Send FIle Failed,total length is%d,failed offset is%d\n",
length,
offset);
break;
}
}
else
{
for (int i = 0; i < length - offset; i++)
{
bufferSend[i] = charImg[i + offset];
}
if (send(sockfd, bufferSend, sizeof(bufferSend), 0) < 0)
{
printf("Send FIle Failed,total length is%d,failed offset is%d\n",
length,
offset);
break;
}
break;
}
offset += BUFFER_SIZE;
}
The output of the code shows : send file failed, total length is 251035, failed offset is 182272.
I am really appreciated on your help. Thank you in advance!

Pulling out the crystal ball here. This might be OP's problem, but if it isn't, this is certainly a problem that needs to be addressed.
Mat decodeImg = imdecode(Mat(bufferFrame), 1);
uchar *transferImg = decodeImg.data;
Get data. Not a bad idea if that's what you need to send.
char* charImg = (char*) transferImg;
Take the array of bytes from above and treat it as an array of characters.
int length = strlen(charImg);
And Boom. Matrix data is not ascii formated data, a string, so it should not be treated like a string.
strlen counts data until it reaches a null character, a character with the numerical value 0, which does not exist in the normal alpha numeric world and thus can be used as a canary value to signal the end of a string. The count is the number of characters before the first null character in the string.
In this case we don't have a string. We have a blob of binary numbers, any one of which could bee 0. There could be a null value anywhere. Could be right at the beginning. Could be a hundred bytes in. There might not be a null value in the until long after all of the valid image data has been read.
Anyway, strlen will almost certainly return the wrong value. Too few bytes and the receiver doesn't get all of the image data and I have no idea what it does. That code's not available to us. It probably gets upset and discards the result. Maybe it crashes. There's no way to know. If there is too much information, we also don't know what happens. Maybe it processes the file happily and ignores the extra crap that's sent. Maybe it crashes.
But what if it closes the TCP/IP connection when it has enough bytes? That leaves the sender trying to write a handful of unsent and unwanted bytes into a closed socket. send will fail and set the error code to socket closed.
Solution:
Get the right size of the data.
What I'm reading from the openCV documentation is inside a Mat is Mat::elemSize which will give you the size of each item in the matrix and Mat::size which returns a Size object containing the rows and columns. Multiply rows * columns * elemSize and you should have the number of bytes to send.
EDIT
This looks to be a better way to get the size.

Garbage values and Buffers differences in TCP

First question: I am confused between Buffers in TCP. I am trying to explain my proble, i read this documentation TCP Buffer, author said a lot about TCP Buffer, thats fine and a really good explanation for a beginner. What i need to know is this TCP Buffer is same buffer with the one we use in our basic client server program (Char *buffer[Some_Size]) or its some different buffer hold by TCP internally ?
My second question is that i am sending a string data with prefix length (This is data From me) from client over socket to server, when i print my data at console along with my string it prints some garbage value also like this "This is data From me zzzzzz 1/2 1/2....." ?. However i fixed it by right shifting char *recvbuf = new char[nlength>>3]; nlength to 3 bits but why i need to do it in this way ?
My third question is in relevance with first question if there is nothing like TCP Buffer and its only about the Char *buffer[some_size] then whats the difference my program will notice using such static memory allocation buffer and by using dynamic memory allocation buffer using char *recvbuf = new char[nlength];. In short which is best and why ?
Client Code
int bytesSent;
int bytesRecv = SOCKET_ERROR;
char sendbuf[200] = "This is data From me";
int nBytes = 200, nLeft, idx;
nLeft = nBytes;
idx = 0;
uint32_t varSize = strlen (sendbuf);
bytesSent = send(ConnectSocket,(char*)&varSize, 4, 0);
assert (bytesSent == sizeof (uint32_t));
std::cout<<"length information is in:"<<bytesSent<<"bytes"<<std::endl;
// code to make sure all data has been sent
while (nLeft > 0)
{
bytesSent = send(ConnectSocket, &sendbuf[idx], nLeft, 0);
if (bytesSent == SOCKET_ERROR)
{
std::cerr<<"send() error: " << WSAGetLastError() <<std::endl;
break;
}
nLeft -= bytesSent;
idx += bytesSent;
}
std::cout<<"Client: Bytes sent:"<< bytesSent;
Server code:
int bytesSent;
char sendbuf[200] = "This string is a test data from server";
int bytesRecv;
int idx = 0;
uint32_t nlength;
int length_received = recv(m_socket,(char*)&nlength, 4, 0);//Data length info
char *recvbuf = new char[nlength];//dynamic memory allocation based on data length info
//code to make sure all data has been received
while (nlength > 0)
{
bytesRecv = recv(m_socket, &recvbuf[idx], nlength, 0);
if (bytesRecv == SOCKET_ERROR)
{
std::cerr<<"recv() error: " << WSAGetLastError() <<std::endl;
break;
}
idx += bytesRecv;
nlength -= bytesRecv;
}
cout<<"Server: Received complete data is:"<< recvbuf<<std::endl;
cout<<"Server: Received bytes are"<<bytesRecv<<std::endl;
WSACleanup();
system("pause");
delete[] recvbuf;
return 0;
}

You send 200 bytes from the client, unconditionally, but in the server you only receive the actual length of the string, and that length does not include the string terminator.
So first of all you don't receive all data that was sent (which means you will fill up the system buffers), and then you don't terminate the string properly (which leads to "garbage" output when trying to print the string).
To fix this, in the client only send the actual length of the string (the value of varSize), and in the receiving server allocate one more character for the terminator, which you of course needs to add.

First question: I am confused between Buffers in TCP. I am trying to
explain my proble, i read this documentation TCP Buffer, author said a
lot about TCP Buffer, thats fine and a really good explanation for a
beginner. What i need to know is this TCP Buffer is same buffer with
the one we use in our basic client server program (Char
*buffer[Some_Size]) or its some different buffer hold by TCP internally ?
When you call send(), the TCP stack will copy some of the bytes out of your char array into an in-kernel buffer, and send() will return the number of bytes that it copied. The TCP stack will then handle the transmission of those in-kernel bytes to its destination across the network as quickly as it can. It's important to note that send()'s return value is not guaranteed to be the same as the number of bytes you specified in the length argument you passed to it; it could be less. It's also important to note that sends()'s return value does not imply that that many bytes have arrived at the receiving program; rather it only indicates the number of bytes that the kernel has accepted from you and will try to deliver.
Likewise, recv() merely copies some bytes from an in-kernel buffer to the array you specify, and then drops them from the in-kernel buffer. Again, the number of bytes copied may be less than the number you asked for, and generally will be different from the number of bytes passed by the sender on any particular call of send(). (E.g if the sender called send() and his send() returned 1000, that might result in you calling recv() twice and having recv() return 500 each time, or recv() might return 250 four times, or (1, 990, 9), or any other combination you can think of that eventually adds up to 1000)
My second question is that i am sending a string data with prefix
length (This is data From me) from client over socket to server, when
i print my data at console along with my string it prints some garbage
value also like this "This is data From me zzzzzz 1/2 1/2....." ?.
However i fixed it by right shifting char *recvbuf = new
char[nlength>>3]; nlength to 3 bits but why i need to it in this way ?
Like Joachim said, this happens because C strings depend on the presence of a NUL-terminator byte (i.e. a zero byte) to indicate their end. You are receiving strlen(sendbuf) bytes, and the value returned by strlen() does not include the NUL byte. When the receiver's string-printing routine tries to print the string, it keeps printing until if finds a NUL byte (by chance) somewhere later on in memory; in the meantime, you get to see all the random bytes that are in memory before that point. To fix the problem, either increase your sent-bytes counter to (strlen(sendbuf)+1), so that the NUL terminator byte gets received as well, or alternatively have your receiver manually place the NUL byte at the end of the string after it has received all of the bytes of the string. Either way is acceptable (the latter way might be slightly preferable as that way the receiver isn't depending on the sender to do the right thing).
Note that if your sender is going to always send 200 bytes rather than just the number of bytes in the string, then your receiver will need to always receive 200 bytes if it wants to receive more than one block; otherwise when it tries to receive the next block it will first get all the extra bytes (after the string) before it gets the next block's send-length field.
My third question is in relevance with first question if there is
nothing like TCP Buffer and its only about the Char *buffer[some_size]
then whats the difference my program will notice using such static
memory allocation buffer and by using dynamic memory allocation buffer
using char *recvbuf = new char[nlength];. In short which is best and
why ?
In terms of performance, it makes no difference at all. send() and receive() don't care a bit whether the pointers you pass to them point at the heap or the stack.
In terms of design, there are some tradeoffs: if you use new, there is a chance that you can leak memory if you don't always call delete[] when you're done with the buffer. (This can particularly happen when exceptions are thrown, or when error paths are taken). Placing the buffer on the stack, on the other hand, is guaranteed not to leak memory, but the amount of space available on the stack is finite so a really huge array could cause your program to run out of stack space and crash. In this case, a single 200-byte array on the stack is no problem, so that's what I would use.

receive from unix local socket and buffer size

I'm having a problem with unix local sockets. While reading a message that's longer than my temp buffer size, the request takes too long (maybe indefinitely).
Added after some tests:
there is still problem with freeze at ::recv. when I send (1023*8) bytes or less to the UNIX socket - all ok, but when sended more than (1023*9) - i get freeze on recv command.
maybe its FreeBSD default UNIX socket limit or C++ default socket settings? Who know?
i made some additational tests and I am 100% sure that its "freeze" on the last 9th itteration when executing ::recv command, when trying to read message >= (1023*9) bytes long. (first 8th itterationg going well.)
What I'm doing:
The idea is to read in a do/while loop from a socket with
::recv (current_socket, buf, 1024, 0);
and check buf for a SPECIAL SYMBOL. If not found:
merge content of buffer to stringxxx += buf;
bzero temp buf
continue the ::recv loop
How do I fix the issue with the request taking too long in the while loop?
Is there a better way to clear the buffer? Currently, it's:
char buf [1025];
bzero(buf, 1025);
But I know bzero is deprecated in the new c++ standard.
EDIT:
*"Why need to clean the buffer*
I see questions at comments with this question. Without buffer cleanup on the next(last) itteration of reading to the buffer, it will contain the "tail" of first part of the message.
Example:
// message at the socket is "AAAAAACDE"
char buf [6];
::recv (current_socket, buf, 6, 0); // read 6 symbols, buf = "AAAAAA"
// no cleanup, read the last part of the message with recv
::recv (current_socket, buf, 6, 0);
// read 6 symbols, but buffer contain only 3 not readed before symbols, therefore
// buf now contain "CDEAAA" (not correct, we waiting for CDE only)

When your recv() enters an infinite loop, this probably means that it's not making any progress whatsoever on the iterations (i.e., you're always getting a short read of zero size immediately, so your loop never exits, because you're not getting any data). For stream sockets, a recv() of zero size means that the remote end has disconnected (it's something like read()ing from a file when the input is positioned at EOF also gets you zero bytes), or at least that it has shut down the sending channel (that's for TCP specifically).
Check whether your PHP script is actually sending the amount of data you claim it sends.
To add a small (non-sensical) example for properly using recv() in a loop:
char buf[1024];
std::string data;
while( data.size() < 10000 ) { // what you wish to receive
::ssize_t rcvd = ::recv(fd, buf, sizeof(buf), 0);
if( rcvd < 0 ) {
std::cout << "Failed to receive\n"; // Receive failed - something broke, see errno.
std::abort();
} else if( !rcvd ) {
break; // No data to receive, remote end closed connection, so quit.
} else {
data.append(buf, rcvd); // Received into buffer, attach to data buffer.
}
}
if( data.size() < 10000 ) {
std::cout << "Short receive, sender broken\n";
std::abort();
}
// Do something with the buffer data.

Instead of bzero, you can just use
memset(buf, 0, 1025);

These are 2 separate issues. The long time is probably some infinite loop due to a bug in your code and has nothing to do with the way you clear your buffer. As a matter of fact you shouldn't need to clear the buffer; receive returns the number of bytes read, so you can scan the buffer for your SPECIAL_SYMBOL up to that point.
If you paste the code maybe I can help. more.

Just to clarify: bzero is not deprecated in C++ 11. Rather, it's never been part of any C or C++ standard. C started out with memset 20+ years ago. For C++, you might consider using std::fill_n instead (or just using std::vector, which can zero-fill automatically). Then again, I'm not sure there's a good reason to zero-fill the buffer in this case at all.

my c++ client/server file exchange implementation is very slow...why?

Hi have implemented simple file exchange over a client/server connection in c++. Works fine except for the one problem that its so damn slow. This is my code:
For sending the file:
int send_file(int fd)
{
char rec[10];
struct stat stat_buf;
fstat (fd, &stat_buf);
int size=stat_buf.st_size;
while(size > 0)
{
char buffer[1024];
bzero(buffer,1024);
bzero(rec,10);
int n;
if(size>=1024)
{
n=read(fd, buffer, 1024);
// Send a chunk of data
n=send(sockFile_, buffer, n, 0 );
// Wait for an acknowledgement
n = recv(sockFile_, rec, 10, 0 );
}
else // reamining file bytes
{
n=read(fd, buffer, size);
buffer[size]='\0';
send(sockFile_,buffer, n, 0 );
n=recv(sockFile_, rec, 10, 0 ); // ack
}
size -= 1024;
}
// Send a completion string
int n = send(sockFile_, "COMP",strlen("COMP"), 0 );
char buf[10];
bzero(buf,10);
// Receive an acknowledgemnt
n = recv(sockFile_, buf, 10, 0 );
return(0);
}
And for receiving the file:
int receive_file(int size, const char* saveName)
{
ofstream outFile(saveName,ios::out|ios::binary|ios::app);
while(size > 0)
{
// buffer for storing incoming data
char buf[1024];
bzero(buf,1024);
if(size>=1024)
{
// receive chunk of data
n=recv(sockFile_, buf, 1024, 0 );
// write chunk of data to disk
outFile.write(buf,n);
// send acknowledgement
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
else
{
n=recv(sockFile_, buf, size, 0 );
buf[size]='\0';
outFile.write(buf,n);
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
size -= 1024;
}
outFile.close();
// Receive 'COMP' and send acknowledgement
// ---------------------------------------
char buf[10];
bzero(buf,10);
n = recv(sockFile_, buf, 10, 0 );
n = send(sockFile_, "OK", strlen("OK"), 0 );
std::cout<<"File received..."<<std::endl;
return(0);
}
Now here are my initial thoughts: Perhaps the buffer is too small. I should therefore try increasing the size from I dunno, 1024 bytes (1KB) to 65536 (64KB) blocks, possibly. But this results in file corruption. Ok, so perhaps the code is also being slowed down by the need to receive an acknowledgement after each 1024 byte block of data has been sent, so why not remove them? Unfortunately this results in the blocks not arriving in the correct order and hence file corruption.
Perhaps I could split the file into chunks before hand and create multiple connections and send each chunk over its own threaded connection and then reassemble the chunks somehow in the receiver....
Any idea how I could make the file transfer process more efficient (faster)?
Thanks,
Ben.

Skip the acknowledgement of buffers! You insert an artificial round trip (server->client+client->server) for probably each single packet.
This slows down the transfer.
You do not need this ack. You are using TCP, which gives you a reliable stream. Send the number of bytes, then send the whole file. Do not read after send and so on.
EDIT: As a second step, you should increase the buffer size. For internet transfer you can assume an MTU of 1500, so there will be space for a payload of 1452 bytes in each IP packet. This should be your minimal buffer size. Make it larger and let the operating system slice the buffers into packets for you. For LAN you have a much higher MTU.

My guess is that you are getting out of sync and some of your reads are less than 1024. It happens all the time with sockets. The "size -= 1024" statement should be "size -= n".
My guess is that n is sometimes less than 1024 from the recv().

You should certainly increase the buffer size, and if this causes corruption it is an error in your code, which you need to fix. Also, if you use a stream protocol (i.e. TCP/IP) the order and delivery of packets is guaranteed.

Read this thread:
Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
Oh, and use sendfile POSIX command, here's an example to get you started:
http://tldp.org/LDP/LGNET/91/misc/tranter/server.c.txt

A couple of things.
1) You are reallocating the buffer each time you go through your while loop:
while(size > 0)
{
char buf[1024];
You can pull it out of the while loop on both sides and you won't be dumping on your stack as much.
2) 1024 is a standard buffer size, and I wouldn't go much above 2048 because then the lower level TCP/IP stack will just have to break it up anyways.
3) If you really need speed, rather than waiting for a recv ack you could just add a packet number to each packet and then check them on the receiving end. This makes your receiving code a little more complex because it has to store packets that are out of order and put them in order. But then you wouldn't need an acknowledgement.
4) It's a little thing, but what if the file that you are sending has a size that is a multiple of 1024... Then you won't send the trailing '/0'. To fix that you just need to change your while to:
while (size >= 0)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js