zlib inflate error : Z_DATA_ERROR while the received packets is out-of-order or lost - compression

I have work this for weeks, very hope for your help!!! please forgive my poor english.
First, I think it's necessary to describe the Application Scenario:
what data I want to decompress?----the data is come from the network traffic of the internet. In these traffic, there are some data are compressed by gzip and store in the http or tcp packet, if the data size is huge and large than the maxlength of tcp payload, it will be sliced and transmiss. I can extract the compressed data from these packet, and group these data by the tcp stream. so I can assure that the data extracted from these packets of one specific tcp stream is belong to Same data source. so the data is consist of many compressed data chunk, the Application Scenario require that you need to decompress the data immediately once recieved one packet. For each tcp stream, we maintain a z_stream data structure.
When does the program report an error? ----All of the error is "Z_DATA_ERROR: invalid distance too far back". then I find when the recieved packet is out-of-order or some packet is lossed, the error will happen!
One simple case:
Compressed data is split into multiple data blocks and stored in network data packets(p1, p2, p3, p4, p5, p6, p7), and then transmiss in one specific tcp stream. For each tcp stream, we maintain a z_stream data structure. Obviously, p1 include the gzip header 0x1f 0x8b 0x08...), but due to the Uncertainty in network transmission, the packet recieved may be out-of-order or loss, for example: (p1,p2,p5,p6,p7,p3,p4),the first two packet can decompree normally, but when decompress p5, the error occur(Z_DATA_ERROR).
SO, I have these problem:
Due to the application scenario, I need to decompress the data once recieved one packet with gzip content-encoding. So I want to know if zlib supports such a function----directly decompress a compressed block without having to consider the packet arrival order?
I also test the influce of packet recieved order: If I sort the data in its original order and then decompress it sequentially, it will decompress normally.
Thirdly, Logically speaking, for the packeted recieved order (p1,p2,p5,p6,p7,p3,p4), when decompress these packet sequentially, p1,p2 will decompress successfully, p5,p6,p7 will decompress failed, the next packet recieved is p3, Logically speaking, it should be decompress successfully, but when I test this case, it failed, I don't understand this.
I also found a confusing problem, which does not happen often: if I sort the packet as (p1,p2,p3,p5,p4...), Logically speaking, when decompress p5, it should report an error, buf it decompress sucessfully, I don't understand this.
the following is source code:
/**
* buf: the gzip compressed data that extract form tcp packet
*/
void dowithGzipDataByZlib(z_stream * p_zlib_strm, unsigned char * buf, int buflen)
{
int zlib_status = Z_OK;
int bytes_dc_now = 0;
unsigned char pNowResBuff[4096];
printf("-------\n");
(*p_zlib_strm).avail_in = buflen;
(*p_zlib_strm).next_in = buf;
do {
memset(pNowResBuff,0,4096);
(*p_zlib_strm).avail_out = 4096;
(*p_zlib_strm).next_out = pNowResBuff;
zlib_status = inflate (p_zlib_strm, Z_NO_FLUSH);
printf("inflate status:%d\n",zlib_status);
if(Z_OK != zlib_status && Z_STREAM_END!=zlib_status){
printf("(*p_zlib_strm).avail_in:%d\n",(*p_zlib_strm).avail_in);
printf("err msg:%s\n",p_zlib_strm->msg);
return ;
}
bytes_dc_now = 4096 - (*p_zlib_strm).avail_out;
// printf("bytes_dc_no:")
} while(0 == (*p_zlib_strm).avail_out) ;
printf("(*p_zlib_strm).avail_in:%d\n",(*p_zlib_strm).avail_in);
}
// under the dirpath, there are some compressed data extract from the packets of one specific tcp stream, and store them in "file_basename_%d" file. (%d is the recieve order num: 1,2,3,4...)
void read( char* dirpath, char* file_basename)
{
char filelist[99][255];
int file_count = listDir(dirpath, filelist, 99, 255);
char filepath[255];
z_stream zlib_strm = {0};
zlib_strm.zalloc = Z_NULL;
zlib_strm.zfree = Z_NULL;
zlib_strm.opaque = Z_NULL;
zlib_strm.next_in = Z_NULL;
zlib_strm.avail_in = 0;
inflateInit2 (& zlib_strm, 32 | MAX_WBITS);
FILE* fp;
char buf[2048];
// sort_file_ind: the array store the origin order of the compressed data.
int sort_file_ind[99] = {0,1,2,3,15,16,17,18,19,20,21,4,5,6,7,8,9,10,11,12,13,14};
for(int i=1;i<=file_count-2;i++)
{
memset(filepath,0,sizeof(filepath));
// snprintf(filepath,sizeof(filepath), "%s%s%d",dirpath,file_basename,sort_file_ind[i]);
snprintf(filepath,sizeof(filepath), "%s%s%d",dirpath,file_basename,i);
printf("%s\n",filepath);
fp = fopen(filepath,"r");
if(fp == NULL){
return;
}
fseek(fp, 0, SEEK_END);
int flen = ftell(fp);
fseek(fp, 0, SEEK_SET);
memset(buf,0,sizeof(buf));
int dlen = fread(buf, 1, flen, fp);
if(dlen != flen){
fclose(fp);
return;
}
printf("dlen:%d\n",dlen);
dowithGzipDataByZlib(&zlib_strm,(unsigned char *)buf,dlen);
fclose(fp);
}
}
char * dir = "/data/GzipDC/softDC/DocumentAnalyze/testbyzs/data/119.40.37.65.42050/";
char * base_filename = "119.40.37.65.42050>180.76.22.49.80_1_";
int main()
{
read(dir,base_filename);
return 0;
}
I've asked around and tried many things for days, and I really need someone with knowledge on the subject to weigh in here. Thanks for your time!

Related

reading .wav and using http post to transfer contents with esp32 esp8266

Background:
I have a .wav file saved on an SD card. I would like to transfer that file to a server using my esp32. I am using node red to handle the server side activities.
Method Employed:
open the file in binary mode.
evaluate the size of the file
decide on a max upload size and allocate a buffer
Read the file and store to the buffer.
use http post to send data to the server.
if file is too large to send in a single buffer then divide the file up and send multiple http posts.
Problem:
I can successfully send text files. when I try to send .wav files the size of the sent wave file increases and the file is corrupted. Analyzing the file is difficult as its not all text, what I have done is open the file in notepad++ to see if I can spot anything. Everything should be the same in theory but several characters are coming up as blank squares in the transferred file and some are coming up as the exact same.
Analysis/Theory:
I am quite lost as to what the issue is. My leading theory is that a wave file is written in int16_t but in order to post the data it needs to be * uint8_t, maybe when the casting of the int16 to a uint8 data is lost, I looked at trying to change a int16_t into two int8_t bytes as done here https://stackoverflow.com/a/53374797/14050333 but had no luck, maybe I'm jumping to conclusions. Any help would be hugely appreciated!
Code:
Full code used to sell text files.
void loop()
{
WiFiClient client;
Serial.println("starting file upload");
IPAddress host(192, 168, 0, 37);
int port = 1880;
if (!client.connect(host, port))
{ // check connection to host if untrue internet connection could be down
Serial.println("couldn't connect to host");
}
HTTPClient http;
const char* serverName = "http://192.168.0.37:1880/sensor_file";
http.begin(client, serverName);
char *fname = "/sdcard/test_text.txt";
FILE *fp = fopen(fname, "rb"); // read in bytes
//get file size
fseek(fp, 0, SEEK_END); //send file pointer to end of file
int file_size = ftell(fp); //get end position of file
fseek(fp, 0, SEEK_SET); //send pointer back to start
int max_upload_size = 10; // array size, larger = less uploads but too large can cause memory issues
int num_of_uploads = file_size / max_upload_size; // figure out how many evenly sized upload chunks we need
int num_of_uploads_mod = file_size % max_upload_size; //find out size of remaining upload chunk if needed
int i;
//upload file in even chunks
if (num_of_uploads > 0)
{
char buff1[max_upload_size+1] = {}; // array to save file too. add 1 for end of array symbol '\n'
for (i = 0; i < num_of_uploads; i++)
{
fread(buff1, sizeof(buff1)-1, 1, fp); // -1 as don't want to count the '\n'
http.addHeader("File_name", "test file"); //header to say what the file name is
int httpResponseCode = http.POST((uint8_t *)buff1, sizeof(buff1)-1); //send data. Datatype is (uint8_t *)
}
}
//upload any remaining data
if (num_of_uploads_mod > 0)
{
int remainder = file_size - num_of_uploads * max_upload_size;
char buff2[remainder+1] = {};
fread(buff2, sizeof(buff2)-1, 1, fp); //read from file and store to buff2
http.addHeader("File_name", "test file");
int httpResponseCode = http.POST((uint8_t *)buff2, sizeof(buff2)-1); //send buff2 to server
}
http.end(); // Close connection
delay(10 * 1000);
}
Adjustments made for .wav files
int remainder = file_size - num_of_uploads * max_upload_size;
int16_t buff2[remainder+1] = {};
fread(buff2, sizeof(buff2)-1, 1, fp); //remainder
http.addHeader("File_name", "test file");
int httpResponseCode = http.POST((uint8_t *)buff2, sizeof(buff2)-1);
Its working!
There were 2 main issues with the code as outlined by heap underrun. The first issue is that I was reading in the wav file as int16_t the correct datatype to use was uint8_t.
Why are you using an array of int16_t-type elements as a buffer? You are reading a file in binary mode, so be it .wav, .jpg, .ttf, or anything else, it's just a sequence of bytes (uint8_t, not int16_t) anyway. Another thing, fread() expects the size of each object to read as the second parameter and the number of objects to read as the third parameter, so, in case of objects being bytes, first define buffer as uint8_t buff1[max_upload_size] = {}; (no need for +1/-1 games), and then fread(buff1, sizeof *buff1, sizeof buff1 / sizeof *buff1, fp);. The same for buff2. –
heap underrun
The second issue was that I did not include a header in the post stream specifying the content type. As it wasn't needed for the text file and when writing the file in node-red it lets you choose the encoding. I didn't think I would need it, however as it turns out I needed to add:
http.addHeader("Content-Type", "application/octet-stream");
Below is the working code for the file upload section:
if (num_of_uploads > 0)
{
uint8_t buff1[max_upload_size] = {};
for (i = 0; i < num_of_uploads; i++)
{
fread(buff1, sizeof *buff1, sizeof buff1 / sizeof *buff1, fp);
http.addHeader("File_name", "test file"); //header to say what the file name is
http.addHeader("Content-Type", "application/octet-stream");
int httpResponseCode = http.POST(buff1, sizeof(buff1));
}
}
if (num_of_uploads_mod > 0)
{
int remainder = file_size - num_of_uploads * max_upload_size;
uint8_t buff2[remainder] = {};
fread(buff2, sizeof *buff2, sizeof buff2 / sizeof *buff2, fp);
http.addHeader("File_name", "test file");
http.addHeader("Content-Type", "application/octet-stream");
int httpResponseCode = http.POST(buff2, sizeof(buff2));
}
On a slightly interesting side note out of curiosity I tried running the above code but with
uint16_t buff1[max_upload_size] = {};
and
http.POST((uint8_t) buff1, sizeof(buff2));
The file uploaded but the size was 2x what it should be, curiously however the file wasn't corrupted, and played the audio as it was recorded. Just thought that was interesting.
I'll close out this answer as the original question was successfully answered. Again thank you for the help, I've been at this literally weeks and you solved my problems in hours!

Sending files in socket programming tcp

I am trying to implement a simple file transfer. Below here is two methods that i have been testing:
Method one: sending and receiving without splitting the file.
I hard coded the file size for easier testing.
sender:
send(sock,buffer,107,NULL); //sends a file with 107 size
receiver:
char * buffer = new char[107];
recv(sock_CONNECTION,buffer,107,0);
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer,107);
The output is as expected, the file isn't corrupted because the .txt file that i sent contains the same content as the original.
Method two: sending and receiving by splitting the contents on receiver's side. 5 bytes each loop.
sender:
send(sock,buffer,107,NULL);
Receiver:
char * buffer = new char[107]; //total file buffer
char * ptr = new char[5]; //buffer
int var = 5;
int sizecpy = size; //orig size
while(size > var ){ //collect bytes
recv(sock_CONNECTION,ptr,5,0);
strcat(buffer,ptr); //concatenate
size= size-var; //decrease
std::cout<<"Transferring.."<<std::endl;
}
std::cout<<"did it reach here?"<<std::endl;
char*last = new char[size];
recv(sock_CONNECTION,last,2,0); //last two bytes
strcat(buffer,last);
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer,107);
Output: The text file contains invalid characters especially at the beginning and the end.
Questions: How can i make method 2 work? The sizes are the same but they yield different results. the similarity of the original file and the new file on method 2 is about 98~99% while it's 100% on method one. What's the best method for transferring files?
What's the best method for transferring files?
Usually I'm not answering questions like What's the best method. But in this case it's obvious:
You sent the file size and a checksum in network byte order, when starting a transfer
Sent more header data (e.g filename) optionally
The client reads the file size and the checksum, and decodes it to host byte order
You sent the file's data in reasonably sized chunks (5 bytes isn't a reasonable size), chunks should match tcp/ip frames maximum available payload size
You receive chunk by chunk at the client side until the previously sent file size is matched
You calculate the checksum for the received data at the client side, and check if it matches the one that was received beforhand
Note: You don't need to combine all chunks in memory at the client side, but just append them to a file at a storage medium. Also the checksum (CRC) usually can be calculated from running through data chunks.
Disagree with Galik. Better not to use strcat, strncat, or anything but the intended output buffer.
TCP is knda fun. You never really know how much data you are going to get, but you will get it or an error.
This will read up to MAX bytes at a time. #define MAX to whatever you want.
std::unique_ptr<char[]> buffer (new char[size]);
int loc = 0; // where in buffer to write the next batch of data
int bytesread; //how much data was read? recv will return -1 on error
while(size > MAX)
{ //collect bytes
bytesread = recv(sock_CONNECTION,&buffer[loc],MAX,0);
if (bytesread < 0)
{
//handle error.
}
loc += bytesread;
size= size-bytesread; //decrease
std::cout<<"Transferring.."<<std::endl;
}
bytesread = recv(sock_CONNECTION,&buffer[loc],size,0);
if (bytesread < 0)
{
//handle error
}
std::ofstream outfile (collector,std::ofstream::binary);
outfile.write (buffer.get(),size);
Even more fun, write into the output buffer so you don't have to store the whole file. In this case MAX should be a bigger number.
std::ofstream outfile (collector,std::ofstream::binary);
char buffer[MAX];
int bytesread; //how much data was read? recv will return -1 on error
while(size)
{ //collect bytes
bytesread = recv(sock_CONNECTION,buffer,MAX>size?size:MAX,0);
// MAX>size?size:MAX is like a compact if-else: if (MAX>size){size}else{MAX}
if (bytesread < 0)
{
//handle error.
}
outfile.write (buffer,bytesread);
size -= bytesread; //decrease
std::cout<<"Transferring.."<<std::endl;
}
The initial problems I see are with std::strcat. You can't use it on an uninitialized buffer. Also you are not copying a null terminated c-string. You are copying a sized buffer. Better to use std::strncat for that:
char * buffer = new char[107]; //total file buffer
char * ptr = new char[5]; //buffer
int var = 5;
int sizecpy = size; //orig size
// initialize buffer
*buffer = '\0'; // add null terminator
while(size > var ){ //collect bytes
recv(sock_CONNECTION,ptr,5,0);
strncat(buffer, ptr, 5); // strncat only 5 chars
size= size-var; //decrease
std::cout<<"Transferring.."<<std::endl;
}
beyond that you should really as error checking so the sockets library can tell you if anything went wrong with the communication.

How to send and receive large amounts of data in udp c++

I am trying to send and receive large amounts of data at once in udp c++, with the following code. I can send at once just 16000 bits, char. How can one send/receive millions of bytes of data without closing the socket?
//sends the data contained in aliceBuf, which is, char of size 16000.
if (sendto(aliceSocket, aliceBuf, strlen(aliceBuf), 0, (struct sockaddr *)&bobAddr, sizeof (bobAddr)) == -1) {
perror("sendto");
exit(1);
}
// receiver code: it is receiving just 16000 char.
recvlen = recvfrom(aliceSocket, aliceBuf1, receiveBuffer, 0, (struct sockaddr*)&bobAddr, &bobAddrSize);
if (recvlen >= 0) {
aliceBuf1[recvlen] = 0; /* expect a printable string - terminate it */
}
You can send a large amount of data in one go, but the question you have to ask yourself is: How will the receiver know how much data to expect?
I normally handle these cases by either encoding the length explicitly by prefixing the data with the length and then the receiver loops until that amount of data has arrived, or by having some sort of end of data marker like 'C' strings or more implicitly like json data and the receiver loops looking for something in the data itself.
You wil have to add a protocol on top of UDP, just as if you were using TCP. I'm sorry that you have to do some work, but that's just how things are. Some of the datagrams may get lost, so you may have to add a layer for that too. 1M bits is ~twice as large as the largest possible UDP datagram anyway, so even if you reconfigure your network stack to allow larger datagrams, you will still hit the 64k limit, so requiring a protocol.
I did with loopiing like this:
int totalGoLength= no of blocks you want to send
int dataLengthOneGo = length of data in one block you want to send
//start loop
int iii=1 ;
while (iii <= totalGoLength){ //send by dividing into packets
////--SEND/by converting to char * for less memory occupation----
// theString has the string data to send
std::string part(theString.substr(0, dataLengthOneGo));
char * s4;
s4 = new char[part.size() + 1];
memcpy(s4, part.c_str(), part.size() + 1);
if (sendto(aliceSocket, s4, strlen(s4), 0, (struct sockaddr *)&bobAddr, sizeof (bobAddr)) == -1) {
perror("sendto");
exit(1);
}
delete [] s4;
////----------------------Receiving------------
// receive buffer should have sufficient memory allocation
char *aliceBuf1;
aliceBuf1 = new char[receiveBuffer];
recvlen = recvfrom(aliceSocket, aliceBuf1, receiveBuffer, 0, (struct sockaddr *)&bobAddr, &bobAddrSize);
if (recvlen >= 0) {
aliceBuf1[recvlen] = 0; /* expect a printable string - terminate it */
//convert char to string
string s1(aliceBuf1);
//erase the white space
s1.erase(remove_if(s1.begin(), s1.end(), isspace), s1.end());
//convert string into integer vector
std::vector<int> ints;
ints.reserve(s1.size());
std::transform(std::begin(s1), std::end(s1), std::back_inserter(ints), [](char c) {
return c - '0'; });
}
delete[] aliceBuf1;
justCopy=ints;
KeepData.insert(KeepData.end(),justCopy .begin(), justCopy.end());
justCopy.erase(justCopy.begin(), justCopy.end()); //erase for next time
ints.erase(ints.begin(), ints.end()); //erase for next time
theString.erase(theString.begin(), theString.begin() + dataLengthOneGo);//keep the remaining
iii=iii+1;
}//end of the while

C++ Inflate gzip char array

I'm attempting to use zlib to uncompress (inflate) some IP packet payload data that is compressed via gzip. However, I'm having some difficultly understanding some of the documentation provided by zlib that covers inflation. I have a char array that my program fills but I can't seem to inflate it with the following code:
const u_char payload; /*contains gzip data,
captured prior to this point in the program*/
/*read compressed contents*/
int ret; //return val
z_stream stream;
unsigned char out[MEM_CHUNK]; //output array, MEM_CHUNK defined as 65535
/* allocate inflate state */
stream.zalloc = Z_NULL;
stream.zfree = Z_NULL;
stream.opaque = Z_NULL;
stream.avail_in = size_payload; // size of input
stream.next_in = (Bytef *)payload; // input char array
stream.avail_out = (uInt)sizeof(out); // size of output
stream.next_out = (Bytef *)out; // output char array
ret = inflateInit(&stream);
inflate(&stream, Z_NO_FLUSH);
inflateEnd(&stream);
printf("Inflate: %s\n\n", out);
In the zlib documentation, they have inflate continually called via a do/while loop, checking for the Z_STREAM_END flag. I'm a bit confused here, because it seems they're working from a file while I'm not. Do I need this loop as well, or am I able to provide a char array without looping over inflate?
Any guidance here would really be appreciated. I'm pretty new to both working with compression and C++.
Thanks.
Assuming you are giving inflate an appropriate and complete "compressed stream", and there is enough space to output the data, you would only need to call inflate once.
Edit: It is not written out as clearly as that in the zlib documentation, but it does say:
inflate decompresses as much data as possible, and stops when the
input buffer becomes empty or the output buffer becomes full. It may
introduce some output latency (reading input without producing any
output) except when forced to flush.
Of course, for any stream that isn't already "in memory and complete", you want to run it block by block, since that's going to have less total runtime (you can decompress while the data is being received [from network or filesystem pre-fetch caching] for the next block).
Here's the whole function from your example code. I've removed the text components from the page to concentrate the code, and marked sections with letters // A , // B etc, then marked tried to explain the sections below.
int inf(FILE *source, FILE *dest)
{
int ret;
unsigned have;
z_stream strm;
unsigned char in[CHUNK]; // A
unsigned char out[CHUNK];
/* allocate inflate state */
strm.zalloc = Z_NULL; // B
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = 0;
strm.next_in = Z_NULL;
ret = inflateInit(&strm); // C
if (ret != Z_OK)
return ret;
/* decompress until deflate stream ends or end of file */
do {
strm.avail_in = fread(in, 1, CHUNK, source); // D
if (ferror(source)) {
(void)inflateEnd(&strm); // E
return Z_ERRNO;
}
if (strm.avail_in == 0) // F
break;
strm.next_in = in; // G
/* run inflate() on input until output buffer not full */
do {
strm.avail_out = CHUNK; // H
strm.next_out = out;
ret = inflate(&strm, Z_NO_FLUSH); // I
assert(ret != Z_STREAM_ERROR); /* state not clobbered */
switch (ret) {
case Z_NEED_DICT:
ret = Z_DATA_ERROR; /* and fall through */
case Z_DATA_ERROR:
case Z_MEM_ERROR:
(void)inflateEnd(&strm);
return ret;
}
have = CHUNK - strm.avail_out; // J
if (fwrite(out, 1, have, dest) != have || ferror(dest)) {
(void)inflateEnd(&strm);
return Z_ERRNO;
}
} while (strm.avail_out == 0); // K
/* done when inflate() says it's done */
} while (ret != Z_STREAM_END); // L
/* clean up and return */
(void)inflateEnd(&strm);
return ret == Z_STREAM_END ? Z_OK : Z_DATA_ERROR;
}
A: in is the input buffer (we read from a file into this buffer, then pass it to inflate a while later. out is the output buffer, which is used by inflate to store the output data.
B: Set up a z_stream object called strm. This holds various fields, most of which are not important here (thus set to Z_NULL). The important ones are the avail_in and next_in as well as avail_out and next_out (which are set later).
C: Start inflation process. This sets up some internal data structures and just makes the inflate function itself "ready to run".
D: Read a "CHUNK" amount of data from file. Store the number of bytes read in strm.avail_in, and the actual data goes into in.
E: If we errored out, finish the inflate by calling inflateEnd. Job done.
F: No data available, we're finished.
G: Set where our data is coming from (next_in is set to the input buffer, in).
H: We're now in the loop to inflate things. Here we set the output buffer up: next_out and avail_out indicate where the output goes and how much space there is, respectively.
I: Call inflate itself. This will uncompress a portion of the input buffer, until the output is full.
J: Calculate how much data is available in this step (have is the number of bytes).
K: Until we have space left when inflate finished - this indicates the output is completed for the data in the in buffer, rather than out of space in the out buffer. So time to read some more data from the input file.
L: If the error code from the inflate call is "happy", go round again.
Now, obviously, if you are reading from a network, and uncompressing into memory, you need to replace the fread and fwrite with some suitable read from network and memcpy type calls instead. I can't tell you EXACTLY what those are, since you haven't provided anything to explain where your data comes from - are you calling recv or read or WSARecv, or something else? - and where is it going to?

my c++ client/server file exchange implementation is very slow...why?

Hi have implemented simple file exchange over a client/server connection in c++. Works fine except for the one problem that its so damn slow. This is my code:
For sending the file:
int send_file(int fd)
{
char rec[10];
struct stat stat_buf;
fstat (fd, &stat_buf);
int size=stat_buf.st_size;
while(size > 0)
{
char buffer[1024];
bzero(buffer,1024);
bzero(rec,10);
int n;
if(size>=1024)
{
n=read(fd, buffer, 1024);
// Send a chunk of data
n=send(sockFile_, buffer, n, 0 );
// Wait for an acknowledgement
n = recv(sockFile_, rec, 10, 0 );
}
else // reamining file bytes
{
n=read(fd, buffer, size);
buffer[size]='\0';
send(sockFile_,buffer, n, 0 );
n=recv(sockFile_, rec, 10, 0 ); // ack
}
size -= 1024;
}
// Send a completion string
int n = send(sockFile_, "COMP",strlen("COMP"), 0 );
char buf[10];
bzero(buf,10);
// Receive an acknowledgemnt
n = recv(sockFile_, buf, 10, 0 );
return(0);
}
And for receiving the file:
int receive_file(int size, const char* saveName)
{
ofstream outFile(saveName,ios::out|ios::binary|ios::app);
while(size > 0)
{
// buffer for storing incoming data
char buf[1024];
bzero(buf,1024);
if(size>=1024)
{
// receive chunk of data
n=recv(sockFile_, buf, 1024, 0 );
// write chunk of data to disk
outFile.write(buf,n);
// send acknowledgement
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
else
{
n=recv(sockFile_, buf, size, 0 );
buf[size]='\0';
outFile.write(buf,n);
n = send(sockFile_, "OK", strlen("OK"), 0 );
}
size -= 1024;
}
outFile.close();
// Receive 'COMP' and send acknowledgement
// ---------------------------------------
char buf[10];
bzero(buf,10);
n = recv(sockFile_, buf, 10, 0 );
n = send(sockFile_, "OK", strlen("OK"), 0 );
std::cout<<"File received..."<<std::endl;
return(0);
}
Now here are my initial thoughts: Perhaps the buffer is too small. I should therefore try increasing the size from I dunno, 1024 bytes (1KB) to 65536 (64KB) blocks, possibly. But this results in file corruption. Ok, so perhaps the code is also being slowed down by the need to receive an acknowledgement after each 1024 byte block of data has been sent, so why not remove them? Unfortunately this results in the blocks not arriving in the correct order and hence file corruption.
Perhaps I could split the file into chunks before hand and create multiple connections and send each chunk over its own threaded connection and then reassemble the chunks somehow in the receiver....
Any idea how I could make the file transfer process more efficient (faster)?
Thanks,
Ben.
Skip the acknowledgement of buffers! You insert an artificial round trip (server->client+client->server) for probably each single packet.
This slows down the transfer.
You do not need this ack. You are using TCP, which gives you a reliable stream. Send the number of bytes, then send the whole file. Do not read after send and so on.
EDIT: As a second step, you should increase the buffer size. For internet transfer you can assume an MTU of 1500, so there will be space for a payload of 1452 bytes in each IP packet. This should be your minimal buffer size. Make it larger and let the operating system slice the buffers into packets for you. For LAN you have a much higher MTU.
My guess is that you are getting out of sync and some of your reads are less than 1024. It happens all the time with sockets. The "size -= 1024" statement should be "size -= n".
My guess is that n is sometimes less than 1024 from the recv().
You should certainly increase the buffer size, and if this causes corruption it is an error in your code, which you need to fix. Also, if you use a stream protocol (i.e. TCP/IP) the order and delivery of packets is guaranteed.
Read this thread:
Send and Receive a file in socket programming in Linux with C/C++ (GCC/G++)
Oh, and use sendfile POSIX command, here's an example to get you started:
http://tldp.org/LDP/LGNET/91/misc/tranter/server.c.txt
A couple of things.
1) You are reallocating the buffer each time you go through your while loop:
while(size > 0)
{
char buf[1024];
You can pull it out of the while loop on both sides and you won't be dumping on your stack as much.
2) 1024 is a standard buffer size, and I wouldn't go much above 2048 because then the lower level TCP/IP stack will just have to break it up anyways.
3) If you really need speed, rather than waiting for a recv ack you could just add a packet number to each packet and then check them on the receiving end. This makes your receiving code a little more complex because it has to store packets that are out of order and put them in order. But then you wouldn't need an acknowledgement.
4) It's a little thing, but what if the file that you are sending has a size that is a multiple of 1024... Then you won't send the trailing '/0'. To fix that you just need to change your while to:
while (size >= 0)