zlib inflate error : Z_DATA_ERROR randomly - c++

I have an application that compresses and sends data via socket and data received is written in remote machine. During recovery, this data is decompressed and retrieved. Compression/Decompression is done using "zlib".But during decompression I face the following problem randomly:
zlib inflate() fails with error "Z_DATA_ERROR" for binary files like .xls,.qbw etc.
The application compresses data in blocks say "1024" bytes in a loop with data read from the file and decompresses in the same way.From the forum posts, I found that one reason for Z_DATA_ERROR is due to data corruption. As of now, to avoid this problem, we have introduced CRC check of data compressed during send and what is received.
Any possible reasons on why this happens is really appreciated! (as this occurs randomly and for the same file, it works the other time around).Is it bcoz of incorrect handling of zlib inflate() and deflate() ?
Note: If needed,will post the exact code snippet for further analysis!
Thanks...Udhai

You didn't mention if the socket was TCP or UDP; but based on the blocking and checksumming, I'm going out on a limb and guessing it's UDP.
If you're sending the compressed packets over UDP they could be received out-of-order on the other end, or the packets could be lost in transit.
Getting things like out-of-sequencing and lost packets right ends up being a lot of the work that is all fixed by using the TCP protocol - you have a simple pipe that guarantees the data arrives in-order and as-expected.
Also I'd make sure that the code on the receiving side is simple, and receives into buffers allocated on the heap and not on the stack (I've seen many a bug triggered by this).
Again, this is just an educated guess based on the detail of the question.

Related

usrsctp send buffer does not free itself

We're working with a C++ webrtc data channels library and in our test application, upon sending a few small packets that would totally amount to about 256kB, the usrsctp_sendv() call returns -1 (with errno as EWOULDBLOCK/EAGAIN which means "Resource is temporarily unavailable"). We believe this is because we're hitting the usrsctp's send buffer limit, which is 256 kB by default. We've tried adding several sleep delays in between each send call hoping it clears that buffer, but nothing works.
The receiving side, (a JS web page) does indeed receive all the bytes that we've sent up until it errors out. It's also worth noting that this only happens when we try to send data from the C++ application to the JS and not the other way around. We tried looking around mozilla's datachannels implementation, but can't seem to draw any conclusions on what the issue could be about.
It is hard to answer such question straight away. I would start looking into wireshark traces in order to see if your remote side (JS page) actually acknowledges data you send (e.i. if SACK chunks are sent back) and what is the value of received buffer (a_rwnd) reported in these SACKs. It might be possible that it is not an issue on your side, but you are getting EWOULDBLOCKS just because sending side SCTP cannot flush the data from buffers because it is still awaiting for delivery confirmation from remote end.
Please provide more details about your case, also if this is possible provide sample code for your JS page.

Internal socket receive buffer implementation

I’m working on an embedded application, where i receive some sensor values over UDP. The board I’m using runs the 2.4 kernel on an ARM processor. The problem is the following: once my internal socket buffer is full only the newest value gets replaced. So the internal buffer is not implemented as a circular buffer, which it should be, as i found out studying some articles. Can i somehow change the behaviour of the internal receive buffer?
I already found out that there is no way to "flush" that buffer from the application side. The best idea I’ve got is checking whether the receive buffer is full, before receiving any packets and if so fist read out all the old packets manually. Is there any better approach?
I hope it's somehow clear what I mean, any help is appreciated.
The best idea I’ve got is checking whether the receive buffer is full,
before receiving any packets and if so fist read out all the old
packets manually.
I'd not bother checking whether the receive buffer is full, rather always read all packets until no more are there and use the last received, which contains the newest value.

Limitations on sending through UDP sockets

I have a big 1GB file, which I am trying to send to another node. After the sender sends 200 packets (before sending the complete file) the code jumps out. Saying "Sendto no send space available". What can be the problem and how to take care of it.
Apart from this, we need maximum throughput in this transfer. So what send buffer size we should use to be efficient?
What is the maximum MTU which we can use to transfer the file without fragmentation?
Thanks
Ritu
Thank you for the answers. Actually, our project specifies to use UDP and then some additional code to take care of lost packets.
Now I am able to send the complete file, using blocking UDP sockets.
I am running the whole setup on an emulab like environment, called deter. I have set link loss to 0 but still my some packets are getting lost. What could be the possible reason behind that? Even if I add delay (assuming receiver drops the packet when its buffer is full) after sending every packet..still this packet losts persists.
It's possible to use UDP for high speed data transfer, but you have to make sure not to send() the data out faster than your network card can pump it onto the wire. In practice that means either using blocking I/O, or blocking on select() and only sending the next packet when select() indicates that the socket is ready-for-write. (ideally you'd also not send the data faster than the receiving machine can receive it, but that's less of an issue these days since modern CPU speeds are generally much faster than modern network I/O speeds)
Once you have that logic working properly, the size of your send-buffer isn't terribly important. (i.e. your send buffer will never be large enough to hold a 1GB file anyway, so making sure your program doesn't overflow the send buffer is the key issue whether the send buffer is large or small) The size of the receive-buffer on the receiver is important though... best to make that as large as possible, so the receiving computer won't drop packets if the receiving process gets held off of the CPU by another program.
Regarding MTU, if you want to avoid packet fragmentation (and assuming your packets are traveling over Ethernet), then you shouldn't place more than 1468 bytes into each UDP packet (or 1452 bytes if you're using IPv6). (Calculated by subtracting the size of the necessary IP and UDP headers from Ethernet's 1500-byte frame size)
Also agree with #jonfen. No UDP for high speed file transfer.
UDP incur less protocol overhead. However, at the maximum transfer rate, transmit errors are inevitable (such as packet loss). So one must incorporate TCP like error correction scheme. End result is lower than TCP performance.

Capture server-client communication with tcpdump

I wrote a simple server and client apps, where I can switch between TCP, DCCP and UDP protocols. The goal was to transfer a file from the one to the other and measure the traffic for each protocol, so I can compare them for different network setups (I know roughly what the result should be, but I need exact numbers/graphs). Anyway after starting both apps on different computers and starting tcpdump I only get in the tcpdump-log the first few MBs (~50MB) from my 4GB file. The apps are written in a standard C/C++ code, which could be found anywhere on the web.
What may be the problem or what could I be doing wrong here?
-- Edit
The command line I use is:
tcpdump -s 1500 -w mylog
tcpdump captures then packets only the first ~55 sec. That's the time the client needs to send the file to the socket. Afterwards it stops, even though the server continues receiving and writing the file to the hard drive.
-- Edit2
Source code:
client.cpp
server.cpp
common.hpp
common.cpp
-- Edit final
As many of you pointed out (and as I suspected) there were several misconceptions/bugs in the source code. After I cleaned it up (or almost rewrote it), it works as needed with tcpdump. I will accept the answer from #Laurent Parenteau but only for point 5. as it was the only relevant for the problem. If someone is interested in the correct code, here it is:
Source code edited
client.cpp
server.cpp
common.hpp
common.cpp
There are many things wrong in the code.
The file size / transfer size is hardcoded to 4294967295 bytes. So, if the file supplied isn't that many bytes, you'll have problems.
In the sender, you aren't checking if the file read is successful or not. So if the file is smaller than 4294967295 bytes, you won't know it and send junk data (or nothing at all) over the network.
When you use UDP and DDCP, the packets order isn't guarantee, so the data received may be out of order (ie. junk).
When you use UDP, there's no retransmission of lost packet, so some data may never be received.
In the receiver, you aren't check how many bytes you received, you always write MAX_LINE bytes to the file. So even if you receive 0 bytes, you'll still be writing to the file, which is wrong.
When you use UDP, since you're sending in a thigh loop, even if the write() call return the same amount of bytes sent that what you requested, a lot of data will probably be dropped by the network stack or the network interface, since there's no congestion control in place. So, you will need to put some congestion control in place yourself.
And this is just from a quick scan of the code, there is probably more problems in there...
My suggestion is :
Try the transfer with TCP, do a md5sum of the file you read/send, and a md5sum of the file you receive/save, and compare the 2 md5sum. Once you have this case working, you can move to testing (still using the md5sum comparison) with UDP and DCCP...
For the tcpdump command, you should change -s 1500 for -s 0, which means unlimited. With that tcpdump command, you can trust it that data not seen by it hasn't been sent/received. Another good thing to do is to compare the tcpdump output of the sender with the receiver. This way you'll know if some packet lost occurred between the two network stacks.
Do you have x term access? Switch to Wireshark instead and try with that - its free, open source, and probably more widely used than tcpdump today. (It was formerly known as Ethereal.)
Also, do try the following tcpdump options:
-xx print the link header and data of the packet as well (does -w write data?)
-C specify the max file size explicitly.
-U to write packet by packet to the file instead of flushing the buffer.
-p dont put the nic in promiscuous mode
-O dont use the packet matching optimizer as yours is a new app level protocol.
Are you using verbose output in tcpdump? This can make the buffers fill quickly so redirect stdout/err to a file when you run it.
Are these Gigabit ethernet card on both ends?
tcpdump is used as a diagnostic and forensics tool by 10s of thousands (at least) programmers and computer security professionals worldwide. When a tool like this seems to be mishandling a very common task the first thing to suspect is the code you wrote, and not the tool.
In this particular case your code has a wide variety of significant errors. In particular, with TCP, your server will continue to write data to the file regardless of whether or not the client is sending any.
This code has race conditions that will result in non-deterministic behavior in some situations, improperly treats '\0' as being a special value in network data, ignores error conditions, and ignores end-of-file conditions. And that's just a brief reading.
In this case I am nearly certain that tcpdump is functioning perfectly and telling you that your application does not do what you think it does.
"That's the time the client needs to
send the file to the socket.
Afterwards it stops, even though the
server continues receiving and writing
the file to the hard drive."
This sound really weird. The socket buffers are way too small to allow this to happen. I really think that your server code only seems to receive data, while the sender actually has already stopped sending data.
I know this might sound silly, but are you sure it is not a problem of flush() of the file? I.e. the data are still in memory and not yet written to disk (because they do not amount to a sufficient quantity).
Try sync or just wait a bit until you are certain that enough data have been transmitted.

Question about file transfer for socket programming

Is there a good method on how to transfer a file from say... a client to a server?
Probably just images, but my professor was asking for any type of files.
I've looked around and am a little confused as to the general idea.
So if we have a large file, we can split that file into segments...? Then send each segment off to the server.
Should I also use a while loop to receive all the files / segments on the server side? Also, how will my server know if all the segments were received without previously knowing how many segments there are?
I was looking on the Cplusplus website and found that there is like a binary transfer of files...
Thanks for all the help =)
If you are using TCP:
You are right, there is no way to "know" how much data you will be receiving. This gives you a few options:
1) Before transmitting the image data, first send the number of bytes to be expected. So your first 4 bytes might be the 4-byte integer "4096". Then your client can read the first 4 bytes, "know" that it is expecting 4096 bytes, and then malloc(4096) so it can expect the rest. Then, your server can send() 4096 bytes worth of image data.
When you do this, be aware that you might have to recv() multiple times - for one reason or another, you might not have received all 4096 bytes. So you will need to check the return value of recv() to make sure you have gotten everything.
2) If you are just sending one file, you could just have your receiver read it. And it can keep recv()ing from the socket until the server closes the connection. This is a bit harder - you will have to keep track of how much you have received, and then if your buffer is full, you will have to reallocate it. I don't recommend this method, but it would technically accomplish the task.
If you are using UDP:
This means that you don't have reliable transfer. So packets might be dropped. They might also arrive out of order. So if you are going to use UDP, you must fragment your data into little segments. Both the sender and receiver must have agreement on how large a segment is (100 bytes? 1000 bytes?)
Not only that, but you must also transmit a sequence number with each packet - that is, label each packet #1, #2, etc. Because your client must be able to tell: if any packets are missing (you receive packets 1, 2 and 4 - and are thus missing #3) and to make sure they are in order (you receive 3, 2, then 1 - but when you save them to the file, you must make sure the packets are saved in the correct order, 1, 2, then 3).
So for your assignment, well, it will depend on what protocol you have to/are allowed to use.
If you use a UDP-based transfer protocol, you will have to break the file up into chunks for network transmission. You'll also have to reassemble them in the correct order on the receiving end and verify the results. If you use a TCP-based transfer protocol, all of this will be taken care of under the hood.
You should consult Beej's Guide to Network Programming for how best to send and receive data and use sockets in general. It explains most of the things about which you are asking.
There are many ways of transferring files. If your transferring files in a lossless manor, then your basically going to divide the file into chunks. Tag each chunk with a sequence number. Send the chunks to the other side and reconstitute the file. Stream oriented protocols are simpler since packets will be retransmitted if lost. If your using an unreliable protocol, then you will need to retransmit missing packets and resequenced chunks which are not in the correct order.
If lossy transfer is acceptable (like transferring video or on-line game data), then use an unreliable protocol. Lossy transfer is simpler because you don't have to retransmit missing chunks. All you need to do is make sure the chunks are processed in the proper sequence.
Many protocols send a terminator packet to indicate the end of transmission. You could use this strategy if you don't want to send the number of chunks to the other side before transmission.