zlib stops on buffer expansion - c++

When I attempt to decompress data of a size greater than 2048 the zlib uncompress call returns Z_OK. So to clarify if I decompress data of size 2980 it will decompress upto 2048 (Two loops) and then return Z_OK.
What am i missing?
Bytes is a vector< unsigned char >;
Bytes uncompressIt( const Bytes& data )
{
size_t buffer_length = 1024;
Byte* buffer = nullptr;
int status = 0;
do
{
buffer = ( Byte* ) calloc( buffer_length + 1, sizeof( Byte ) );
int status = uncompress( buffer, &buffer_length, &data[ 0 ], data.size( ) );
if ( status == Z_OK )
{
break;
}
else if ( status == Z_MEM_ERROR )
{
throw runtime_error( "GZip decompress ran out of memory." );
}
else if ( status == Z_DATA_ERROR )
{
throw runtime_error( "GZip decompress input data was corrupted or incomplete." );
}
else //if ( status == Z_BUF_ERROR )
{
free( buffer );
buffer_length *= 2;
}
} while ( status == Z_BUF_ERROR ); //then the output buffer wasn't large enough
Bytes result;
for( size_t index = 0; index != buffer_length; index++ )
{
result.push_back( buffer[ index ] );
}
return result;
}
EDIT:
Thanks #Michael for catching the realloc. I've been mucking around with the implementation and missed it; still no excuse before posting it.

I got it.
int status
is defined inside and outside of the loop. The Lesson here is never drink & develop.

From the zlib manual: "In the case where there is not enough room, uncompress() will fill the output buffer with the uncompressed data up to that point."
I.e, up to 1024 bytes have already been uncompressed, then you get Z_BUF_ERROR and double the buffer size giving you room for 2048 bytes, and once you've uncompressed the second time you've got a total of up to 3072 bytes of uncompressed data.
Also, it looks like you're unnecessarily doing a calloc right after realloc when you get Z_BUF_ERROR.

I find nothing apparent that is wrong with your code. You may be mis-predicting the length of your uncompressed data. uncompress() will only return Z_OK if it has decompressed a complete zlib stream and the check value of the uncompressed data matched the check value at the end of the stream.

Related

How to use ZLIB deflate method?

I am trying to use zlib to compress a text file. It seems to kinda work except I pretty sure my calculation of the number of bytes to write to the output is wrong. My code (guided by http://zlib.net/zlib_how.html) is below:
int
deflateFile(
char *infile,
char *outfile)
{
#define CHUNKSIZE 1000
int n,nr,nw,towrite;
z_stream strm;
FILE *fin,*fout;
BYTE *inbuf,*outbuf;
int ntot=0;
printf( "Start doDeflateFile:\n" );
// ALLOC BUFFERS
inbuf = malloc( CHUNKSIZE+1 );
outbuf = malloc( CHUNKSIZE+1 );
// OPEN FILES
fin = fopen( infile, "rb" );
fout = fopen( outfile, "wb" );
// SETUP Z STREAM
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = CHUNKSIZE; // size of input
strm.next_in = inbuf; // input buffer
strm.avail_out = CHUNKSIZE; // size of output
strm.next_out = outbuf; // output buffer
deflateInit( &strm, Z_BEST_COMPRESSION ); // init stream level
while( TRUE ) { // loop til EOF on input file
// READ NEXT INPUT CHUNK
nr = fread( inbuf, 1, CHUNKSIZE, fin );
if( nr <= 0 ) {
printf( "End of input\n" );
break;
}
printf( "\nread chunk of %6d bytes\n", nr );
printf( "calling deflate...\n" );
n = deflate(&strm, Z_FINISH); // call ZLIB deflate
towrite = CHUNKSIZE - strm.avail_out; // calc # bytes to write (FIXME???)
printf( "#bytes to write %6d bytes\n", towrite );
nw = fwrite( outbuf, 1, towrite, fout );
if( nw != towrite ) break;
printf( "wrote chunk of %6d bytes\n", nw );
ntot += nw;
}
deflateEnd(&strm); // end deflate
printf( "wrote total of %d bytes\n", ntot );
printf( "End deflateFile.\n" );
return( 0 );
}
The output for a 1010-byte input file with a CHUNKSIZE of 1000 is:
Start deflateFile:
read chunk of 1000 bytes
calling deflate...
#bytes to write 200 bytes
wrote chunk of 200 bytes
read chunk of 10 bytes
calling deflate...
#bytes to write 200 bytes
wrote chunk of 200 bytes
End of input
wrote total of 400 bytes
End deflateFile.
SO #4538586 sort of addressed this but not quite and it's very old..
Can anybody point out my problem?
You should read that page again. Much more carefully this time.
You are not setting avail_in properly at the start, and you are not resetting next_in, avail_in, next_out, and avail_out in the loop. The only thing you are doing correctly is the thing you think is wrong, which is the calculation of how many bytes to write out. What you have will not even "kinda work".
First off, avail_in must always be set to the amount of available input at next_in. Hence the name avail_in. You are setting it to CHUNKSIZE and calling inflateInit(), even though there is no available input in that buffer yet.
Then after you read data into the input buffer, you ignore nr! You need to set avail_in to nr, to indicate how much data is actually in the buffer. It might be less than CHUNKSIZE.
You should read data into the input buffer only if you have processed all of the data that was there from the last read, indicated by avail_in being zero.
When a call of deflate() completes inside the loop, it has updated next_in, avail_in, next_out, and avail_out. To use the inbuf and outbuf buffers again, you need reset the values of next_in, next_out, and avail_out to the values you did initially. avail_in will be set at the top of the loop from nr.
You are calling deflate() with Z_FINISH every time! The way this works is that you call deflate() with Z_NO_FLUSH until the last of the input is provided, and then you use Z_FINISH, to let it know to finish. (That's why it's called that.)
Your loop will exit prematurely, since you need to finish compressing and writing the output, not just finish reading the input.
You are not checking the return code of deflate(). Always check return codes. That's why they're there.
Good luck.

How to read and write a ppm file?

I try to read a ppm file aand create a new one identical. But when I open them with GIMP2 the images are not the same.
Where is the problem with my code ?
int main()
{
FILE *in, *out;
in = fopen("parrots.ppm","r");
if( in == NULL )
{
std::cout<<"Error.\n";
return 0;
}
unsigned char *buffer = NULL;
long size = 0;
fseek(in, 0, 2);
size = ftell(in);
fseek(in, 0, 0);
buffer = new unsigned char[size];
if( buffer == NULL )
{
std::cout<<"Error\n";
return 0;
}
if( fread(buffer, size, 1, in) < 0 )
{
std::cout<<"Error.\n";
return 0 ;
}
out = fopen("out.ppm","w");
if( in == NULL )
{
std::cout<<"Error.\n";
return 0;
}
if( fwrite(buffer, size, 1, out) < 0 )
{
std::cout<<"Error.\n";
return 0;
}
delete[] buffer;
fcloseall();
return 0;
}
Before that I read the ppm file in a structure and when I wrote it I get the same image but the green was more intense than in the original picture. Then I tried this simple reading and writing but I get the same result.
int main()
Missing includes.
FILE *in, *out;
C style I/O in a C++ program, why? Also, declare at point of initialization, close to first use.
in = fopen("parrots.ppm","r");
This is opening the file in text mode, which is most certainly not what you want. Use "rb" for mode.
unsigned char *buffer = NULL;
Declare at point of initialization, close to first use.
fseek(in, 0, 2);
You are supposed to use SEEK_END, which is not guaranteed to be defined as 2.
fseek(in, 0, 0);
See above, for SEEK_SET not guaranteed to be defined as 0.
buffer = new unsigned char[size];
if( buffer == NULL )
By default, new will not return a NULL pointer, but throw a std::bad_alloc exception. (With overallocation being the norm on most current operating systems, checking for NULL would not protect you from out-of-memory even with malloc(), but good to see you got into the habit of checking anyway.)
C++11 brought us smart pointers. Use them. They are an excellent tool to avoid memory leaks (one of the very few weaknesses of C++).
if( fread(buffer, size, 1, in) < 0 )
Successful use of fread should return the number of objects written, which should be checked to be equal the third parameter (!= 1), not < 0.
out = fopen("out.ppm","w");
Text mode again, you want "wb" here.
if( fwrite(buffer, size, 1, out) < 0 )
See the note about the fread return value above. Same applies here.
fcloseall();
Not a standard function. Use fclose( in ); and fclose( out );.
A C++11-ified solution (omitting the error checking for brevity) would look somewhat like this:
#include <iostream>
#include <fstream>
#include <memory>
int main()
{
std::ifstream in( "parrots.ppm", std::ios::binary );
std::ofstream out( "out.ppm", std::ios::binary );
in.seekg( 0, std::ios::end );
auto size = in.tellg();
in.seekg( 0 );
std::unique_ptr< char[] > buffer( new char[ size ] );
in.read( buffer.get(), size );
out.write( buffer.get(), size );
in.close();
out.close();
return 0;
}
Of course, a smart solution would do an actual filesystem copy, either through Boost.Filesystem or the standard functionality (experimental at the point of this writing).

Raspberry Pi C++ Read NMEA Sentences from Adafruit's Ultimate GPS Module

I'm trying to read the GPS NMEA sentences from Adafruit's Ultimate GPS module. I'm using C++ on the raspberry pi to read the serial port connection to the module
Here is my read function:
int Linuxutils::readFromSerialPort(int fd, int bufferSize) {
/*
Reading data from a port is a little trickier. When you operate the port in raw data mode,
each read(2) system call will return however many characters are actually available in the
serial input buffers. If no characters are available, the call will block (wait) until
characters come in, an interval timer expires, or an error occurs. The read function can be
made to return immediately by doing the following:
fcntl(fd, F_SETFL, FNDELAY);
The NDELAY option causes the read function to return 0 if no characters are available on the port.
*/
// Check the file descriptor
if ( !checkFileDecriptorIsValid(fd) ) {
fprintf(stderr, "Could not read from serial port - it is not a valid file descriptor!\n");
return -1;
}
// Now, let's wait for an input from the serial port.
fcntl(fd, F_SETFL, 0); // block until data comes in
// Now read the data
int absoluteMax = bufferSize*2;
char *buffer = (char*) malloc(sizeof(char) * bufferSize); // allocate buffer.
int rcount = 0;
int length = 0;
// Read in each newline
FILE* fdF = fdopen(fd, "r");
int ch = getc(fdF);
while ( (ch != '\n') ) { // Check for end of file or newline
// Reached end of file
if ( ch == EOF ) {
printf("ERROR: EOF!");
continue;
}
// Expand by reallocating if necessary
if( rcount == absoluteMax ) { // time to expand ?
absoluteMax *= 2; // expand to double the current size of anything similar.
rcount = 0; // Re-init count
buffer = (char*)realloc(buffer, absoluteMax); // Re-allocate memory.
}
// Read from stream
ch = getc(fdF);
// Stuff in buffer
buffer[length] = ch;
// Increment counters
length++;
rcount++;
}
// Don't care if we return 0 chars read
if ( rcount == 0 ) {
return 0;
}
// Stick
buffer[rcount] = '\0';
// Print results
printf("Received ( %d bytes ): %s\n", rcount,buffer);
// Return bytes read
return rcount;
}
So I kind of get the sentences as you can see below, the problem is I get these "repeated" portions of a complete sentence like this:
Received ( 15 bytes ): M,-31.4,M,,*61
Here is the complete thing:
Received ( 72 bytes ): GPGGA,182452.000,4456.2019,N,09337.0243,W,1,8,1.19,292.6,M,-31.4,M,,*61
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,1.49,1.19,0.91*00
Received ( 15 bytes ): M,-31.4,M,,*61
Received ( 72 bytes ): GPGGA,182453.000,4456.2019,N,09337.0242,W,1,8,1.19,292.6,M,-31.4,M,,*61
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,1.49,1.19,0.91*00
Received ( 15 bytes ): M,-31.4,M,,*61
Received ( 72 bytes ): GPGGA,182456.000,4456.2022,N,09337.0241,W,1,8,1.21,292.6,M,-31.4,M,,*64
Received ( 56 bytes ): GPGSA,A,3,17,07,28,26,08,11,01,09,,,,,2.45,1.21,2.13*0C
Received ( 70 bytes ): GPRMC,182456.000,A,4456.2022,N,09337.0241,W,0.40,183.74,110813,,,A*7F
Received ( 37 bytes ): GPVTG,183.74,T,,M,0.40,N,0.73,K,A*34
Received ( 70 bytes ): GPRMC,182453.000,A,4456.2019,N,09337.0242,W,0.29,183.74,110813,,,A*7E
Received ( 37 bytes ): GPVTG,183.74,T,,M,0.29,N,0.55,K,A*3F
Received ( 32 bytes ): 242,W,0.29,183.74,110813,,,A*7E
Received ( 70 bytes ): GPRMC,182452.000,A,4456.2019,N,09337.0243,W,0.33,183.74,110813,,,A*75
Why am I getting the repeated sentences and how can I fix it? I tried flushing the serial port buffers but then things became really ugly! Thanks.
I'm not sure I understand your exact problem. There are a few problems with the function though which might explain a variety of errors.
The lines
int absoluteMax = bufferSize*2;
char *buffer = (char*) malloc(sizeof(char) * bufferSize); // allocate buffer.
seem wrong. You'll decide when to grow the buffer by comparing the number of characters read to absoluteMax so this needs to match the size of the buffer allocated. You're currently writing beyond the end of allocated memory before you reallocate. This results in undefined behaviour. If you're lucky your app will crash, if you're unlucky, things will appear to work but you'll lose the second half of the data you've read since only the data written to memory you own will be moved by realloc (if it relocates your heap cell).
Also, you shouldn't cast the return from malloc (or realloc) and can rely on sizeof(char) being 1.
You lose the first character read (the one that is read just before the while loop). Is this deliberate?
When you reallocate buffer, you shouldn't reset rcount. This causes the same bug as above where you'll write beyond the end of buffer before reallocating again. Again, the effects of doing this are undefined but could include losing portions of output.
Not related to the bug you're currently concerned with but also worth noting is the fact that you leak buffer and fdF. You should free and fclose them respectively before exiting the function.
The following (untested) version ought to fix these issues
int Linuxutils::readFromSerialPort(int fd, int bufferSize)
{
if ( !checkFileDecriptorIsValid(fd) ) {
fprintf(stderr, "Could not read from serial port - it is not a valid file descriptor!\n");
return -1;
}
fcntl(fd, F_SETFL, 0); // block until data comes in
int absoluteMax = bufferSize;
char *buffer = malloc(bufferSize);
int rcount = 0;
int length = 0;
// Read in each newline
FILE* fdF = fdopen(fd, "r");
int ch = getc(fdF);
for (;;) {
int ch = getc(fdF);
if (ch == '\n') {
break;
}
if (ch == EOF) { // Reached end of file
printf("ERROR: EOF!\n");
break;
}
if (length+1 >= absoluteMax) {
absoluteMax *= 2;
char* tmp = realloc(buffer, absoluteMax);
if (tmp == NULL) {
printf("ERROR: OOM\n");
goto cleanup;
}
buffer = tmp;
}
buffer[length++] = ch;
}
if (length == 0) {
return 0;
}
buffer[length] = '\0';
// Print results
printf("Received ( %d bytes ): %s\n", rcount,buffer);
cleanup:
free(buffer);
fclose(fdH);
return length;
}
Maybe you could try to flush serial port buffers before reading from it as shown in this link ?
I would also consider not reopening the serial port every time you call Linuxutils::readFromSerialPort - you could keep the file descriptor open for further reading (anyway the call is blocking so from the caller's point of view nothing changes).

Efficient means of null terminating an unsigned char buffer in a string append function?

I've been writing a "Byte Buffer" utility module - just a set of functions for personal use in low level development.
Unfortunately, my ByteBuffer_Append(...) function doesn't work properly when it null terminates the character at the end, and/or adds extra room for the null termination character. One result, when this is attempted, is when I call printf() on the buffer's data (a cast to (char*) is performed): I'll get only a section of the string, as the first null termination character within the buffer will be found.
So, what I'm looking for is a means to incorporate some kind of null terminating functionality within the function, but I'm kind of drawing a blank in terms of what would be a good way of going about this, and could use a point in the right direction.
Here's the code, if that helps:
void ByteBuffer_Append( ByteBuffer_t* destBuffer, uInt8* source, uInt32 sourceLength )
{
if ( !destBuffer )
{
puts( "[ByteBuffer_Append]: param 'destBuffer' received is NULL, bailing out...\n" );
return;
}
if ( !source )
{
puts( "[ByteBuffer_Append]: param 'source' received is NULL, bailing out...\n" );
return;
}
size_t byteLength = sizeof( uInt8 ) * sourceLength;
// check to see if we need to reallocate the buffer
if ( destBuffer->capacity < byteLength || destBuffer->length >= sourceLength )
{
destBuffer->capacity += byteLength;
uInt8* newBuf = ( uInt8* ) realloc( destBuffer->data, destBuffer->capacity );
if ( !newBuf )
{
Mem_BadAlloc( "ByteBuffer_Append - realloc" );
}
destBuffer->data = newBuf;
}
uInt32 end = destBuffer->length + sourceLength;
// use a separate pointer for the source data as
// we copy it into the destination buffer
uInt8* pSource = source;
for ( uInt32 iBuffer = destBuffer->length; iBuffer < end; ++iBuffer )
{
destBuffer->data[ iBuffer ] = *pSource;
++pSource;
}
// the commented code below
// is where the null termination
// was happening
destBuffer->length += sourceLength; // + 1;
//destBuffer->data[ destBuffer->length - 1 ] = '\0';
}
Many thanks to anyone providing input on this.
Looks like your issue is caused by memory corruption.
You have to fix the following three problems:
1 check if allocated space is enough
if ( destBuffer->capacity < byteLength || destBuffer->length >= sourceLength )
does not properly check if buffer reallocation is needed,
replace with
if ( destBuffer->capacity <= destBuffer->length+byteLength )
2 allocating enough space
destBuffer->capacity += byteLength;
is better to become
destBuffer->capacity = destBuffer->length + byteLength + 1;
3 properly null terminating
destBuffer->data[ destBuffer->length - 1 ] = '\0';
should become
destBuffer->data[ destBuffer->length ] = '\0';
In C/C++, a list of chars terminating by a '\0' is a string. There are a set of string functions, such as strcpy(), strcmp(), they take char * as parameter, and when they find a '\0', they the string end there. In your case, printf("%s", buf) treats buf as a string, so when it find a '\0', it stops print.
If you are doing a buffer, that means any data include '\0' is normal data in the buffer. So you should avoid to use string functions. To print a buffer, you need to implement your own function.

Reading SDL_RWops from a std::istream

I'm quite surprised that Google didn't find a solution. I'm searching for a solution that allows SDL_RWops to be used with std::istream. SDL_RWops is the alternative mechanism for reading/writing data in SDL.
Any links to sites that tackle the problem?
An obvious solution would be to pre-read enough data to memory and then use SDL_RWFromMem. However, that has the downside that I'd need to know the filesize beforehand.
Seems like the problem could somehow be solved by "overriding" SDL_RWops functions...
I feel bad answering my own question, but it preocupied me for some time, and this is the solution I came up with:
int istream_seek( struct SDL_RWops *context, int offset, int whence)
{
std::istream* stream = (std::istream*) context->hidden.unknown.data1;
if ( whence == SEEK_SET )
stream->seekg ( offset, std::ios::beg );
else if ( whence == SEEK_CUR )
stream->seekg ( offset, std::ios::cur );
else if ( whence == SEEK_END )
stream->seekg ( offset, std::ios::end );
return stream->fail() ? -1 : stream->tellg();
}
int istream_read(SDL_RWops *context, void *ptr, int size, int maxnum)
{
if ( size == 0 ) return -1;
std::istream* stream = (std::istream*) context->hidden.unknown.data1;
stream->read( (char*)ptr, size * maxnum );
return stream->bad() ? -1 : stream->gcount() / size;
}
int istream_close( SDL_RWops *context )
{
if ( context ) {
SDL_FreeRW( context );
}
return 0;
}
SDL_RWops *SDL_RWFromIStream( std::istream& stream )
{
SDL_RWops *rwops;
rwops = SDL_AllocRW();
if ( rwops != NULL )
{
rwops->seek = istream_seek;
rwops->read = istream_read;
rwops->write = NULL;
rwops->close = istream_close;
rwops->hidden.unknown.data1 = &stream;
}
return rwops;
}
Works under the assumptions that istream's are never freed by SDL (and that they live through the operation). Also only istream support is in, a separate function would be done for ostream -- I know I could pass iostream, but that would not allow passing an istream to the conversion function :/.
Any tips on errors or upgrades welcome.
If you're trying to get an SDL_RWops struct from an istream, you could do it by reading the whole istream into memory and then using SDL_RWFromMem to get a struct to represent it.
Following is a quick example; note that it's unsafe, as no sanity checks are done. For example, if the file's size is 0, accessing buffer[0] may throw an exception or assert in debug builds.
// Open a bitmap
std::ifstream bitmap("bitmap.bmp");
// Find the bitmap file's size
bitmap.seekg(0, std::ios_base::end);
std::istream::pos_tye fileSize = bitmap.tellg();
bitmap.seekg(0);
// Allocate a buffer to store the file in
std::vector<unsigned char> buffer(fileSize);
// Copy the istream into the buffer
std::copy(std::istreambuf_iterator<unsigned char>(bitmap), std::istreambuf_iterator<unsigned char>(), buffer.begin());
// Get an SDL_RWops struct for the file
SDL_RWops* rw = SDL_RWFromMem(&buffer[0], buffer.size());
// Do stuff with the SDL_RWops struct