I am trying to unzip a file by calling the inflate function but it always fails with Z_DATA_ERROR even when I use the example program from the website. I am thinking that maybe the zip file I have is not supported. I have attached a picture of the zip header below.
And here is the function that I wrote to perform the unzipping. I read in the whole file at once (about 34KB) and pass it into this function. Note I have tried passing the whole zip file with the zip header as well as skipping over the zip file header and only passing the zipped data both fail with Z_DATA_ERROR when inflate() is called.
int CHttpDownloader::unzip(unsigned char * pDest, unsigned long * ulDestLen, unsigned char * pSource, int iSourceLen){
int ret = 0;
unsigned int uiUncompressedBytes = 0; // Number of uncompressed bytes returned from inflate() function
unsigned char * pPositionDestBuffer = pDest; // Current position in dest buffer
unsigned char * pLastSource = &pSource[iSourceLen]; // Last position in source buffer
z_stream strm;
// Skip over local file header
SLocalFileHeader * header = (SLocalFileHeader *) pSource;
pSource += sizeof(SLocalFileHeader) + header->sFileNameLen + header->sExtraFieldLen;
// We should now be at the beginning of the stream data
/* allocate inflate state */
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.avail_in = 0;
strm.next_in = Z_NULL;
ret = inflateInit2(&strm, 16+MAX_WBITS);
if (ret != Z_OK){
return -1;
}
// Uncompress the data
strm.avail_in = header->iCompressedSize; //iSourceLen;
strm.next_in = pSource;
do {
strm.avail_out = *ulDestLen;
strm.next_out = pPositionDestBuffer;
ret = inflate(&strm, Z_NO_FLUSH);
assert(ret != Z_STREAM_ERROR); /* state not clobbered */
switch (ret) {
case Z_NEED_DICT:
ret = Z_DATA_ERROR; /* and fall through */
case Z_DATA_ERROR:
case Z_MEM_ERROR:
(void)inflateEnd(&strm);
return -2;
}
uiUncompressedBytes = *ulDestLen - strm.avail_out;
*ulDestLen -= uiUncompressedBytes; // ulDestSize holds number of free/empty bytes in buffer
pPositionDestBuffer += uiUncompressedBytes;
} while (strm.avail_out == 0);
// Close the decompression stream
inflateEnd(&strm);
ASSERT(ret == Z_STREAM_END);
return 0;
}
So my question is, is the type of zip file I am reading in not supported by ZLib's inflate() function? Or is there something wrong with my CHttpDownloader::unzip() function? Thanks for any help :)
Inflate() was failing because it was looking for GZip headers which were not present. If you initialize the stream with:
ret = inflateInit2(&strm, -MAX_WBITS);
Passing a negative window bits value prevents inflate from checking for gzip or zlib headers and unzipping works as expected.
That file that begins with 50 4B 03 04 is a zip file. The zlib library does not process zip files directly. zlib can help with the compression, decompression, and crc calculations. However you need other code to process the zip file format.
You can look at contrib/minizip in the zlib distribution, or libzip.
Related
Using zlib version 1.2.7 uncompress gzip data, but I couldn't know how to get the files' name in the compression package, or some one you are extracting.The method I find,it looks like read all data to buffer, and then return it.
like this:
int gzdecompress(Byte *zdata, uLong nzdata, Byte *data, uLong *ndata)
{
int err = 0;
z_stream d_stream = {0}; /* decompression stream */
static char dummy_head[2] = {
0x8 + 0x7 * 0x10,
(((0x8 + 0x7 * 0x10) * 0x100 + 30) / 31 * 31) & 0xFF,
};
d_stream.zalloc = NULL;
d_stream.zfree = NULL;
d_stream.opaque = NULL;
d_stream.next_in = zdata;
d_stream.avail_in = 0;
d_stream.next_out = data;
//only set value "MAX_WBITS + 16" could be Uncompress file that have header or trailer text
if(inflateInit2(&d_stream, MAX_WBITS + 16) != Z_OK) return -1;
while(d_stream.total_out < *ndata && d_stream.total_in < nzdata) {
d_stream.avail_in = d_stream.avail_out = 1; /* force small buffers */
if((err = inflate(&d_stream, Z_NO_FLUSH)) == Z_STREAM_END) break;
if(err != Z_OK) {
if(err == Z_DATA_ERROR) {
d_stream.next_in = (Bytef*) dummy_head;
d_stream.avail_in = sizeof(dummy_head);
if((err = inflate(&d_stream, Z_NO_FLUSH)) != Z_OK) {
return -1;
}
} else return -1;
}
}
if(inflateEnd(&d_stream) != Z_OK) return -1;
*ndata = d_stream.total_out;
return 0;
}
Using Example:
// file you want to extract
filename = "D:\\gzfile";
// read file to buffer
ifstream infile(filename, ios::binary);
if(!infile)
{
cerr<<"open error!"<<endl;
}
int begin = infile.tellg();
int end = begin;
int FileSize = 0;
infile.seekg(0,ios_base::end);
end = infile.tellg();
FileSize = end - begin;
char* buffer_bin = new char[FileSize];
char buffer_bin2 = new char[FileSize * 2];
infile.seekg(0,ios_base::beg);
for(int i=0;i<FileSize;i++)
infile.read(&buffer_bin[i],sizeof(buffer_bin[i]));
infile.close( );
// uncompress
uLong ts = (FileSize * 2);
gzdecompress((Byte*)buffer_bin, FileSize, (Byte*)buffer_bin2, &ts);
Array "buffer_bin2" get the extracted data.Attribute "ts" is the data length.
The question is, I don't know what is it name, is there only one file.How can I get the infomation?
Your question is not at all clear, but if you are trying to get the file name that is stored in the gzip header, then it would behoove you to read the zlib documentation in zlib.h. In fact that would be good idea if you plan to use zlib in any capacity.
In the documentation, you will find that the inflate...() functions will decompress gzip data, and that there is an inflateGetHeader() function that will return the gzip header contents.
Note that when gzip decompresses a .gz file, it doesn't even look at the name in the header, unless explicitly asked to. gzip will decompress to the name of the .gz file, e.g. foo.gz becomes foo when decompressed, even if the gzip header says the name is bar. If you use gzip -dN foo.gz, then it will call it bar. It is not clear why you even care what the name in the gzip header is.
I using libzip to work with zip files and everything goes fine, until i need to read file from zip
I need to read just a whole text files, so it will be great to achieve something like PHP "file_get_contents" function.
To read file from zip there is a function "int
zip_fread(struct zip_file *file, void *buf, zip_uint64_t nbytes)".
Main problem what i don't know what size of buf must be and how many nbytes i must read (well i need to read whole file, but files have different size). I can just do a big buffer to fit them all and read all it's size, or do a while loop until fread return -1 but i don't think it's rational option.
You can try using zip_stat to get file size.
http://linux.die.net/man/3/zip_stat
I haven't used the libzip interface but from what you write it seems to look very similar to a file interface: once you got a handle to the stream you keep calling zip_fread() until this function return an error (ir, possibly, less than requested bytes). The buffer you pass in us just a reasonably size temporary buffer where the data is communicated.
Personally I would probably create a stream buffer for this so once the file in the zip archive is set up it can be read using the conventional I/O stream methods. This would look something like this:
struct zipbuf: std::streambuf {
zipbuf(???): file_(???) {}
private:
zip_file* file_;
enum { s_size = 8196 };
char buffer_[s_size];
int underflow() {
int rc(zip_fread(this->file_, this->buffer_, s_size));
this->setg(this->buffer_, this->buffer_,
this->buffer_ + std::max(0, rc));
return this->gptr() == this->egptr()
? traits_type::eof()
: traits_type::to_int_type(*this->gptr());
}
};
With this stream buffer you should be able to create an std::istream and read the file into whatever structure you need:
zipbuf buf(???);
std::istream in(&buf);
...
Obviously, this code isn't tested or compiled. However, when you replace the ??? with whatever is needed to open the zip file, I'd think this should pretty much work.
Here is a routine I wrote that extracts data from a zip-stream and prints out a line at a time. This uses zlib, not libzip, but if this code is useful to you, feel free to use it:
#
# compile with -lz option in order to link in the zlib library
#
#include <zlib.h>
#define Z_CHUNK 2097152
int unzipFile(const char *fName)
{
z_stream zStream;
char *zRemainderBuf = malloc(1);
unsigned char zInBuf[Z_CHUNK];
unsigned char zOutBuf[Z_CHUNK];
char zLineBuf[Z_CHUNK];
unsigned int zHave, zBufIdx, zBufOffset, zOutBufIdx;
int zError;
FILE *inFp = fopen(fName, "rbR");
if (!inFp) { fprintf(stderr, "could not open file: %s\n", fName); return EXIT_FAILURE; }
zStream.zalloc = Z_NULL;
zStream.zfree = Z_NULL;
zStream.opaque = Z_NULL;
zStream.avail_in = 0;
zStream.next_in = Z_NULL;
zError = inflateInit2(&zStream, (15+32)); /* cf. http://www.zlib.net/manual.html */
if (zError != Z_OK) { fprintf(stderr, "could not initialize z-stream\n"); return EXIT_FAILURE; }
*zRemainderBuf = '\0';
do {
zStream.avail_in = fread(zInBuf, 1, Z_CHUNK, inFp);
if (zStream.avail_in == 0)
break;
zStream.next_in = zInBuf;
do {
zStream.avail_out = Z_CHUNK;
zStream.next_out = zOutBuf;
zError = inflate(&zStream, Z_NO_FLUSH);
switch (zError) {
case Z_NEED_DICT: { fprintf(stderr, "Z-stream needs dictionary!\n"); return EXIT_FAILURE; }
case Z_DATA_ERROR: { fprintf(stderr, "Z-stream suffered data error!\n"); return EXIT_FAILURE; }
case Z_MEM_ERROR: { fprintf(stderr, "Z-stream suffered memory error!\n"); return EXIT_FAILURE; }
}
zHave = Z_CHUNK - zStream.avail_out;
zOutBuf[zHave] = '\0';
/* copy remainder buffer onto line buffer, if not NULL */
if (zRemainderBuf) {
strncpy(zLineBuf, zRemainderBuf, strlen(zRemainderBuf));
zBufOffset = strlen(zRemainderBuf);
}
else
zBufOffset = 0;
/* read through zOutBuf for newlines */
for (zBufIdx = zBufOffset, zOutBufIdx = 0; zOutBufIdx < zHave; zBufIdx++, zOutBufIdx++) {
zLineBuf[zBufIdx] = zOutBuf[zOutBufIdx];
if (zLineBuf[zBufIdx] == '\n') {
zLineBuf[zBufIdx] = '\0';
zBufIdx = -1;
fprintf(stdout, "%s\n", zLineBuf);
}
}
/* copy some of line buffer onto the remainder buffer, if there are remnants from the z-stream */
if (strlen(zLineBuf) > 0) {
if (strlen(zLineBuf) > strlen(zRemainderBuf)) {
/* to minimize the chance of doing another (expensive) malloc, we double the length of zRemainderBuf */
free(zRemainderBuf);
zRemainderBuf = malloc(strlen(zLineBuf) * 2);
}
strncpy(zRemainderBuf, zLineBuf, zBufIdx);
zRemainderBuf[zBufIdx] = '\0';
}
} while (zStream.avail_out == 0);
} while (zError != Z_STREAM_END);
/* close gzip stream */
zError = inflateEnd(&zStream);
if (zError != Z_OK) {
fprintf(stderr, "could not close z-stream!\n");
return EXIT_FAILURE;
}
if (zRemainderBuf)
free(zRemainderBuf);
fclose(inFp);
return EXIT_SUCCESS;
}
With any streaming you should consider the memory requirements of your app.
A good buffer size is large, but you do not want to have too much memory in use depending on your RAM usage requirements. A small buffer size will require you call your read and write operations more times which are expensive in terms of time performance. So, you need to find a buffer in the middle of those two extremes.
Typically I use a size of 4096 (4KB) which is sufficiently large for many purposes. If you want, you can go larger. But at the worst case size of 1 byte, you will be waiting a long time for you read to complete.
So to answer your question, there is no "right" size to pick. It is a choice you should make so that the speed of your app and the memory it requires are what you need.
I'm writing Qt-based client application. It connects to remote server using QTcpSocket. Before sending any actual data it needs to send login info, which is zlib-compressed json.
As far as I know from server sources, to make everything work I need to send X bytes of compressed data following 4 bytes with length of uncompressed data.
Uncompressing on server-side looks like this:
/* look at first 32 bits of buffer, which contains uncompressed len */
unc_len = le32toh(*((uint32_t *)buf));
if (unc_len > CLI_MAX_MSG)
return NULL;
/* alloc buffer for uncompressed data */
obj_unc = malloc(unc_len + 1);
if (!obj_unc)
return NULL;
/* decompress buffer (excluding first 32 bits) */
comp_p = buf + 4;
if (uncompress(obj_unc, &dest_len, comp_p, buflen - 4) != Z_OK)
goto out;
if (dest_len != unc_len)
goto out;
memcpy(obj_unc + unc_len, &zero, 1); /* null terminate */
I'm compressing json using Qt built-in zlib (I've just downloaded headers and placed it in mingw's include folder):
char json[] = "{\"version\":1,\"user\":\"test\"}";
char pass[] = "test";
std::auto_ptr<Bytef> message(new Bytef[ // allocate memory for:
sizeof(ubbp_header) // + msg header
+ sizeof(uLongf) // + uncompressed data size
+ strlen(json) // + compressed data itself
+ 64 // + reserve (if compressed size > uncompressed size)
+ SHA256_DIGEST_LENGTH]);//+ SHA256 digest
uLongf unc_len = strlen(json);
uLongf enc_len = strlen(json) + 64;
// header goes first, so server will determine that we want to login
Bytef* pHdr = message.get();
// after that: uncompressed data length and data itself
Bytef* pLen = pHdr + sizeof(ubbp_header);
Bytef* pDat = pLen + sizeof(uLongf);
// hash of compressed message updated with user pass
Bytef* pSha;
if (Z_OK != compress(pLen, &enc_len, (Bytef*)json, unc_len))
{
qDebug("Compression failed.");
return false;
}
Complete function code here: http://pastebin.com/hMY2C4n5
Even though server correctly recieves uncompressed length, uncompress() returning Z_BUF_ERROR.
P.S.: I'm actually writing pushpool's client to figure out how it's binary protocol works. I've asked this question on official bitcoin forum, but no luck there. http://forum.bitcoin.org/index.php?topic=24257.0
Turns out it was server-side bug. More details in bitcoin forum thread.
By default i output a file that is 120mb. Here i have a input and output buffer thats double that. When i run this code i get an output of 10mb (default gives me 11mb). When i zip the raw 128mb file i get 700kb. Why am i getting 11mb instead of <1mb like zip gives me? Using 7-zip manager i asked it to compress with gzip using deflate and it give me a 4.6mb file which is still much smaller. I'm very curious why this is happening. It feels like i am doing something wrong.
static UInt32 len=0;
static char buf[1024*1024*256];
static char buf2[1024*1024*256];
static char *curbuf=buf;
z_stream strm;
void initzstuff()
{
strm.zalloc = 0;
strm.zfree = 0;
strm.opaque = 0;
int ret = deflateInit(&strm, Z_BEST_COMPRESSION);
if (ret != Z_OK)
return;
}
void flush_file(MyOstream o, bool end){
strm.avail_in = len;
strm.next_in = (UInt8*)buf;
strm.avail_out = sizeof(buf2);
strm.next_out = (UInt8*)buf2;
int ret = deflate(&strm, (end ? Z_FINISH : Z_NO_FLUSH));
assert(ret != Z_STREAM_ERROR);
int have = sizeof(buf2) - strm.avail_out;
fwrite(buf2, 1, have, o);
if(end)
{
(void)deflateEnd(&strm);
}
len=0;
curbuf=buf;
/*
fwrite(buf, 1, len, o);
len=0;
curbuf=buf;
//*/
}
Zip can use Deflate64 or other compression algorithm (like BZip2), and when your file is very sparce that can result in such difference.
Also, standard for ZLib tells only about the format of compressed data, and how the data is compressed is chosen by archivators, so 7-zip can use some heuristics which makes the ouput smaller.
Probably chunk-size? zlib.net/zpipe.c gives a fairly good example.
You'll probably get better performance too if you chunk rather than try to do the entire stream.
How to write bitmaps as frames to Ogg Theora in C\C++?
Some Examples with source would be grate!)
The entire solution is a little lengthy to post on here as a code sample, but if you download libtheora from Xiph.org, there is an example png2theora. All of the library functions I am about to mention can be found in the documentation on Xiph.org for theora and ogg.
Call th_info_init() to initialise a th_info structure, then set up you output parameters by assigning the appropriate members in that.
Use that structure in a call to th_encode_alloc() to get an encoder context
Initialise an ogg stream, with ogg_stream_init()
Initialise a blank th_comment structure using th_comment_init
Iterate through the following:
Call th_encode_flushheader with the the encoder context, the blank comment structure and an ogg_packet.
Send the resulting packet to the ogg stream with ogg_stream_packetin()
Until th_encode_flushheader returns 0 (or an error code)
Now, repeatedly call ogg_stream_pageout(), every time writing the page.header and then page.body to an output file, until it returns 0. Now call ogg_stream_flush and write the resulting page to the file.
You can now write frames to the encoder. Here is how I did it:
int theora_write_frame(int outputFd, unsigned long w, unsigned long h, unsigned char *yuv_y, unsigned char *yuv_u, unsigned char *yuv_v, int last)
{
th_ycbcr_buffer ycbcr;
ogg_packet op;
ogg_page og;
unsigned long yuv_w;
unsigned long yuv_h;
/* Must hold: yuv_w >= w */
yuv_w = (w + 15) & ~15;
/* Must hold: yuv_h >= h */
yuv_h = (h + 15) & ~15;
//Fill out the ycbcr buffer
ycbcr[0].width = yuv_w;
ycbcr[0].height = yuv_h;
ycbcr[0].stride = yuv_w;
ycbcr[1].width = yuv_w;
ycbcr[1].stride = ycbcr[1].width;
ycbcr[1].height = yuv_h;
ycbcr[2].width = ycbcr[1].width;
ycbcr[2].stride = ycbcr[1].stride;
ycbcr[2].height = ycbcr[1].height;
if(encoderInfo->pixel_fmt == TH_PF_420)
{
//Chroma is decimated by 2 in both directions
ycbcr[1].width = yuv_w >> 1;
ycbcr[2].width = yuv_w >> 1;
ycbcr[1].height = yuv_h >> 1;
ycbcr[2].height = yuv_h >> 1;
}else if(encoderInfo->pixel_fmt == TH_PF_422)
{
ycbcr[1].width = yuv_w >> 1;
ycbcr[2].width = yuv_w >> 1;
}else if(encoderInfo->pixel_fmt != TH_PF_422)
{
//Then we have an unknown pixel format
//We don't know how long the arrays are!
fprintf(stderr, "[theora_write_frame] Unknown pixel format in writeFrame!\n");
return -1;
}
ycbcr[0].data = yuv_y;
ycbcr[1].data = yuv_u;
ycbcr[2].data = yuv_v;
/* Theora is a one-frame-in,one-frame-out system; submit a frame
for compression and pull out the packet */
if(th_encode_ycbcr_in(encoderContext, ycbcr)) {
fprintf(stderr, "[theora_write_frame] Error: could not encode frame\n");
return -1;
}
if(!th_encode_packetout(encoderContext, last, &op)) {
fprintf(stderr, "[theora_write_frame] Error: could not read packets\n");
return -1;
}
ogg_stream_packetin(&theoraStreamState, &op);
ssize_t bytesWritten = 0;
int pagesOut = 0;
while(ogg_stream_pageout(&theoraStreamState, &og)) {
pagesOut ++;
bytesWritten = write(outputFd, og.header, og.header_len);
if(bytesWritten != og.header_len)
{
fprintf(stderr, "[theora_write_frame] Error: Could not write to file\n");
return -1;
}
bytesWritten = write(outputFd, og.body, og.body_len);
if(bytesWritten != og.body_len)
{
bytesWritten = fprintf(stderr, "[theora_write_frame] Error: Could not write to file\n");
return -1;
}
}
return pagesOut;
}
Where encoderInfo is the th_info structure used to initialise the encoder (static in the data section for me).
On your last frame, setting the last frame on th_encode_packetout() will make sure the stream terminates properly.
Once your done, just make sure to clean up (closing fds mainly). th_info_clear() will clear the th_info structure, and th_encode_free() will free your encoder context.
Obviously, you'll need to convert your bitmap into YUV planes before you can pass them to theora_write_frame().
Hope this is of some help. Good luck!
Here's the libtheora API and example code.
Here's a micro howto that shows how to use the theora binaries. As the encoder reads raw, uncompressed 'yuv4mpeg' data for video you could use that from your app, too by piping the video frames to the encoder.