I have to read a dat-file byte by byte from a zip-file in a char[] buffer. The zip-file contains only one dat-file. I guess unzip chunk by chunk would be good. I am using Visual Studio 2013 with c++.
I have found zip-utils (http://www.codeproject.com/Articles/7530/Zip-Utils-clean-elegant-simple-C-Win), would this be ok, because its nearly 10 years old? Would Minizip be a good way? I guess zlib alone would not be enough for this use case, right?
My question is, whats the best way to do the unzipping? I have no experience with handling zip-files and would like to hear a suggestion by somebody with experience.
Thank you,
Friedrich
Minizip would work. Please notice that it still requires zlib source code to link with.
A zip file is not just chunks of zlib compressed content.
It's an archive.
There is a directory header, and per element header you must decode too even if the archive only contains a single file. Typically, the header will tell you from which offset in the zip file you'll find your DAT compressed content. Then you'll likely use zlib to decode chunk by chunk starting at the given offset.
Please notice also that zip file format does not always imply zlib as a compressor (you can have many different compressor). If you master the code that create the zip file, it's not an issue. But if it comes from hostile user, then you should rely actually check the compressor used and assert it's zlib else you should deny decompressing the file because you'll not be able to do so.
Related
I'm looking to replace the zip library that I am using in a small utility with something a bit better.
One of the deficiencies in the library I am currently using is that it doesn't appear to validate zip file very well - I can corrupt the file by changing random characters and the library doesn't notice.
I am looking for a C++ zip library that has a function to validate the zip file without extracting all the files in the library.
Someone recommended ziplib to me, but I don't see anything in there about checking the integrity of a zip library.
Does anyone know if ziplib has this capability? Or have a better recommendation?
Libraries like libzip and libarchive allow you to read archive entries a chunk at a time. You can simply read the entire archive to verify it, repeatedly overwriting the same buffer in memory with the decompressed data and thereby discarding it.
Looking around I have found the question being asked, but not great answers. If this is a stackoverflow duplicate (sorry!)
My goal is to have a zlib compressed file that I append to using C/C++ at different intervals (such as a log file). Due to buffer size constraints I was hoping to avoid having to keep the entire file in memory for appending new items.
Mark Adler's answer was very close to what I needed, but due to already being entrenched in the zlib library and on an embedded device with limited resources I was/am stuck.
I ended up simply appending a delimiter to each section of data (ex: ##delimiter##) and once ready to read the finished file, (different application) it seeks these sections and creates an array object of the compressed sections that are then individually decompressed.
I am still marking Adler's answer as correct, as it was useful info that will be of more help to other programmers.
It sounds like you are trying to keep something like a compressed log, appending small amounts of data each time. For that you can look at gzlog.h and gzlog.c for an example of how to do this.
You can also look at gzappend, which appends data to a gzip file.
These are all easily adaptable to a zlib stream.
I've seen a lot of examples of i/o with text files I'm just wondering if you can do the same with other file types like mp3's, jpg's, zip files, etc..?
Will iostream and fstream work for all of these or do I need another library? Do I need a new sdk?
It's all binary data so I'd think it would be that simple. But I've been unpleasently surprised before.
Could I convert all files to text or binary?
It depend on what you mean by "work"
You can think of those files as a book written in Greek.
If you want to just mess with binary representation (display text in Greek on screen) then yes, you can do that.
If you want to actually extract some info: edit video stream, remove voice from audio (actually understand what is written), then you would need to either parse file format yourself (learn Greek) or use some specialized library (hire a translator).
Either way, filestreams are suited to actually access those files data (and many libraries do use them under the hood)
You can work on binary streams by opening them with openmode binary :
ifstream ifs("mydata.mp3", ios_base::binary);
Then you read and write any binary content. However, if you need to generate or modify such content, play a video or display a piture, the you you need to know the inner details of the format you are using. This can be exremely complex, so a library would be recomended. And even with a library, advanced programming skills are required.
Examples of open source libraries: ffmpeg for usual audio/video format, portaudio for audio, CImg for image processing (in C++), libpng for png graphic format, lipjpeg for jpeg. Note that most libraries offer a C api.
Some OS also supports some native file types (example, windows bitmaps).
You can open these files using fstream, but the important thing to note is you must be intricately aware of what is contained within the file in order to process it.
If you just want to open it and spit out junk, then you can definitely just start at the first line of the file and exhaustively push all data into your console.
If you know what the file looks like on the inside, then you can process it just as you would any other file.
There may be specific libraries for processing specific files, but the fstream library will allow you to access any file you'd like.
All files are just bytes. There's nothing stopping you from reading/writing those bytes however you see fit.
The trick is doing something useful with those bytes. You could read the bytes from a .jpg file, for example, but you have to know what those bytes mean, and that's complicated. Usually it's best to use libraries written by people who know about the format in question, and let them deal with that complexity.
I have a rather large ZIP file, which gets downloaded (cannot change the file). The quest now is to unzip the file while it is downloading instead of having to wait till the central directory end is received.
Does such a library exist?
I wrote "pinch" a while back. It's in Objective-C but the method to decode files from a zip might be a way to get it in C++? Yeah, some coding will be necessary.
http://forrst.com/posts/Now_in_ObjC_Pinch_Retrieve_a_file_from_inside-I54
https://github.com/epatel/pinch-objc
I'm not sure such a library exists. Unless you are on a very fast line [or have a very slow processor], it's unlikely to save you a huge amount of time. Decompressing several gigabytes only takes a few seconds if all the data is in ram [it may then take a while to write the uncompressed data to the disk, and loading it from the disk may add to the total time].
However, assuming the sending end supports "range" downloading, you could possibly write something that downloads the directory first [by reading the fixed header first, then reading the directory and then downloading the rest of the file from start to finish]. Presumably that's how "pinch" linked in epatel's answer works.
I'm working on application that must enrypt and zip files. So, I create some data in memory (text, binary or whatever), encrypt it and save to disk (file1 and file2). The I call e.g. "zip out.zip file1 file2 ".
I do not want to save this files to disk, but immediately create zip and pack these files from memory.
How should I do that?
Thanks a lot!
You could try to use the zlib library to be able to create zip files from memory buffers.
The boost:iostreams could also be a good solution.
For zlib there is an extension for zip called minizip in the contribs. For minizip you can find code to work with in-memory buffers on the authors page:
Justin Fletcher wrote a very simple implementation of a memory access method for the ioapi code (ioapi_mem_c.zip).
Note that you must compress first and then encrypt. Encrypted data can't be compressed anymore.
Interestingly enough, I wasn't able to find a library to create ZIP files from C. zlib only allows to (de-)compress individual entries in a ZIP archive.
It comes with contrib/minizip; maybe that can get you started.