C/C++ Arbitrary output size when decompressing GZIP - c++

I'm using this to decompress a GZIP compressed file "input.gz" into the uncompressed "output.file". It works wonderfully, except I need a fixed size for the buffer (in this case 1MB) and if the output becomes larger the bytes get cut off. Is there a way to get this to work with any output size?
#include "zlib.h"
#include <stdio.h>
int main()
{
char buf[1024*1024];
gzFile in = gzopen("input.gz","rb8");
int len = gzread(in,buf,sizeof(buf));
gzclose(in);
FILE* out = fopen("output.file", "wb");
fwrite(buf,1,len,out);
fclose(out);
free(buf);
return 0;
}

gzread works the same way as fread. Consecutive calls to gzread just read more data from that file. I haven't tested the code, but this should work fine.
#include "zlib.h"
#include <stdio.h>
int main() {
char buf[1024];
gzFile in = gzopen("input.gz","rb8");
FILE* out = fopen("output.file", "wb");
while (int len = gzread(in, buf, sizeof(buf)))
fwrite(buf, 1, len, out);
gzclose(in);
fclose(out);
return 0;
}

Related

How can I get the latest changes in a file using ifstream?

It's a real-time capture system, I need to get the latest changes from a file which is occasionally edited(mostly add content) by other applications.
In other words, how can I get content that added in the period when I open it without reopening the file?
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(){
ifstream tfile("temp.txt",ios::in);
if(!tfile){
cout<<"open failed"<<endl;
return 0;
}
string str;
while(1){
if(tfile.eof())
continue;
getline(tfile,str);
cout<<str<<endl;
}
tfile.close();
}
C++ / C Solution
If you are looking for a c++ solution you can use the following functions that I had created a while back:
#include <iostream>
#include <string>
// For sleep function
#ifdef _WIN32
#include <Windows.h>
#else
#include <unistd.h>
#endif
using namespace std;
void watchLogs(const char *FILENAME) {
FILE * f;
unsigned size = 0;
f = fopen(FILENAME , "r");
char c;
while (true) {
if (!size) { // will print content of your log file. If you just want the updates you can remove the current content except the first two lines;
fseek(f, 0, SEEK_END);
size =(unsigned long)ftell(f) ;
fseek (f, 0, SEEK_SET);
char buffer[size + 1];
fread ( buffer, 1, size, f );
buffer[size] = '\0';
cout << buffer << "\n";
}
else if ((c = (char)fgetc(f)) >= 0) {
fseek(f, 0, SEEK_END); // reach end of file
int BUFFER_SIZE =(unsigned long)ftell(f) - size; // save the length of the update to your logs
char buffer[BUFFER_SIZE + 1]; // prepare a buffer to print the characters
fseek(f,-BUFFER_SIZE,SEEK_END); // rewind BUFFER_SIZE characters before the EOF
int i = 0;
do {buffer[i++] = (char)fgetc(f);} while(i < BUFFER_SIZE); // copy to buffer
buffer[i] = '\0'; // don't forget to NULL terminate your buffer
cout << buffer << "\n";
size += i; // increment the size of the current file
}
}
sleep(3); // updates are checked every 3 seconds to avoid running the cpu at fullspeed, you could set the new logs to show up every minutes or every seconds, up to you.
fclose(f);
}
And you can test it with:
int main(int argc, char **argv) {
if (argc < 2)
return 1;
const char *FILENAME = argv[1];
watchLogs(FILENAME);
return 0;
}
./a.out mysql_binary.log
I could have used stringstreamer but I like that this version would also work with c files with some minor tweaks (can't use string).
I hope you will find it helpful!
NB: This assume that your file will only grow and that the changes will be appended to the end of your file.
NB2: This program is not segfault proof, you may want to check the return of fopen etc
Inotify
If you use Linux you could also potentially go for inotify:
Download inotify: sudo apt-get install -y inotify-tools
Then create the following script mywatch.sh
while inotifywait -e close_write $1; do ./$1; done
Give permission to execute:
add chmox +x mywatch.sh
and call it with ./watchit.sh mysql_binary.log

C++ reading a file prints nothing to console

I'm having trouble printing the contents of a file to console.
file.bin contents are "abc".
data holds value, but it just doesn't print it...
#include <Windows.h>
#include <iostream>
int main()
{
wchar_t *data;
FILE* file;
int err = _wfopen_s(&file, L"file.bin", L"rb");
if (err != 0)
{
std::cout << "Error";
return 0;
}
fseek(file, 0, SEEK_END);
long lSize;
lSize = ftell(file);
rewind(file);
data = (wchar_t *)malloc(lSize + 1);
fread(data, 1, lSize, file);
//dereference pointer
wchar_t data2 = *data;
std::wcout << data2; // prints nothing...
system("PAUSE");
return 0;
}
EDIT
I know about fstream but I would really prefer C style opening/reading files.
#include <fstream>
#include <string>
#include <iostream>
int main()
{
std::ifstream ifs("file.bin");
std::string content( (std::istreambuf_iterator<char>(ifs) ),
(std::istreambuf_iterator<char>() ) );
std::cout<<content;
return 0;
}
Use std::ifstream if you're using c++. You're making this much more complicated then you need to. See this former answer.

Extract hexdump or RAW data of a file to text

I was wondering if there is a way to output the hexdump or raw data of a file to txt file.
for example
I have a file let's say "data.jpg" (the file type is irrelevant) how can I export the HEXdump (14ed 5602 etc) to a file "output.txt"?
also how I can I specify the format of the output for example, Unicode or UTF?
in C++
You can use a loop, fread and fprintf: With read you get the byte-value of the bytes, then with fprintf you can use the %x to print hexadecimal to a file.
http://www.cplusplus.com/reference/clibrary/cstdio/fread/
http://www.cplusplus.com/reference/clibrary/cstdio/fprintf/
If you want this to be fast you load whole machine-words (int or long long) instead of single bytes, if you want this to be even faster you fread a whole array, then sprintf a whole array, then fprintf that array to the file.
Maybe something like this?
#include <sstream>
#include <iostream>
#include <iomanip>
#include <iterator>
#include <algorithm>
int main()
{
std::stringstream buffer( "testxzy" );
std::istreambuf_iterator<char> it( buffer.rdbuf( ) );
std::istreambuf_iterator<char> end; // eof
std::cout << std::hex << std::showbase;
std::copy(it, end, std::ostream_iterator<int>(std::cout));
std::cout << std::endl;
return 0;
}
You just have to replace buffer with an ifstream that reads the binary file, and write the output to a textfile using an ofstream instead of cout.
This is pretty old -- if you want Unicode, you'll have to add that yourself.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
unsigned long offset = 0;
FILE *input;
int bytes, i, j;
unsigned char buffer[16];
char outbuffer[60];
if ( argc < 2 ) {
fprintf(stderr, "\nUsage: dump filename [filename...]");
return EXIT_FAILURE;
}
for (j=1;j<argc; ++j) {
if ( NULL ==(input=fopen(argv[j], "rb")))
continue;
printf("\n%s:\n", argv[j]);
while (0 < (bytes=fread(buffer, 1, 16, input))) {
sprintf(outbuffer, "%8.8lx: ", offset+=16);
for (i=0;i<bytes;i++) {
sprintf(outbuffer+10+3*i, "%2.2X ",buffer[i]);
if (!isprint(buffer[i]))
buffer[i] = '.';
}
printf("%-60s %*.*s\n", outbuffer, bytes, bytes, buffer);
}
fclose(input);
}
return 0;
}

How to compress a buffer with zlib?

There is a usage example at the zlib website: http://www.zlib.net/zlib_how.html
However in the example they are compressing a file. I would like to compress a binary data stored in a buffer in memory. I don't want to save the compressed buffer to disk either.
Basically here is my buffer:
fIplImageHeader->imageData = (char*)imageIn->getFrame();
How can I compress it with zlib?
I would appreciate some code example of how to do that.
zlib.h has all the functions you need: compress (or compress2) and uncompress. See the source code of zlib for an answer.
ZEXTERN int ZEXPORT compress OF((Bytef *dest, uLongf *destLen, const Bytef *source, uLong sourceLen));
/*
Compresses the source buffer into the destination buffer. sourceLen is
the byte length of the source buffer. Upon entry, destLen is the total size
of the destination buffer, which must be at least the value returned by
compressBound(sourceLen). Upon exit, destLen is the actual size of the
compressed buffer.
compress returns Z_OK if success, Z_MEM_ERROR if there was not
enough memory, Z_BUF_ERROR if there was not enough room in the output
buffer.
*/
ZEXTERN int ZEXPORT uncompress OF((Bytef *dest, uLongf *destLen, const Bytef *source, uLong sourceLen));
/*
Decompresses the source buffer into the destination buffer. sourceLen is
the byte length of the source buffer. Upon entry, destLen is the total size
of the destination buffer, which must be large enough to hold the entire
uncompressed data. (The size of the uncompressed data must have been saved
previously by the compressor and transmitted to the decompressor by some
mechanism outside the scope of this compression library.) Upon exit, destLen
is the actual size of the uncompressed buffer.
uncompress returns Z_OK if success, Z_MEM_ERROR if there was not
enough memory, Z_BUF_ERROR if there was not enough room in the output
buffer, or Z_DATA_ERROR if the input data was corrupted or incomplete. In
the case where there is not enough room, uncompress() will fill the output
buffer with the uncompressed data up to that point.
*/
This is an example to pack a buffer with zlib and save the compressed contents in a vector.
void compress_memory(void *in_data, size_t in_data_size, std::vector<uint8_t> &out_data)
{
std::vector<uint8_t> buffer;
const size_t BUFSIZE = 128 * 1024;
uint8_t temp_buffer[BUFSIZE];
z_stream strm;
strm.zalloc = 0;
strm.zfree = 0;
strm.next_in = reinterpret_cast<uint8_t *>(in_data);
strm.avail_in = in_data_size;
strm.next_out = temp_buffer;
strm.avail_out = BUFSIZE;
deflateInit(&strm, Z_BEST_COMPRESSION);
while (strm.avail_in != 0)
{
int res = deflate(&strm, Z_NO_FLUSH);
assert(res == Z_OK);
if (strm.avail_out == 0)
{
buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
strm.next_out = temp_buffer;
strm.avail_out = BUFSIZE;
}
}
int deflate_res = Z_OK;
while (deflate_res == Z_OK)
{
if (strm.avail_out == 0)
{
buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
strm.next_out = temp_buffer;
strm.avail_out = BUFSIZE;
}
deflate_res = deflate(&strm, Z_FINISH);
}
assert(deflate_res == Z_STREAM_END);
buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE - strm.avail_out);
deflateEnd(&strm);
out_data.swap(buffer);
}
You can easily adapt the example by replacing fread() and fwrite() calls with direct pointers to your data. For zlib compression (referred to as deflate as you "take out all the air of your data") you allocate z_stream structure, call deflateInit() and then:
fill next_in with the next chunk of data you want to compress
set avail_in to the number of bytes available in next_in
set next_out to where the compressed data should be written which should usually be a pointer inside your buffer that advances as you go along
set avail_out to the number of bytes available in next_out
call deflate
repeat steps 3-5 until avail_out is non-zero (i.e. there's more room in the output buffer than zlib needs - no more data to write)
repeat steps 1-6 while you have data to compress
Eventually you call deflateEnd() and you're done.
You're basically feeding it chunks of input and output until you're out of input and it is out of output.
The classic way more convenient with C++ features
Here's a full example which demonstrates compression and decompression using C++ std::vector objects:
#include <cstdio>
#include <iosfwd>
#include <iostream>
#include <vector>
#include <zconf.h>
#include <zlib.h>
#include <iomanip>
#include <cassert>
void add_buffer_to_vector(std::vector<char> &vector, const char *buffer, uLongf length) {
for (int character_index = 0; character_index < length; character_index++) {
char current_character = buffer[character_index];
vector.push_back(current_character);
}
}
int compress_vector(std::vector<char> source, std::vector<char> &destination) {
unsigned long source_length = source.size();
uLongf destination_length = compressBound(source_length);
char *destination_data = (char *) malloc(destination_length);
if (destination_data == nullptr) {
return Z_MEM_ERROR;
}
Bytef *source_data = (Bytef *) source.data();
int return_value = compress2((Bytef *) destination_data, &destination_length, source_data, source_length,
Z_BEST_COMPRESSION);
add_buffer_to_vector(destination, destination_data, destination_length);
free(destination_data);
return return_value;
}
int decompress_vector(std::vector<char> source, std::vector<char> &destination) {
unsigned long source_length = source.size();
uLongf destination_length = compressBound(source_length);
char *destination_data = (char *) malloc(destination_length);
if (destination_data == nullptr) {
return Z_MEM_ERROR;
}
Bytef *source_data = (Bytef *) source.data();
int return_value = uncompress((Bytef *) destination_data, &destination_length, source_data, source.size());
add_buffer_to_vector(destination, destination_data, destination_length);
free(destination_data);
return return_value;
}
void add_string_to_vector(std::vector<char> &uncompressed_data,
const char *my_string) {
int character_index = 0;
while (true) {
char current_character = my_string[character_index];
uncompressed_data.push_back(current_character);
if (current_character == '\00') {
break;
}
character_index++;
}
}
// https://stackoverflow.com/a/27173017/3764804
void print_bytes(std::ostream &stream, const unsigned char *data, size_t data_length, bool format = true) {
stream << std::setfill('0');
for (size_t data_index = 0; data_index < data_length; ++data_index) {
stream << std::hex << std::setw(2) << (int) data[data_index];
if (format) {
stream << (((data_index + 1) % 16 == 0) ? "\n" : " ");
}
}
stream << std::endl;
}
void test_compression() {
std::vector<char> uncompressed(0);
auto *my_string = (char *) "Hello, world!";
add_string_to_vector(uncompressed, my_string);
std::vector<char> compressed(0);
int compression_result = compress_vector(uncompressed, compressed);
assert(compression_result == F_OK);
std::vector<char> decompressed(0);
int decompression_result = decompress_vector(compressed, decompressed);
assert(decompression_result == F_OK);
printf("Uncompressed: %s\n", uncompressed.data());
printf("Compressed: ");
std::ostream &standard_output = std::cout;
print_bytes(standard_output, (const unsigned char *) compressed.data(), compressed.size(), false);
printf("Decompressed: %s\n", decompressed.data());
}
In your main.cpp simply call:
int main(int argc, char *argv[]) {
test_compression();
return EXIT_SUCCESS;
}
The output produced:
Uncompressed: Hello, world!
Compressed: 78daf348cdc9c9d75128cf2fca495164000024e8048a
Decompressed: Hello, world!
The Boost way
#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/zlib.hpp>
std::string compress(const std::string &data) {
boost::iostreams::filtering_streambuf<boost::iostreams::output> output_stream;
output_stream.push(boost::iostreams::zlib_compressor());
std::stringstream string_stream;
output_stream.push(string_stream);
boost::iostreams::copy(boost::iostreams::basic_array_source<char>(data.c_str(),
data.size()), output_stream);
return string_stream.str();
}
std::string decompress(const std::string &cipher_text) {
std::stringstream string_stream;
string_stream << cipher_text;
boost::iostreams::filtering_streambuf<boost::iostreams::input> input_stream;
input_stream.push(boost::iostreams::zlib_decompressor());
input_stream.push(string_stream);
std::stringstream unpacked_text;
boost::iostreams::copy(input_stream, unpacked_text);
return unpacked_text.str();
}
TEST_CASE("zlib") {
std::string plain_text = "Hello, world!";
const auto cipher_text = compress(plain_text);
const auto decompressed_plain_text = decompress(cipher_text);
REQUIRE(plain_text == decompressed_plain_text);
}
This is not a direct answer on your question about the zlib API, but you may be interested in boost::iostreams library paired with zlib.
This allows to use zlib-driven packing algorithms using the basic "stream" operations notation and then your data could be easily compressed by opening some memory stream and doing the << data operation on it.
In case of boost::iostreams this would automatically invoke the corresponding packing filter for every data that passes through the stream.

How to write to a memory buffer with a FILE*?

Is there any way to create a memory buffer as a FILE*. In TiXml it can print the xml to a FILE* but i cant seem to make it print to a memory buffer.
There is a POSIX way to use memory as a FILE descriptor: fmemopen or open_memstream, depending on the semantics you want: Difference between fmemopen and open_memstream
I guess the proper answer is that by Kevin. But here is a hack to do it with FILE *. Note that if the buffer size (here 100000) is too small then you lose data, as it is written out when the buffer is flushed. Also, if the program calls fflush() you lose the data.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *f = fopen("/dev/null", "w");
int i;
int written = 0;
char *buf = malloc(100000);
setbuffer(f, buf, 100000);
for (i = 0; i < 1000; i++)
{
written += fprintf(f, "Number %d\n", i);
}
for (i = 0; i < written; i++) {
printf("%c", buf[i]);
}
}
fmemopen can create FILE from buffer, does it make any sense to you?
I wrote a simple example how i would create an in-memory FILE:
#include <unistd.h>
#include <stdio.h>
int main(){
int p[2]; pipe(p); FILE *f = fdopen( p[1], "w" );
if( !fork() ){
fprintf( f, "working" );
return 0;
}
fclose(f); close(p[1]);
char buff[100]; int len;
while( (len=read(p[0], buff, 100))>0 )
printf(" from child: '%*s'", len, buff );
puts("");
}
C++ basic_streambuf inheritance
In C++, you should avoid FILE* if you can.
Using only the C++ stdlib, it is possible to make a single interface that transparently uses file or memory IO.
This uses techniques mentioned at: Setting the internal buffer used by a standard stream (pubsetbuf)
#include <cassert>
#include <cstring>
#include <fstream>
#include <iostream>
#include <ostream>
#include <sstream>
/* This can write either to files or memory. */
void write(std::ostream& os) {
os << "abc";
}
template <typename char_type>
struct ostreambuf : public std::basic_streambuf<char_type, std::char_traits<char_type> > {
ostreambuf(char_type* buffer, std::streamsize bufferLength) {
this->setp(buffer, buffer + bufferLength);
}
};
int main() {
/* To memory, in our own externally supplied buffer. */
{
char c[3];
ostreambuf<char> buf(c, sizeof(c));
std::ostream s(&buf);
write(s);
assert(memcmp(c, "abc", sizeof(c)) == 0);
}
/* To memory, but in a hidden buffer. */
{
std::stringstream s;
write(s);
assert(s.str() == "abc");
}
/* To file. */
{
std::ofstream s("a.tmp");
write(s);
s.close();
}
/* I think this is implementation defined.
* pusetbuf calls basic_filebuf::setbuf(). */
{
char c[3];
std::ofstream s;
s.rdbuf()->pubsetbuf(c, sizeof c);
write(s);
s.close();
//assert(memcmp(c, "abc", sizeof(c)) == 0);
}
}
Unfortunately, it does not seem possible to interchange FILE* and fstream: Getting a FILE* from a std::fstream
You could use the CStr method of TiXMLPrinter which the documentation states:
The TiXmlPrinter is useful when you
need to:
Print to memory (especially in non-STL mode)
Control formatting (line endings, etc.)
https://github.com/Snaipe/fmem is a wrapper for different platform/version specific implementations of memory streams
It tries in sequence the following implementations:
open_memstream.
fopencookie, with growing dynamic buffer.
funopen, with growing dynamic buffer.
WinAPI temporary memory-backed file.
When no other mean is available, fmem falls back to tmpfile()