zlib's compress function is not doing anything. Why? - c++

before = new unsigned char[mSizeNeeded*4];
uLong value = compressBound(mSizeNeeded*4);
after = new unsigned char[value];
compress(after, &value, before, mSizeNeeded*4);
fwrite(&after, 1, value, file);
'before' has a bunch of audio data stored into it and I am trying to compress it and store it into 'after'. I then write it into a file. The file is the same size as the original file, it also contains the same data that was in before (as far as I can tell).
Compress also returns OK so I know that the compression is not failing.
Okay, so it looks like my only problem is somewhere in the compression (I think). I am able to run compress and then I can uncompress and get the correct data out. Also, it is writing into the file and fwrite returns 561152 but the count (value) is 684964. So it looks like something is wrong with fwrite. I looked more carefully and the after data is different than the before data.
561152 is the same size as the original audio data in a .wav file that I have (stripped of the .wav headers of course).

Based on your original text:
fwrite (&before, ...
I am trying to compress it and store it into 'after'. I then write it into a file.
I think not. You are writing the original data to the file, you should probably be writing after instead.
The other thing you should get in the habit of doing is checking return values from functions that you care about. In other words, compress() will tell you if a problem occurs yet you seem to be totally ignoring the possibility.
Similarly, fwrite() also uses its return value to indicate whether it was successful or not. Since you haven't included the code showing how that's set up, this is also a distinct possibility. In particular fwrite is under no obligation to write your entire block to the file in one hit (device may be full, etc), that's why it has a return value, so you can detect and adjust for that situation. Often, a better option than:
fwrite (&after, 1, value, file);
is:
fwrite (&after, value, 1, file);
since the latter will always give you one for a fully successful write, something else for a failure of some description.
That would be my first step in establishing where the problem lies.
On top of that, there are numerous other (generally-applicable) methods you can use to track down the issue, such as:
outputting all variables after they change or are set (like the return values of functions, after, before, value and so on).
delete the output file before running your program, to ensure it's created afresh.
run the code through a debugger so you can see what's happening under the covers.
clearing after to all zero bytes (or a known pattern) to ensure you don't get stale data in there.
And, as a final approach (given that the zlib source code is freely available), you can also modify (or debug into) it so that you can clearly see what's going on under the covers.

Related

Get raw buffer for in-memory dataset in GDAL C++ API

I have generated a GeoTiff dataset in-memory using GDALTranslate() with a /vsimem/ filepath. I need access to the buffer for the actual GeoTiff file to put it in a stream for an external API. My understanding is that this should be possible with VSIGetMemFileBuffer(), however I can't seem to get this to return anything other than nullptr.
My code is essentially as follows:
//^^ GDALDataset* srcDataset created somewhere up here ^^
//psOptions struct has "-b 4" and "-of GTiff" settings.
const char* filep = "/vsimem/foo.tif";
GDALDataset* gtiffData = GDALTranslate(filep, srcDataset, psOptions, nullptr);
vsi_l_offset size = 0;
GByte* buf = VSIGetMemFileBuffer(filep, &size, true); //<-- returns nullptr
gtiffData seems to be a real dataset on inspection, it has all the appropriate properties (number of bands, raster size, etc). When I provide a real filesystem location to GDALTranslate() rather than the /vsimem/ path and load it up in QGIS it renders correctly too.
Looking a the source for VSIGetMemFileBuffer(), this should really only be returning nullptr if the file can't be found. This suggests i'm using it incorrectly. Does anyone know what the correct usage is?
Bonus points: Is there a better way to do this (stream the file out)?
Thanks!
I don't know anything about the C++ API. But in Python, the snippet below is what I sometimes use to get the contents of an in-mem file. In my case mainly VRT's but it shouldn't be any different for other formats.
But as said, I don't know if the VSI-api translate 1-on-1 to C++.
from osgeo import gdal
filep = "/vsimem/foo.tif"
# get the file size
stat = gdal.VSIStatL(filep, gdal.VSI_STAT_SIZE_FLAG)
# open file
vsifile = gdal.VSIFOpenL(filep, 'r')
# read entire contents
vsimem_content = gdal.VSIFReadL(1, stat.size, vsifile)
In the case of a VRT the content would be text, shown with something like print(vsimem_content.decode()). For a tiff it would of course be binary data.
I came back to this after putting in a workaround, and upon swapping things back over it seems to work fine. #mmomtchev suggested looking at the CPL_DEBUG output, which showed nothing unusual (and was silent during the actual VSIGetMemFileBuffer call).
In particular, for other reasons I had to put a GDALWarp call in between calling GDALTranslate and accessing the buffer, and it seems that this is what makes the difference. My guess is that GDALWarp is calling VSIFOpenL internally - although I can't find this in the source - and this does some kind of initialisation for VSIGetMemFileBuffer. Something to try for anyone else who encounters this.

How to reduce the size of a fstream file in C++

What is the best way to cut the end off of a fstream file in C++ 11
I am writing a data persistence class to store audio for my audio editor. I have chosen to use fstream (possibly a bad idea) to create a random access binary read write file.
Each time I record a little sound into my file I simply tack it onto the end of this file. Another internal data structure / file, contains pointers into the audio file and keeps track of edits.
When I undo a recording action and then do something else the last bit of the audio file becomes irrelevant. It is not referenced in the current state of the document and you cannot redo yourself back to a state where you can ever see it again. So I want to chop this part of the file off and start recording at the new end. I don’t need to cut out bitts in the middle, just off the end.
When the user quits this file will remain and be reloaded when they open the project up again.
In my application I expect the user to do this all the time and being able to do this might save me as much as 30% of the file size. This file will be long, potentially very, very long, so rewriting it to another file every time this happens is not a viable option.
Rewriting it when the user saves could be an option but it is still not that attractive.
I could stick a value at the start that says how long the file is supposed to be and then overwrite the end to recycle the space but in the mean time. If I wanted to continually update the data store file in case of crash this would mean I would be rewriting the start over and over again. I worry that this might be bad for flash drives. I could also recomputed the end of the useful part of the file on load, by analyzing the pointer file but in the mean time I would be wasting all that space potentially, and that is complicated.
Is there a simple call for this in the fstream API?
Am I using the wrong library? Note I want to stick to something generic STL I preferred, so I can keep the code as cross platform as possible.
I can’t seem to find it in the documentation and have looked for many hours. It is not the end of the earth but would make this a little simpler and potentially more efficient. Maybe I am just missing it somehow.
Thanks for your help
Andre’
Is there a simple call for this in the fstream API?
If you have C++17 compiler then use std::filesystem::resize_file. In previous standards there was no such thing in standard library.
With older compilers ... on Windows you can use SetFilePointer or SetFilePointerEx to set the current position to the size you want, then call SetEndOfFile. On Unixes you can use truncate or ftruncate. If you want portable code then you can use Boost.Filesystem. From it is simplest to migrate to std::filesystem in the future because the std::filesystem was mostly specified based on it.
If you have variable, that contains your current position in the file, you could seek back for the length of your "unnedeed chunk", and just continue to write from there.
// Somewhere in the begining of your code:
std::ofstream *file = new std::ofstream();
file->open("/home/user/my-audio/my-file.dat");
// ...... long story of writing data .......
// Lets say, we are on a one millin byte now (in the file)
int current_file_pos = 1000000;
// Your last chunk size:
int last_chunk_size = 12345;
// Your chunk, that you are saving
char *last_chunk = get_audio_chunk_to_save();
// Writing chunk
file->write(last_chunk, last_chunk_size);
// Moving pointer:
current_file_pos += last_chunk_size;
// Lets undo it now!
current_file_pos -= last_chunk_size;
file->seekp(current_file_pos);
// Now you can write new chunks from the place, where you were before writing and unding the last one!
// .....
// When you want to finally write file to disk, you just close it
file->close();
// And when, truncate it to the size of current_file_pos
truncate("/home/user/my-audio/my-file.dat", current_file_pos);
Unfortunatelly, you'll have to write a crossplatform function truncate, that would call SetEndOfFile in windows, and truncate in linux. It's easy enough with using preprocessor macros.

Append to a JSON array in a JSON file on disk, every second using C++

This is my first post here, so please bear with me.
I have searched high and low on the internet for an answer, but I've not been able to resolve my issue, so I have decided to write a post here.
I am trying to write(append) to a JSON array on file using C++ and JZON, at intervals of 1 write each second. The JSON file is initially written by a “Prepare” function. Another function is then called each second to a add an array to the JSON file and append an new object to the array every second.
I have tried many things, most of which resulted in all sorts of issues. My latest attempt gave me the best results and this is the code that I have included below. However, the approach I took is very inefficient as I am writing an entire array every second. This is having a massive hit on CPU utilisation as the array grows, but not so much on memory as I had first anticipated.
What I really would like to be able to do is to append to an existing array contained in a JSON file on disk, line by line, rather than having to clear the entire array from the JSON object and rewriting the entire file, each and every second.
I am hoping that some of the geniuses on this website will be able to point me in the right direction.
Thank you very much in advance.
Here is my code:
//Create some object somewhere at the top of the cpp file
Jzon::Object jsonFlight;
Jzon::Array jsonFlightPath;
Jzon::Object jsonCoordinates;
int PrepareFlight(const char* jsonfilename) {
//...SOME PREPARE FUNCTION STUFF GOES HERE...
//Add the Flight Information to the jsonFlight root JSON Object
jsonFlight.Add("Flight Number", flightnum);
jsonFlight.Add("Origin", originicao);
jsonFlight.Add("Destination", desticao);
jsonFlight.Add("Pilot in Command", pic);
//Write the jsonFlight object to a .json file on disk. Filename is passed in as a param of the function.
Jzon::FileWriter::WriteFile(jsonfilename, jsonFlight, Jzon::NoFormat);
return 0;
}
int UpdateJSON_FlightPath(ACFT_PARAM* pS, const char* jsonfilename) {
//Add the current returned coordinates to the jsonCoordinates jzon object
jsonCoordinates.Add("altitude", pS-> altitude);
jsonCoordinates.Add("latitude", pS-> latitude);
jsonCoordinates.Add("longitude", pS-> longitude);
//Add the Coordinates to the FlightPath then clear the coordinates.
jsonFlightPath.Add(jsonCoordinates);
jsonCoordinates.Clear();
//Now add the entire flightpath array to the jsonFlight object.
jsonFlight.Add("Flightpath", jsonFlightPath);
//write the jsonFlight object to a JSON file on disk.
Jzon::FileWriter::WriteFile(jsonfilename, jsonFlight, Jzon::NoFormat);
//Remove the entire jsonFlighPath array from the jsonFlight object to avoid duplicaiton next time the function executes.
jsonFlight.Remove("Flightpath");
return 0;
}
For sure you can do "flat file" storage yourself.. but this is a symptom of needing a database. Something very light like SQLite, or mid-weight & open-source like MySQL, FireBird, or PostgreSQL.
But as to your question:
1) Leave the closing ] bracket off, and just keep the file open & appending -- but if you don't close the file correctly, it will be damaged & need repair to be readable.
2) Your current option -- writing a complete file each time -- isn't safe from data loss either, as the moment you "open to overwrite" you lose all data previously stored in the file. The workaround here, is to rename the old file as a backup before you start writing.
You should also make backup copies of your file, with the first option. (Say at daily intervals). Otherwise data loss is likely to occur eventually -- on Ctrl-C, power loss, program error or system crash.
Of course if you use any of SQLlite, MySQL, Firebird or PostgreSQL all the data-integrity problems will be handled for you.

How to cut a file without using another file?

Is it possible to delete part of a file (let's say from the beginning to its half), without having to use another file?
Thank's!
Yes, it is possible, but still you'll have to rewrite most of the file.
The rough idea is as follows:
open the file
beg = find the start of the fragment to be removed
len = length of the fragment to be removed
blocksize = 4096 -- example block size, may be any
datamoved = 0
do {
fseek(pos +len +datamoved);
if( endoffile ) return; -- finished!
actualread = fread(buffer, blocksize)
fseek(pos + datamoved)
fwrite(buffer, actualread)
datamoved += actualread
}
and the last step after the loop is to 'truncate' the file to the pos+datamoved size. if the underlying filesystem does not handle 'truncatefile' operation, then you have to rewrite.. but most of filesystems and libraries do support that.
The short answer is that no, most file systems don't attempt to support operations like that.
That leaves you with two choices. The obvious one is to create a copy of the data, leaving out the parts you don't want. You can do this either in-place (i.e., moving the data around in the same file) or by using an auxiliary file, typically copying the data to the new file, then doing something like renaming the new file to the old name.
The other major choice is to simply re-structure your file and data so you don't have to get rid of the old data at all. For example, if you want to keep the most recent N amount of data from a process, you might structure (most of) the file as a circular buffer, with a couple of "pointers" at the beginning tell you the head and tail points, so you know where to read data from/write data to. With a structure like this, you don't erase or remove the old data, you just overwrite it as needed.
If you have enough memory, read its contents fully to the memory, copy it back to the front of the file, and truncate the file.
If you do not have enough memory, copy in blocks, and only when you are done truncate the file.

What is the best way to return an image or video file from a function using c++?

I am writing a c++ library that fetches and returns either image data or video data from a cloud server using libcurl. I've started writing some test code but still stuck at designing API because I'm not sure about what's best way to handle these media files. Storing it in a char/string variable as binary data seems to work, but I wonder if that would take up too much RAM memory if the files are too big. I'm new to this, so please suggest a solution.
You can use something like zlib to compress it in memory, and then uncompress it only when it needs to be used; however, most modern computers have quite a lot of memory, so you can handle quite a lot of images before you need to start compressing. With videos, which are effectively a LOT of images, it becomes a bit more important -- you tend to decompress as you go, and possibly even stream-from-disk as you go.
The usual way to handle this, from an API point of view, is to have something like an Image object and a Video object (classes). These objects would have functions to "get" the uncompressed image/frame. The "get" function would check to see if the data is currently compressed; if it is, it would decompress it before returning it; if it's not compressed, it can return it immediately. The way the data is actually stored (compressed/uncompressed/on disk/in memory) and the details of how to work with it are thus hidden behind the "get" function. Most importantly, this model lets you change your mind later, adding additional types of compression, adding disk-streaming support, etc., without changing how the code that calls the get() function is written.
The other challenge is how you return an Image or Video object from a function. You can do it like this:
Image getImageFromURL( const std::string &url );
But this has the interesting problem that the image is "copied" during the return process (sometimes; depends how the compiler optimizes things). This way is more memory efficient:
void getImageFromURL( const std::string &url, Image &result );
This way, you pass in the image object into which you want your image loaded. No copies are made. You can also change the 'void' return value into some kind of error/status code, if you aren't using exceptions.
If you're worried about what to do, code for both returning the data in an array and for writing the data in a file ... and pass the responsability to choose to the caller. Make your function something like
/* one of dst and outfile should be NULL */
/* if dst is not NULL, dstlen specifies the size of the array */
/* if outfile is not NULL, data is written to that file */
/* the return value indicates success (0) or reason for failure */
int getdata(unsigned char *dst, size_t dstlen,
const char *outfile,
const char *resource);