GetDiskFreeSpaceEx with compressed disk - c++

I want to get the free space on a compressed disk to show it to a end user. I'm using C++, MFC on Windows 2000 and later. The Windows API offers the GetDiskFreeSpaceEx() function.
However, this function seems to return the "uncompressed" sized of the data. This cause me some problem.
For example :
- Disk size is 100 GB
- Data size is 90 GB
- Compressed data size is 80 GB
The user will see that the disk is 90% full, but in reality, it is only 80% full.
EDIT
As Gleb pointed out, the function is returning the good information.
So here is the new question : is there a way to get both the compressed size and the uncompressed one?

I think you would have to map over all files, query with GetFileSize() and GetCompressedFileSize() and sum them up. Use GetFileAttributes() to know if a file is compressed or not, in case only parts of the whole volume is compressed, which might certainly be the case.
Hum, so that's not a trivial
operation. I suppose I must implement
some mechanism to avoid querying all
files size all the time. I mean ... if
I have a 800GB hard drive, it could
take some very long time to get all
file size.
True.
Perhaps start off by a full scan (application startup) and populate your custom data structure, e.g. a hash/map from file name to file data struct/class, then poll the drive with FindFirstChangeNotification() and update your internal structure accordingly.
You might also want to read about "Change Journals". I have never used them myself so don't know how they work, but might be worth checking out.

The function returns the amount of free space correctly. It can be demonstrated by using this simple program.
#include <stdio.h>
#include <windows.h>
void main() {
ULARGE_INTEGER p1, p2, p3;
GetDiskFreeSpaceEx(".", &p1, &p2, &p3);
printf("%llu %llu %llu\n", p1, p2, p3);
}
After compressing a previously uncompressed directory the free space grows.
So what are you talking about?

Related

Append to a JSON array in a JSON file on disk, every second using C++

This is my first post here, so please bear with me.
I have searched high and low on the internet for an answer, but I've not been able to resolve my issue, so I have decided to write a post here.
I am trying to write(append) to a JSON array on file using C++ and JZON, at intervals of 1 write each second. The JSON file is initially written by a “Prepare” function. Another function is then called each second to a add an array to the JSON file and append an new object to the array every second.
I have tried many things, most of which resulted in all sorts of issues. My latest attempt gave me the best results and this is the code that I have included below. However, the approach I took is very inefficient as I am writing an entire array every second. This is having a massive hit on CPU utilisation as the array grows, but not so much on memory as I had first anticipated.
What I really would like to be able to do is to append to an existing array contained in a JSON file on disk, line by line, rather than having to clear the entire array from the JSON object and rewriting the entire file, each and every second.
I am hoping that some of the geniuses on this website will be able to point me in the right direction.
Thank you very much in advance.
Here is my code:
//Create some object somewhere at the top of the cpp file
Jzon::Object jsonFlight;
Jzon::Array jsonFlightPath;
Jzon::Object jsonCoordinates;
int PrepareFlight(const char* jsonfilename) {
//...SOME PREPARE FUNCTION STUFF GOES HERE...
//Add the Flight Information to the jsonFlight root JSON Object
jsonFlight.Add("Flight Number", flightnum);
jsonFlight.Add("Origin", originicao);
jsonFlight.Add("Destination", desticao);
jsonFlight.Add("Pilot in Command", pic);
//Write the jsonFlight object to a .json file on disk. Filename is passed in as a param of the function.
Jzon::FileWriter::WriteFile(jsonfilename, jsonFlight, Jzon::NoFormat);
return 0;
}
int UpdateJSON_FlightPath(ACFT_PARAM* pS, const char* jsonfilename) {
//Add the current returned coordinates to the jsonCoordinates jzon object
jsonCoordinates.Add("altitude", pS-> altitude);
jsonCoordinates.Add("latitude", pS-> latitude);
jsonCoordinates.Add("longitude", pS-> longitude);
//Add the Coordinates to the FlightPath then clear the coordinates.
jsonFlightPath.Add(jsonCoordinates);
jsonCoordinates.Clear();
//Now add the entire flightpath array to the jsonFlight object.
jsonFlight.Add("Flightpath", jsonFlightPath);
//write the jsonFlight object to a JSON file on disk.
Jzon::FileWriter::WriteFile(jsonfilename, jsonFlight, Jzon::NoFormat);
//Remove the entire jsonFlighPath array from the jsonFlight object to avoid duplicaiton next time the function executes.
jsonFlight.Remove("Flightpath");
return 0;
}
For sure you can do "flat file" storage yourself.. but this is a symptom of needing a database. Something very light like SQLite, or mid-weight & open-source like MySQL, FireBird, or PostgreSQL.
But as to your question:
1) Leave the closing ] bracket off, and just keep the file open & appending -- but if you don't close the file correctly, it will be damaged & need repair to be readable.
2) Your current option -- writing a complete file each time -- isn't safe from data loss either, as the moment you "open to overwrite" you lose all data previously stored in the file. The workaround here, is to rename the old file as a backup before you start writing.
You should also make backup copies of your file, with the first option. (Say at daily intervals). Otherwise data loss is likely to occur eventually -- on Ctrl-C, power loss, program error or system crash.
Of course if you use any of SQLlite, MySQL, Firebird or PostgreSQL all the data-integrity problems will be handled for you.

What is the best way to return an image or video file from a function using c++?

I am writing a c++ library that fetches and returns either image data or video data from a cloud server using libcurl. I've started writing some test code but still stuck at designing API because I'm not sure about what's best way to handle these media files. Storing it in a char/string variable as binary data seems to work, but I wonder if that would take up too much RAM memory if the files are too big. I'm new to this, so please suggest a solution.
You can use something like zlib to compress it in memory, and then uncompress it only when it needs to be used; however, most modern computers have quite a lot of memory, so you can handle quite a lot of images before you need to start compressing. With videos, which are effectively a LOT of images, it becomes a bit more important -- you tend to decompress as you go, and possibly even stream-from-disk as you go.
The usual way to handle this, from an API point of view, is to have something like an Image object and a Video object (classes). These objects would have functions to "get" the uncompressed image/frame. The "get" function would check to see if the data is currently compressed; if it is, it would decompress it before returning it; if it's not compressed, it can return it immediately. The way the data is actually stored (compressed/uncompressed/on disk/in memory) and the details of how to work with it are thus hidden behind the "get" function. Most importantly, this model lets you change your mind later, adding additional types of compression, adding disk-streaming support, etc., without changing how the code that calls the get() function is written.
The other challenge is how you return an Image or Video object from a function. You can do it like this:
Image getImageFromURL( const std::string &url );
But this has the interesting problem that the image is "copied" during the return process (sometimes; depends how the compiler optimizes things). This way is more memory efficient:
void getImageFromURL( const std::string &url, Image &result );
This way, you pass in the image object into which you want your image loaded. No copies are made. You can also change the 'void' return value into some kind of error/status code, if you aren't using exceptions.
If you're worried about what to do, code for both returning the data in an array and for writing the data in a file ... and pass the responsability to choose to the caller. Make your function something like
/* one of dst and outfile should be NULL */
/* if dst is not NULL, dstlen specifies the size of the array */
/* if outfile is not NULL, data is written to that file */
/* the return value indicates success (0) or reason for failure */
int getdata(unsigned char *dst, size_t dstlen,
const char *outfile,
const char *resource);

FSCTL_GET_RETRIEVAL_POINTERS failure on very small file on a NT File System

My questions is: how would it be possible to get the file disk offset if this file (very important) is small (less than one cluster, only a few bytes).
Currently I use this Windows API function:
DeviceIOControl(FileHandle, FSCTL_GET_RETRIEVAL_POINTERS, #InBuffer, SizeOf(InBuffer), #OutBuffer, SizeOf(OutBuffer), Num, Nil);
FirsExtent.Start := OutBuffer.Pair[0].LogicalCluster ;
It works perfectly with files bigger than a cluster but it just fails with smaller files, as it always returns a null offset.
What is the procedure to follow with small files ? where are they located on a NTFS volume ? Is there an alternative way to know a file offset ? This subtility doesn't seem to be documented anywhere.
Note: the question is tagged as Delphi but C++ samples or examples would be appreciated as well.
The file is probably resident, meaning that its data is small enough to fit in its MFT entry. See here for a slightly longer description:
http://www.disk-space-guide.com/ntfs-disk-space.aspx
So you'd basically need to find the location of the MFT entry in order to know where the data is on disk. Do you control this file? If so the easiest thing to do is make sure that it's always larger than the size of an MFT entry (not a documented value, but you could always just do 4K or something).

How can GetFreeDiskSpaceEx return the (seemingly) wrong amount of disk space?

So I work on a device that outputs large images (anywhere from 30MB to 2GB+). Before we begin creating one of these images we check to see if there is sufficient disk space via GetDiskFreeSpaceEx. Typically (and in this case) we are writing to a shared folder on the same network. There are no user quotas on disk space at play.
Last night, in preparation for a demo, we kicked off a test run. During the run we experienced a failure. We needed 327391776 bytes and were told that we only had 186580992 available. The numbers from GetDiskFreeSpaceEx were:
User free space available: 186580992
Total free space available: 186580992
Those correspond to the QuadPart variables in the two (output) arguments lpFreeBytesAvailable and lpTotalNumberOfFreeBytes to GetDiskFreeSpaceAvailable.
This code has been in use for years now and I have never seen a false negative. Here is the complete function:
long IsDiskSpaceAvailable( const char* inDirectory,
const _int64& inRequestedSize,
_int64& outUserFree,
_int64& outTotalFree,
_int64& outCalcRequest )
{
ULARGE_INTEGER fba;
ULARGE_INTEGER tnb;
ULARGE_INTEGER tnfba;
ULARGE_INTEGER reqsize;
string dir;
size_t len;
dir = inDirectory;
len = strlen( inDirectory );
outUserFree = 0;
outTotalFree = 0;
outCalcRequest = 0;
if( inDirectory[len-1] != '\\' )
dir += "\\";
// this is the value of inRequestSize that was passed in
// inRequestedSize = 3273917760;
if( GetDiskFreeSpaceEx( dir.c_str(), &fba, &tnb, &tnfba ) )
{
outUserFree = fba.QuadPart;
outTotalFree = tnfba.QuadPart;
// this is computed dynamically given a specific compression
// type, but for simplicity I had hard-coded the value that was used
float compressionRatio = 10.0;
reqsize.QuadPart = (ULONGLONG) (inRequestedSize / compressionRatio);
outCalcRequest = reqsize.QuadPart;
// this is what was triggered to cause the failure,
// i.e., user free space was < the request size
if( fba.QuadPart < reqsize.QuadPart )
return( RetCode_OutOfSpace );
}
else
{
return( RetCode_Failure );
}
return( RetCode_OK );
}
So, a value of 3273917760 was passed to the function which is the total amount of disk space needed before compression. The function divides this by the compression ratio of 10 to get the actual size needed.
When I checked the disk that the share resides on it had ~177GB free, far more than what was reported. After starting the test again without changing anything it worked.
So my question here is; what could cause something like this? As far as I can tell it is not a programming error and, as I mentioned earlier, this code has been in use for a very long time now with no problems.
I checked the event log of the remote machine and found nothing of interest around the time of the failure. I'm hoping that someone out there has seen something similar before, thanks in advance.
Might not be of any use, but it's "strange" that:
177GB ~= 186580992 * 1000.
This could be explained by a stack corruption (since you don't initialize your local variable) happening elsewhere in the code.
The code "inRequestedSize / compressionRatio" doesn't have to be using float for the division, and since you've silented the "conversion loose precision" warning with the cast, you might actually hit an error too (but the number given in the example should work). You could simply do "inRequestedSize / 10".
Last but not least, you don't say where the code is running. On Mobile, the documentation of GetDiskFreeSpaceEx states:
When Mobile Encryption is enabled, the reporting behavior of this function changes. Each encrypted file has at least one 4-KB page of overhead associated. This function takes this overhead into account when it reports the amount pf space available. That is, if a 128-KB disk contains a single 60-KB file, this function reports that 64 KB is available, subtracting the space occupied by both the file and its associated overhead.
Although this function reports the total available space, keep the space requirement for encrypted files in mind when estimating whether multiple new files will fit into the remaining space. Include the amount of space required for overhead when Mobile Encryption is enabled. Each file requires at least an additional 4 KB. For example, a single 60-KB file requires 64 KB, but two 30-KB files actually require 68 KB.

appending to a memory-mapped file

I'm constantly appending to a file of stock quotes (ints, longs, doubles, etc.). I have this file mapped into memory with mmap.
What's the most efficient way to make newly appended data available as part of the memory mapping?
I understand that I can open the file again (new file descriptor) and then mmap it to get the new data but that seems to be inefficient. Another approach that has been suggested to me is to pre-allocate the file in 1mb chunks, write to a specific position until reaching the end then ftruncate the file to +1mb.
Are there other approaches?
Doest Boost help with this?
Boost.IOStreams has fixed-size only memory mapped files, so it won't help with your specific problem. Linux has an interface mremap which works as follows:
void *new_mapping = mremap(mapping, size, size + GROWTH, MREMAP_MAYMOVE);
if (new_mapping == MAP_FAILED)
// handle error
mapping = new_mapping;
This is non-portable, however (and poorly documented). Mac OS X seems not to have mremap.
In any case, you don't need to reopen the file, just munmap it and mmap it again:
void *append(int fd, char const *data, size_t nbytes, void *map, size_t &len)
{
// TODO: check for errors here!
ssize_t written = write(fd, data, nbytes);
munmap(map, len);
len += written;
return mmap(NULL, len, PROT_READ, 0, fd, 0);
}
A pre-allocation scheme may be very useful here. Be sure to keep track of the file's actual length and truncate it once more before closing.
I know the answer has already been accepted but maybe it will help someone else if I provide my answer. Allocate a large file ahead of time, say 10 GiB in size. Create three of these files ahead of time, I call them volumes. Keep track of your last known location somewhere like in the header, another file, etc. and then keep appending from that point. If you reach the maximum size of the file and run out of room switch to the next volume. If there are no more volumes, create another volume. Note that you would probably do this a few volumes ahead to make sure not to block your appends waiting for a new volume to be created. That's how we implement it where I work for storing continuous incoming video/audio in a DVR system for surveillance. We don't waste space to store file names for video clips which is why we don't use a real file system and instead we go flat file and we just track offsets, frame information (fps, frame type, width/height, etc), time recorded and camera channel. For you storage space is cheap for the kind of work you are doing, whereas your time is invaluable. So, grab as much as you want to ahead of time. You're basically implementing your own file system optimized for your needs. The needs that general-use file systems supply aren't the same needs that we need in other fields.
Looking at man page for mremap it should be possible.
My 5cents, but they are more C specific.
Make normal file, but mmap huge size - e.g file is say 100K, but mmap 1GB or more. Then you can safely access everything up to file size. Access over file size will result in error.
If you are on 32bit OS, just dont make mmap too big, because it will eat your address space.
If you're using boost/iostreams/device/mapped_file.hpp on windows:
boost::filesystem::resize_file throws an exception if a reading mapping object is open, due to lack of sharing privileges.
Instead, use windows-api to resize the file on the disc, and the reading mapped_files can still be open.
bool resize_file_wapi(string path, __int64 new_file_size) //boost::uintmax_t size
{
HANDLE handle = CreateFile(path.c_str(), GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, 0, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, 0);
LARGE_INTEGER sz;
sz.QuadPart = new_file_size;
return handle != INVALID_HANDLE_VALUE
&& ::SetFilePointerEx(handle, sz, 0, FILE_BEGIN)
&& ::SetEndOfFile(handle)
&& ::CloseHandle(handle);
}