Qt/C++ (Pre-)Allocating large file

Qt/C++ (Pre-)Allocating large file - c++

Is there a way of allocating a file with a determined size with Qt?
The reason is to avoid or minimize fragmentation. I don't want to zero-write a large file (unwanted overhead), but just allocate it from the file system.
I'd like a solution which works on Win/OSX/Linux. I know there are solutions depending on the file system in question for all these platforms, but digging up the solutions and testing on each platform takes some time.

I'm not sure about fragmentation, but Qt has QFile::resize() method which clearly pre-allocates (or truncates) the file. The process is fast - ~1s for 800MB on my machine, therefore the file is clearly not explicitly garbage-filled. Tested on Windows 7.

Related

Copying multiple file segments into a single file - Qt

I have a file split into many segments. I have to combine the files into a single file. Now the simple code I came up with is:
QFile file;
file.setFileName(fileUrl);
file.open(QIODevice::WriteOnly);
for(int j=0;j<totalSegments;j++)
{
Segment[j]->fileSegment.close();
if(!Segment[j]->fileSegment.open(QIODevice::ReadOnly))
{
qDebug()<<"Segment not found";
continue;
}
file.write(Segment[j]->fileSegment.readAll()); // is this really efficient and safe
Segment[j]->fileSegment.close();
Segment[j]->fileSegment.remove();
}
The above code snippet works fine on Windows as well as Linux. But I have some questions:
1- Is this method really efficient. If suppose the segment size is in GB's will this badly affect the performance of the system, or can even corrupt the file or fail due to less available RAM.
2- The above method fails in some Linux Distro's especially Fedora if total size is more than 2GB. I haven't tested myself but was reported to me by many.
3- In Linux can it fail if segments are on an EXT4 filesystem and target file into which the file will be written on NTFS system. It didn't fail on Ubuntu but many users are complaining that it does. I can't just replicate it. Am I doing something wrong.

Please avoid multiple sub-questions per question in general, but I will try to answer your questions regardless.
1- Is this method really efficient. If suppose the segment size is in GB's will this badly affect the performance of the system, or can even corrupt the file or fail due to less available RAM.
It is very bad idea for large files. I think you wish to establish chunk file read and write.
2- The above method fails in some Linux Distro's especially Fedora if total size is more than 2GB. I haven't tested myself but was reported to me by many.
2 GB < (or was it 4 GB?) counts as large file on 32 bit systems, so it is possible that they use the software without large file support build. It is necessary to make sure that support is enabled while building. There used to be a configure option for Qt as -largefile.
3- In Linux can it fail if segments are on an EXT4 filesystem and target file into which the file will be written on NTFS system. It didn't fail on Ubuntu but many users are complaining that it does. I can't just replicate it. Am I doing something wrong.
Yes, it can be the same issue, also you need to pay attention to memory fragmentation which means, you will not be able to allocate 2 GB in memory even if you have 2 GB available, but the memory is inappropriately fragmented. On Windows, you may wish to use the /LARGEADDRESSAWARE option for instance when using 32 bit process.
Overall, the best would be to establish the loop for reading and writing, and then you could forget the large address aware and so on issues. You would still need to make sure that Qt can handle large files though if you wish to support them for your clients. This is of course only necessary on 32 bit because there is no practical limit for 64 bit with the currently ongoing file sizes at this point.
Since you requested some code in the comment to get you going, here is a simple and untested version of chunk read and immediate write of the content from an input file into an output file. I am sure this will get you going so that you can figure out the rest.
QFileInfo fileInfo("/path/to/my/file");
qint64 size = fileInfo.size();
QByteArray data;
int chunkSize = 4096;
for (qint64 bytes = 0; bytes < size, bytes+=data.size()) {
data = myInputFile.read(chunkSize);
// Error check
myOutputFile.write(data);
}

Accessing >2,3,4GB Files in 32-bit Process on 64-bit (or 32-bit) Windows

Disclaimer: I apologize for the verbosity of this question (I think it's an interesting problem, though!), yet I cannot figure out how to more concisely word it.
I have done hours of research as to the apparently myriad of ways in which to solve the problem of accessing multi-GB files in a 32-bit process on 64-bit Windows 7, ranging from /LARGEADDRESSAWARE to VirtualAllocEx AWE. I am somewhat comfortable in writing a multi-view memory-mapped system in Windows (CreateFileMapping, MapViewOfFile, etc.), yet can't quite escape the feeling that there is a more elegant solution to this problem. Also, I'm quite aware of Boost's interprocess and iostream templates, although they appear to be rather lightweight, requiring a similar amount of effort to writing a system utilizing only Windows API calls (not to mention the fact that I already have a memory-mapped architecture semi-implemented using Windows API calls).
I'm attempting to process large datasets. The program depends on pre-compiled 32-bit libraries, which is why, for the moment, the program itself is also running in a 32-bit process, even though the system is 64-bit, with a 64-bit OS. I know there are ways in which I could add wrapper libraries around this, yet, seeing as it's part of a larger codebase, it would indeed be a bit of an undertaking. I set the binary headers to allow for /LARGEADDRESSAWARE (at the expense of decreasing my kernel space?), such that I get up to around 2-3 GB of addressable memory per process, give or take (depending on heap fragmentation, etc.).
Here's the issue: the datasets are 4+GB, and have DSP algorithms run upon them that require essentially random access across the file. A pointer to the object generated from the file is handled in C#, yet the file itself is loaded into memory (with this partial memory-mapped system) in C++ (it's P/Invoked). Thus, I believe the solution is unfortunately not as simple as simply adjusting the windowing to access the portion of the file I need to access, as essentially I want to still have the entire file abstracted into a single pointer, from which I can call methods to access data almost anywhere in the file.
Apparently, most memory mapped architectures rely upon splitting the singular process into multiple processes.. so, for example, I'd access a 6 GB file with 3x processes, each holding a 2 GB window to the file. I would then need to add a significant amount of logic to pull and recombine data from across these different windows/processes. VirtualAllocEx apparently provides a method of increasing the virtual address space, but I'm still not entirely sure if this is the best way of going about it.
But, let's say I want this program to function just as "easily" as a singular 64-bit proccess on a 64-bit system. Assume that I don't care about thrashing, I just want to be able to manipulate a large file on the system, even if only, say, 500 MB were loaded into physical RAM at any one time. Is there any way to obtain this functionality without having to write a somewhat ridiculous, manual memory system by hand? Or, is there some better way than what I have found through thusfar combing SO and the internet?
This lends itself to a secondary question: is there a way of limiting how much physical RAM would be used by this process? For example, what if I wanted to limit the process to only having 500 MB loaded into physical RAM at any one time (whilst keeping the multi-GB file paged on disk)?
I'm sorry for the long question, but I feel as though it's a decent summary of what appear to be many questions (with only partial answers) that I've found on SO and the net at large. I'm hoping that this can be an area wherein a definitive answer (or at least some pros/cons) can be fleshed out, and we can all learn something valuable in the process!

You could write an accessor class which you give it a base address and a length. It returns data or throws exception (or however else you want to inform of error conditions) if error conditions arise (out of bounds, etc).
Then, any time you need to read from the file, the accessor object can use SetFilePointerEx() before calling ReadFile(). You can then pass the accessor class to the constructor of whatever objects you create when you read the file. The objects then use the accessor class to read the data from the file. Then it returns the data to the object's constructor which parses it into object data.
If, later down the line, you're able to compile to 64-bit, you can just change (or extend) the accessor class to read from memory instead.
As for limiting the amount of RAM used by the process.. that's mostly a matter of making sure that
A) you don't have memory leaks (especially obscene ones) and
B) destroying objects you don't need at the very moment. Even if you will need it later down the line but the data won't change... just destroy the object. Then recreate it later when you do need it, allowing it to re-read the data from the file.

Reserving Shared Memory with No File Backing (Linux/Windows) (boost::interprocess)

How can I reserve and allocate shared memory without the backing of a file? I'm trying to reserve a large (many tens of GiBs) chunk of shared memory and use it in multiple processes as a form of IPC. However, most of this chunk won't be touched at all (the access will be really sparse; maybe a few hundred megabytes throughout the life of the processes) and I don't care about the data when the applications end.
So preferably, the method to do this should have the following properties:
Doesn't commit the whole range. I will choose which parts to commit (actually use.) (But the pattern is quite unpredictable.)
Doesn't need a memory-mapped file or anything like that. I don't need to preserve the data.
Lets me access the memory area from multiple processes (I'll handle the locking explicitly.)
Works in both Linux and Windows (obviously a 64-bit OS is needed.)
Actually uses shared memory. I need the performance.
(NEW) The OS or the library doesn't try to initialize the reserved region (to zero or whatever.) This is obviously impractical and unnecessary.
I've been experimenting with boost::interprocess::shared_memory_object, but that causes a large file to be created on the filesystem (with the same size as my mapped memory region.) It does remove the file afterwards, but that hardly helps.
Any help/advice/pointers/reference is appreciated.
P.S. I do know how to do this on Windows using the native API. And POSIX seems to have the same functionality (only with a cleaner interface!) I'm looking for a cross-platform way here.
UPDATE: I did a bit of digging, and it turns out that the support that I thought existed in Windows was only a special case of memory-mapped files, using the system page file as a backing. (I'd never noticed it before because I had used at most a few megabytes of shared memory in the past projects.)
Also, I have a new requirement now (the number 6 above.)

On Windows, all memory has to be backed by the disk one way or another.
The closest I think you can accomplish on windows would be to memory map a sparse file. I think this will work in your case for two reasons:
On Windows, only the shared memory that you actually touch will become resident. This meets the first requirement.
Because the file is sparse, only the parts that have been modified will actually be stored on disk. If you looked at the properties of the file, it would say something like, "Size: 500 MB, Size on disk: 32 KB".
I realize that this technically doesn't meet your 2nd requirement, but, unfortunately, I don't think that is possible on Windows. At least with this approach, only the regions you actually use will take up space. (And only when Windows decides to commit the memory to disk, which it may or may not do at its discretion.)
As for turning this into a cross-platform solution, one option would be to modify boost::interprocess so that it creates sparse files on Windows. I believe boost::interprocess already meets your requirements on Linux, where POSIX shared memory is available.

Best way to read 12-15GB ASCII file in C++

I am trying to count the number of lines in a huge file. This ASCII file is anywhere from 12-15GB. Right now, I am using something along the lines of readline() to count each line of the file. But ofcourse, this is extremely slow. I've also tried to implement a lower level reading using seekg() and tellg() but due to the size of my file, I am unable to allocate a large enough array to store each character to run a '\n' comparison (I have 8GB of ram). What would be a faster way of reading this ridiculously large file? I've looked through many posts here and most people don't seem to have trouble with the 32bit system limitation, but here, I see that as a problem (correct me if I'm wrong).
Also, if anyone can recommend me a good way of splitting something this large, that would be helpful as well.
Thanks!

Don't try to read the whole file at once. If you're counting lines, just read in chunks of a given size. A couple of MB should be a reasonable buffer size.

Try Boost Memory-Mapped Files, one code for both Windows and POSIX platforms.

Memory-mapping a file does not require that you actually have enough RAM to hold the whole file. I've used this technique successfully with files up to 30 GB (I think I had 4 GB of RAM in that machine). You will need a 64-bit OS and 64-bit tools (I was using Python on FreeBSD) in order to be able to address that much.
Using a memory mapped file significantly increased the performance over explicitly reading chunks of the file.

what OS are you on? is there no wc -l or equivalent command on that platform?

C++ program runs slow in VS2008

I have a program written in C++, that opens a binary file(test.bin), reads it object by object, and puts each object into a new file (it opens the new file, writes into it(append), and closes it).
I use fopen/fclose, fread and fwrite.
test.bin contains 20,000 objects.
This program runs under linux with g++ in 1 sec but in VS2008 in debug/release mode in 1min!
There are reasons why I don't do them in batches or don't keep them in memory or any other kind of optimizations.
I just wonder why it is that much slow under windows.
Thanks,

I believe that when you close a file in Windows, it flushes the contents to disk each time. In Linux, I don't think that is the case. The flush on each operation would be very expensive.

Unfortunately file access on Windows isn't renowned for its brilliant speed, particularly if you're opening lots of files and only reading and writing small amounts of data. For better results, the (not particularly helpful) solution would be to read large amounts of data from a small number of files. (Or switch to Linux entirely for this program?!)
Other random suggestions to try:
turn off the virus checker if you have one (I've got Kaspersky on my PC, and writing 20,000 files quickly drove it bananas)
use an NTFS disk if you have one (FAT32 will be even worse)
make sure you're not accidentally using text mode with fopen (easily done)
use setvbuf to increase the buffer size for each FILE
try CreateFile/ReadFile/etc. instead of fopen and friends, which won't solve your problem but may shave a few seconds off the running time (since the stdio functions do a bit of extra work that you probably don't need)

I think it is not matter of VS 2008. It is matter of Linux and Windows file system differences. And how C++ works with files in both systems.

I'm seeing a lot of guessing here.
You're running under VS2008 IDE. You can always use the "poor man's profiler" and find out exactly what's going on.
In that minute, hit the "pause" button and look at what it's doing, including the call stack. Do this several times. Every single pause is almost certain (Prob = 59/60) to catch it doing precisely what it doesn't do under Linux.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js