So I read this interview with John Carmack in Gamasutra, in which he talks about what he calls "live C++ objects that live in memory mapped files". Here are some quotes:
JC: Yeah. And I actually get multiple benefits out of it in that... The last iOS Rage project, we shipped with some new technology that's using some clever stuff to make live C++ objects that live in memory mapped files, backed by the flash file system on here, which is how I want to structure all our future work on PCs.
...
My marching orders to myself here are, I want game loads of two seconds on our PC platform, so we can iterate that much faster. And right now, even with solid state drives, you're dominated by all the things that you do at loading times, so it takes this different discipline to be able to say "Everything is going to be decimated and used in relative addresses," so you just say, "Map the file, all my resources are right there, and it's done in 15 milliseconds."
(Full interview can be found here)
Does anybody have any idea what Carmack is talking about and how you would set up something like this? I've searched the web for a bit but I can't seem to find anything on this.
The idea is that you have all or part of your program state serialized into a file at all times by accessing that file via memory mapping. This will require you not having usual pointers because pointers are only valid while your process lasts. Instead you have to store offsets from the mapping start so that when you restart the program and remap the file you can continue working with it. The advantage of this scheme is that you don't have separate serialization which means you don't have extra code for that and you don't need to save all the state at once - instead your (all or most of) program state is backed by the file at all times.
You'd use placement new, either directly or via custom allocators.
Look at EASTL for an implementation of (subset) STL that is specifically geared to working well with custom allocation schemes (such as required for games running on embedded systems or game consoles).
A free subset of EASTL is here:
http://gpl.ea.com/
a clone at https://github.com/paulhodge/EASTL
We use for years something we call "relative pointers" which is some kind of smart pointer. It is inherently nonstandard, but works nice on most platforms. It is structured like:
template<class T>
class rptr
{
size_t offset;
public:
T* operator->() { return reinterpret_cast<T*>(reinterpret_cast<char*>(this)+offset); }
};
This requires that all objects are stored into the same shared memory (which can be a filemap too). It also usually requires us to only store our own compatible types in there, as well as havnig to write own allocators to manage that memory.
To always have consistent data, we use snapshots via COW mmap tricks (which work in userspace on linux, no idea about other OSs).
With the big move to 64bit we also sometimes just use fixed mappings, as the relative pointers incur some runtime overhead. With usually 48bits of address space, we chose a reserved memry area for our applications that we always map such a file to.
This reminds me of a file system I came up with that loaded level files of CD in an amazingly short time (it improved the load time from 10s of seconds to near instantaneous) and it works on non-CD media as well. It consisted of three versions of a class to wrap the file IO functions, all with the same interface:
class IFile
{
public:
IFile (class FileSystem &owner);
virtual Seek (...);
virtual Read (...);
virtual GetFilePosition ();
};
and an additional class:
class FileSystem
{
public:
BeginStreaming (filename);
EndStreaming ();
IFile *CreateFile ();
};
and you'd write the loading code like:
void LoadLevel (levelname)
{
FileSystem fs;
fs.BeginStreaming (levelname);
IFile *file = fs.CreateFile (level_map_name);
ReadLevelMap (fs, file);
delete file;
fs.EndStreaming ();
}
void ReadLevelMap (FileSystem &fs, IFile *file)
{
read some data from fs
get names of other files to load (like textures, object definitions, etc...)
for each texture file
{
IFile *texture_file = fs.CreateFile (some other file name)
CreateTexture (texture_file);
delete texture_file;
}
}
Then, you'd have three modes of operation: debug mode, stream file build mode and release mode.
In each mode, the FileSystem object would create different IFile objects.
In debug mode, the IFile object just wrapped the standard IO functions.
In stream file building, the IFile object also wrapped the standard IO but had the additional functions of writing to the stream file (the owner FileSystem opened the stream file) every byte that was read, and writing the return value of any file pointer position queries (so if anything needed to know a file size, that information is written to the stream file). This would sort of concatenate the various files into one big file, but only the data that was actually read.
The release mode would create an IFile that did not open files or seek within files, it just read from the streaming file (as opened by the owner FileSystem object).
This means that in release mode, all data is read in one sequential series of reads (the OS would buffer it nicely) rather than lots of seeks and reads. This is ideal for CDs where seek times are really slow. Needless to say, this was developed for a CD based console system.
A side effect is that the data is stripped of unnecessary meta data that would normally be skipped.
It does have drawbacks - all the data for a level is in one file. These can get quite large and the data can't be shared between files, if you had a set of textures, say, that were common across two or more levels, the data would be duplicated in each stream file. Also, the load process must be the same every time the data is loaded, you can't conditionally skip or add elements to a level.
As Carmack indicates many games (and other applications) loading code is structured lika a lot of small reads and allocations.
Instead of doing this you do a single fread (or equivalent) of say a level file into memory and just fixup the pointers afterwards.
Related
WRT code, i want to explicitly "save" the file without calling close(). i know there is no need to call close() as fsteam will call the destructor and save the file when fstream object goes out of scope.
But i want to "explictly" save the file without waiting for fstream
object to go out of scope. Is there a way to do this in C++ ?
Is there anything like
flHtmlFile.save()
The only 1 option i know is to close it & open again ?
#include <fstream>
int main()
{
std::ofstream flHtmlFile;
std::string fname = "some.txt";
flHtmlFile.open(fname);
flHtmlFile << "text1"; // gets written
flHtmlFile.close(); // My purpose is to EXPLICITLY SAVE THE FILE. Is there anything like flHtmlFile.save()
flHtmlFile << "text2"; // doesn't get written 'coz i called close()
return 1;
}
Files are often some stream of bytes, and could be much bigger than your virtual address space (e.g. you can have a terabyte sized file on a machine with only a few gigabytes of RAM).
In general a program won't keep all the content of a file in memory.
Some libraries enable you to read or write all the content at once in memory (if it fits there!). E.g. Qt has a QFile class with an inherited readAll member function.
However, file streams (either FILE from C standard library, or std::ostream from C++ standard library) are buffered. You may want to flush the buffer. Use std::flush (in C++) or fflush (in C); they practically often issue some system calls (probably write(2) on Linux) to ask the operating system to write some data in some file (but they probably don't guarantee that the data has reached the disk).
What exactly happens is file system-, operating system-, and hardware- specific. On Linux, the page cache may keep the data before it is written to disk (so if the computer loses power, data might be lost). And disk controller hardware also have RAM and are somehow buffering. See also sync(2) and fsync(2) (and even posix_fadvise(2)...). So even if you flush some stream, you are not sure that the bytes are permanently written on the disk (and you usually don't care).
(there are many layers and lots of buffering between your C++ code and the real hardware)
BTW you might write into memory thru std::ostringstream in C++ (or open_memstream in C on POSIX), flush that stream, then do something with its memory data (e.g. write(2) it to disk).
If all you want is for the content you wrote to reach the file system as soon as possible, then call flush on the file:
flHtmlFile.flush();
No closing or re-opening required.
I have multiple threads, and I want each of them to process a part of my file. Can I have a single ifstream object for that and make them read concurrently read different parts ? The parts are non overlapping, so the same line will not be processed by two threads. If yes, how to get multiple cursors ?
A single std::ifstream is associated with exactly one cursor (there's a seekg and tellg method associated with the std::ifstream directly).
If you want the same std::ifstream object to be shared accross multiple threads, you'll have to have some sort of synchronization mechanism between the threads, which might defeat the purpose (in each thread, you'll have to lock, seek, read and unlock each time).
To solve your problem, you can open one std::ifstream to the same file per thread. In each thread, you'd seek to whatever position you want to start reading from. This would only require you to be able to "easily" compute the seek position for each thread though (Note: this is a pretty strong requirement).
C++ file streams are not guaranteed to be thread safe (see e.g. this answer).
The typical solution is anyway to open separate streams on the same file, each instance comes with their own "cursor". However, you need to ensure shared access, and concurrency becomes platform specific.
For ifstream (i.e. only reading from the file), the concurrency issues are usually tame. Even if someone else modifies the file, both streams might see different content, but you do have some kind of eventual consistency.
Reads and writes are usually not atomic, i.e. you might read only part of a write. Writes might not even execute in the order they are issued (see write combining).
Looking at FILE struct it seems like there is a pointer inside FILE, char* curp, pointing to the current active pointer, which may mean that for each FILE object, you'd have one particular part of the file.
This being in C, I don't know how ifstream works and if it uses FILE object/it is built like a FILE object. Might not help you at all, but I thought it would be interesting to share this little information, and that it could may be help someone.
My question is all about tips and tricks. I'm currently working on project, where I have one very big(~1Gb) file with data. First, I need to extract data. This extraction takes 10 mins. Then I do calculations. Next calculation depends on previous. Let's call them calculation1, calculation2 and so on. Assuming, that I've done extraction part right, I currently face two problems:
Every time I launch program it works 10 mins least. I cannot avoid it, so I have to plan debugging.
For every next calculation it takes more time.
Thinking of first problem, I assumed, that some sort of database may help, if database is faster, than reading file, which I doubt.
Second problem might be overcomed if I split my big program in smaller programs, each of which will do: read file-do stuff-write file. So next stage always can read file from previous, for debug. But it introduces many wasted code for file I/O.
I think that both these problems could be solved by some strategy like: write, and test extract module, than launch it and let it extract all data into RAM. Than write calculation1, and launch it to somehow grab data directly from RAM of extract module. And so on with every next calculation. So my questions are:
Are there tips and tricks to minimize loads from files?
Are there ways to share RAM and objects between programs?
By the way I write this task on Perl because I need it quick, but I'll rewrite it on C++ or C# later, so any language-specific or language-agnostic answers welcome.
Thank you!
[EDIT]
File of data does not change it is like big immutable source of knowledge. And it is not exactly 1Gb and it does not take 10 minutes to read it. I just wanted to say, that file is big and time to read it is considerable. On my machine 1 Gb read+parse file into right objects takes about minute. Which is still pretty bad.
[/EDIT]
On my current system Perl copies the whole 1GB file in memory in 2 seconds. So I believe your problem is not reading the file but parsing it.
So the straightforward solution I can think of is to preparse it by, for instance, converting your data into actual code source. I mean, you can prepare your data and hardcode it in your script directly (using another file of course).
However, if reading is an actual problem (which I doubt) you can use database that store the data in the memory (example). It will be faster anyway just because your database reads the data once upon starting and you don't restart your database as often as your program.
The idea for solving this type of problems can be as follows:
Go for 3 programs:
Reader
Analyzer
Writer
and exchange data between them using shared memory.
For that big file I guess you have considerable amount of data of one object type which you can store in circular buffer in shared memory (I recommend using boost::interprocess).
Reader will continuously read data from the input file and store it in shared memory.
In the meantime, once is enough data read for doing calculations, the Analyzer will start processing it and store results into another circular buffer shared memory file.
Once there are some calculations in the second shared memory the Writer will read them and store them into final output file.
You need to make sure all the processes are synchronized properly so that they do their job simultanouesly and you don't lose the data (the data is not being overwritten before is processed or saved into final file).
I like the answer doqtor gives, but to prevent data from being overwritten, a nice helper class to enable and disable critical sections of code within a thread will do the trick.
// Note: I think sealed may be specific to Visual Studio Compiler.
// CRITICAL_SECTION is defined in Windows.h - If on another OS,
// look for similar structure.
class BlockThread sealed {
private:
CRITICAL_SECTION* m_pCriticalSection;
public:
explicit BlockThread( CRITICAL_SECTION& criticalSection );
~BlockThread();
private:
BlockThread( const BlockThread& c );
BlockThread& operator=( const BlockThread& c ); // Not Implement
};
BlockThread::BlockThread( CRITICAL_SECTION& criticalSection ) {
m_pCriticalSection = &criticalSection;
}
BlockThread::~BlockThread() {
LeaveCriticalSection( m_pCriticalSection
}
A class such as this would allow you to block specific threads if you are within the bounds of a critical section where shared memory is being used
and another thread currently has access to it. This will cause this thread
of code to be locked until the current thread is done its work and this
class goes out of scope.
To use this class with in another class is fairly simple: in the class that you
want to block a thread within its .cpp file you need to create a static variable of this type and call the API's function to initialize it. Then
you can use the BlockThread class to lock this thread.
SomeClass.cpp
#include "SomeClass.h"
#include "BlockThread.h"
static CRITICAL_SECTION s_criticalSection;
SomeClass::SomeClass {
// Do This First
InitializeCriticalSection( &s_criticalSection );
// Class Stuff Here
}
SomeClass::~SomeClass() {
// Class Stuff Here
// Do This Last
DeleteCriticalSection( &s_criticalSection );
}
// To Use The BlockThread
SomeClass::anyFunction() {
// When Your Condition Is Met & You Know This Is Critical
// Call This Before The Critical Computation Code.
BlockThread blockThread( s_criticalSection );
}
And that is about it, once this object goes out of scope the static member
is cleaned up within the objects destructor and when this object goes out
of scope so does the BlockThread class and its Destructor cleans it up there.
And now this shared memory can be used. You would usually want to use this class if you are traversing over containers to either add, insert, or find and access elements when this data is a shared type.
As for the 3 different threads running in memory on the same data set a good concept is to have 3 or 4 buffers each about 4MB in size and have them work in a rotating order. Buff1 gets data then Buff2 gets data, while Buff2 is getting data Buff 1 is either parsing the data it or passing it off to be stored for computation, then Buff1 waits until Buff3 or 4 is done, pending on how many buffers you have. Then this process starts again. This is the same principle that is used with Sound Buffers when reading in sound files for doing an Audio Stream, or sending batches of triangles to a graphics card. Another words it is a Batch type process.
I'm trying to understand something about HGLOBALs, because I just found out that what I thought is simply wrong.
In app A I GlobalAlloc() data (with GMEM_SHARE|GMEM_MOVABLE) and place the string "Test" in it. Now, what can I give to another application to get to that data?
I though (wrongfully!) that HGLOBALs are valid in all the processes, which is obviously wrong, because HGLOBAL is a HANDLE to the global data, and not a pointer to the global data (that's where I said "OHHHH!").
So how can I pass the HGLOBAL to another application?
Notice: I want to pass just a "pointer" to the data, not the data itself, like in the clipboard.
Thanks a lot! :-)
(This is just a very long comment as others have already explained that Win32 takes different approach to memory sharing.)
I would say that you are reading into books (or tutorials) on Windows programming which are quite old and obsolete as Win16 is virtually dead for quite some time.
16-bit Windows (3.x) didn't have the concept of memory isolation (or virtual /flat/ address space) that 32-bit (and later) Windows versions provide. Memory there used to be divided into local (to the process) and global sections, both living in the same global address space. Descriptors like HGLOBAL were used to allow memory blocks to be moved around in physical memory and still accessed correctly despite their new location in the address space (after proper fixation with LocalLock()/GlobalLock()). Win32 uses pointers instead since physical memory pages can be moved without affecting their location in the virtual address space. It still provides all of the Global* and Local* API functions for compatibility reasons but they should not be used anymore and usual heap management should be used instead (e.g. malloc() in C or the new operator in C++). Also several different kind of pointers existed on Win16 in order to reflect on the several different addressing modes available on x86 - near (same segment), far (segment:offset) and huge (normalised segment:offset). You can still see things like FARPTR in legacy Win16 code that got ported to Win32 but they are defined to be empty strings as in flat mode only near pointers are used.
Read the documentation. With the introduction of 32-bit processing, GlobalAlloc() does not actually allocate global memory anymore.
To share a memory block with another process, you could allocate the block with GlobalAlloc() and put it on the clipboard, then have the other process retreive it. Or you can allocate a block of shared memory using CreateFileMapping() and MapViewOfFile() instead.
Each process "thinks" that it owns the full memory space available on the computer. No process can "see" the memory space of another process. As such, normally, nothing a process stores can be seen by another process.
Because it can be necessary to pass information between processess, certain mechanisms exists to provide this functionality.
One approach is message passing; one process issues a message to another, for example over a pipe, or a socket, or by a Windows message.
Another is shared memory, where a given block of memory is made available to two or more processes, such that whatever one process writes can be seen by the others.
Don't be confused with GMEM_SHARE flag. It does not work the way you possibly supposed. From MSDN:
The following values are obsolete, but are provided for compatibility
with 16-bit Windows. They are ignored.
GMEM_SHARE
GMEM_SHARE flag explained by Raymond Chen:
In 16-bit Windows, the GMEM_SHARE flag controlled whether the memory
should outlive the process that allocated it.
To share memory with another process/application you instead should take a look at File Mappings: Memory-mapped files and how they work.
I'm making an File class that uses fstream to read/write to a file. I have no issues in terms of functionality but rather in best practice regarding the lifetime of the fstream object.
Is it better to have an fstream object stored as a member variable that gets created for each new File(path), and use that fstream over the lifetime of each File instance?
Or, for each individual function that I can call on a File instance (readBytes(), writeBytes(), exists(), isDirectory(), etc.), should I declare a local ifstream/ofstream, do what needs to be done, and, when the function exists, they go out of scope and are auto-closed?
In the first case, I fear that if I have many many files "open" there will be a penalty for having that many streams active at the same time.
In the second case, it just seems inefficient to continually create and destroy fstream objects.
Anyone with experience in the matter who can comment would be greatly appreciated!
Thanks,
Jon.
You have nailed the two issues right on the head. Generally the most efficient approach is to keep the files around (open) until you run the risk of running out of file descriptors. On some systems file descriptors aren't recycled immediately, so you need to limit your use of descriptors by closing some files before you would run out.
If you know something more about which files are read/written more often, which are only read/written in large chunks, etc. you could close down those for which the penalty of having to open them again is relatively small.