Shared memory or named pipes in ram?

Shared memory or named pipes in ram? - c++

I want to communicate between two different programs. A modded ambilight program which outputs led information and my own program that reads this information.
I read about named pipes and shared memory. But for me it is not clear where the data is stored. Due to the fact that i will exchange a lot of data i do not want to write this data to disk every time. I am using a raspberry Pi and the sd card should last for some more time ;)
So the basic question is: with what methode can i exchange information to the other end without writing to the disk? I am not sure if shared memory is written to ram, i want to make this clear.
As another idea i read about is /dev/shm which should be a ram disk. Can i also use named pipes for this location and will the information than be saved in ram?
whats the best way to go? thanks :)

I read about named pipes and shared memory. But for me it is not clear
where the data is stored.
In both cases, data is stored in memory (named pipes look like they reside on filesystem, but actual data is stored on memory).
What method is better, it depends on actual application. Pipes have fairly limited buffer (most likely 64kb) and writing to it will block when buffer is full. Shared memory can be arbitrarily large, but on the downside, shared memory is, well, just like that - plain memory. You have to take care about synchronization etc yourself.

Shared memory and named pipes (and unix domain sockets) IPC won't write to your sdcard unless you allocate more memory than the available physical RAM which is either 256MB or 512MB depending on your raspberrypi model. If you do so it will start swapping and will probably slow down.

Related

C++ memory mapped files [duplicate]

POSIX environments provide at least two ways of accessing files. There's the standard system calls open(), read(), write(), and friends, but there's also the option of using mmap() to map the file into virtual memory.
When is it preferable to use one over the other? What're their individual advantages that merit including two interfaces?

mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.
mmap also allows the operating system to optimize paging operations. For example, consider two programs; program A which reads in a 1MB file into a buffer creating with malloc, and program B which mmaps the 1MB file into memory. If the operating system has to swap part of A's memory out, it must write the contents of the buffer to swap before it can reuse the memory. In B's case any unmodified mmap'd pages can be reused immediately because the OS knows how to restore them from the existing file they were mmap'd from. (The OS can detect which pages are unmodified by initially marking writable mmap'd pages as read only and catching seg faults, similar to Copy on Write strategy).
mmap is also useful for inter process communication. You can mmap a file as read / write in the processes that need to communicate and then use synchronization primitives in the mmap'd region (this is what the MAP_HASSEMAPHORE flag is for).
One place mmap can be awkward is if you need to work with very large files on a 32 bit machine. This is because mmap has to find a contiguous block of addresses in your process's address space that is large enough to fit the entire range of the file being mapped. This can become a problem if your address space becomes fragmented, where you might have 2 GB of address space free, but no individual range of it can fit a 1 GB file mapping. In this case you may have to map the file in smaller chunks than you would like to make it fit.
Another potential awkwardness with mmap as a replacement for read / write is that you have to start your mapping on offsets of the page size. If you just want to get some data at offset X you will need to fixup that offset so it's compatible with mmap.
And finally, read / write are the only way you can work with some types of files. mmap can't be used on things like pipes and ttys.

One area where I found mmap() to not be an advantage was when reading small files (under 16K). The overhead of page faulting to read the whole file was very high compared with just doing a single read() system call. This is because the kernel can sometimes satisify a read entirely in your time slice, meaning your code doesn't switch away. With a page fault, it seemed more likely that another program would be scheduled, making the file operation have a higher latency.

mmap has the advantage when you have random access on big files. Another advantage is that you access it with memory operations (memcpy, pointer arithmetic), without bothering with the buffering. Normal I/O can sometimes be quite difficult when using buffers when you have structures bigger than your buffer. The code to handle that is often difficult to get right, mmap is generally easier. This said, there are certain traps when working with mmap.
As people have already mentioned, mmap is quite costly to set up, so it is worth using only for a given size (varying from machine to machine).
For pure sequential accesses to the file, it is also not always the better solution, though an appropriate call to madvise can mitigate the problem.
You have to be careful with alignment restrictions of your architecture(SPARC, itanium), with read/write IO the buffers are often properly aligned and do not trap when dereferencing a casted pointer.
You also have to be careful that you do not access outside of the map. It can easily happen if you use string functions on your map, and your file does not contain a \0 at the end. It will work most of the time when your file size is not a multiple of the page size as the last page is filled with 0 (the mapped area is always in the size of a multiple of your page size).

In addition to other nice answers, a quote from Linux system programming written by Google's expert Robert Love:
Advantages of mmap( )
Manipulating files via mmap( ) has a handful of advantages over the
standard read( ) and write( ) system calls. Among them are:
Reading from and writing to a memory-mapped file avoids the
extraneous copy that occurs when using the read( ) or write( ) system
calls, where the data must be copied to and from a user-space buffer.
Aside from any potential page faults, reading from and writing to a memory-mapped file does not incur any system call or context switch
overhead. It is as simple as accessing memory.
When multiple processes map the same object into memory, the data is shared among all the processes. Read-only and shared writable
mappings are shared in their entirety; private writable mappings have
their not-yet-COW (copy-on-write) pages shared.
Seeking around the mapping involves trivial pointer manipulations. There is no need for the lseek( ) system call.
For these reasons, mmap( ) is a smart choice for many applications.
Disadvantages of mmap( )
There are a few points to keep in mind when using mmap( ):
Memory mappings are always an integer number of pages in size. Thus, the difference between the size of the backing file and an
integer number of pages is "wasted" as slack space. For small files, a
significant percentage of the mapping may be wasted. For example, with
4 KB pages, a 7 byte mapping wastes 4,089 bytes.
The memory mappings must fit into the process' address space. With a 32-bit address space, a very large number of various-sized mappings
can result in fragmentation of the address space, making it hard to
find large free contiguous regions. This problem, of course, is much
less apparent with a 64-bit address space.
There is overhead in creating and maintaining the memory mappings and associated data structures inside the kernel. This overhead is
generally obviated by the elimination of the double copy mentioned in
the previous section, particularly for larger and frequently accessed
files.
For these reasons, the benefits of mmap( ) are most greatly realized
when the mapped file is large (and thus any wasted space is a small
percentage of the total mapping), or when the total size of the mapped
file is evenly divisible by the page size (and thus there is no wasted
space).

Memory mapping has a potential for a huge speed advantage compared to traditional IO. It lets the operating system read the data from the source file as the pages in the memory mapped file are touched. This works by creating faulting pages, which the OS detects and then the OS loads the corresponding data from the file automatically.
This works the same way as the paging mechanism and is usually optimized for high speed I/O by reading data on system page boundaries and sizes (usually 4K) - a size for which most file system caches are optimized to.

An advantage that isn't listed yet is the ability of mmap() to keep a read-only mapping as clean pages. If one allocates a buffer in the process's address space, then uses read() to fill the buffer from a file, the memory pages corresponding to that buffer are now dirty since they have been written to.
Dirty pages can not be dropped from RAM by the kernel. If there is swap space, then they can be paged out to swap. But this is costly and on some systems, such as small embedded devices with only flash memory, there is no swap at all. In that case, the buffer will be stuck in RAM until the process exits, or perhaps gives it back withmadvise().
Non written to mmap() pages are clean. If the kernel needs RAM, it can simply drop them and use the RAM the pages were in. If the process that had the mapping accesses it again, it cause a page fault the kernel re-loads the pages from the file they came from originally. The same way they were populated in the first place.
This doesn't require more than one process using the mapped file to be an advantage.

how to cache 1000s of large C++ objects

Environment:
Windows 8 64 bit, Windows 2008 server 64 bit
Visual Studio (professional) 2012 64 bits
list L; //I have 1000s of large CMyObject in my program that I cache, which is shared by different threads in my windows service program.
For our SaaS middleware product, we cache in memory 1000s of large C++ objects (read only const objects, each about 4MB in size), which runs the system out of memory. Can we associate a disk file (or some other persistent mechanism that is OS managed) to our C++ objects? There is no need for sharing / inter-process communication.
The disk file will suffice if it works for the duration of the process (our windows service program). The read-only const C++ objects are shared by different threads in the same windows service.
I was even considering using object databases (like mongoDB) to store the objects, which will then be loaded / unloaded at each use. Though faster than reading our serialized file (hopefully), it will still spoil the performance.
The purpose is to retain caching of C++ objects for performance reason and avoid having to load / unload the serialized C++ object every time. It would be great if this disk file is OS managed and requires minimal tweaking in our code.
Thanks in advance for your responses.

The only thing which is OS managed in the manner you describe is swap file. You can create a separate application (let it be called "cache helper"), which loads all the objects into memory and waits for requests. Since it does not use it's memory pages, OS will eventually displace the pages to the swap file, recalling it only if/when needed.
Communication with the applciation can be done through named pipes or sockets.
Disadvantages of such approach are that the performance of such cache will be highly volatile, and it may degrade performance of the whole server.
I'd recommend to write your own caching algorithm/application, as you may later need to adjust its properties.

One solution is of course to simply load every object, and let the OS deal with swapping it in from/out to disk as required. (Or dynamically load, but never discard unless the object is absolutely being destroyed). This approach will work well if there are are number of objects that are more frequently used than others. And the loading from swapspace is almost certainly faster than anything you can write. The exception to this is if you do know beforehand what objects are more likely or less likely to be used next, and can "throw out" the right objects in case of low memory.
You can certainly also use a memory mapped file - this will allow you to read from and write to the file as if it was memory (and the OS will cache the content in RAM as memory is available). On WIndows, you will be using CreateFileMapping or OpenFileMapping to create/open the filemapping, and then MapViewOfFile to map the file into memory. When finished, use UnmapViewOfFile to "unmap" the memory, and then CloseHandle to close the FileMapping.
The only worry about a filemapping is that it may not appear at the same address in memory next time around, so you can't have pointers within the filemapping and load the same data as binary next time. It would of course work fine to create a new filemapping each time.

So your thousands of massive objects have constructor, destructor, virtual functions and pointers. This means you can't easily page them out. The OS can do it for you though, so your most practical approach is simply to add more physical memory, possibly an SSD swap volume, and use that 64-bit address space. (I don't know how much is actually addressable on your OS, but presumably enough to fit your ~4G of objects).
Your second option is to find a way to just save some memory. This might be using a specialized allocator to reduce slack, or removing layers of indirection. You haven't given enough information about your data for me to make concrete suggestions on this.
A third option, assuming you can fit your program in memory, is simply to speed up your deserialization. Can you change the format to something you can parse more efficiently? Can you somehow deserialize objects quickly on-demand?
The final option, and the most work, is to manually manage a swapfile. It would be sensible as a first step to split your massive polymorphic classes into two: a polymorphic flyweight (with one instance per concrete subtype), and a flattened aggregate context structure. This aggregate is the part you can swap in and out of your address space safely.
Now you just need a memory-mapped paging mechanism, some kind of cache tracking which pages are currently mapped, possibly a smart pointer replacing your raw pointer with a page+offset which can map data in on-demand, etc. Again, you haven't given enough information on your data structure and access patterns to make more detailed suggestions.

Is memory-mapped memory possible?

I know that is possible to use memory-mapped files i.e. real files on disk that are transparently mapped to memory. As far as I understand (I haven't used these yet) the mapping takes place immediately, the file is partly read on the first memory access while the OS starts "caching" the whole file in the background.
Now: Is it possible to somewhat abuse this concept and memory-map another block of memory? Assuming the OS provides such indirection one could create a kind of compressed_malloc() that returns a mapping from memory to memory. The memory returned to the caller is simple the memory-mapped range that is transparently compressed in memory and also eventually kept in memory. Thus, for large buffers it could be possible that only part of it get decompressed on-the-fly (on access) while the remaining blocks are kept compressed.
Is that concept technically possible at the moment or - if already realized (in software) - what are the things to look at?
Update 1: I am more or less looking for something that is technically achievable without modifying the OS kernel itself or which requires a virtualization platform.
Update 2: I am hoping for something which allows me to implement the compression and related logic in my own user-space code. I would just use the facilities of the operating system to create the memory-mapping.

Very much so. The VM (Virtual Memory) system is designed to handle different kinds of objects that can be mapped. There is in fact a filesystem call cramfs that does something similar in the sense that it keeps compressed data in storage, but enables transparent, uncompressed access.
You would not be modifying the kernel per se, but you will have to work in the kernel space, implementing VM handlers for this new kind of a memory mapped object.

This is possible, eg.
http://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.resourcemanagement.doc_41/managing_memory_resources/c_memory_compression.html
It is not correctly implemented in kernel space in Linux, but something like this could be implemented in user space.

Swapping objects out to file

My C++ application occasionally runs out of memory due to large amounts of data being retrieved from a database. It has to run on 32bit WinXP machines.
Is it possible to transparently (for most of the existing code) swap out the data objects to disk and read them into memory only on demand, so I'm not limited to the 2GB that 32bit Windows gives to the process?
I've looked at VirtualAlloc and Address Window Extensions but I'm not sure it's what I want.
I also found this SO question where the questioner creates a file mapping and wants to create objects in there. One answer suggests using placement new which sounds like it would be pretty transparent to the rest of the code.
Will this prevent my application to run out of physical memory? I'm not entirely sure of it because after all there is still the 32bit address space limit. Or is this a different kind of problem that will occur when trying to create a lot of objects?

So long as you are using a 32-bit operating system there is nothing you can do about this. There is no way to have more than 3GB (2GB in the case of Windows) of data in virtual memory, whether or not it's actually swapped out to disk.
Historically databases have always handled this problem by using read, write and seek. So rather than accessing data directly from memory, they use a fake (64-bit) pointer. Data is split into blocks (normally around 4kb), and a number of these blocks are allocated in memory. When they want to access data from a fake pointer address they check if the block is loaded into memory and if it is they access it from there. If it is not then they find an empty slot and copy it in, then return the address. If there are no slots free then a piece of data will be written back out to disk (if it's been modified) and that slot will be reused.
The real beauty of this is that if your system has enough RAM then the operating system will cache much more than 2GB of this data in RAM at any point in time, and when you feel like you are actually reading and writing from disk the operating system will probably just be copying data around in memory. This, of course, requires a 32-bit operating system that support more than 3GB of physical memory, such as Linux or Windows Server with PAE.
SQLite has a nice self-contained implementation of this, which you could probably make use of with little effort.
If you do not wish to do this then your only alternatives are to either use a 64-bit operating system or to work with less data at any given point in time.

Memory Based Data Server for local IPC

I am going to be running an app(s) that require about 200MB of market data each time it runs.
This is trivial amount of data to store in memory these days, so for speed thats what i want to do.
Over the course of a days session I will probably run, re-run, re-write and re-run etc etc one or more applications over and over.
SO, the question is how to hold the data in memory all day such that even if the app crashes I do not have to reload the data by opening the data file on disk and re-loading the data?
My initial idea is to write a data server app that does nothing more than read the data into shared memory so that it is available for use. If I do that I guess I could use memory mapping for the IPC by calling
CreateFile()
CreateFileMapping()
MapViewOfFile()
Is there a better IPC/approach?

If you have enough memory and nothing else asks for memory, that might reduce your startup time. To guarantee access to the memory, you probably want to have a memory mapped file in named shared memory, as described here. You can have a simple program create the share and manage it so you can guarantee it remains in memory.

Just memory map the data file. Unless your computer is low on memory, the file will stay in file cache even when the program exits. The next time it starts up, access will be fast.
If your in-memory data is different from the on-disk data, just use two files. On restart, check a timestamp and a file revision written into the memory file to compare to the disk file and that way your program will know which one has the most recent data.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js