What happens if I remove shared memory of other processes? - c++

I was experimenting with shared memory and writing multiprocessing programs in C. I was using the ipcrm command to remove shared memory but I accidentally deleted shared memory not allocated by me or my program. Nothing happened except its key became 0x00000000. I was wondering if this is a dangerous thing to do because it may be critical for other processes.
On the other hand, what is the best way to store critical data so noobs like me won't crash processes?

In current POSIX standard, shared memory is just a file for each process. Let's see the interface shm_open() which set up shared memory:
/* shm_open - open a shared memory file */
int shm_open (const char *name, int oflag, mode_t mode)
{
int fd;
char shm_name[PATH_MAX+20] = "/dev/shm/";
/* skip opening slash */
if (*name == '/')
++name;
/* create special shared memory file name and leave enough space to
cause a path/name error if name is too long */
strlcpy (shm_name + 9, name, PATH_MAX + 10);
fd = open (shm_name, oflag, mode);
if (fd != -1)
{
/* once open we must add FD_CLOEXEC flag to file descriptor */
int flags = fcntl (fd, F_GETFD, 0);
if (flags >= 0)
{
flags |= FD_CLOEXEC;
flags = fcntl (fd, F_SETFD, flags);
}
/* on failure, just close file and give up */
if (flags == -1)
{
close (fd);
fd = -1;
}
}
return fd;
}
We could see that shm_open() just create a file about the information of shared memory. When one process delete shared memory opened by itself, it will not affect other processes using the shared memory. That also means, the access to shared memory is not synchronized for processes.
For the second question, only root and owner of a process can use ipcrm to delete the shared memory file of a process, so it is probably a safe operation.

Related

mmap open and read from file

I am mapping a huge file to avoid my app thrashing to main virtual memory, and to be able to run the app with more than the RAM I have. The code is c++ but partly follows old c APIs. When I work with the allocated pointer, the memory does get backed to the file as desired. However, when I run the app next time, I want the memory to be read from this same file which already has the prepared data. For some reason, on the next run, I read back all zeros. What am I doing wrong? Is it the ftruncate call? Is it the fopen call with wrong flag? Is it the mmap flags?
int64_t mmbytes=1<<36;
FILE *file = fopen(filename, "w+");
int fd = fileno(file);
int r = ftruncate(fd, mmbytes );
if (file == NULL || r){
perror("Failed: ");
throw std::runtime_error(std::strerror(errno));
} //
if ((mm = mmap(0, mmbytes,
PROT_READ | PROT_WRITE, MAP_FILE | MAP_SHARED, fd, 0)) == MAP_FAILED)
{
fprintf(stderr,"mmap error for output, errno %d\n", errno);
exit(-1);
}
}
FILE *file = fopen(filename, "w+");
I refer you to fopen's manual page, which describes "w+" as follows:
w+ Open for reading and writing. The file is created if it does
not exist, otherwise it is truncated. The stream is positioned
at the beginning of the file.
I specifically draw your attention to the "it is truncated" part. In other words, if there's anything in an existing file this ends up nuking it from high orbit.
Depending on what else you're doing "a" will work better.
Even better would be to forget fopen entirely, and simply use open:
int fd=open(filename, O_RDWR|O_CREAT, 0666);
There's your file descriptor, without jumping through any hoops. The file gets created, and left untouched if it already exists.

c++ close a open() file read with mmap

I am working with mmap() to fastly read big files, basing my script on this question answer (Fast textfile reading in c++).
I am using the second version from sehe answer :
#include <algorithm>
#include <iostream>
#include <cstring>
// for mmap:
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
const char* map_file(const char* fname, size_t& length);
int main()
{
size_t length;
auto f = map_file("test.cpp", length);
auto l = f + length;
uintmax_t m_numLines = 0;
while (f && f!=l)
if ((f = static_cast<const char*>(memchr(f, n, l-f))))
m_numLines++, f++;
std::cout << "m_numLines = " << m_numLines << "n";
}
void handle_error(const char* msg) {
perror(msg);
exit(255);
}
const char* map_file(const char* fname, size_t& length)
{
int fd = open(fname, O_RDONLY);
if (fd == -1)
handle_error("open");
// obtain file size
struct stat sb;
if (fstat(fd, &sb) == -1)
handle_error("fstat");
length = sb.st_size;
const char* addr = static_cast<const char*>(mmap(NULL, length, PROT_READ, MAP_PRIVATE, fd, 0u));
if (addr == MAP_FAILED)
handle_error("mmap");
// TODO close fd at some point in time, call munmap(...)
return addr;
}
and it works just great.
But if I implement it over a loop of several files (I just change the main() function name to:
void readFile(std::string &nomeFile) {
and then get the file content in "f" object in main() function with:
size_t length;
auto f = map_file(nomeFile.c_str(), length);
auto l = f + length;
and call it from main() on a loop over a filenames list), after a while I got:
open: Too many open files
I imagine there would be a way to close the open() call after working on a file, but I can not figure out how and where to put it exactly. I tried:
int fc = close(fd);
at the end of the readFile() function but it did change nothing.
Thanks a lot in advance for any help!
EDIT:
after the important suggestions I received I made some performance comparison with different approaches with mmap() and std::cin(), check out: fast file reading in C++, comparison of different strategies with mmap() and std::cin() results interpretation for the results
Limit to the number of concurrently open files
As you can imagine, keeping a file open consumes resources. So there is in any case a practical limit to the number of open file descriptors on your system. This is why it's highly recommended to close files that you no longer need.
The exact limit depends on the OS and the configuration. If you want to know more, there are already a lot of answers available for this kind of question.
Special case of mmap
Obviously, with mmap() you open a file. And doing so repetitively in a loop risk to reach sooner or later the fatal file description limit, as you could experience.
The idea of trying to close the file is not bad. The problem is that it does not work. This is specified in the POSIX documentation:
The mmap() function adds an extra reference to the file associated
with the file descriptor fildes which is not removed by a subsequent
close() on that file descriptor. This reference is removed when there
are no more mappings to the file.
Why ? Because mmap() links the file in a special way to the virtual memory management in your system. And this file will be needed as long as you use the address range to which it was allocated.
So how to remove those mappings ? The answer is to use munmap():
The function munmap() removes any mappings for those entire pages
containing any part of the address space of the process starting at
addr and continuing for len bytes.
And of course, close() the file descriptor that you no longer need. A prudent approach would be to close after munmap(), but in principle, at least on a POSIX compliant system, it should not matter when you're closing. Nevertheless, check your latest OS documentation to be on the safe side :-)
*Note: file mapping is also available on windows; the documentation about closing the handles is ambiguous on potential memory leaks if there are remaining mappings. This is why I recommend prudence on the closing moment. *

File read() hangs on binary large file

I'm working on a benchmark program. Upon making the read() system call, the program appears to hang indefinitely. The target file is 1 GB of binary data and I'm attempting to read directly into buffers that can be 1, 10 or 100 MB in size.
I'm using std::vector<char> to implement dynamically-sized buffers and handing off &vec[0] to read(). I'm also calling open() with the O_DIRECT flag to bypass kernel caching.
The essential coding details are captured below:
std::string fpath{"/path/to/file"};
size_t tries{};
int fd{};
while (errno == EINTR && tries < MAX_ATTEMPTS) {
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
// Throw exception if error opening file
if (fd == -1) {
ostringstream ss {};
switch (errno) {
case EACCES:
ss << "Error accessing file " << fpath << ": Permission denied";
break;
case EINVAL:
ss << "Invalid file open flags; system may also not support O_DIRECT flag, required for this benchmark";
break;
case ENAMETOOLONG:
ss << "Invalid path name: Too long";
break;
case ENOMEM:
ss << "Kernel error: Out of memory";
}
throw invalid_argument {ss.str()};
}
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
Poking through the executable with gdb shows that buffers are allocated correctly, and the file I've tested with checks out in xxd. I'm using g++ 7.3.1 (with C++11 support) to compile my code on a Fedora Server 27 VM.
Why is read() hanging on large binary files?
Edit: Code example updated to more accurately reflect error checking.
There are multiple problems with your code.
This code will never work properly if errno ever has a value equal to EINTR:
while (errno == EINTR && tries < MAX_ATTEMPTS) {
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
That code won't stop when the file has been successfully opened and will keep reopening the file over and over and leak file descriptors as it keeps looping once errno is EINTR.
This would be better:
do
{
fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
tries++;
}
while ( ( -1 == fd ) && ( EINTR == errno ) && ( tries < MAX_ATTEMPTS ) );
Second, as noted in the comments, O_DIRECT can impose alignment restrictions on memory. You might need page-aligned memory:
So
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
becomes
size_t buf_sz{1024*1024}; // 1 MiB buffer
// page-aligned buffer
buffer = mmap( 0, buf_sz, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, NULL );
auto bytes_read = read(fd, &buffer[0], buf_sz);
Note also the the Linux implementation of O_DIRECT can be very dodgy. It's been getting better, but there are still potential pitfalls that aren't very well documented at all. Along with alignment restrictions, if the last amount of data in the file isn't a full page, for example, you may not be able to read it if the filesystem's implementation of direct IO doesn't allow you to read anything but full pages (or some other block size). Likewise for write() calls - you may not be able to write just any number of bytes, you might be constrained to something like a 4k page.
This is also critical:
Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.
Some devices simply do not support direct IO. They should return an error, but again, the O_DIRECT implementation on Linux can be very hit-or-miss.
Pasting your program and running on my linux system, was a working and non-hanging program.
The most likely cause for the failure is the file is not a file-system item, or it has a hardware element which is not working.
Try with a smaller size - to confirm, and try on a different machine to help diagnose
My complete code (with no error checking)
#include <vector>
#include <string>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
int main( int argc, char ** argv )
{
std::string fpath{"myfile.txt" };
auto fd = open(fpath.c_str(), O_RDONLY | O_DIRECT | O_LARGEFILE);
size_t buf_sz{1024*1024}; // 1 MiB buffer
std::vector<char> buffer(buf_sz); // Creates vector pre-allocated with buf_sz chars (bytes)
// Result is 0-filled buffer of size buf_sz
auto bytes_read = read(fd, &buffer[0], buf_sz);
}
myfile.txt was created with
dd if=/dev/zero of=myfile.txt bs=1024 count=1024
If the file is not 1Mb in size, it may fail.
If the file is a pipe, it can block until the data is available.
Most examples of read() hanging appear to be when using pipes or non-standard I/O devices (e.g., serial). Disk I/O, not so much.
O_DIRECT flag is useful for filesystems and block devices. With this flag people normally map pages into the user space.
For sockets, pipes and serial devices it is plain useless because the kernel does not cache that data.
Your updated code hangs because fd is initialized with 0 which is STDIN_FILENO and it never opens that file, then it hangs reading from stdin.

Shared Memory on Linux without ftruncate & physical files?

When I want to map some shared memory in linux, I do:
hFileMap = open(MapName, O_RDWR | O_CREAT, 438);
pData = mmap(NULL, Size, PROT_READ | PROT_WRITE, MAP_FILE | MAP_SHARED, hFileMap, 0);
and it works just fine. It maps the memory properly. However three things arise that I don't like.
It creates a physical file on the disc. I'd have to remove this file manually or using remove function. I like that on Windows there is no physical file unless I map a file physically myself. I'd like to do the same on linux.
I have to use ftruncate to set the length of the file. Otherwise memcpy will segfault when copying data into the file. I mean, it doesn't make much sense for the file to have 0 space when I had to specify the size to mmap in the first place..
The size is fixed. I don't need it resizing so there should be no need for ftruncate?
Is there anyway at all to map memory without a physical file and still have other processes be able to access it? What would be the disadvantages to a solution?
I don't really care too much about the ftruncate but is there a way to also remove the call? It just bothers me a tiny bit that I have to do this when I don't have to on Windows.
shm_open will still create a file in the file system to represent the shared memory object.
You can call mmap with map_anonymous and map_shared, and it will not create any files. However, the other processes must be children of the current process, and mmap must be setup before fork is called.
If that won't work then shm_open is your best bet.
You can use shm_open to create a shared memory region. The following code segment demonstrates the use of shm_open() to create a shared memory object which is then sized using ftruncate() before being mapped into the process address space using mmap():
#include <unistd.h>
#include <sys/mman.h>
...
#define MAX_LEN 10000
struct region { /* Defines "structure" of shared memory */
int len;
char buf[MAX_LEN];
};
struct region *rptr;
int fd;
/* Create shared memory object and set its size */
fd = shm_open("/myregion", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
if (fd == -1)
/* Handle error */;
if (ftruncate(fd, sizeof(struct region)) == -1)
/* Handle error */;
/* Map shared memory object */
rptr = mmap(NULL, sizeof(struct region),
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (rptr == MAP_FAILED)
/* Handle error */;
/* Now we can refer to mapped region using fields of rptr;
for example, rptr->len */
...
ftruncate is there to set the size of the file. So if you don't like to call it, you can manually write 0 bytes to fill up the file.

Creating a file of arbitrary size using Windows C++ API

I would like to create a file of arbitrary size using the Windows C/C++ API. I am using Windows XP service pack 2 with a 32 bit virtual address memory space. I am familiar with CreateFile.
However CreateFile does not have a size arument, The reason I want to pass in a size argument is to allow me to create memory mapping files which allow the user to access data structures of predetermined size. Could you please advise of the proper Windows C/C++ API function which allow me to create a file of arbritrary predetermined size? Thank you
You CreateFile as usual, SetFilePointerEx to the desired size and then call SetEndOfFile.
To do this on UNIX, seek to (RequiredFileSize - 1) and then write a byte. The value of the byte can be anything, but zero is the obvious choice.
You don't need a file, you can use the pagefile as the backing for your memory mapped file, from the MSDN CreateFileMapping function page:
If hFile is INVALID_HANDLE_VALUE, the calling process must also specify a size for the file mapping object in the dwMaximumSizeHigh and dwMaximumSizeLow parameters. In this scenario, CreateFileMapping creates a file mapping object of a specified size that is backed by the system paging file instead of by a file in the file system.
You can still share the mapping object by use of DuplicateHandle.
according to your comments, you actually need cross-platform solution, so check Boost Interprocess library. it provides cross-platform shared memory facilities and more
to do this on Linux, you can do the following:
/**
* Clear the umask permissions so we
* have full control of the file creation (see man umask on Linux)
*/
mode_t origMask = umask(0);
int fd = open("/tmp/file_name",
O_RDWR, 00666);
umask(origMask);
if (fd < 0)
{
perror("open fd failed");
return;
}
if (ftruncate(fd, size) == 0)
{
int result = lseek(data->shmmStatsDataFd, size - 1, SEEK_SET);
if (result == -1)
{
perror("lseek fd failed");
close(fd);
return ;
}
/* Something needs to be written at the end of the file to
* have the file actually have the new size.
* Just writing an empty string at the current file position will do.
*newDataSize
* Note:
* - The current position in the file is at the end of the stretched
* file due to the call to lseek().
* - An empty string is actually a single '\0' character, so a zero-byte
* will be written at the last byte of the file.
*/
result = data->write(fd, "", 1);
if (result != 1)
{
perror("write fd failed");
close(fd);
return;
}
}