I need to create a shared memory segment which contains some secret data. I use shmget and shmat functions to access the segment with 0600 rights. I want to share this segment of memory with forked processes only. I tried to create another application which tried to access this segment and it was unsuccessful, so it looks like it's working as I want.
But when I run the application which created the segment again, it can access the segment. How is it possible? Is a good idea to store secret data into shared memory?
You can mmap() shared and anonymous memory region by providing MAP_SHARED and MAP_ANONYMOUS flags in parent process. That memory will be accessible only to that process and its children. As memory segment is anonymous, no other processes will be able to refer to it, let alone access/map it:
void *shared_mem = mmap(NULL, n_bytes, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
Parent process should create shared memory segment using mmap(). That memory segment is inherited by any child process created by fork(). Child process can simply use shared_mem pointer inherited from parent to refer to that memory segment:
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
int main()
{
void *shared_mem = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
pid_t pid = fork();
if (pid > 0) {
// parent
// use shared_mem here
} else if (pid == 0) {
// child
// use shared_mem here
} else {
// error
}
return 0;
}
The shared seg doesn't belong to a process, it belongs to a user. Effectively settings 0600 only allows that user for RW (and root), however any other process running as this user will have the same access.
Create a specific user, to be "used" (logged in) only for this purpose.
Is it a good idea to have secret data in a shared memory segment?
Think of the segment as a file - maybe a bit less easy to access (need to know IPC) - except it will disappear when the system shuts down.
Is it a good idea to store secrets in a file? Maybe not if the data is clear text.
In a file or in a shared mem segment, data encryption would be an improvement.
See this page for detailed explanations on how you can control a shmem segment.
OTOH if what you need is only to have a process exchanges information with her children, see processes piping. In this case the secret data is stored within the processes heap/stack memory, and is more difficult to reach by external processes owned by the same user. But the user "owning" the process may also read a process memory (via a core dump for instance) and search for the secret data. Much less easier, but still possible.
Note that in this case, if the secret data is available in the parent process before fork() is performed, children will automatically inherit of it.
Again, anyway, think encryption.
Related
I am implementing a reader of huge compressed raster files. Decompression is performed partially on the fly. Only requested regions of the raster are decompressed and stored in memory cache. Reader works similarly as memory mapping of a file but the data is not mapped to memory 1:1, it is decompressed.
It is implemented using anonymous memory mapping:
char* raster_cache = static_cast<char*>(mmap(0, UNCOMPRESSED_RASTER_SIZE, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0));
Reading of an area which is not cached yet emits segmentation violation signal which is caught and handled using libsigsegv (see my previous question):
struct CacheHandlerData
{
std::mutex mutex;
// other data needed for decompression
};
int cache_sigsegv_handler(void* fault_address, void* user_data)
{
void* page_address = reinterpret_cast<void*>(reinterpret_cast<uintptr_t>(fault_address) & ~(PAGE_SIZE - 1));
CacheHandlerData* data = static_cast<CacheHandlerData*>(user_data);
std::lock_guard<std::mutex> lock(data->mutex);
unsigned char cached = 0;
mincore(page_address, 1, &cached);
if (!cached)
{
mprotect(page_address, PAGE_SIZE, PROT_WRITE);
// decompress whole page
mprotect(page_address, PAGE_SIZE, PROT_READ);
}
return 1;
}
The problem is that cached pages stay in memory forever. Because i write to the pages, they are marked as dirty and never invalidated.
QUESTION: Is there some possibility to mark pages as not dirty?
In case the system is running out of memory, the pages would be removed from memory similarly to a normal disk cache. It would also be needed to call mprotect(page_address, PAGE_SIZE, PROT_NONE) for the removed pages in order to cause a segmentation violation when the page is accessed again.
Thank you.
EDIT: I could use temporary file backed mapping instead of anonymous one. Pages would be swapped to disk in case the system is out of memory. But this solution loses benefits from using compressed data (smaller disk size, probably faster reading).
I read in a vector as in:
int readBytes(string filename, vector<uint32_t> &v)
{
// fstat file, get filesize, etc.
uint32_t *filebuf = (uint32_t*)mmap(0,filesize,PROT_READ,
MAP_FILE|MAP_PRIVATE,
fhand,0);
v = std::vector<uint32_t>(filebuf,filebuf+numrecords);
munmap(filebuf, filesize);
}
in main() I have two successive calls (purely as a test):
vector<uint32_t> v(10000);
readBytes(filename, v);
readBytes(filename, v);
// ...
The second call almost always gives a faster clock time:
Profile time [1st call]: 0.000214141 sec
Profile time [2nd call]: 0.000094109 sec
A look at the system calls indicates the memory chunks are differend:
mmap(NULL, 40000, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe843ac8000
mmap(NULL, 40000, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7fe843ac7000
Why is the second call faster? Coincidence? What, if anything, is cached?
Assuming you're talking about something *NIX-ish, there's probably a page cache, whose job is precisely to cache this sort of data to get this speedup. Unless something else came along between calls to evict those pages from the cache, they'll still be there.
So, the first call potentially has to:
allocate pages
map the pages into your process address space
copy the data from those pages into your vector (possibly faulting the data from disk as it goes)
the second call probably finds the pages still in the cache, and only has to:
map the pages into your process address space
copy the data from those pages into your vector (they're pre-faulted this time, so it's a simple memory operation)
In fact, I've skipped a step: the open/fstat step in your comment is probably also accelerated, via the inode cache.
Remember that your program sees virtual memory. There is a mapping table ("page tables") that maps virtual addresses seen by your program to the real physical memory. And the OS will ensure that the two mmap() calls map two different virtual addresses seen by your program to the same physical memory. So the data only has to be loaded from disk once.
More detal:
First mmap(): OS just records the mapping
When you actually try to read the data: A "page fault" happens, since the data isn't in memory. The OS catches that, reads data from disk to its disk cache, and updates the page tables so that your program can read directly from that disk cache, then it resumes your program automatically.
First munmap(): OS disables the mapping, and updates your page tables so you can't read the file any more. Note that the file is still in the OS's disk cache.
Second mmap(): OS just records the mapping
When you actually try to read the data: A "page fault" happens, since the data isn't mapped. The OS catches that, notices that the data is already in its disk cache, and updates the page tables so that your program can read directly from that disk cache, then it resumes your program automatically.
Second munmap(): OS disables the mapping, and updates your page tables so you can't read the file any more. Note that the file is still in the OS's disk cache.
I need to know how many instances of my process are running on a local Windows system. I need to be able to do it using C++/MFC/WinAPIs. So what is a reliable method to do this?
I was thinking to use process IDs for that, stored as a list in a shared memory array that can be accessed by the process. But the question is, when a process is closed or crashes how soon will its process ID be reused?
The process and thread identifiers may be reused any time after closure of all handles. See When does a process ID become available for reuse? for more information on this.
However if you are going to store a pair of { identifier, process start time } you can resolve these ambiguities and detect identifier reuse. You can create a named file mapping to share information between the processes, and use IPC to synchronize access to this shared data.
You can snag the process handles by the name of the process using the method described in this question. It's called Process Walking. That'll be more reliable than process id's or file paths.
A variation of this answer is what you're looking for. Just loop through the processes with Process32Next, and look for processes with the same name using MatchProcessName. Unlike the example in the link I provided, you'll be looking to count or create a list of the processes with the same name, but that's a trivial addition.
If you are trying to limit the number of instances of your process to some number you can use a Semaphore.
You can read in detail here:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686946(v=vs.85).aspx
In a nutshell, the semaphore is initialized with a current count and max count. Each instance of your process will decrement the count when it acquires the semaphore. When the nth process tries to acquire it but the count has reached zero that process will fail to acquire it and can terminate or take appropriate action.
The following code should give you the gist of what you have to do:
#include <windows.h>
#include <stdio.h>
// maximum number of instances of your process
#define MAX_INSTANCES 10
// name shared by all your processes. See http://msdn.microsoft.com/en-us/library/windows/desktop/aa382954(v=vs.85).aspx
#define SEMAPHORE_NAME "Global\MyProcess"
// access rights for semaphore, see http://msdn.microsoft.com/en-us/library/windows/desktop/ms686670(v=vs.85).aspx
#define MY_SEMAPHORE_ACCESS SEMAPHORE_ALL_ACCESS
DWORD WINAPI ThreadProc( LPVOID );
int main( void )
{
HANDLE semaphore;
// Create a semaphore with initial and max counts of MAX_SEM_COUNT
semaphore = CreateSemaphore(
NULL, // default security attributes
MAX_INSTANCES, // initial count
MAX_INSTANCES, // maximum count
SEMAPHORE_NAME );
if (semaphore == NULL)
{
semaphore = OpenSemaphore(
MY_SEMAPHORE_ACCESS,
FALSE, // don't inherit the handle for child processes
SEMAPHORE_NAME );
if (semaphore == NULL)
{
printf("Error creating/opening semaphore: %d\n", GetLastError());
return 1;
}
}
// acquire semaphore and decrement count
DWORD acquireResult = 0;
acquireResult = WaitForSingleObject(
semaphore,
0L); // timeout after 0 seconds trying to acquire
if(acquireResult == WAIT_TIMEOUT)
{
printf("Too many processes have the semaphore. Exiting.");
CloseHandle(semaphore);
return 1;
}
// do your application's business here
// now that you're done release the semaphore
LONG prevCount = 0;
BOOL releaseResult = ReleaseSemaphore(
semaphore,
1, // increment count by 1
&prevCount );
if(!releaseResult)
{
printf("Error releasing semaphore");
CloseHandle(semaphore);
return 1;
}
printf("Semaphore released, prev count is %d", prevCount);
CloseHandle(semaphore);
return 0;
}
Well, your solution is not very reliable. PIDs can be reused by the OS at any later time.
I did it once by going through all the processes and comparing their command line string (the path of the executable) with the one for my process. Works pretty well.
Extra care should be taken for programs that are started via batch files (like some java apps/servers).
Other solutions involve IPC, maybe through named pipes, sockets, shared memory (as you mentioned). But none of them are that easy to implement and maintain.
is there a way for a forked child to examine another forked child so that, if the other forked child takes more time than usual to perform its chores, the first child may perform predefined steps?
if so, sample code will be greatly appreciated.
Yes. Simply fork the process to be watched, from the process to watch it.
if (fork() == 0) {
// we are the watcher
pid_t watchee_pid = fork();
if (watchee_pid != 0) {
// wait and/or handle timeout
int status;
waitpid(watchee_pid, &status, WNOHANG);
} else {
// we're being watched. do stuff
}
} else {
// original process
}
To emphasise: There are 3 processes. The original, the watcher process (that handles timeout etc.) and the actual watched process.
To do this, you'll need to use some form of IPC, and named shared memory segments makes perfect sense here. Your first child could read a value in a named segment which the other child will set once it has completed it's work. Your first child could set a time out and once that time out expires, check for the value - if the value is not set, then do what you need to do.
The code can vary greatly depending on C or C++, you need to select which. If C++, you can use boost::interprocess for this - which has lots of examples of shared memory usage. If C, then you'll have to put this together using native calls for your OS - again this should be fairly straightforward - start at shmget()
This is some orientative code that could help you to solve the problem in a Linux environment.
pid_t pid = fork();
if (pid == -1) {
printf("fork: %s", strerror(errno));
exit(1);
} else if (pid > 0) {
/* parent process */
int i = 0;
int secs = 60; /* 60 secs for the process to finish */
while(1) {
/* check if process with pid exists */
if (exist(pid) && i > secs) {
/* do something accordingly */
}
sleep(1);
i++;
}
} else {
/* child process */
/* child logic here */
exit(0);
}
... those 60 seconds are not very strict. you could better use a timer if you want more strict timing measurement. But if your system doesn't need critical real time processing should be just fine like this.
exist(pid) refers to a function that you should have code that looks into proc/pid where pid is the process id of the child process.
Optionally, you can implement the function exist(pid) using other libraries designed to extract information from the /proc directory like procps
The only processes you can wait on are your own direct child processes - not siblings, not your parent, not grandchildren, etc. Depending on your program's needs, Matt's solution may work for you. If not, here are some other alternatives:
Forget about waiting and use another form of IPC. For robustness, it needs to be something where unexpected termination of the process you're waiting on results in your receiving an event. The best one I can think of is opening a pipe which both processes share, and giving the writing end of the pipe to the process you want to wait for (make sure no other processes keep the writing end open!). When the process holding the writing end terminates, it will be closed, and the reading end will then indicate EOF (read will block on it until the writing end is closed, then return a zero-length read).
Forget about IPC and use threads. One advantage of threads is that the atomicity of a "process" is preserved. It's impossible for individual threads to be killed or otherwise terminate outside of the control of your program, so you don't have to worry about race conditions with process ids and shared resource allocation in the system-global namespace (IPC objects, filenames, sockets, etc.). All synchronization primitives exist purely within your process's address space.
I have a Linux System V IPC shared memory segment that is populated by one process and read by many others. All the processes use interface to the shared memory segment in the form of a class which takes care of looking up, attaching, and detaching to the segment as part of its constructor/destructor methods.
The problem here is that from time to time I'm seeing that the segment has "split". What I mean here is that looking in the "ipcs -m -s" output I see that I've got two segments listed: one which has been marked for destruction but still has some processes attached to it, and a second which appears to get all new attempts to attach to the segment. However, I'm never actually asking the kernel to destroy the segment. What's happening here?!
One other thing to note is that unfortunately the system this is running on is seriously overcommited in the memory department. There is 1 GB of physical memory, no swap, and the Committed_AS in /proc/meminfo is reporting about 2.5GB of commited memory. Fortunately the system processes are not actually using this much memory... they're just asking for it (I still have about 660MB "free" memory as reported by vmstat). While I know this is far from ideal, for the time being there is nothing I can do about the overcommitted memory. However, browsing the kernel/libc source I don't see anything in there that would mark a shared memory segment for deletion for any reason other than a user request (but perhaps I've missed it hidden in there somewhere).
For reference here's the shared memory interface class' constructor:
const char* shm_ftok_pathname = "/usr/bin";
int shm_ftok_proj_id = 21;
// creates a key from a file path so different processes will get same key
key_t m_shm_key = ftok(shm_ftok_pathname, shm_ftok_proj_id);
if ( m_shm_key == -1 )
{
fprintf(stderr,"Couldn't get the key for the shared memory\n%s\n",strerror(errno));
exit ( status );
}
m_shm_id = shmget(m_shm_key, sizeof(shm_data_s), (IPC_CREAT | 0666));
if (m_shm_id < 0)
{
fprintf(stderr,"Couldn't get the shared memory ID\nerrno = %s \n",strerror(errno));
exit ( status );
}
// get a ptr to shared memory, which is a shared mem struct
// second arg of 0 says let OS choose shm address
m_shm_data_ptr = (shm_data_s *)shmat(m_shm_id, 0, 0);
if ( (int)m_shm_data_ptr == -1 )
{
fprintf(stderr,"Couldn't get the shared memory pointer\n");
exit ( status );
}
And here's my uname output:
Linux 2.6.18-5-686 #1 SMP Fri Jun 1 00:47:00 UTC 2007 i686 GNU/Linux
My first guess is that you probably are calling shmctl(..., IPC_RMID, ...) somewhere.
Can you show the shared memory interface class' destructor?
The only reason for kernel to mark the segment for deletion is the explicit user call.May be you can give a try to strace/truss(in solaris) to find out if there is a user call to the said function, mentioned in 1 above.
Raman Chalotra