dynamically monitor program status - c++

In Unix systems, it is possible to dynamically monitor the system by reading data from /proc. I am hoping to implement this kind of monitoring in my application, by dynamically saving "current status" into a file. However, I do not want IO delay my program, so it would be best to make the file virtual, i.e. not stored into disk but actually in memory. Is there a way of doint that? Thanks for the hint!

Why not used shared memory and semaphores. Do a 'man shmget' as a starting point.

As an alternative you could make your application a socket server. Doing this way you can have it responding with status information only if being asked to (there's not even the need to keep updating a memory area with the current status) and you can also control your program from a remote machine. If the status itself is not a huge quantity of data I think this is the most flexible solution.
If also you make your application responding to an HTTP request (i don't mean handling all the http protocol possibilities, just the requests you want to support) then you can also avoid to write a client and if you want to write it anyway it's probably easier to find libraries and programmers able to do that.
Make it listening to port 80 and you could check your program over the internet getting through firewalls without efforts :-) (well... assuming the program itself can be reached from the internet, but even for that it's a simple and common thing to ask for to sysadmins).

Try FUSE. it is particularly useful for writing virtual file systems. There are already many filesystems on top of it.

I have no idea about your exact requirements, so I can only guess, but every file that under linux is put into /dev/shm is in ram. But that doesn't mean it is not doing I/O, just that the I/O is faster. If you don't want to do I/O via filedescriptors or similar, do as someone else suggested and use shared memory segments, but this is a bit harder for everyone to read. Having other programs just open and read a file, which then calls some functions in your program (which is done by /proc in kernel space) is not possible. Maybe also a filesystem socket or fifo is something that suits your needs more (e.g. when you are having a select/(e)poll routine anyways). When you have full control over the system, also tmpfs might be useful for you.

Related

How to monitor processes on linux

When an executable is running on Linux, it generates processes, threads, I/O ... etc, and uses libraries from languages like C/C++, sometimes there might be timers in question, is it possible to monitor this? how can I get a deep dive into these software and processes and what is going on in the background?
I know this stuff is abstracted from me because I shouldn't be worrying about it as a regular user, but I'm curious to what would I see.
What I need to see are:
System calls for this process/thread.
Open/closed sockets.
Memory management and utilization, what block is being accessed.
Memory instructions.
If a process is depending on the results of another one.
If a process/thread terminates, why, and was it successful?
I/O operations and DB read/write if any.
The different things you wanted to monitor may require different tools. All tools I will mention below have extensive manual pages where you can find exactly how to use them.
System calls for this process/thread.
The strace command does exactly this - it lists exactly which system calls are invoked by your program. The ltrace tool is similar, but focuses on calls to library functions - not just system calls (which involve the kernel).
Open/closed sockets.
The strace/ltrace commands will list among other things socket creation, but if you want to know which sockets are open - connected, listening, and so on - right now, there is the netstat utility, which lists all the connected (or with "-a", also listening) sockets in the system, and which process they belong to.
Memory management and utilization, what block is being accessed.
Memory instructions.
Again ltrace will let you see all malloc()/free() calls, but to see exactly what memory is being access where, you'll need a debugger, like gdb. The thing is that almost everything your program does will be a "memory instruction" so you'll need to know exactly what you are looking for, with breakpoints, tracepoints, single-stepping, and so on, and usually don't just want to see every memory access in your program.
If you don't want to find all memory accesses but rather are searching for bugs in this area - like accessing memory after it's freed and so on, there are tools that help you find those more easily. One of them called ASAN ("Address Sanitizer") is built into the C++ compiler, so you can build with it enabled and get messages on bad access patterns. Another one you can use is valgrind.
Finally, if by "memory utilization" you meant to just check how much memory your process or thread is using, well, both ps and top can tell you that.
If a process is depending on the results of another one.
If a process/thread terminates, why, and was it successful?
Various tools I mentioned like strace/ltrace will let you know when the process they follow exits. Any process can print the exit code of one of its sub-processes, but I'm not aware of a tool which can print the exit status of all processes in the system.
I/O operations
There is iostat that can give you periodic summaries of how much IO was done to each disk. netstat -s gives you network statistics so you can see how many network operations were done. vmstat gives you, among other things, statistics on IO caused by swap in/out (in case this is a problem in your case).
and DB read/write if any.
This depends on your DB, I guess, and how you monitor it.

High performance ways to stream local files as they're being written to network

Today a system exists that will write packet-capture files to the local disk as they come in. Dropping these files to local disk as the first step is deemed desirable for fault-tolerance reasons. If a client dies and needs to reconnect or be brought up somewhere else, we enjoy the ability to replay from the disk.
The next step in the data pipeline is trying to get this data that was landed to disk out to remote clients. Assuming sufficient disk space, it strikes me as very convenient to use the local disk (and the page-cache on top of it) as a persistent boundless-FIFO. It is also desirable to use the file system to keep the coupling between the producer and consumer low.
In my research, I have not found a lot of guidance on this type of architecture. More specifically, I have not seen well-established patterns in popular open-source libraries/frameworks for reading the file as it is being written to stream out.
My questions:
Is there a flaw in this architecture that I am not noting or indirectly downplaying?
Are there recommendations for consuming a file as it is being written, and efficiently blocking and/or asynchronously being notified when more data is available in the file?
A goal would be to either explicitly or implicitly have the consumer benefit from page-cache warmth. Are there any recommendations on how to optimize for this?
The file-based solution sounds clunky but could work. Similarly to how tail -f does it:
read the file until EOF, but not close it
setup an inode watch (with inotify), waiting for more writes
repeat
The difficulty is usually with file rotation and cleanup, i.e. you need to watch for new files and/or truncation.
Having said that, it might be more efficient to connect to the packet-capture interface directly, or setup a queue to which clients can subscribe.

Linux non-persistent backing store for mmap()

First, a little motivating background info: I've got a C++-based server process that runs on an embedded ARM/Linux-based computer. It works pretty well, but as part of its operation it creates a fairly large fixed-size array (e.g. dozens to hundreds of megabytes) of temporary/non-persistent state information, which it currently keeps on the heap, and it accesses and/or updates that data from time to time.
I'm investigating how far I can scale things up, and one problem I'm running into is that eventually (as I stress-test the server by making its configuration larger and larger), this data structure gets big enough to cause out-of-memory problems, and then the OOM killer shows up, and general unhappiness ensues. Note that this embedded configuration of Linux doesn't have swap enabled, and I can't (easily) enable a swap partition.
One idea I have on how to ameliorate the issue is to allocate this large array on the computer's local flash partition, instead of directly in RAM, and then use mmap() to make it appear to the server process like it's still in RAM. That would reduce RAM usage considerably, and my hope is that Linux's filesystem-cache would mask most of the resulting performance cost.
My only real concern is file management -- in particular, I'd like to avoid any chance of filling up the flash drive with "orphan" backing-store files (i.e. old files whose processes don't exist any longer, but the file is still present because its creating process crashed or by some other mistake forgot to delete it on exit). I'd also like to be able to run multiple instances of the server simultaneously on the same computer, without the instances interfering with each other.
My question is, does Linux have any built-it facility for handling this sort of use-case? I'm particularly imagining some way to flag a file (or an mmap() handle or similar) so that when the file that created the process exits-or-crashes, the OS automagically deletes the file (similar to the way Linux already automagically recovers all of the RAM that was allocated by a process, when the process exits-or-crashes).
Or, if Linux doesn't have any built-in auto-temp-file-cleanup feature, is there a "best practice" that people use to ensure that large temporary files don't end up filling up a drive due to unintentionally becoming persistent?
Note that AFAICT simply placing the file in /tmp won't help me, since /tmp is using a RAM-disk and therefore doesn't give me any RAM-usage advantage over simply allocating in-process heap storage.
Yes, and I do this all the time...
open the file, unlink it, use ftruncate or (better) posix_fallocate to make it the right size, then use mmap with MAP_SHARED to map it into your address space. You can then close the descriptor immediately if you want; the memory mapping itself will keep the file around.
For speed, you might find you want to help Linux manage its page cache. You can use posix_madvise with POSIX_MADV_WILLNEED to advise the kernel to page data in and POSIX_MADV_DONTNEED to advise the kernel to release the pages.
You might find that last does not work the way you want, especially for dirty pages. You can use sync_file_range to explicitly control flushing to disk. (Although in that case you will want to keep the file descriptor open.)
All of this is perfectly standard POSIX except for the Linux-specific sync_file_range.
Yes, You create/open the file. Then you remove() the file by its filename.
The file will still be open by your process and you can read/write it just like any opened file, and it will disappear when the process having the file opened exits.
I believe this behavior is mandated by posix, so it will work on any unix like system. Even at a hard reboot, the space will be reclaimed.
I believe this is filesystem-specific, but most Linux filesystems allow deletion of open files. The file will still exist until the last handle to it is closed. I would recommend that you open the file then delete it immediately and it will be automatically cleaned up when your process exits for any reason.
For further details, see this post: What happens to an open file handle on Linux if the pointed file gets moved, delete

How a process can broadcast data locally

I am looking for some existing way of broadcasting data localy (like IPC, but in in an unconnected way).
The need:
I am currently having a computation program that has no HMI (and won't have) and I would like this program to send information about its progress so another one can display it (for example in an HMI). But if there is no other program "listening", the comptation is not interrupted. And I would like to have the minimum logic embeded in the computation program.
I have found things about IPC, but it seems to work only in a client-server configruation.
So I have identified that my need is to find a way of broadcasting the data, and clients may or may not listen to this broadcast.
How can i do this ?
EDIT:
I would like or a very light solution (like a standalone set for .h files (not more than 5)) or even a way of doing it by myself : as I said, IPC seems ok but it is working in a connected way.
For example, the 0MQ (http://zguide.zeromq.org/page:all#Getting-the-Message-Out) is doing exactly what I need, but is embeding to much functionalities.
You can try with MPI library this purpose.
Have a look at this
For now, the Shared memory (on UNIX) seems to do the job.
It remains several points I have not investigated yet:
compatibility between OS (it's C++ and I would like it to be
build-able under any platform without having to change the code)
Sharing complex objects with undetermined size at compilation time.
Dynamic size => might be really complicated to have something
efficient.
So I am still open and waiting for a better solution.

Fastest technique to pass messages between processes on Linux?

What is the fastest technology to send messages between C++ application processes, on Linux? I am vaguely aware that the following techniques are on the table:
TCP
UDP
Sockets
Pipes
Named pipes
Memory-mapped files
are there any more ways and what is the fastest?
Whilst all the above answers are very good, I think we'd have to discuss what is "fastest" [and does it have to be "fastest" or just "fast enough for "?]
For LARGE messages, there is no doubt that shared memory is a very good technique, and very useful in many ways.
However, if the messages are small, there are drawbacks of having to come up with your own message-passing protocol and method of informing the other process that there is a message.
Pipes and named pipes are much easier to use in this case - they behave pretty much like a file, you just write data at the sending side, and read the data at the receiving side. If the sender writes something, the receiver side automatically wakes up. If the pipe is full, the sending side gets blocked. If there is no more data from the sender, the receiving side is automatically blocked. Which means that this can be implemented in fairly few lines of code with a pretty good guarantee that it will work at all times, every time.
Shared memory on the other hand relies on some other mechanism to inform the other thread that "you have a packet of data to process". Yes, it's very fast if you have LARGE packets of data to copy - but I would be surprised if there is a huge difference to a pipe, really. Main benefit would be that the other side doesn't have to copy the data out of the shared memory - but it also relies on there being enough memory to hold all "in flight" messages, or the sender having the ability to hold back things.
I'm not saying "don't use shared memory", I'm just saying that there is no such thing as "one solution that solves all problems 'best'".
To clarify: I would start by implementing a simple method using a pipe or named pipe [depending on which suits the purposes], and measure the performance of that. If a significant time is spent actually copying the data, then I would consider using other methods.
Of course, another consideration should be "are we ever going to use two separate machines [or two virtual machines on the same system] to solve this problem. In which case, a network solution is a better choice - even if it's not THE fastest, I've run a local TCP stack on my machines at work for benchmark purposes and got some 20-30Gbit/s (2-3GB/s) with sustained traffic. A raw memcpy within the same process gets around 50-100GBit/s (5-10GB/s) (unless the block size is REALLY tiny and fits in the L1 cache). I haven't measured a standard pipe, but I expect that's somewhere roughly in the middle of those two numbers. [This is numbers that are about right for a number of different medium-sized fairly modern PC's - obviously, on a ARM, MIPS or other embedded style controller, expect a lower number for all of these methods]
I would suggest looking at this also: How to use shared memory with Linux in C.
Basically, I'd drop network protocols such as TCP and UDP when doing IPC on a single machine. These have packeting overhead and are bound to even more resources (e.g. ports, loopback interface).
NetOS Systems Research Group from Cambridge University, UK has done some (open-source) IPC benchmarks.
Source code is located at https://github.com/avsm/ipc-bench .
Project page: http://www.cl.cam.ac.uk/research/srg/netos/projects/ipc-bench/ .
Results: http://www.cl.cam.ac.uk/research/srg/netos/projects/ipc-bench/results.html
This research has been published using the results above: http://anil.recoil.org/papers/drafts/2012-usenix-ipc-draft1.pdf
Check CMA and kdbus:
https://lwn.net/Articles/466304/
I think the fastest stuff these days are based on AIO.
http://www.kegel.com/c10k.html
As you tagged this question with C++, I'd recommend Boost.Interprocess:
Shared memory is the fastest interprocess communication mechanism. The
operating system maps a memory segment in the address space of several
processes, so that several processes can read and write in that memory
segment without calling operating system functions. However, we need
some kind of synchronization between processes that read and write
shared memory.
Source
One caveat I've found is the portability limitations for synchronization primitives. Nor OS X, nor Windows have a native implementation for interprocess condition variables, for example,
and so it emulates them with spin locks.
Now if you use a *nix which supports POSIX process shared primitives, there will be no problems.
Shared memory with synchronization is a good approach when considerable data is involved.
Well, you could simply have a shared memory segment between your processes, using the linux shared memory aka SHM.
It's quite easy to use, look at the link for some examples.
posix message queues are pretty fast but they have some limitations