How to perform cache operations in C++? - c++

I want to run my C++ program after flushing the cache, Before running my program I do not know what is there in the cache. Is there some other manner in C++ on Ubuntu via which I may flush my cache before running my program.
EDIT: The motive for flushing the cache is... that each time I run my program I do not want my present data structures to be there in the cache... I mean I want a cold cache... whereby all the accesses are made from the disk.
One way of achieving this is to restart the computer... but considering the number of experiments that I have to run, this is not feasible for me. So, can anyone be kind enough to guide me as to how I can achieve this.

You have no need to flush the cache from your user-mode (non-kernel-mode) program. The OS (Linux, in the case of ubuntu) provides your application with a fresh virtual address space, with no "leftover stuff" from other programs. Without executing special OS system calls, your program can't even get to memory that's used for other applications. So from a cache perspective, your application start from a clean slate, as far as it's concerned. There are cacheflush() system calls (syntax differs by OS), but unless you're doing something out-of-the-ordinary for typical user-mode applications, you can just forget that the cache even exists; it's just there to speed up your program, and the OS manages it via the CPU's MMU, your app does not need to manage it.
You may have also heard about "memory leaks" (memory allocated to your application that your application forgets to free/delete, which is "lost forever" once your application forgets about it). If you're writing a (potentially) long-running program, leaked memory is definitely a concern. But leaked memory is only an issue for the application that leaks it; in modern virtual-memory environments, if application A leaks memory, it doesn't affect application B. And when application A exits, the OS clears out its virtual address space, and any leaked memory is at that point reclaimed by the system and no longer consumes any system resources whatsoever. In many cases, programmers specifically choose to NOT free/delete a memory allocation, knowing that the OS will automatically reclaim the entire amount of the memory when the application exits. There's nothing wrong with that strategy, as long as the program doesn't keep doing that on a repeating basis, exhausting its virtual address space.

This is a common question.
Firstly you have to understand that the caches are never really empty, just like a register is never really empty, it's always there, and it always has a value. The phrase "Flushing the cache" actually refers to writing the cache contents to memory, also called a memory barrier.
see https://en.wikipedia.org/wiki/Memory_barrier
This is not your problem, and so you are using the wrong terminology.
What you really want is to fill the cache with the wrong values. This is harder than it sounds, because you are fighting all the optimisations that normally are your friend. Memcpy'ing a large block of memory (several MB - given the size of todays caches) should normally work though.
However...
You also have file caches and other things that will give your application an unfair advantages. This can be a very complex subject, and is a small project in it's own right.

Related

Is memory allocated even when program ends? [duplicate]

I've run into memory leaks many times. Usually when I'm malloc-ing like there's no tomorrow, or dangling FILE *s like dirty laundry. I generally assume (read: hope desperately) that all memory is cleaned up at least when the program terminates. Are there any situations where leaked memory won't be collected when the program terminates, or crashes?
If the answer varies widely from language-to-language, then let's focus on C(++).
Please note hyperbolic usage of the phrase, 'like there's no tomorrow', and 'dangling ... like dirty laundry'. Unsafe* malloc*ing can hurt the ones you love. Also, please use caution with dirty laundry.
No. Operating systems free all resources held by processes when they exit.
This applies to all resources the operating system maintains: memory, open files, network connections, window handles...
That said, if the program is running on an embedded system without an operating system, or with a very simple or buggy operating system, the memory might be unusable until a reboot. But if you were in that situation you probably wouldn't be asking this question.
The operating system may take a long time to free certain resources. For example the TCP port that a network server uses to accept connections may take minutes to become free, even if properly closed by the program. A networked program may also hold remote resources such as database objects. The remote system should free those resources when the network connection is lost, but it may take even longer than the local operating system.
The C Standard does not specify that memory allocated by malloc is released when the program terminates. This done by the operating system and not all OSes (usually these are in the embedded world) release the memory when the program terminates.
As all the answers have covered most aspects of your question w.r.t. modern OSes, but historically, there is one that is worth mentioning if you have ever programmed in the DOS world. Terminant and Stay Resident (TSR) programs would usually return control to the system but would reside in memory which could be revived by a software / hardware interrupt. It was normal to see messages like "out of memory! try unloading some of your TSRs" when working on these OSes.
So technically the program terminates, but because it still resides on memory, any memory leak would not be released unless you unload the program.
So you can consider this to be another case apart from OSes not reclaiming memory either because it's buggy or because the embedded OS is designed to do so.
I remember one more example. Customer Information Control System (CICS), a transaction server which runs primarily on IBM mainframes is pseudo-conversational. When executed, it processes the user entered data, generates another set of data for the user, transferring to the user terminal node and terminates. On activating the attention key, it again revives to process another set of data. Because the way it behaves, technically again, the OS won't reclaim memory from the terminated CICS Programs, unless you recycle the CICS transaction server.
Like the others have said, most operating systems will reclaim allocated memory upon process termination (and probably other resources like network sockets, file handles, etc).
Having said that, the memory may not be the only thing you need to worry about when dealing with new/delete (instead of raw malloc/free). The memory that's allocated in new may get reclaimed, but things that may be done in the destructors of the objects will not happen. Perhaps the destructor of some class writes a sentinel value into a file upon destruction. If the process just terminates, the file handle may get flushed and the memory reclaimed, but that sentinel value wouldn't get written.
Moral of the story, always clean up after yourself. Don't let things dangle. Don't rely on the OS cleaning up after you. Clean up after yourself.
This is more likely to depend on operating system than language. Ultimately any program in any language will get it's memory from the operating system.
I've never heard of an operating system that doesn't recycle memory when a program exits/crashes. So if your program has an upper bound on the memory it needs to allocate, then just allocating and never freeing is perfectly reasonable.
If the program is ever turned into a dynamic component ("plugin") that is loaded into another program's address space, it will be troublesome, even on an operating system with tidy memory management. We don't even have to think about the code being ported to less capable systems.
On the other hand, releasing all memory can impact the performance of a program's cleanup.
One program I was working on, a certain test case required 30 seconds or more for the program to exit, because it was recursing through the graph of all dynamic memory and releasing it piece by piece.
A reasonable solution is to have the capability there and cover it with test cases, but turn it off in production code so the application quits fast.
All operating systems deserving the title will clean up the mess your process made after termination. But there are always unforeseen events, what if it was denied access somehow and some poor programmer did not foresee the possibility and so it doesn't try again a bit later?
Always safer to just clean up yourself IF memory leaks are mission critical - otherwise not really worth the effort IMO if that effort is costly.
Edit:
You do need to clean up memory leaks if they are in place where they will accumulate, like in loops. The memory leaks I speak of are ones that build up in constant time throughout the course of the program, if you have a leak of any other sort it will most likely be a serious problem sooner or later.
In technical terms if your leaks are of memory 'complexity' O(1) they are fine in most cases, O(logn) already unpleasant (and in some cases fatal) and O(N)+ intolerable.
Shared memory on POSIX compliant systems persists until shm_unlink is called or the system is rebooted.
If you have interprocess communication, this can lead to other processes never completing and consuming resources depending on the protocol.
To give an example, I was once experimenting with printing to a PDF printer in Java when I terminated the JVM in the middle of a printer job, the PDF spooling process remained active, and I had to kill it in the task manager before I could retry printing.

Allocating memory that can be freed by the OS if needed

I'm writing a program that generates thumbnails for every page in a large document. For performance reasons I would like to keep the thumbnails in memory for as long as possible, but I would like the OS to be able to reclaim that memory if it decides there is another more important use for it (e.g. the user has started running a different application.)
I can always regenerate the thumbnail later if the memory has gone away.
Is there any cross-platform method for flagging memory as can-be-removed-if-needed? The program is written in C++.
EDIT: Just to clarify, rather than being notified when memory is low or regularly monitoring the system's amount of memory, I'm thinking more along the lines of allocating memory and then "unlocking" it when it's not in use. The OS can then steal unlocked memory if needed (even for disk buffers if it thinks that would be a better use of the memory) and all I have to do as a programmer is just "lock" the memory again before I intend to use it. If the lock fails I know the memory has been reused for something else so I need to regenerate the thumbnail again, and if the lock succeeds I can just keep using the data from before.
The reason is I might be displaying maybe 20 pages of a document on the screen, but I may as well keep thumbnails of the other 200 or so pages in case the user scrolls around a bit. But if they go do something else for a while, that memory might be better used as a disk cache or for storing web pages or something, so I'd like to be able to tell the OS that it can reuse some of my memory if it wants to.
Having to monitor the amount of free system-wide memory may not achieve the goal (my memory will never be reclaimed to improve disk caching), and getting low-memory notifications will only help in emergencies. I was hoping that by having a lock/unlock method, this could be achieved in more of a lightweight way and benefit the performance of the system in a non-emergency situation.
Is there any cross-platform method for flagging memory as can-be-removed-if-needed? The program is written in C++
For Windows, at least, you can register for a memory resource notification.
HANDLE WINAPI CreateMemoryResourceNotification(
_In_ MEMORY_RESOURCE_NOTIFICATION_TYPE NotificationType
);
NotificationType
LowMemoryResourceNotification Available physical memory is running low.
HighMemoryResourceNotification Available physical memory is high.
Just be careful responding to both events. You might create a feedback loop (memory is low, release the thumbnails! and then memory is high, make all the thumbnails!).
In AIX, there is a signal SIGDANGER that is send to applications when available memory is low. You may handle this signal and free some memory.
There is a discussion among Linux people to implement this feature into Linux. But AFAIK it is not yet implemented in Linux. Maybe they think that application should not care about low level memory management, and it could be transparently handled in OS via swapping.
In posix standard there is a function posix_madvise might be used to mark an area of memory as less important. There is an advice POSIX_MADV_DONTNEED specifies that the application expects that it will not access the specified range in the near future.
But unfortunately, current Linux implementation will immediately free the memory range when posix_madvise is called with this advice.
So there's no portable solution to your question.
However, on almost every OS you are able to read the current available memory via some OS interface. So you can routinely read such value and manually free memory when available memory in OS is low.
There's nothing special you need to do. The OS will remove things from memory if they haven't been used recently automatically. Some OSes have platform-specific ways to improve this, but generally, nothing special is needed.
This question is very similar and has answers that cover things not covered here.
Allocating "temporary" memory (in Linux)
This shouldn't be too hard to do because this is exactly what the page cache does, using unused memory to cache the hard disk. In theory, someone could write a filesystem such that when you read from a certain file, it calculated something, and the page cache would cache it automatically.
All the basics of automatically freed cache space are already there in any OS with a disk cache, and It's hard to imagine there not being an API for something that would make a huge difference especially in things like mobile web browsers.

How far can memory leaks go?

I've run into memory leaks many times. Usually when I'm malloc-ing like there's no tomorrow, or dangling FILE *s like dirty laundry. I generally assume (read: hope desperately) that all memory is cleaned up at least when the program terminates. Are there any situations where leaked memory won't be collected when the program terminates, or crashes?
If the answer varies widely from language-to-language, then let's focus on C(++).
Please note hyperbolic usage of the phrase, 'like there's no tomorrow', and 'dangling ... like dirty laundry'. Unsafe* malloc*ing can hurt the ones you love. Also, please use caution with dirty laundry.
No. Operating systems free all resources held by processes when they exit.
This applies to all resources the operating system maintains: memory, open files, network connections, window handles...
That said, if the program is running on an embedded system without an operating system, or with a very simple or buggy operating system, the memory might be unusable until a reboot. But if you were in that situation you probably wouldn't be asking this question.
The operating system may take a long time to free certain resources. For example the TCP port that a network server uses to accept connections may take minutes to become free, even if properly closed by the program. A networked program may also hold remote resources such as database objects. The remote system should free those resources when the network connection is lost, but it may take even longer than the local operating system.
The C Standard does not specify that memory allocated by malloc is released when the program terminates. This done by the operating system and not all OSes (usually these are in the embedded world) release the memory when the program terminates.
As all the answers have covered most aspects of your question w.r.t. modern OSes, but historically, there is one that is worth mentioning if you have ever programmed in the DOS world. Terminant and Stay Resident (TSR) programs would usually return control to the system but would reside in memory which could be revived by a software / hardware interrupt. It was normal to see messages like "out of memory! try unloading some of your TSRs" when working on these OSes.
So technically the program terminates, but because it still resides on memory, any memory leak would not be released unless you unload the program.
So you can consider this to be another case apart from OSes not reclaiming memory either because it's buggy or because the embedded OS is designed to do so.
I remember one more example. Customer Information Control System (CICS), a transaction server which runs primarily on IBM mainframes is pseudo-conversational. When executed, it processes the user entered data, generates another set of data for the user, transferring to the user terminal node and terminates. On activating the attention key, it again revives to process another set of data. Because the way it behaves, technically again, the OS won't reclaim memory from the terminated CICS Programs, unless you recycle the CICS transaction server.
Like the others have said, most operating systems will reclaim allocated memory upon process termination (and probably other resources like network sockets, file handles, etc).
Having said that, the memory may not be the only thing you need to worry about when dealing with new/delete (instead of raw malloc/free). The memory that's allocated in new may get reclaimed, but things that may be done in the destructors of the objects will not happen. Perhaps the destructor of some class writes a sentinel value into a file upon destruction. If the process just terminates, the file handle may get flushed and the memory reclaimed, but that sentinel value wouldn't get written.
Moral of the story, always clean up after yourself. Don't let things dangle. Don't rely on the OS cleaning up after you. Clean up after yourself.
This is more likely to depend on operating system than language. Ultimately any program in any language will get it's memory from the operating system.
I've never heard of an operating system that doesn't recycle memory when a program exits/crashes. So if your program has an upper bound on the memory it needs to allocate, then just allocating and never freeing is perfectly reasonable.
If the program is ever turned into a dynamic component ("plugin") that is loaded into another program's address space, it will be troublesome, even on an operating system with tidy memory management. We don't even have to think about the code being ported to less capable systems.
On the other hand, releasing all memory can impact the performance of a program's cleanup.
One program I was working on, a certain test case required 30 seconds or more for the program to exit, because it was recursing through the graph of all dynamic memory and releasing it piece by piece.
A reasonable solution is to have the capability there and cover it with test cases, but turn it off in production code so the application quits fast.
All operating systems deserving the title will clean up the mess your process made after termination. But there are always unforeseen events, what if it was denied access somehow and some poor programmer did not foresee the possibility and so it doesn't try again a bit later?
Always safer to just clean up yourself IF memory leaks are mission critical - otherwise not really worth the effort IMO if that effort is costly.
Edit:
You do need to clean up memory leaks if they are in place where they will accumulate, like in loops. The memory leaks I speak of are ones that build up in constant time throughout the course of the program, if you have a leak of any other sort it will most likely be a serious problem sooner or later.
In technical terms if your leaks are of memory 'complexity' O(1) they are fine in most cases, O(logn) already unpleasant (and in some cases fatal) and O(N)+ intolerable.
Shared memory on POSIX compliant systems persists until shm_unlink is called or the system is rebooted.
If you have interprocess communication, this can lead to other processes never completing and consuming resources depending on the protocol.
To give an example, I was once experimenting with printing to a PDF printer in Java when I terminated the JVM in the middle of a printer job, the PDF spooling process remained active, and I had to kill it in the task manager before I could retry printing.

Risk of damaging your computer by altering memory in C++

I know some Java and am right now trying C++ instead and apparently in C++ you can do things like declare an int array of size 6, then change the 10th element of that array, which I'm understanding to be simply the 4th byte after the end of the section of memory that was allocated for the 6-integer array.
So my question is, if I'm careless is it possible to accidentally alter memory in my C++ program that is being used by other programs on my system? Is there an actual risk of seriously messing something up this way? I mean I know you can just restart your computer and clear the memory if you have to, but if I don't do that, there could be some lasting damage.
It depends on your system. Formally, an out of bounds access is
undefined behavior. On a modern general purpose system, each user
process has its own address space, and one process can't modify, or even
read, that of another process (barring shared memory), so unless you're
writing kernel code, you shouldn't be able to break anything outside of
your own process (and non-kernel code can't normally do physical IO, so
I don't see how anything in the hardware could break).
If you're writing kernel code, however, or working on an embedded
processor with no memory mapping or protection, you can literally
destroy hardware with an out of bounds write; if the program is
controlling something like a nuclear power plant, you can even destroy a
lot more than just the machine your code is running on.
Each process is given its own virtual address space, so naturally processes don't see each others memory. Don't forget that even a buffer overrun that is local to your program can have dire consequences - the overrun may cause the program to misbehave and do something that has lasting effect (like deleting all files for example).
This depends on what operating system and environment you are in:
Normal OS (Windows, Linux etc) userspace program: You can only mess up your own process memory. However having really bad luck this can be enough. Imagine for example that you make a call to some function that deletes files. If your memory is corrupted at the time of the call, the parameters to the function might be messed up to mean deletion of something else than you intended. As long as you keep from calling delete file routines etc. in the programs where you test memory handling, this risk is non-existent.
Normal OS, kernel space device driver: You can access system memory and the memory of the currently running process, possibly destroying everything.
Simple embedded OS without memory protection: You can access everything and destroy anything.
Legacey OS without memory protection (Win 3.X, MS-DOS): You can access everyting and destroy anything.
Every program runs in its own address space and one program cannot access (read / modify) any other programs address space (this is a memory management technique called paging).
If you try to access an address in memory that your program cannot read it will cause a segmentation or page fault and your program will crash.
In answer to your question, no permanent damage will be caused.
I'm not sure modern OS (especially win7) allows you to do that. The OS will block the buffer overrun action as you've described
Back in the day (DOS times) some viruses would try to damage hardware by directly programming video or hard drive controller, but even then it was not easy or sure thing. Modern hardware and OSs make it practicaly impossible for user level applications to damage hardware. So program away :) you wont break anything.
There is a different possibility. Having a buffer overrun might allow ill reputed people to exploit that bug and execute arbitrary code in your clients computer. I guess thats bad enough. And the most dangerous part about overrun is that you may not find it even after a seriously excessive test.

Memory usage and minimizing

We have a fairly graphical intensive application that uses the FOX toolkit and OpenSceneGraph, and of course C++. I notice that after running the application for some time, it seems there is a memory leak. However when I minimize, a substantial amount of memory appears to be freed (as witnessed in the Windows Task Manager). When the application is restored, the memory usage climbs but plateaus to an amount less than what it was before the minimize.
Is this a huge indicator that we have a nasty memory leak? Or might this be something with how Windows handles graphical applications? I'm not really sure what is going on.
What you are seeing is simply memory caching. When you call free()/delete()/delete, most implementations won't actually return this memory to the OS. They will keep it to be returned in a much faster fashion the next time you request it. When your application is minimized, they will free this memory because you won't be requesting it anytime soon.
It's unlikely that you have an actual memory leak. Task Manager is not particularly accurate, and there's a lot of behaviour that can change the apparent amount of memory that you're using- even if you released it properly. You need to get an actual memory profiler to take a look if you're still concerned.
Also, yes, Windows does a lot of things when minimizing applications. For example, if you use Direct3D, there's a device loss. There's thread timings somethings. Windows is designed to give the user the best experience in a single application at a time and may well take extra cached/buffered resources from your application to do it.
No, there effect you are seeing means that your platform releases resources when it's not visible (good thing), and that seems to clear some cached data, which is not restored after restoring the window.
Doing this may help you find memory leaks. If the minimum amount of memory (while minimized) used by the app grows over time, that would suggest a leak.
You are looking at the working set size of your program. The sum of the virtual memory pages of your program that are actually in RAM. When you minimize your main window, Windows assumes the user won't be interested in the program for a while and aggressively trims the working set. Copying the pages in RAM to the paging file and chucking them out, making room for the other process that the user is likely to start or to switch to.
This number will also go down automatically when the user starts another program that needs a lot of RAM. Windows chucks out your pages to make room for this program. It picks pages that your program hasn't used for a while, making it likely that this doesn't affect the perf of your program much.
When you switch back to your program, Windows needs to swap pages back into RAM. But this is on-demand, it only pages-in pages that your program actually uses. Which will normally be less than what it used before, no need to swap the initialization code of your program back in for example.
Needless to say perhaps, the number has absolutely nothing to do with the memory usage of your program, it is merely a statistical number.
Private bytes would be a better indicator for a memory leak. Taskmgr doesn't show that, SysInternals' ProcMon tool does. It still isn't a great indicator because that number also includes any blocks in the heap that were freed by your program and were added to the list of free blocks, ready to be re-used. There is no good way to measure actual memory in use, read the small print for the HeapWalk() API function for the kind of trouble that causes.
The memory and heap manager in Windows are far too sophisticated to draw conclusions from the available numbers. Use a leak detection tool, like the VC debug allocator (crtdbg.h).