memory safety for encrypted, sensitive data

memory safety for encrypted, sensitive data - c++

im writing a server in c++ that will handle safe connections where sensitive data will be sent.
the goal is never saving the data in unencrypted form anywhere outside memory, and keeping it at a defined space in the memory (to be overwritten after its no longer needed)
will allocating a large chunk of memory and using it to store the sensitive data be sufficient and ensure that there is no leakage of data ?

From the manual of a tool that handles passwords:
It's also debatable whether mlock() is a proper way to protect sensitive
information. According to POSIX, mlock()-ing a page guarantees that it
is in memory (useful for realtime applications), not that it isn't
in the swap (useful for security applications). Possibly an encrypted
swap partition (or no swap partition) is a better solution.
However, Linux does guarantee that it is not in the swap and specifically discusses the security applications. It also mentions:
But be aware that the suspend mode on laptops and some desktop computers will
save a copy of the system's RAM to disk, regardless of memory locks.

Why don't you use SELinux? Then no process can access other stuff unless you tell it can.
I think if you are securing a program handling sensitive data, you should start by using a secure OS. If the OS is not secure enough then there is nothing your application can do to fix that.
And maybe when using SELinux you don't have to do anything special in your application making your application smaller, simpler and also more secure?

What you want is locking some region of memory into RAM. See the manpage for mlock(2).

Locking the memory (or, if you use Linux, using large pages, since these cannot be paged out) is a good start. All other considerations left aside, this does at least not write plaintext to harddisk in unpredictable ways.
Overwriting memory when no longer needed does not hurt, but is probably useless, because
any pages that are reclaimed and later given to another process will be zeroed out by the operating system anyway (every modern OS does that)
as long as some data is on a computer, you must assume that someone will be able to steal it, one way or the other
there are more exploits in the operating system and in your own code than you are aware of (this happens to the best programmers, and it happens again and again)
There are countless concerns when attempting to prevent someone from stealing sensitive data, and it is by no means an easy endeavour. Encrypting data, trying not to have any obvious exploits, and trying to avoid the most stupid mistakes is as good as you will get. Beyond that, nothing is really safe, because for every N things you plan for, there exists a N+1 thing.
Take my wife's work laptop as a parade example. The intern setting up the machines in their company (at least it's my guess that he's an intern) takes every possible measure and configures everything in paranoia mode to ensure that data on the computer cannot be stolen and that working becomes as much of an ordeal as possible. What you end up with is a bitlocker-protected computer that takes 3 passwords to even boot up, and on which you can practically do nothing, and a screensaver that locks the workstation every time you pick up the phone and forget shaking the mouse. At the same time, this super secure computer has an enabled firewire port over which everybody can read and write anything in the computer's memory without a password.

Related

Is there a way to code data directly to the hard drive (similar to how one can do with RAM)?

My question concerns C/C++. It is possible to manipulate the data on the RAM with pretty great flexibility. You can also give the GPU direct commands using OpenGL, allowing one to manipulate VRAM as well.
My curiosity is whether it is possible to do this to the hard drive (even though this would likely be a horrible idea with many, many possibilities of corrupting existing data). The logic of my question comes from an assumption that the hard drive is similar to RAM and VRAM (bytes of data), but just accesses data slower.
I'm not asking about how to perform file IO, but instead how to directly modify bytes of memory on the hard drive (maybe via some sort of "hard-drive pointer").
If my assumption is totally off, a detailed correction about how the hard drive's data storage is different from RAM or VRAM would be very helpful. Thank you!

Modern operating systems in combination with modern CPUs offer the ability to memory-map disk clusters to memory pages.
The memory pages are initially marked as invalid, and as soon as you try to access them an invalid page "trap" or "interrupt" occurs, which is handled by the operating system, which loads the corresponding cluster into that memory page.
If you write to that page there is either a hardware-supported "dirty" bit, or another interrupt mechanism: the memory page is initially marked as read-only, so the first time you try to write to it there is another interrupt, which simply marks the page as dirty and turns it read-write. Then, you know that the page needs to be flushed to disk at a convenient time.
Note that reading and writing is usually done via Direct Memory Access (DMA) so the CPU is free to do other things while the pages are being transferred.
So, yes, you can do it, either with the help of the operating system, or by writing all that very complex code yourself.

Not for you. Being able to write directly to the hard drive would give you infinite potential to mess up things beyond all recognition. (The technical term is FUBAR, and the F doesn't stand for Mess).
And if you write hard disk drivers, I sincerely hope you are not trying to ask for help here.

Allocating memory that can be freed by the OS if needed

I'm writing a program that generates thumbnails for every page in a large document. For performance reasons I would like to keep the thumbnails in memory for as long as possible, but I would like the OS to be able to reclaim that memory if it decides there is another more important use for it (e.g. the user has started running a different application.)
I can always regenerate the thumbnail later if the memory has gone away.
Is there any cross-platform method for flagging memory as can-be-removed-if-needed? The program is written in C++.
EDIT: Just to clarify, rather than being notified when memory is low or regularly monitoring the system's amount of memory, I'm thinking more along the lines of allocating memory and then "unlocking" it when it's not in use. The OS can then steal unlocked memory if needed (even for disk buffers if it thinks that would be a better use of the memory) and all I have to do as a programmer is just "lock" the memory again before I intend to use it. If the lock fails I know the memory has been reused for something else so I need to regenerate the thumbnail again, and if the lock succeeds I can just keep using the data from before.
The reason is I might be displaying maybe 20 pages of a document on the screen, but I may as well keep thumbnails of the other 200 or so pages in case the user scrolls around a bit. But if they go do something else for a while, that memory might be better used as a disk cache or for storing web pages or something, so I'd like to be able to tell the OS that it can reuse some of my memory if it wants to.
Having to monitor the amount of free system-wide memory may not achieve the goal (my memory will never be reclaimed to improve disk caching), and getting low-memory notifications will only help in emergencies. I was hoping that by having a lock/unlock method, this could be achieved in more of a lightweight way and benefit the performance of the system in a non-emergency situation.

Is there any cross-platform method for flagging memory as can-be-removed-if-needed? The program is written in C++
For Windows, at least, you can register for a memory resource notification.
HANDLE WINAPI CreateMemoryResourceNotification(
_In_ MEMORY_RESOURCE_NOTIFICATION_TYPE NotificationType
);
NotificationType
LowMemoryResourceNotification Available physical memory is running low.
HighMemoryResourceNotification Available physical memory is high.
Just be careful responding to both events. You might create a feedback loop (memory is low, release the thumbnails! and then memory is high, make all the thumbnails!).

In AIX, there is a signal SIGDANGER that is send to applications when available memory is low. You may handle this signal and free some memory.
There is a discussion among Linux people to implement this feature into Linux. But AFAIK it is not yet implemented in Linux. Maybe they think that application should not care about low level memory management, and it could be transparently handled in OS via swapping.
In posix standard there is a function posix_madvise might be used to mark an area of memory as less important. There is an advice POSIX_MADV_DONTNEED specifies that the application expects that it will not access the specified range in the near future.
But unfortunately, current Linux implementation will immediately free the memory range when posix_madvise is called with this advice.
So there's no portable solution to your question.
However, on almost every OS you are able to read the current available memory via some OS interface. So you can routinely read such value and manually free memory when available memory in OS is low.

There's nothing special you need to do. The OS will remove things from memory if they haven't been used recently automatically. Some OSes have platform-specific ways to improve this, but generally, nothing special is needed.

This question is very similar and has answers that cover things not covered here.
Allocating "temporary" memory (in Linux)
This shouldn't be too hard to do because this is exactly what the page cache does, using unused memory to cache the hard disk. In theory, someone could write a filesystem such that when you read from a certain file, it calculated something, and the page cache would cache it automatically.
All the basics of automatically freed cache space are already there in any OS with a disk cache, and It's hard to imagine there not being an API for something that would make a huge difference especially in things like mobile web browsers.

how to keep c++ variables in RAM securely?

I'm working on a C++ application which is keeping some user secret keys in the RAM. This secret keys are highly sensitive & I must minimize risk of any kind of attack against them.
I'm using a character array to store these keys, I've read some contents about storing variables in CPU registers or even CPU cache (i.e using C++ register keyword), but seems there is not a guaranteed way to force application to store some of it's variables outside of RAM (I mean in CPU registers or cache).
Can anybody suggest a good way to do this or suggest any other solution to keep these keys securely in the RAM (I'm seeking for an OS-independent solution)?

Your intentions may be noble, but they are also misguided. The short answer is that there's really no way to do what you want on a general purpose system (i.e. commodity processors/motherboard and general-purpose O/S). Even if you could, somehow, force things to be stored on the CPU only, it still would not really help. It would just be a small nuisance.
More generally to the issue of protecting memory, there are O/S specific solutions to indicate that blocks memory should not be written out to the pagefile such as the VirtualLock function on Windows. Those are worth using if you are doing crypto and holding sensitive data in that memory.
One last thing: I will point out that it worries me is that you have a fundamental misunderstanding of the register keyword and its security implications; remember it's a hint and it won't - indeed, it cannot - force anything to actually be stored in a register or anywhere else.
Now, that, by itself, isn't a big deal, but it is a concern here because it indicates that you do not really have a good grasp on security engineering or risk analysis, which is a big problem if you are designing or implementing a real-world cryptographic solution. Frankly, your posts suggests (to me, at least) that you aren't quite ready to architect or implement such a system.

You can't eliminate the risk, but you can mitigate it.
Create a single area of static memory that will be the only place that you ever store cleartext keys. And create a single buffer of random data that you will use to xor any keys that are not stored in this one static buffer.
Whenever you read a key into memory, from a keyfile or something, you only read it directly into this one static buffer, xor with your random data and copy it out wherever you need it, and immediately clear the buffer with zeroes.
You can compare any two key by just comparing their masked versions. You can even compare hashes of masked keys.
If you need to operate on the cleartext key - e.g. to generate a hash or validate they key somehow load the masked xor'ed key into this one static buffer, xor it back to cleartext and use it. Then write zeroes back into that buffer.
The operation of unmasking, operating and remasking should be quick. Don't leave the buffer sitting around unmasked for a long time.
If someone were to try a cold-boot attack, pulling the plug on the hardware, and inspecting the memory chips there would be only one buffer that could possibly hold a cleartext key, and odds are during that particular instant of the coldboot attack the buffer would be empty.
When operating on the key, you could even unmask just one word of the key at a time just before you need it to validate the key such that a complete key is never stored in that buffer.
#update: I just wanted to address some criticisms in the comments below:
The phrase "security through obscurity" is commonly misunderstood. In the formal analysis of security algorithms "obscurity" or methods of hiding data that are not crytpographically secure do not increase the formal security of a cryptographic algorithm. And it is true in this case. Given that keys are stored on the users machine, and must be used by that program on that machine there is nothing that can be done to make the keys on this machine cryptographically secure. No matter what process you use to hide or lock the data at some point the program must use it, and a determined hacker can put breakpoints in the code and watch when the program uses the data. But no suggestion in this thread can eliminate that risk.
Some people have suggested that the OP find a way to use special hardware with locked memory chips or some operating system method of locking a chip. This is cryptographically no more secure. Ultimately if you have physical access to the machine a determined enough hacker could use a logic analyzer on the memory bus and recover any data. Besides the OP has stated that the target systems don't have such specialized hardware.
But this doesn't mean that there aren't things you can do to mitigate risk. Take the simplest of access keys- the password. If you have physical access to a machine you can put in a key logger, or get memory dumps of running programs etc. So formally the password is no more secure than if it was written in plaintext on a sticky note glued to the keyboard. Yet everyone knows keeping a password on a sticky note is a bad idea, and that is is bad practice for programs to echo back passwords to the user in plaintext. Because of course practically speaking this dramatically lowers the bar for an attacker. Yet formally a sticky note with a password is no less secure.
The suggestion I make above has real security advantages. None of the details matter except the 'xor' masking of the security keys. And there are ways of making this process a little better. Xor'ing the keys will limit the number of places that the programmer must consider as attack vectors. Once the keys are xord, you can have different keys all over your program, you can copy them, write them to a file, send them over the network etc. None of these things will compromise your program unless the attacker has the xor buffer. So there is a SINGLE BUFFER that you have to worry about. You can then relax about every other buffer in the system. ( and you can mlock or VirtualLock that one buffer )
Once you clear out that xor buffer, you permanently and securely eliminate any possibility that an attacker can recover any keys from a memory dump of your program. You are limiting your exposure both in terms of the number of places and the times that keys can be recovered. And you are putting in place a system that allows you to work with keys easily without worrying during every operation on an object that contains keys about possible easy ways the keys can be recovered.
So you can imagine for example a system where keys refcount the xor buffer, and when all key are no longer needed, you zero and delete the xor buffer and all keys become invalidated and inaccessible without you having to track them down and worry about if a memory page got swapped out and still holds plaintext keys.
You also don't have to literally keep around a buffer of random data. You could for example use a cryptographically secure random number generator, and use a single random seed to generate the xor buffer as needed. The only way an attacker can recover the keys is with access to the single generator seed.
You could also allocate the plaintext buffer on the stack as needed, and zero it out when done such that it is extremely unlikely that the stack ever leaves on chip cache. If the complete key is never decoded, but decoded one word at a time as needed even access to the stack buffer won't reveal the key.

There is no platform-independent solution. All the threats you're addressing are platform specific and thus so are the solutions. There is no law that requires every CPU to have registers. There is no law that requires CPUs to have caches. The ability for another program to access your program's RAM, in fact the existence of other programs at all, are platform details.
You can create some functions like "allocate secure memory" (that by default calls malloc) and "free secure memory" (that by default calls memset and then free) and then use those. You may need to do other things (like lock the memory to prevent your keys from winding up in swap) on platforms where other things are needed.

Aside from the very good comments above, you have to consider that even IF you succeed in getting the key to be stored in registers, that register content will most likely get stored in memory when an interrupt comes in, and/or when another task gets to run on the machine. And of course, someone with physical access to the machine can run a debugger and inspect the registers. Debugger may be an "in circuit emulator" if the the key is important enough that someone will spent a few thousand dollars on such a device - which means no software on the target system at all.
The other question is of course how much this matters. Where are the keys originating from? Is someone typing them in? If not, and are stored somewhere else (in the code, on a server, etc), then they will get stored in the memory at some point, even if you succeed in keeping them out of the memory when you actually use the keys. If someone is typing them in, isn't the security risk that someone in one way or another, forces the person(s) knowing the keys to reveal the keys?

As others have said, there is no secure way to do this on a general purpose computer. The alternative is to use a Hardware Security Module (HSM).
These provide:
greater physical protection for the keys than normal PCs/servers (protecting against direct access to RAM);
greater logical protection as they are not general purpose - no other software is running on the machine so no other processes/users have access to the RAM.
You can use the HSM's API to perform the cryptographic operations you need (assuming they are somewhat standard) without ever exposing the unencrypted key outside of the HSM.

If your platform supports POSIX, you would want to use mlock to prevent your data from being paged to the swap area. If you're writing code for Windows, you can use VirtualLock instead.
Keep in mind that there's no absolute way to protect the sensitive data from getting leaked, if you require the data to be in its unencrypted form at any point in time in the RAM (we're talking about plain ol' RAM here, nothing fancy like TrustZone). All you can do (and hope for) is to minimize the amount of time that the data remains unencrypted so that the adversary will have lesser time to act upon it.

If yours is an user mode application and the memory you are trying to protect is from other user mode processes try CryptProtectMemory api (not for persistant data).

As the other answers mentioned, you may implement a software solution but if your program runs on a general purpose machine and OS and the attacker has access to your machine it will not protect your sensitive data. If you data is really very sensitive and an attacker can physically access the machine a general software solution won't be enough.
I once saw some platforms dealing with very sensible data which had some sensors to detect when someone was accessing the machine physically, and which would actively delete the data when that was the case.
You already mentioned cold boot attack, the problem is that the data in RAM can be accessed until minutes after shut down on general RAM.

How to optimize paging for large in memory database

I have an application where the entire database is implemented in memory using a stl-map for each table in the database.
Each item in the stl-map is a complex object with references to other items in the other stl-maps.
The application works with a large amount of data, so it uses more than 500 MByte RAM. Clients are able to contact the application and get a filtered version of the entire database. This is done by running through the entire database, and finding items relevant for the client.
When the application have been running for an hour or so, then Windows 2003 SP2 starts to page out parts of the RAM for the application (Eventhough there is 16 GByte RAM on the machine).
After the application have been partly paged out then a client logon takes a long time (10 mins) because it now generates a page fault for each pointer lookup in the stl-map. If running the client logon a second time right after then it is fast (few secs) because all the memory is now back in RAM.
I can see it is possible to tell Windows to lock memory in RAM, but this is generally only recommended for device drivers, and only for "small" amounts of memory.
I guess a poor mans solution could be to loop through the entire memory database, and thus tell Windows we are still interested in keeping the datamodel in RAM.
I guess another poor mans solution could be to disable the pagefile completely on Windows.
I guess the expensive solution would be a SQL database, and then rewrite the entire application to use a database layer. Then hopefully the database system will have implemented means to for fast access.
Are there other more elegant solutions ?

This sounds like either a memory leak, or a serious fragmentation problem. It seems to me that the first step would be to figure out what's causing 500 Mb of data to use up 16 Gb of RAM and still want more.
Edit: Windows has a working set trimmer that actively attempts to page out idle data. The basic idea is that it goes through and marks pages as being available, but leaves the data in them (and the virtual memory manager knows what data is in them). If, however, you attempt to access that memory before it's allocated to other purposes, it'll be marked as being in use again, which will normally prevent it from being paged out.
If you really think this is the source of your problem, you can indirectly control the working set trimmer by calling SetProcessWorkingSetSize. At least in my experience, this is only rarely of much use, but you may be in one of those unusual situations where it's really helpful.

As #Jerry Coffin said, it really sounds like your actual problem is a memory leak. Fix that.
But for the record, none of your "poor mans solutions" would work. At all.
Windows pages out some of your data because there's not room for it in RAM.
Looping through the entire memory database would load in every byte of the data model, yes... which would cause other parts of it to be paged out. In the end, you'd generate a lot of page faults, and the only difference in the end would be which parts of the data structure are paged out.
Disabling the page file? Yes, if you think a hard crash is better than low performance. Windows doesn't page data out because it's fun. It does that to handle situations where it would otherwise run out of memory. If you disable the pagefile, the app will just crash when it would otherwise page out data.
If your dataset really is so big it doesn't fit in memory, then I don't see why an SQL database would be especially "expensive". Unlike your current solution, databases are optimized for this purpose. They're meant to handle datasets too large to fit in memory, and to do this efficiently.
It sounds like you have a memory leak. Fixing that would be the elegant, efficient and correct solution.
If you can't do that, then either
throw more RAM at the problem (the app ends up using 16GB? Throw 32 or 64GB at it then), or
switch to a format that's optimized for efficient disk access (A SQL database probably)

We have a similar problem and the solution we choose was to allocate everything in a shared memory block. AFAIK, Windows doesn't page this out. However, using stl-map here is not for faint of heart either and was beyond what we required.
We are using Boost Shared Memory to implement this for us and it works well. Follow examples closely and you will be up and running quickly. Boost also has Boost.MultiIndex that will do a lot of what you want.
For a no cost sql solution have you looked at Sqlite? They have an option to run as an in memory database.
Good luck, sounds like an interesting application.

I have an application where the entire
database is implemented in memory
using a stl-map for each table in the
database.
That's the start of the end: STL's std::map is extremely memory inefficient. Same applies to std::list. Every element would be allocated separately causing rather serious memory waste. I often use std::vector + sort() + find() instead of std::map in applications where it is possible (more searches than modifications) and I know in advance memory usage might become an issue.
When the application have been running
for an hour or so, then Windows 2003
SP2 starts to page out parts of the
RAM for the application (Eventhough
there is 16 GByte RAM on the machine).
Hard to tell without knowing how your application is written. Windows has the feature to unload from RAM whatever memory of idle applications can be unloaded. But that normally affects memory mapped files and alike.
Otherwise, I would strongly suggest to read up the Windows memory management documentation . It is not very easy to understand, yet Windows has all sorts and types of memory available to applications. I never had luck with it, but probably in your application using custom std::allocator would work.

I can believe it is the fault of flawed pagefile behaviour -i've run my laptops mostly with pagefile turned off since nt4.0. In my experience, at least up to XP Pro, Windows intrusively swaps pages out just to provide the dubious benefit of having a really-really-slow extension to the maximum working set space.
Ask what benefit swapping to harddisk is achieving with 16 Gigabityes of real RAM available? If your working set it so big as to need more virtual memory than +10 Gigs, then once swapping is actualy required processes will take anything from a bit longer, to thousands of times longer to complete. On Windows the untameable file system cache seems to antagonise the relationships.
Now when I (very) occasionaly run out of working set on my XP laptops, there is no traffic jam, the guilty app just crashes. A utility to suspend memory glugging processes before that time and make an alert would be nice, but there is no such thing just a violation, a crash, and sometimes explorer.exe goes down too.
Pagefiles - who needs em'

---- Edit
Given snakefoot explanation, the problem is swapping out memory that is not used for a longer period of time and due to this not having the data in memory when needed. This is the same as this:
Can I tell Windows not to swap out a particular processes’ memory?
and VirtualLock function should do its job:
http://msdn.microsoft.com/en-us/library/aa366895(VS.85).aspx
---- Previous answer
First of all you need to distinguish between memory leak and memory need problems.
If you have a memory leak then it would be bigger effort to convert entire application to SQL than to debug the application.
SQL cannot be faster then a well designed, domain specific in-memory database and if you have bugs, chances are you will have different ones in an SQL version as well.
If this is a memory need problem, then you will need to switch to SQL anyway and this sounds like a good moment.

Does using SecureZeroMemory() really help to make the application more secure?

There's a SecureZeroMemory() function in WinAPI that is designed for erasing the memory used for storing passwords/encryption keys/similar stuff when the buffer is no longer needed. It differs from ZeroMemory() in that its call will not be optimized out by the compiler.
Is it really so necessary to erase the memory used for storing sensitive data? Does it really make the application more secure?
I understand that data could be written into swapfile or into hibernation file and that other processes could possibly read my program's memory. But the same could happen with the data when it is still in use. Why is use, then erase better than just use?

It does. Hibernation file is not encrypted, for example. And if you don't securely clear the memory, you might end up with trouble. It's just a single example, though. You should always hold secret stuff in memory only as long as needed.

It exists for a reason. :)
If you keep sensitive data in memory, then other processes can potentially read it.
Of course, in your application, passwords or other secure data may not be so critical that this is required. But in some applications, it's pretty essential that malicious code can't just snoop your passwords or credit card numbers or whatever other data the application uses.

Also note that it might be that some OS'es will not zero memory before giving it to an application, this means that an application might randomly request memory, scan it for possibly interesting content and do something with it.
If that application would only get zero'd memory, of course it would have a harder time trying to get interesting data.

SecureZeroMemory() will certainly not make your application perfectly secure. The fact that the password was already in memory is already a security hole. Using SecureZeroMemory() will definitely make it less likely that your password can be retrieved. I don't see any reason not to use it, so why not? Just remember that there are many other things you have to worry about too.

If you actually have password data or other secrets, you're also going to want to make sure that the memory they are in doesn't get swapped out, otherwise the swap file can become a problem (I think the function you want is 'VirtualLock' for a windows app). Further you'll need to detect windows going into hibrenate and wipe the data at that point too. I believe Windows will send a message to every app when it's about to hibrenate.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js