Within an application, I've got Secret Keys uses to calculate a hash for an API call. In a .NET application it's fairly easy to use a program like Reflector to pull out information from the assembly to include these keys.
Is obfuscating the assembly a good way of securing these keys?
Probably not.
Look into cryptography and Windows' built-in information-hiding mechanisms (DPAPI and storing the keys in an ACL-restricted registry key, for example). That's as good as you're going to get for security you need to keep on the same system as your application.
If you are looking for a way to stop someone physically sitting at the machine from getting your information, forget it. If someone is determined, and has unrestricted access to a computer that is not under your control, there is no way to be 100% certain that the data is protected under all circumstances. Someone who is determined will get at it if they want to.
I wouldn't think so, as obfuscating (as I understand it at least) will simply mess around with the method names to make it hard (but not impossible) to understand the code. This won't change the data of the actual key (which I'm guessing you have stored in a constant somewhere).
If you just want to make it somewhat harder to see, you could run a simple cipher on the plaintext (like ROT-13 or something) so that it's at least not stored in the clear in the code itself. But that's certainly not going to stop any determined hacker from accessing your key. A stronger encryption method won't help because you'd still need to store the key for THAT in the code, and there's nothing protecting that.
The only really secure thing I can think of is to keep the key outside of the application somehow, and then restrict access to the key. For instance, you could keep the key in a separate file and then protected the file with an OS-level user-based restriction; that would probably work. You could do the same with a database connection (again, relying on the user-based access restriction to keep non-authorized users out of the database).
I've toyed with the idea of doing this for my apps but I've never implemented it.
DannySmurf is correct that you can't hide keys from the person running an application; if the application can get to the keys, so can the person.
However, What you are trying to accomplish exactly?
Depending on what it is, there are often ways to accomplish your goal that don't simply rely on keeping a secret "secret", on your user's machine.
Late to the game here...
The approach of storing the keys in the assembly / assembly config is fundamentally insecure. There is no possible ironclad way to store it as a determined user will have access. I don't care if you use the best / most expensive obfuscation product on the planet. I don't care if you use PDAPI to secure the data (although this is better). I don't care if you use a local OS-protected key store (this is even better still). None are ideal as all suffer from the same core issue: the user has access to the keys, and they are there, unchanging for days, weeks, possibly even months and years.
A far more secure approach would be is to secure your API calls with tried and true PKI. However, this has obvious performance overhead if your API calls are chatty, but for the vast majority of applications this is a non-issue.
If performance is a concern, you can use Diffie-Hellman over asymmetric PKI to establish a shared secret symmetric key for use with a cipher such as AES. "shared" in this case means shared between client and server, not all clients / users. There is no hard-coded / baked in key. Anywhere.
The keys are transient, regenerated every time the user runs the program, or if you are truly paranoid, they could time-out and require recreation.
The computed shared secret symmetric keys themselves get stored in memory only, in SecureString. They are hard to extract, and even if you do, they are only good for a very short time, and only for communication between that particular client (ie that session). In other words, even if somebody does hack their local keys, they are only good for interfering with local communication. They can't use this knowledge to affect other users, unlike a baked-in key shared by all users via code / config.
Furthermore, the entire keys themselves are never, ever passed over the network. The client Alice and server Bob independently compute them. The information they pass in order to do this could in theory be intercepted by third party Charlie, allowing him to independently calculate the shared secret key. That is why you use that (significantly more costLy) asymmetric PKI to protect the key generation between Alice and Bob.
In these systems, the key generation is quite often coupled with authentication and thus session creation. You "login" and create your "session" over PKI, and after that is complete, both the client and the server independently have a symmetric key which can be used for order-of-magnitude faster encryption for all subsequent communication in that session. For high-scale servers, this is important to save compute cycles on decryption over using say TLS for everything.
But wait: we're not secure yet. We've only prevented reading the messages.
Note that it is still necessary to use a message digest mechanism to prevent man-in-the-middle manipulation. While nobody can read the data being transmitted, without a MD there is nothing preventing them from modifying it. So you hash the message before encryption, then send the hash along with the message. The server then re-hashes the payload upon decryption and verifies that it matches the hash that was part of the message. If the message was modified in transit, they won't, and the entire message is discarded / ignored.
The final mechanism needed to guard against is replay attacks. At this point, you have prevented people from reading your data, as well as modifying your data, but you haven't prevented them from simply sending it again. If this is a problem for your application, it's protocol must provide data and both client and server must have enough stateful information to detect a replay. This could be something as simple as a counter that is part of the encrypted payload. Note that if you are using a transport such as UDP, you probably already have a mechanism to deal with duplicated packets, and thus can already deal with replay attacks.
What should be obvious is getting this right is not easy. Thus, use PKI unless you ABSOLUTELY cannot.
Note that this approach is used heavily in the games industry where it is highly desirable to spend as little compute on each player as possible to achieve higher scalability, while at the same time providing security from hacking / prying eyes.
So in conclusion, if this is really something that is a concern, instead of trying to find a securely store the API keys, don't. Instead, change how your app uses this API (assuming you have control of both sides, naturally). Use a PKI, or use a PKI-shared symmetric hybrid if PKI will be too slow (which is RARELY a problem these days). Then you won't have anything stored that is a security concern.
Related
I'm working on a C++ application which is keeping some user secret keys in the RAM. This secret keys are highly sensitive & I must minimize risk of any kind of attack against them.
I'm using a character array to store these keys, I've read some contents about storing variables in CPU registers or even CPU cache (i.e using C++ register keyword), but seems there is not a guaranteed way to force application to store some of it's variables outside of RAM (I mean in CPU registers or cache).
Can anybody suggest a good way to do this or suggest any other solution to keep these keys securely in the RAM (I'm seeking for an OS-independent solution)?
Your intentions may be noble, but they are also misguided. The short answer is that there's really no way to do what you want on a general purpose system (i.e. commodity processors/motherboard and general-purpose O/S). Even if you could, somehow, force things to be stored on the CPU only, it still would not really help. It would just be a small nuisance.
More generally to the issue of protecting memory, there are O/S specific solutions to indicate that blocks memory should not be written out to the pagefile such as the VirtualLock function on Windows. Those are worth using if you are doing crypto and holding sensitive data in that memory.
One last thing: I will point out that it worries me is that you have a fundamental misunderstanding of the register keyword and its security implications; remember it's a hint and it won't - indeed, it cannot - force anything to actually be stored in a register or anywhere else.
Now, that, by itself, isn't a big deal, but it is a concern here because it indicates that you do not really have a good grasp on security engineering or risk analysis, which is a big problem if you are designing or implementing a real-world cryptographic solution. Frankly, your posts suggests (to me, at least) that you aren't quite ready to architect or implement such a system.
You can't eliminate the risk, but you can mitigate it.
Create a single area of static memory that will be the only place that you ever store cleartext keys. And create a single buffer of random data that you will use to xor any keys that are not stored in this one static buffer.
Whenever you read a key into memory, from a keyfile or something, you only read it directly into this one static buffer, xor with your random data and copy it out wherever you need it, and immediately clear the buffer with zeroes.
You can compare any two key by just comparing their masked versions. You can even compare hashes of masked keys.
If you need to operate on the cleartext key - e.g. to generate a hash or validate they key somehow load the masked xor'ed key into this one static buffer, xor it back to cleartext and use it. Then write zeroes back into that buffer.
The operation of unmasking, operating and remasking should be quick. Don't leave the buffer sitting around unmasked for a long time.
If someone were to try a cold-boot attack, pulling the plug on the hardware, and inspecting the memory chips there would be only one buffer that could possibly hold a cleartext key, and odds are during that particular instant of the coldboot attack the buffer would be empty.
When operating on the key, you could even unmask just one word of the key at a time just before you need it to validate the key such that a complete key is never stored in that buffer.
#update: I just wanted to address some criticisms in the comments below:
The phrase "security through obscurity" is commonly misunderstood. In the formal analysis of security algorithms "obscurity" or methods of hiding data that are not crytpographically secure do not increase the formal security of a cryptographic algorithm. And it is true in this case. Given that keys are stored on the users machine, and must be used by that program on that machine there is nothing that can be done to make the keys on this machine cryptographically secure. No matter what process you use to hide or lock the data at some point the program must use it, and a determined hacker can put breakpoints in the code and watch when the program uses the data. But no suggestion in this thread can eliminate that risk.
Some people have suggested that the OP find a way to use special hardware with locked memory chips or some operating system method of locking a chip. This is cryptographically no more secure. Ultimately if you have physical access to the machine a determined enough hacker could use a logic analyzer on the memory bus and recover any data. Besides the OP has stated that the target systems don't have such specialized hardware.
But this doesn't mean that there aren't things you can do to mitigate risk. Take the simplest of access keys- the password. If you have physical access to a machine you can put in a key logger, or get memory dumps of running programs etc. So formally the password is no more secure than if it was written in plaintext on a sticky note glued to the keyboard. Yet everyone knows keeping a password on a sticky note is a bad idea, and that is is bad practice for programs to echo back passwords to the user in plaintext. Because of course practically speaking this dramatically lowers the bar for an attacker. Yet formally a sticky note with a password is no less secure.
The suggestion I make above has real security advantages. None of the details matter except the 'xor' masking of the security keys. And there are ways of making this process a little better. Xor'ing the keys will limit the number of places that the programmer must consider as attack vectors. Once the keys are xord, you can have different keys all over your program, you can copy them, write them to a file, send them over the network etc. None of these things will compromise your program unless the attacker has the xor buffer. So there is a SINGLE BUFFER that you have to worry about. You can then relax about every other buffer in the system. ( and you can mlock or VirtualLock that one buffer )
Once you clear out that xor buffer, you permanently and securely eliminate any possibility that an attacker can recover any keys from a memory dump of your program. You are limiting your exposure both in terms of the number of places and the times that keys can be recovered. And you are putting in place a system that allows you to work with keys easily without worrying during every operation on an object that contains keys about possible easy ways the keys can be recovered.
So you can imagine for example a system where keys refcount the xor buffer, and when all key are no longer needed, you zero and delete the xor buffer and all keys become invalidated and inaccessible without you having to track them down and worry about if a memory page got swapped out and still holds plaintext keys.
You also don't have to literally keep around a buffer of random data. You could for example use a cryptographically secure random number generator, and use a single random seed to generate the xor buffer as needed. The only way an attacker can recover the keys is with access to the single generator seed.
You could also allocate the plaintext buffer on the stack as needed, and zero it out when done such that it is extremely unlikely that the stack ever leaves on chip cache. If the complete key is never decoded, but decoded one word at a time as needed even access to the stack buffer won't reveal the key.
There is no platform-independent solution. All the threats you're addressing are platform specific and thus so are the solutions. There is no law that requires every CPU to have registers. There is no law that requires CPUs to have caches. The ability for another program to access your program's RAM, in fact the existence of other programs at all, are platform details.
You can create some functions like "allocate secure memory" (that by default calls malloc) and "free secure memory" (that by default calls memset and then free) and then use those. You may need to do other things (like lock the memory to prevent your keys from winding up in swap) on platforms where other things are needed.
Aside from the very good comments above, you have to consider that even IF you succeed in getting the key to be stored in registers, that register content will most likely get stored in memory when an interrupt comes in, and/or when another task gets to run on the machine. And of course, someone with physical access to the machine can run a debugger and inspect the registers. Debugger may be an "in circuit emulator" if the the key is important enough that someone will spent a few thousand dollars on such a device - which means no software on the target system at all.
The other question is of course how much this matters. Where are the keys originating from? Is someone typing them in? If not, and are stored somewhere else (in the code, on a server, etc), then they will get stored in the memory at some point, even if you succeed in keeping them out of the memory when you actually use the keys. If someone is typing them in, isn't the security risk that someone in one way or another, forces the person(s) knowing the keys to reveal the keys?
As others have said, there is no secure way to do this on a general purpose computer. The alternative is to use a Hardware Security Module (HSM).
These provide:
greater physical protection for the keys than normal PCs/servers (protecting against direct access to RAM);
greater logical protection as they are not general purpose - no other software is running on the machine so no other processes/users have access to the RAM.
You can use the HSM's API to perform the cryptographic operations you need (assuming they are somewhat standard) without ever exposing the unencrypted key outside of the HSM.
If your platform supports POSIX, you would want to use mlock to prevent your data from being paged to the swap area. If you're writing code for Windows, you can use VirtualLock instead.
Keep in mind that there's no absolute way to protect the sensitive data from getting leaked, if you require the data to be in its unencrypted form at any point in time in the RAM (we're talking about plain ol' RAM here, nothing fancy like TrustZone). All you can do (and hope for) is to minimize the amount of time that the data remains unencrypted so that the adversary will have lesser time to act upon it.
If yours is an user mode application and the memory you are trying to protect is from other user mode processes try CryptProtectMemory api (not for persistant data).
As the other answers mentioned, you may implement a software solution but if your program runs on a general purpose machine and OS and the attacker has access to your machine it will not protect your sensitive data. If you data is really very sensitive and an attacker can physically access the machine a general software solution won't be enough.
I once saw some platforms dealing with very sensible data which had some sensors to detect when someone was accessing the machine physically, and which would actively delete the data when that was the case.
You already mentioned cold boot attack, the problem is that the data in RAM can be accessed until minutes after shut down on general RAM.
I'm making a commercial product that will have a client and server side. The client is totally dependent on the server , just to make it harder to crack/pirate . Problem is , even so there is a chance that someone will reverse engineer the protocol and make their own server.
I've thought about encrypting the connection either with ssl or with another algorithm so it won't be so easy to figure out the protocol just from sniffing the traffic between the client and the server.
Now the only thing I can think of that pirates would use is to decompile the program, remove the encryption and try to see the "plain text" protocol in order to reverse engineer it.
I have read previous topics and I know that it's impossible to make it impossible to crack , but what tweaks can we programmers bring to our code to make it a huge headache for crackers?
Read how Skype did it:
The binary is decrypted into memory at startup.
The import table is overwritten.
The startup code is erased from memory.
Code integrity checks bust most debuggers: in random points in the code it computes a checksum of some other chunk of code and uses the checksum for an indirect jump to the next instruction. (Explanation: most debuggers implement breakpoints by changing the instruction at the breakpoint address. This check detects that.)
If debugger is detected -- it scrambles the registers and jumps to a random page.
Obfuscates code: call destination addresses are dynamically computed; dummy branches that are never executed; raises SEH where the handler sets some registers and resumes execution.
Keep in mind that these or other techniques would make reverse engineering harder, but not impossible. Also you shall never rely on any of these for security.
IMO your best option is to design your servers to provide some useful functionality (SaS). Your clients will essentially be paying for using that functionality. If your client-app is dumb enough, you won't care about it being open-source.
One thing you need to be aware of is that most packers/cryptors cause false positives with virus scanners. And that can be pretty annoying because people complain all the time that your software contains a virus(they don't get the concept of false positives).
And for protocol-obfuscation don't use SSL. It is trivial for an attacker to intercept the plaintext when you call Send with the plain-text. Use SSL for securing the connection and obfuscate the data before sending them. The obfuscation algorithm doesn't need to be cryptographically secure.
This might be helpful: http://www.woodmann.com/crackz/Tutorials/Protect.htm
IMHO, it's difficult to hide the actual plain code. What most packers do is to make it difficult to patch. However, in your case, Themida could do the trick.
Here are some nice tips about writing a good protection: http://www.inner-smile.com/nocrack.phtml
I'm working with a listview control which saves the data using AES encryption to a file. I need to keep the data of every item in listview in std::list class of std::string. should I just keep the data encrypted in std::list and decrypt to a local variable when its needed? or is it enough to keep it encrypted in file only?
To answer this question you need to consider who your attackers are (i.e. who are you trying to hide the data from?).
For this purpose, it helps if you work up a simple Threat Model (basically: Who you are worried about, what you want to protect, the types of attacks they may carry out, and the risks thereof).
Once this is done, you can determine if it is worth your effort to protect the data from being written to the disk (even when held decrypted only in memory).
I know this answer may not seem useful, but I hope it helps you to become aware that you need to specifically state (and hence know) you your attackers are, before you can correctly defend against them (i.e, you may end up implementing completely useless defenses, and so on).
Will you be decrypting the same item more than once? If you aren't concerned about in-memory attacks then performance might be the other issue to consider.
If you have time it may be worth coding your solution to allow for both eventualities. Therefore if you choose to cache encrypted, then it's not too much work to change to a decrypted in-memory solution later on if performance becomes an issue.
It is unclear what attack you are attempting to defend against. If the attacker has local access to the system then they can attach a debugger like OllyDBG to the process to observe its memory. The attack would be to set a break point at the call to AES and then observe the data being passed in and returned, very simple.
I agree with answer from silky that you have to start with a basic threat model. Just wanted to pointed out that when handling sensitive information in memory, you have every right to be concerned that the information may end up on disk even if you do not write it out.
For example, the data in memory could be written to swap space or could end up in a core file and from there on elsewhere (such as an email attachment or copied to other places). You can deal with these without encrypting data in memory since that may just shift the problem to dealing with key to decrypt that data...
I've got some data that I want to save on Amazon S3. Some of this data is encrypted and some is compressed. Should I be worried about single bit flips? I know of the MD5 hash header that can be added. This (from my experience) will prevent flips in the most unreliable portion of the deal (network communication), however I'm still wondering if I need to guard against flips on disk?
I'm almost certain the answer is "no", but if you want to be extra paranoid you can precalculate the MD5 hash before uploading, compare that to the MD5 hash you get after upload, then when downloading calculate the MD5 hash of the downloaded data and compare it to your stored hash.
I'm not sure exactly what risk you're concerned about. At some point you have to defer the risk to somebody else. Does "corrupted data" fall under Amazon's Service Level Agreement? Presumably they know what the file hash is supposed to be, and if the hash of the data they're giving you doesn't match, then it's clearly their problem.
I suppose there are other approaches too:
Store your data with an FEC so that you can detect and correct N bit errors up to your choice of N.
Store your data more than once in Amazon S3, perhaps across their US and European data centers (I think there's a new one in Singapore coming online soon too), with RAID-like redundancy so you can recover your data if some number of sources disappear or become corrupted.
It really depends on just how valuable the data you're storing is to you, and how much risk you're willing to accept.
I see your question from two points of view, a theoretical and practical.
From a theoretical point of view, yes, you should be concerned - and not only about bit flipping, but about several other possible problems. In particular section 11.5 of the customer agreements says that Amazon
MAKE NO REPRESENTATIONS OR WARRANTIES OF ANY KIND, WHETHER EXPRESS, IMPLIED, STATUTORY OR OTHERWISE WITH RESPECT TO THE SERVICE OFFERINGS. (..omiss..) WE AND OUR LICENSORS DO NOT WARRANT THAT THE SERVICE OFFERINGS WILL FUNCTION AS DESCRIBED, WILL BE UNINTERRUPTED OR ERROR FREE, OR FREE OF HARMFUL COMPONENTS, OR THAT THE DATA YOU STORE WITHIN THE SERVICE OFFERINGS WILL BE SECURE OR NOT OTHERWISE LOST OR DAMAGED.
Now, in practice, I'd not be concerned. If your data will be lost, you'll blog about it and (although they might not face any legal action), their business will be pretty much over.
On the other hand, that depends on how much vital your data is. Suppose that you were rolling your own stuff in your own data center(s). How would you plan for disaster recovery there? If you says: I'd just keep two copies in two different racks, just use the same technique with Amazon, maybe keeping two copies in two different datacenters (since you wrote that you are not interested in how to protect against bit flips, I'm providing only a trivial example here)
Probably not: Amazon is using checksums to protect against bit flips, regularly combing through data at rest, ensuring that no bit flips have occurred. So, unless you have corruption in all instances of the data within the interval of integrity check loops you should be fine.
Internally, S3 uses MD5 checksums throughout the system to detect/protect against bitflips. When you PUT an object into S3, we compute the MD5 and store that value. When you GET an object we recompute the MD5 as we stream it back. If our stored MD5 doesn't match the value we compute as we're streaming the object back we'll return an error for the GET request. You can then retry the request.
We also continually loop through all data at rest, recomputing checksums and validating them against the MD5 we saved when we originally stored the object. This allows us to detect and repair bit flips that occur in data at rest. When we find a bit flip in data at rest, we repair it using the redundant data we store for each object.
You can also protect yourself against bitflips during transmission to and from S3 by providing an MD5 checksum when you PUT the object (we'll error if the data we received doesn't match the checksum) and by validating the MD5 when GET an object.
Source:
https://forums.aws.amazon.com/thread.jspa?threadID=38587
There are two ways of reading your question:
"Is Amazon S3 perfect?"
"How do I handle the case where Amazon S3 is not perfect?"
The answer to (1) is almost certainly "no". They might have lots of protection to get close, but there is still the possibility of failure.
That leaves (2). The fact is that devices fail, sometimes in obvious ways and other times in ways that appear to work but give an incorrect answer. To deal with this, many databases use a per-page CRC to ensure that a page read from disk is the same as the one that was written. This approach is also used in modern filesystems (for example ZFS, which can write multiple copies of a page, each with a CRC to handle raid controller failures. I have seen ZFS correct single bit errors from a disk by reading a second copy; disks are not perfect.)
In general you should have a check to verify that your system is operating is you expect. Using a hash function is a good approach. What approach you take when you detect a failure depends on your requirements. Storing multiple copies is probably the best approach (and certainly the easiest) because you can get protection from site failures, connectivity failures and even vendor failures (by choosing a second vendor) instead of just redundancy in the data itself by using FEC.
My app keeps track of the state of about 1000 objects. Those objects are read from and written to a persistent store (serialized) in no particular order.
Right now the app uses the registry to store each object's state. This is nice because:
It is simple
It is very fast
Individual object's state can be read/written without needing to read some larger entity (like pulling out a snippet from a large XML file)
There is a decent editor (RegEdit) which allow easily manipulating individual items
Having said that, I'm wondering if there is a better way. SQLite seems like a possibility, but you don't have the same level of multiple-reader/multiple-writer that you get with the registry, and no simple way to edit existing entries.
Any better suggestions? A bunch of flat files?
If what you mean by 'multiple-reader/multiple-writer' is that you keep a lot of threads writing to the store concurrently, SQLite is threadsafe (you can have concurrent SELECTs and concurrent writes are handled transparently). See the [FAQ [1]] and grep for 'threadsafe'
[1]: http://www.sqlite.org/faq.html/ FAQ
If you do begin to experiment with SQLite, you should know that "out of the box" it might not seem as fast as you would like, but it can quickly be made to be much faster by applying some established optimization tips:
SQLite optimization
Depending on the size of the data and the amount of RAM available, one of the best performance gains will occur by setting sqlite to use an all-in-memory database rather than writing to disk.
For in-memory databases, pass NULL as the filename argument to sqlite3_open and make sure that TEMP_STORE is defined appropriately
On the other hand, if you tell sqlite to use the harddisk, then you will get a similar benefit to your current usage of RegEdit to manipulate the program's data "on the fly."
The way you could simulate your current RegEdit technique with sqlite would be to use the sqlite command-line tool to connect to the on-disk database. You can run UPDATE statements on the sql data from the command-line while your main program is running (and/or while it is paused in break mode).
I doubt any sane person would go this route these days, however some of what you describe could be done with Window's Structured/Compound Storage. I only mention this since you're asking about Windows - and this is/was an official Windows way to do this.
This is how DOC files were put together (but not the new DOCX format). From MSDN it'll appear really complicated, but I've used it, it isn't the worst API in Win32.
it is not simple
it is fast, I would guess it's faster then the registry.
Individual object's state can be read/written without needing to read some larger entity.
There is no decent editor, however there are some real basic stuff (VC++ 6.0 had the "DocFile Viewer" under Tools. (yeah, that's what that thing did) I found a few more online.
You get a file instead of registry keys.
You gain some old-school Windows developer geek-cred.
Other random thoughts:
I think XML is the way to go (despite the random access issue). Heck, INI files may work. The registry gives you very fine grain security if you need it - people seem to forget this when the claim using files are better. An embedded DB seems like overkill if I'm understanding what you're doing.
Do you need to persist the objects on each change event or just in memory and store on shutdown? If so, just load them up and serialize them at the end, assuming your app runs for a long time (and you don't share that state with another program) then in memory is going to be a winner.
If you've got fixed size structures then you could consider just using a memory mapped file and allocate memory from that?
If the only thing you do is serialize/deserialize individual objects (no fancy queries), then use a btree database, for example Berkeley DB. It is very fast at storing and retrieving chunks of data by key (I assume your objects have some id that can be used as a key) and access by multiple processes is supported.