How to prevent a file from being tampered with - c++

I want to store confidential data in a digitally signed file, so that I know when its contents have been tampered with.
My initial thought is that the data will be stored in NVPs (name value pairs), with some kind of CRC or other checksum to verify the contents.
I am thinking of implementing the creating (i.e. writing) and verification (reading) of such a file, using ANSI C++.
Assuming this is the data I want to store:
//Unencrypted, raw data to be stored in file
struct PrivateInfo {
double age; weight;
FitnessScale fitness;
Location loc;
OtherStuff stuff;
};
//128-bit Encrypted Data (Payload to be stored in file)
struct EncryptedData {
// unknown fields/format ??
};
[After I have read a few responses to this question]
Judging by the comments I have received so far, I fear people are getting side tracked by the word "licensing" which seems to be a red flag to most people. I suspected that may be the case, but in todays atmosphere of heightened security and general nervousness, I thought I'd better detail what I needed to be "hiding" lest someone thought I was thinking of passing on the "Nuke password" to some terrorists or something. I will now remove the word "license" from my question.
View it more as a technical question. Imagine I am a student (which I am), and that I am trying to find out about recommended (or best practices) for encoding information that needs to be secure.
Mindful of the above, I will reformat my questions thus:
Given a struct of different data type fields, what is the "recommended" algorithm to give it a "reasonable secure" encryption (I still prefer to use 128 bit - but thats just me)
What is a recommended way of providing a ROBUST check on the encrypted data, so I can use that check value to know if the contents of the file (the Payload of encrypted data) differs from the original.?

First, note that "signing" data (to notice when it has been tampered with) is a completely separate and independent operation from "encrypting" data (to prevent other people from reading it).
That said, the OpenPGP standard does both. GnuPG is a popular implementation: http://www.gnupg.org/gph/en/manual.html
Basically you need to:
Generate a keypair, but don't bother publishing the public part.
Sign and encrypt your data (this is a single operation in gpg)
... storage ...
Decrypt and check the signature (this is also a single operation).
But, beware that this is only any use if you can store your private key more securely than you store the rest of the data. If you can't guarantee the security of the key, then GPG can't help you against a malicious attempt to read or tamper with your data. And neither can any other encryption/signing scheme.
Forgetting encryption, you might think that you can sign the data on some secure server using the private key, then validate it on some user's machine using the public key. This is fine as far as it goes, but if the user is malicious and clever, then they can invent new data, sign it using their own private key, and modify your code to replace your public key with theirs. Their data will then validate. So you still need the storage of the public key to be tamper-proof, according to your threat-model.
You can implement an equivalent yourself, something along the lines of:
Choose a longish string of random characters. This is your key.
Concatenate your data with the key. Hash this with a secure hash function (SHA-256). Then concatenate the resulting hash with your data, and encrypt it using the key and a secure symmetric cipher (AES).
... storage ...
Decrypt the data, chop off the hash value, put back the key, hash it, and compare the result to the hash value to verify that it has not been modified.
This will likely be faster and use less code in total than gpg: for starters, PGP is public key cryptography, and that's more than you require here. But rolling your own means you have to do some work, and write some of the code, and check that the protocol I've just described doesn't have some stupid error in it. For example, it has potential weaknesses if the data is not of fixed length, which HMAC solves.
Good security avoids doing work that some other, smarter person has done for you. This is the virtuous kind of laziness.

Err, why not use a well known encryption system like GPG?

The answers to the edited question depend on the specific scenario.
For q1 (encryption): if you encrypt and decrypt at your servers you can use a symmetric key algorithm. Otherwise you may want to use public key cryptography.
For q2, if you simply want to check if a file has changed you can use any cryptographic hash such as SHA-1 -- assuming that you can make sure that the hash itself wasn't change.
If the data generator and the verifier are both secure you can use MAC algorithm such as HMAC to to verify that the data and the MAC match. But this works only if the secret key remains secret.
Otherwise, you may be able to use digital signatures.

I'm going to change the phrasing of the question and see if it makes people happier (or I get downvoted). There are really two types of questions being asked:
You are making some computer game and you want to know if someone has been messing with your save files. (data signing)
You are writing a messaging program and want to keep people's message logs private. (data encryption)
I will deal with the second one (data encryption). It's a massively difficult topic and you should be looking for pre-built programs (such as PGP/GPG) even then it's going to take you a lot of time to understand and use properly. Think about encryption like this: it will be broken; your job is to make it not worth the effort. In other words make the effort required to break it greater than the value of the information.
As for the first one, again it can be broken. But a checksum is a good idea. see Amnon's answer for some links on that.
Hope this points you in the right direction. I'm not an expert on either topics but I hope this gives you a starting point. (you might want to re-phrase the question and see if you get some better answers)

Related

What is the best way to encrypt hardcoded strings in C++?

Warning: C++ noob
I've read multiple posts on StackOverflow about string encryption. By the way, they don't answer my doubts.
I must insert one or two hardcoded strings in my code but I would like to make it difficult to read in plain text when debugging/reverse engineering. That's not all: my strings are URLs, so a simple packet analyzer (Wireshark) can read it.
I've said difficult because I know that, when the code runs, the string is somewhere (in RAM?) decrypted as plain text and somebody can read it. So, assuming that is not possible to completely secure my string, what is the best way of encrypting/decrypting it in C++?
I was thinking of something like this:
//I've omitted all the #include and main stuff of course...
string encryptedUrl = "Ajdu67gGHhbh34590Hb6vfu6gu" //Encrypted url with some known algorithm
URLDownloadToFile(NULL, encryptedUrl.decrypt(), C:\temp.txt, 0, NULL);
What about packet analyzing? I'm sure there's no way to hide the URL but maybe I'm missing something? Thank you and sorry for my worst english!
Edit 1: What my application does?
It's a simple login script. My application downloads a text file from an URL. This file contains an encrypted string that is read using fstream library. The string is then decrypted and used to login on another site. It is very weak, because there's no database, no salt, no hashing. My achievement is to ensure that neither the url nor the login string are "easy" to read from a static analisys of the binary, and possibly as hard as possible with a dynamic analysis (debugging, revers engineering, etc).
If you want to stymie packet inspectors, the bare minimum requirement is to use https with a hard-coded server certificate baked into your app.
There is no panacea for encrypting in-app content. A determined hacker with the right skills will get at the plain url, no matter what you do. The best you can hope for is to make it difficult enough that most people will just give up. The way to achieve this is to implement multiple diverse obfuscation and tripwire techniques. Including, but not limited to:
Store parts of the encrypted url and the password (preferably a one-time key) in different locations and bring them together in code.
Hide the encrypted parts in large strings of randomness that looks indistinguishable from the parts.
Bring the parts together piecemeal. E.g., Concatenate the first and second third of the encrypted url into a single buffer from one initialisation function, concatenate this buffer with the last third in a different unrelated init function, and use the final concatenation in yet another function, all called from different random places in your code.
Detect when the app is running under a debugger and have different functions trash the encrypted contents at different times.
Detection should be done at various call sites using different techniques, not by calling a single "DetectDebug" function or testing a global bool, both of which create a single point of attack.
Don't use obvious names, like, "DecryptUrl" for the relevant functions.
Harvest parts of the key from seemingly unrelated, but consistent sources. E.g., read the clock and only use a few of the high bits (high enough that that they won't change for the foreseeable future, but low enough that they're not all zero), or use a random sampling of non-volatile results from initialisation code.
This is just the tip of the iceberg and will only throw novices off the scent. None of it is going to stop, or even significantly slow down, a skillful attacker, who will simply intercept calls to the SSL library using a stealth debugger. You therefore have to ask yourself:
How much is it worth to me to protect this url, and from what kind of attacker?
Can I somehow change the system design so that I don't need to secure the url?
Try XorSTR [1, 2]. It's what I used to use when trying to hamper static analysis. Most results will come from game cheat forums, there is an html generator too.
However as others have mentioned, getting the strings is still easy for anyone who puts a breakpoint on URLDownloadToFile. However, you will have made their life a bit harder if they are trying to do static analysis.
I am not sure what your URL's do, and what your goal is in all this, but XorStr + anti-debug + packing the binary will stop most amateurs from reverse engineering your application.

how to deal with passwords securely within your application

I have found a similar question here Saving passwords inside an application but it didn't really answer my concerns.
I am dealing with an application that will receive a password (securely) from the user. Once I receive the password I would need store it in some variable and send it through transactions to other systems (this logic is safe and secure and already implemented).
My worry is that I don't want to be able to see the password in a core dump so I would like to encrypt any password before saving it to any variable.
Questions:
Is encrypting it before saving it to a variable enough? Or am I missing some security loopholes?
Is there a simple header only libraries that can do encryption? Can you guide me to where I can start looking?
Note to answer commenters:
The password will not be stored long term; Only for the lifespan of the transactions.
Unfortunately, the participants of the transactions cannot decrypt the password, therefore I would need to decrypt it before I send it to them.
My main concern right now is to find a way to encrypt and decrypt the password locally - in an easy manner...
I found OpenSSL library and crypto++ but it seams that I would need to link with them, I can't just include and call them (i.e. not header only libraries)...
Thanks,
(Note: I'm sure there are rigorous checklists and official guidelines about how to treat passwords in secure software out there, from people and authorities that actually know something about security. This is not one of those!)
I don't think there is a cryptographically-secure way to have passwords in your process memory, be able to use them, but not give access to it to a user that can run your application under a debugger or inspect your core dumps.
What you can do is obscure the password. Here are some techniques you can use:
Not keep the password as a simple string anywhere in your memory (scatter the characters around, etc.)
Scrub all the variables that the password is stored in after they are used (e.g. if you pass the password to a function, you should set all the characters of that variable to NUL inside the function after you are done with it.
Encrypt the password.
Vary the encryption key at each run of the application (or periodically if it's a long-running app.)
Generate the encryption key procedurally based on some aspect of the system/hardware and not store the encryption key for the password anywhere in your process memory.
Use hardware like the Trusted Platform Module (TPM) if available.
Implementing the above consistently and effectively is quite hard and impacts all of your code that deals with the password. And sometimes you even have to intentionally make your code more obscure and go against all your instincts as a programmer (e.g. not passing the password into functions as a parameter, but using hard-coded addresses inside the function.)
I, once again, have to emphasize that it's probably provably impossible to secure your passwords in software only, when the adversary has complete access to the physical machine.
As for the second part of your question, I don't know of any header-only encryption libraries, but encrypting a password will probably only need a cipher and probably a hash. And all of the best algorithms have public-domain or otherwise free implementations in the wild. You can get one of those and copy/paste into your own application. Don't forget to seriously test it though!

Advice about the Encryption Method I should Use

Ok, so I need some advice on which encryption method I should use for my current project. All the questions about this subject on here are to do with networking and passing encrypted data from one machine to another.
A brief summary of how the system works is:
I have some data that is held in tables that are in text format. I then use a tool to parse this data and serialize it to a dat file. This works fine but I need to encrypt this data as it will be stored with the application in a public place. The data wont be sent anywhere it is simply read by the application. I just need it to be encrypted so that if it were to fall into the wrong hands, it would not be possible to read the data.
I am using the crypto++ library for my encryption and I have read that it can perform most types of encryption algorithms. I have noticed however that most algorithms use a public and private key to encrypt/decrypt the data. This would mean I would have to store the private key with the data which seems counter intuitive to me. Are there any ways that I can perform the encryption without storing a private key with the data?
I see no reason to use asymmetric crypto in your case. I see two decent solutions depending on the availability of internet access:
Store the key on a server. Only if the user of the program logs in to the server he gets back the key to his local storage.
Use a Key-Derivation-Function such as PBKDF2 to derive the key from a password.
Of course all of this fails if the attacker is patient and installs a keylogger and waits until you access the files the next time. There is no way to secure your data once your machine has been compromised.
Short answer: don't bother.
Long answer: If you store your .DAT file with the application, you'll have to store the key somewhere too. Most probably in the same place (maybe hidden in the code). So if a malicious user wants to break your encryption all he has to do is to look for that key, and that's it. It doesn't really matter which method or algorithm you use. Even if you don't store the decryption key with the application, it will get there eventually, and the malicious user can catch it with the debugger at run time (unless you're using a dedicated secured memory chip and running on a device that has the necessary protections)
That said, many times the mere fact that the data is encrypted is enough protection because the data is just not worth the trouble. If this is your case - then you can just embed the key in the code and use any symmetric algorithm available (AES would be the best pick).
Common way to solve your issue is:
use symetric key algorithm to cipher your data, common algorithm are AES, twofish. most probably, you want to use CBC chaining.
use a digest (sha-256) and sign it with an asymetric algorithm (RSA), using your private key : this way you embed a signature and a public key to check it, making sure that if your scrambling key is compromised, other persons won't be able to forge your personal data. Of course, if you need to update these data, then you can't use this private key mechanism.
In any case, you should check
symetric cipher vs asymetric ones
signature vs ciphering
mode of operation, meaning how you chain one block to the next one for block ciphers, like AES, 3DES (CBC vs ECB)
As previously said, if your data is read andwritten by same application, in any way, it will be very hard to prevent malicious users to steal these data. There are ways to hide keys in the code (you can search for Whitebox cryptography), but it will be definitely fairly complex (and obviously not relying on a simple external crypto library which can be easily templated to steal the key).
If your application can read the data and people have access to that application, someone with enough motivation and time will eventually figure out (by disassembling your application) how to read the data.
In other words, all the information that is needed to decipher the encrypted data is already in the hand of the attacker. You have the consumer=attacker problem in all DRM-related designs and this is why people can easily decrypt DVDs, BluRays, M4As, encrypted eBooks, etc etc etc...
That is called an asymmetric encryption when you use public/private key pairs.
You could use a symmetric encryption algorithm, that way you would only require one key.
That key will still need to be stored somewhere (it could be in the executable). But if the user has access to the .dat, he probably also has access to the exe. Meaning he could still extract that information. But if he has access to the pc (and the needed rights) he could read all the information from memory anyways.
You could ask the user for a passphrase (aka password) and use that to encrypt symmetrically. This way you don't need to store the passphrase anywhere.

Is RIJNDAEL encryption safe to use with small amounts of text given to users?

I am thinking about making the switch to storing session data in encrypted cookies rather than somewhere on my server. While this will result in more bandwidth used for each request - it will save extra database server load and storage space.
Anyway, I plan on encrypting the cookie contents using RIJNDAEL 256.
function encrypt($text, $key)
{
return mcrypt_encrypt(MCRYPT_RIJNDAEL_256,$key,$text,MCRYPT_MODE_ECB,mcrypt_create_iv(mcrypt_get_iv_size(MCRYPT_RIJNDAEL_256,MCRYPT_MODE_ECB),MCRYPT_RAND));
}
Which in use would produce something like this (base64 encoded for display)
print base64_encode(encrypt('text', 'key'));
7s6RyMaYd4yAibXZJ3C8EuBtB4F0qfJ31xu1tXm8Xvw=
I'm not worried about a single users cookie being compromised as much as I am worried that an attacker would discover the key and be able to construct any session for any user since they know what I use to sign the data.
Is there a way I can verify estimated cracking times in relation to the parameters used? Or is there a standard measure of time in relation to the size of the text or key used?
I heard someone say that the keys needed to exceed 256bits themselves to be safe enough to be used with RIJNDAEL. I'm also wondering if the length of the text encrypted needs to be a certain length so as not to give away the key.
The data will generally be about 200 characters
a:3{s:7:"user_id";i:345;s:5:"token";s:32:"0c4a14547ad221a5d877c2509b887ee6";s:4:"lang";s:2:"en";}
So is this safe?
Yes Rijndael(AES) is safe, however your implementation is far from safe. There are 2 outstanding issues with your implementation. The use of ECB mode and your IV is a static variable that will be used for all messages. An IV must always be a Cryptographic Nonce. Your code is in clear violation of CWE-329.
ECB mode should never be used, CBC mode must be used and this why:
Original:
Encrypted with ECB Mode:
Encrypted using CBC mode:
Avoid using ECB. It can reveal information about what's encrypted. Any two blocks with the same plaintext will have the same ciphertext. CBC would avoid this, but requires an IV to be generated or saved.
Avoid simply saving a key and IV. Generate a 256 bit master key using a cryptographically strong random number generator and save that into you application somewhere safe. Use that to generate session keys for use in encryption. The IV can be derived from the session key. When generating the session key include any and all available data that can be used to narrow the scope of the session key. (e.g. include the scope the cookie, the remote host address, a random nounce stored with the encrypted data, and/or a user ID if it isn't within the encrypted data)
Depending on how the data is to be used you may have to include a MAC. ECB and CBC are not designed to detect any changes to the ciphertext, and such changes will result in garbage in plaintext. You might want to include an HMAC with the encrypted data to allow you to authenticate it before taking it as canon. A session HMAC key must be derived from the session encryption key. Alternatively, you could use PCBC mode. PCBC was made to detect changes in the ciphertext, but its ability to do so is limited by the size of the padding, witch is dependent on the data that is encrypted, and not all crypto APIs will have it as an option.
Once you have gone so far as to include a MAC, then you should consider taking steps against replay attacks. Any time someone can resend old data within the scope of a session is a chance for a replay attack. Making a session key usage as narrow as possible without causing issues for the user is one way to thwart replay attacks. Another thing you could do is include a date and time into the encrypted data to create a window for while the data is to be considered valid.
In summery, protecting the key is just the tip of the iceburg.
If you use a long key, I'd say the key was pretty safe. Some things to concern yourself with:
You are offloading data storage to the client. NEVER TRUST THE CLIENT. This doesn't mean you can't do this, just that you either have to treat the data in the cookie as untrusted (don't make any decisions more serious than what 'theme' to show the user based on it) or provide for a way to validate the data.
Some examples of how to validate the data would be to:
include a salt (so that people with the same session data don't get the same cookie) and
a checksum (so that someone who changes even one bit of the cookie makes it useless).
Rijndael was renamed AES. Yes, it is safe to use.
That said, you should consider carefully what you put in the cookie. It depends on what you have available in the way of storage on your system, but you could simply choose a random number (say a 64-bit number), and store that in the cookie. In your server-side system, you'd keep a record of who that number was associated with, and the other details. This avoids encryption altogether. You use the other details to validate (to the extent anything can be validated) whether the cookie was sent back from the browser you originally sent it to.
Alternatively, you can use a different encryption key for each session, keeping a track of which key was used with which session.
Even if you go with straight encryption with a fixed key, consider including a random number in with the data to be encrypted - this makes it harder to crack using a known plaintext attack because, by definition, the random number can't be known.
AES-128 should be more than sufficient, with no needs to use longer keys - if the key is chosen randomly.
However there are other issues. The first is that you should not use ECB. With ECB a given 128-bit block of plaintext always maps into the same 128-bit ciphertext if the key is the same. This means that adversaries can surgically modify the ciphertext injecting different blocks for which they know the corresponding ciphertext. For example they could mix the data of two different users. With other modes, CBC for example is fine, the ciphertext also depends on the IV (initialization vector), which should be different at every execution of the algorithm. This way, the same plaintext is ciphered differently each time and the adversary cannot gain any advantage. You also need to save the IV somewhere with the ciphertext, no need to protect it. Whenever the chance of reusing the same IV becomes non-negligible you should also change the key.
The second issue is that you should also append a message authentication code. Otherwise you would not be able to distinguish the forged cookies from the good ones.

Easiest way to sign/certify text file in C++?

I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks!
Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since the customer isn't that computer savy, I think it is safe to embed the salt in the binary. I'll continue to search for a simple solution using the keywords "salt checksum hash" etc and post back here once I find one.
Obligatory preamble: How much is at stake here? You must assume that tampering will be possible, but that you can make it very difficult if you spend enough time and money. So: how much is it worth to you?
That said:
Since it's your code writing the file, you can write it out encrypted. If you need it to be human readable, you can keep a second encrypted copy, or a second file containing only a hash, or write a hash value for every entry. (The hash must contain a "secret" key, of course.) If this is too risky, consider transmitting hashes or checksums or the log itself to other servers. And so forth.
This is a quite difficult thing to do, unless you can somehow protect the keypair used to sign the data. Signing the data requires a private key, and if that key is on a machine, a person can simply alter the data or create new data, and use that private key to sign the data. You can keep the private key on a "secure" machine, but then how do you guarantee that the data hadn't been tampered with before it left the original machine?
Of course, if you are protecting only data in motion, things get a lot easier.
Signing data is easy, if you can protect the private key.
Once you've worked out the higher-level theory that ensures security, take a look at GPGME to do the signing.
You may put a checksum as a prefix to each of your file lines, using an algorithm like adler-32 or something.
If you do not want to put binary code in your log files, use an encode64 method to convert the checksum to non binary data. So, you may discard only the lines that have been tampered.
It really depends on what you are trying to achieve, what is at stakes and what are the constraints.
Fundamentally: what you are asking for is just plain impossible (in isolation).
Now, it's a matter of complicating the life of the persons trying to modify the file so that it'll cost them more to modify it than what they could earn by doing the modification. Of course it means that hackers motivated by the sole goal of cracking in your measures of protection will not be deterred that much...
Assuming it should work on a standalone computer (no network), it is, as I said, impossible. Whatever the process you use, whatever the key / algorithm, this is ultimately embedded in the binary, which is exposed to the scrutiny of the would-be hacker. It's possible to deassemble it, it's possible to examine it with hex-readers, it's possible to probe it with different inputs, plug in a debugger etc... Your only option is thus to make debugging / examination a pain by breaking down the logic, using debug detection to change the paths, and if you are very good using self-modifying code. It does not mean it'll become impossible to tamper with the process, it barely means it should become difficult enough that any attacker will abandon.
If you have a network at your disposal, you can store a hash on a distant (under your control) drive, and then compare the hash. 2 difficulties here:
Storing (how to ensure it is your binary ?)
Retrieving (how to ensure you are talking to the right server ?)
And of course, in both cases, beware of the man in the middle syndroms...
One last bit of advice: if you need security, you'll need to consult a real expert, don't rely on some strange guys (like myself) talking on a forum. We're amateurs.
It's your file and your program which is allowed to modify it. When this being the case, there is one simple solution. (If you can afford to put your log file into a seperate folder)
Note:
You can have all your log files placed into a seperate folder. For eg, in my appplication, we have lot of DLLs, each having it's own log files and ofcourse application has its own.
So have a seperate process running in the background and monitors the folder for any changes notifications like
change in file size
attempt to rename the file or folder
delete the file
etc...
Based on this notification, you can certify whether the file is changed or not!
(As you and others may be guessing, even your process & dlls will change these files that can also lead to a notification. You need to synchronize this action smartly. That's it)
Window API to monitor folder in given below:
HANDLE FindFirstChangeNotification(
LPCTSTR lpPathName,
BOOL bWatchSubtree,
DWORD dwNotifyFilter
);
lpPathName:
Path to the log directory.
bWatchSubtree:
Watch subfolder or not (0 or 1)
dwNotifyFilter:
Filter conditions that satisfy a change notification wait. This parameter can be one or more of the following values.
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_SECURITY
etc...
(Check MSDN)
How to make it work?
Suspect A: Our process
Suspect X: Other process or user
Inspector: The process that we created to monitor the folder.
Inpector sees a change in the folder. Queries with Suspect A whether he did any change to it.
if so,
change is taken as VALID.
if not
clear indication that change is done by *Suspect X*. So NOT VALID!
File is certified to be TAMPERED.
Other than that, below are some of the techniques that may (or may not :)) help you!
Store the time stamp whenever an application close the file along with file-size.
The next time you open the file, check for the last modified time of the time and its size. If both are same, then it means file remains not tampered.
Change the file privilege to read-only after you write logs into it. In some program or someone want to tamper it, they attempt to change the read-only property. This action changes the date/time modified for a file.
Write to your log file only encrypted data. If someone tampers it, when we decrypt the data, we may find some text not decrypted properly.
Using compress and un-compress mechanism (compress may help you to protect the file using a password)
Each way may have its own pros and cons. Strength the logic based on your need. You can even try the combination of the techniques proposed.