I need to store sensitive information (a symmetric encryption key that I want to keep private) in my C++ application. The simple approach is to do this:
std::string myKey = "mysupersupersecretpasswordthatyouwillneverguess";
However, running the application through the strings process (or any other that extracts strings from a binary app) will reveal the above string.
What techniques should be used to obscure such sensitive data?
Edit:
OK, so pretty much all of you have said "your executable can be reverse engineered" - of course! This is a pet peeve of mine, so I'm going to rant a bit here:
Why is it that 99% (OK, so perhaps I exaggerate a little) of all security-related questions on this site are answered with a torrent of "there is no possible way to create a perfectly secure program" - that is not a helpful answer! Security is a sliding scale between perfect usability and no security at one end, and perfect security but no usability at the other.
The point is that you pick your position on that sliding scale depending on what you're trying to do and the environment in which your software will run. I'm not writing an app for a military installation, I'm writing an app for a home PC. I need to encrypt data across an untrusted network with a pre-known encryption key. In these cases, "security through obscurity" is probably good enough! Sure, someone with enough time, energy and skill could reverse-engineer the binary and find the password, but guess what? I don't care:
The time it takes me to implement a top-notch secure system is more expensive than the loss of sales due to the cracked versions (not that I'm actually selling this, but you get my point). This blue-sky "lets do it the absolute best way possible" trend in programming amongst new programmers is foolish to say the least.
Thank you for taking the time to answer this question - they were most helpful. Unfortunately I can only accept one answer, but I've up-voted all the useful answers.
Basically, anyone with access to your program and a debugger can and will find the key in the application if they want to.
But, if you just want to make sure the key doesn't show up when running strings on your binary, you could for instance make sure that the key is not within the printable range.
Obscuring key with XOR
For instance, you could use XOR to split the key into two byte arrays:
key = key1 XOR key2
If you create key1 with the same byte-length as key you can use (completely) random byte values and then compute key2:
key1[n] = crypto_grade_random_number(0..255)
key2[n] = key[n] XOR key1[n]
You can do this in your build environment, and then only store key1and key2 in your application.
Protecting your binary
Another approach is to use a tool to protect your binary. For instance, there are several security tools that can make sure your binary is obfuscated and starts a virtual machine that it runs on. This makes it hard(er) to debug, and is also the convential way many commercial grade secure applications (also, alas, malware) is protected.
One of the premier tools is Themida, which does an awesome job of protecting your binaries. It is often used by well known programs, such as Spotify, to protect against reverse engineering. It has features to prevent debugging in programs such as OllyDbg and Ida Pro.
There is also a larger list, maybe somewhat outdated, of tools to protect your binary.
Some of them are free.
Password matching
Someone here discussed hashing password+salt.
If you need to store the key to match it against some kind of user submitted password, you should use a one-way hashing function, preferrably by combining username, password and a salt. The problem with this, though, is that your application has to know the salt to be able to do the one-way and compare the resulting hashes. So therefore you still need to store the salt somewhere in your application. But, as #Edward points out in the comments below, this will effectively protect against a dictionary attack using, e.g, rainbow tables.
Finally, you can use a combination of all the techniques above.
There is a (very light) header-only project obfuscate made by adamyaxley that works perfectly. It is based on lambda functions and macros and it encrypts strings litteral with a XOR cipher at compile-time. If needed, we can change the seed for each string.
The following code will not store the string "hello world" in the compiled binary.
#include "obfuscate.h"
int main()
{
std::cout << AY_OBFUSCATE("Hello World") << std::endl;
return 0;
}
I have tested with c++17 and visual studio 2019, and check via IDA and I confirm the string is hidden. One precious advantage compared to ADVobfuscator is that it is convertible to a std::string (while being still hidden in the compiled binary) :
std::string var = AY_OBFUSCATE("string");
First of all, realise that there is nothing you can do that will stop a sufficiently determined hacker, and there are plenty of those around. The protection on every game and console around is cracked eventually, so this is only a temporary fix.
There are 4 things you can do that will increase you chances of staying hidden for a while.
1) Hide the elements of the string in some way -- something obvious like xoring ( the ^ operator) the string with another string will be good enough to make the string impossible to search for.
2) Split the string into pieces -- split up your string and pop bits of it into strangely named methods in strange modules. Don't make it easy to search through and find the method with the string in it. Of course some method will have to call all these bits, but it still makes it a little harder.
3) Don't ever build the string in memory -- most hackers use tools that let them see the string in memory after you have encoded it. If possible, avoid this. If for example you are sending the key off to a server, send it character by character, so the whole string is never around. Of course, if you are using it from something like RSA encoding, then this is trickier.
4) Do an ad-hoc algorithm -- on top of all this, add a unique twist or two. Maybe just add 1 to everything you produce, or do any encryption twice, or add a sugar. This just makes it a little harder for the hacker who already knows what to look for when someone is using, for example, vanilla md5 hashing or RSA encryption.
Above all, make sure it isn't too important when (and it will be when if you application becomes popular enough) your key is discovered!
A strategy i've used in the past is to create an array of seemingly-random characters. You initially insert, and then locate your particular characters with a algebraic process where each step from 0 to N will yield a number < size of the array which contains the next char in your obfuscated string. (This answer is feeling obfuscated now!)
Example:
Given an array of chars (numbers and dashes are for reference only)
0123456789
----------
ALFHNFELKD
LKFKFLEHGT
FLKRKLFRFK
FJFJJFJ!JL
And an equation whose first six results are: 3, 6, 7, 10, 21, 47
Would yield the word "HELLO!" from the array above.
I agree with #Checkers, your executable can be reverse-engineered.
A bit better way is to create it dynamically, for example:
std::string myKey = part1() + part2() + ... + partN();
Of course, storing private data in software which is shipped to the user is always a risk. Any sufficiently educated (and dedicated) engineer could reverse engineer the data.
That being said, you can often make things secure enough by raising the barrier which people need to overcome to reveal your private data. That's usually a good compromise.
In your case, you could clutter your strings with non-printable data, and then decode that at runtime using a simple helper function, like this:
void unscramble( char *s )
{
for ( char *str = s + 1; *str != 0; str += 2 ) {
*s++ = *str;
}
*s = '\0';
}
void f()
{
char privateStr[] = "\001H\002e\003l\004l\005o";
unscramble( privateStr ); // privateStr is 'Hello' now.
string s = privateStr;
// ...
}
I've created a simple encryption tool for strings, it can automatically generate encrypted strings and has a few extra options to do that, a few examples:
String as a global variable:
// myKey = "mysupersupersecretpasswordthatyouwillneverguess";
unsigned char myKey[48] = { 0xCF, 0x34, 0xF8, 0x5F, 0x5C, 0x3D, 0x22, 0x13, 0xB4, 0xF3, 0x63, 0x7E, 0x6B, 0x34, 0x01, 0xB7, 0xDB, 0x89, 0x9A, 0xB5, 0x1B, 0x22, 0xD4, 0x29, 0xE6, 0x7C, 0x43, 0x0B, 0x27, 0x00, 0x91, 0x5F, 0x14, 0x39, 0xED, 0x74, 0x7D, 0x4B, 0x22, 0x04, 0x48, 0x49, 0xF1, 0x88, 0xBE, 0x29, 0x1F, 0x27 };
myKey[30] -= 0x18;
myKey[39] -= 0x8E;
myKey[3] += 0x16;
myKey[1] += 0x45;
myKey[0] ^= 0xA2;
myKey[24] += 0x8C;
myKey[44] ^= 0xDB;
myKey[15] ^= 0xC5;
myKey[7] += 0x60;
myKey[27] ^= 0x63;
myKey[37] += 0x23;
myKey[2] ^= 0x8B;
myKey[25] ^= 0x18;
myKey[12] ^= 0x18;
myKey[14] ^= 0x62;
myKey[11] ^= 0x0C;
myKey[13] += 0x31;
myKey[6] -= 0xB0;
myKey[22] ^= 0xA3;
myKey[43] += 0xED;
myKey[29] -= 0x8C;
myKey[38] ^= 0x47;
myKey[19] -= 0x54;
myKey[33] -= 0xC2;
myKey[40] += 0x1D;
myKey[20] -= 0xA8;
myKey[34] ^= 0x84;
myKey[8] += 0xC1;
myKey[28] -= 0xC6;
myKey[18] -= 0x2A;
myKey[17] -= 0x15;
myKey[4] ^= 0x2C;
myKey[9] -= 0x83;
myKey[26] += 0x31;
myKey[10] ^= 0x06;
myKey[16] += 0x8A;
myKey[42] += 0x76;
myKey[5] ^= 0x58;
myKey[23] ^= 0x46;
myKey[32] += 0x61;
myKey[41] ^= 0x3B;
myKey[31] ^= 0x30;
myKey[46] ^= 0x6C;
myKey[35] -= 0x08;
myKey[36] ^= 0x11;
myKey[45] -= 0xB6;
myKey[21] += 0x51;
myKey[47] += 0xD9;
As unicode string with decryption loop:
// myKey = "mysupersupersecretpasswordthatyouwillneverguess";
wchar_t myKey[48];
myKey[21] = 0x00A6;
myKey[10] = 0x00B0;
myKey[29] = 0x00A1;
myKey[22] = 0x00A2;
myKey[19] = 0x00B4;
myKey[33] = 0x00A2;
myKey[0] = 0x00B8;
myKey[32] = 0x00A0;
myKey[16] = 0x00B0;
myKey[40] = 0x00B0;
myKey[4] = 0x00A5;
myKey[26] = 0x00A1;
myKey[18] = 0x00A5;
myKey[17] = 0x00A1;
myKey[8] = 0x00A0;
myKey[36] = 0x00B9;
myKey[34] = 0x00BC;
myKey[44] = 0x00B0;
myKey[30] = 0x00AC;
myKey[23] = 0x00BA;
myKey[35] = 0x00B9;
myKey[25] = 0x00B1;
myKey[6] = 0x00A7;
myKey[27] = 0x00BD;
myKey[45] = 0x00A6;
myKey[3] = 0x00A0;
myKey[28] = 0x00B4;
myKey[14] = 0x00B6;
myKey[7] = 0x00A6;
myKey[11] = 0x00A7;
myKey[13] = 0x00B0;
myKey[39] = 0x00A3;
myKey[9] = 0x00A5;
myKey[2] = 0x00A6;
myKey[24] = 0x00A7;
myKey[46] = 0x00A6;
myKey[43] = 0x00A0;
myKey[37] = 0x00BB;
myKey[41] = 0x00A7;
myKey[15] = 0x00A7;
myKey[31] = 0x00BA;
myKey[1] = 0x00AC;
myKey[47] = 0x00D5;
myKey[20] = 0x00A6;
myKey[5] = 0x00B0;
myKey[38] = 0x00B0;
myKey[42] = 0x00B2;
myKey[12] = 0x00A6;
for (unsigned int fngdouk = 0; fngdouk < 48; fngdouk++) myKey[fngdouk] ^= 0x00D5;
String as a global variable:
// myKey = "mysupersupersecretpasswordthatyouwillneverguess";
unsigned char myKey[48] = { 0xAF, 0xBB, 0xB5, 0xB7, 0xB2, 0xA7, 0xB4, 0xB5, 0xB7, 0xB2, 0xA7, 0xB4, 0xB5, 0xA7, 0xA5, 0xB4, 0xA7, 0xB6, 0xB2, 0xA3, 0xB5, 0xB5, 0xB9, 0xB1, 0xB4, 0xA6, 0xB6, 0xAA, 0xA3, 0xB6, 0xBB, 0xB1, 0xB7, 0xB9, 0xAB, 0xAE, 0xAE, 0xB0, 0xA7, 0xB8, 0xA7, 0xB4, 0xA9, 0xB7, 0xA7, 0xB5, 0xB5, 0x42 };
for (unsigned int dzxykdo = 0; dzxykdo < 48; dzxykdo++) myKey[dzxykdo] -= 0x42;
Somewhat dependent on what you are trying to protect as joshperry points out.
From experience, I would say that if it is part of some licensing scheme to protect your software then don't bother. They will eventially reverse engineer it. Simply use a simple cipher like ROT-13 to protect it from simple attacks (line running strings over it).
If it is to secure users sensitive data I would be questioning whether protecting that data with a private key stored locally is a wise move. Again it comes down to what you are trying to protect.
EDIT: If you are going to do it then a combination of techniques that Chris points out will be far better than rot13.
As was said before, there's no way to totally protect your string. But there are ways to protect it with a reasonable safety.
When I had to do this, I did put some innocent looking string into the code (a copyright notice, for example, or some faked user prompt or anything else that won't be changed by someone fixing unrelated code), encrypted that using itself as a key, hashed that (adding some salt), and used the result as a key to encrypt what I actually wanted to encrypt.
Of course this could be hacked, but it does take a determined hacker to do so.
If you are on windows user DPAPI, http://msdn.microsoft.com/en-us/library/ms995355.aspx
As a previous post said if you are on mac use the keychain.
Basically all of these cute ideas about how to store your private key inside your binary are sufficiently poor from a security perspective that you should not do them. Anyone getting your private key is a big deal, don't keep it inside your program. Depending on how import your app is you can keep your private keys on a smart card, on a remote computer your code talks to or you can do what most people do and keep it in a very secure place on the local computer (the "key store" which is kind of like a weird secure registry) that is protected by permissions and all the strength of your OS.
This is a solved problem and the answer is NOT to keep the key inside your program :)
Try this. The source code explains how to encrypt and decrypt on the fly all strings in a given Visual Studio c++ project.
One method I recently tried is:
Take hash (SHA256) of the private data and populate it in code as part1
Take XOR of private data and its hash and populate it in code as part2
Populate data: Don't store it as char str[], but populate on runtime using assignment instructions (as shown in macro below)
Now, generate the private data on run time by taking the XOR of part1 and part2
Additional step: Calculate hash of generated data and compare it with part1. It will verify the integrity of private data.
MACRO to populate data:
Suppose, private data is of 4 bytes. We define a macro for it which saves the data with assignment instructions in some random order.
#define POPULATE_DATA(str, i0, i1, i2, i3)\
{\
char *p = str;\
p[3] = i3;\
p[2] = i2;\
p[0] = i0;\
p[1] = i1;\
}
Now use this macro in code where you need to save part1 and part2, as follows:
char part1[4] = {0};
char part2[4] = {0};
POPULATE_DATA(part1, 1, 2, 3, 4);
POPULATE_DATA(part2, 5, 6, 7, 8);
Instead of storing private key in your executable, you may want to request it from the user and store it by means of an external password manager, something similar to Mac OS X Keychain Access.
Context dependent but you could just store the hash of the key plus a salt (constant string, easy to obscure).
Then when (if) the user enters the key, you add the salt, calculate the hash and compare.
The salt is probably unnecessary in this case, it stops a brute-force dictionary attack if the hash can be isolated (a Google search has also been know to work).
A hacker still only has to insert a jmp instruction somewhere to bypass the whole lot, but that's rather more complicated than a simple text search.
Related
In a previous question, I asked if it was possible to write and execute assembly commands in memory. I got some nice responses, and after a bit more research, I figured out how to do it. Now that I can do it, I am having trouble figuring out what to write to memory (and how to do it correctly). I know some assembly and how the mnemonics translate to opcodes, but I can't figure out how to use the opcodes correctly.
Here's an example I'm trying to get working:
void(*test)() = NULL; //create function pointer, initialize to NULL
void* hold_address = VirtualAlloc(NULL, 5*1024, MEM_COMMIT, PAGE_EXECUTE_READWRITE); //allocate memory, make writable/ readable/ executable
unsigned char asm_commands[] = {0x55, 0x89, 0xE5, 0x83, 0xEC, 0x18, 0xC7, 0x04, 0x24, 0x41, 0xE8, 0x1E, 0xB3, 0x01, 0x00, 0xC9, 0xC3}; //create array of assembly commands, hex values
memcpy(hold_address, asm_commands, sizeof(asm_commands)[0]*10); //copy the array into the reserved memory
test = (void(*)())hold_address; //set the function pointer to start of the allocated memory
test(); //call the function
Just placing 0xC3 into the asm_commands array works (and the function just returns), but that's boring. The series of opcodes (and addresses) I have in there right now are supposed to print out the character "A" (capital a). I got the opcodes and addresses from debugging a simple program that calls printf("A") and finding the call in memory. Right now, the program returns a 0xC00000096 error, "privileged command". I think the error stems from trying to call the system putchar address directly, which the system doesn't like. I also think I can bypass that by giving my program Ring 0 access, but I hardly know what that entails other than a lot of potential problems.
So is there any way to either call the printf() function (in assembly opcodes) without needing higher privileges?
I'm using Windows 7, 64-bit, Code::Blocks 10.05 (GNU GCC Compiler).
Here's a screenshot of the debugged printf() call (in OllyDebug):
unsigned char asm_commands[] = {0x55, 0x89E5…
Whoa, hang on, stop right there. 0x89E5 isn't a valid value for an unsigned char, and your compiler should probably be complaining about this. (If not, check your settings; you've probably disabled some very important warnings.)
You'll need to split your code in this initializer up into individual bytes, e.g.
{0x55, 0x89, 0xE5, …
Nothing gets printed out because you forgot these zeroes in the dword 0x00000041 and mistakenly wrote 0x1A in stead of 0x1E.
unsigned char asm_commands[] = {0x55, 0x89, 0xE5, 0x83, 0xEC, 0x18, 0xC7, 0x04, 0x24, 0x41, 0x00, 0x00, 0x00, 0xE8, 0x1E, 0xB3, 0x01, 0x00, 0xC9, 0xC3}; //create array of assembly commands, hex values
In addition to what #duskwuff and #user3144770 wrote. Did you change the following line to include every byte?
memcpy(hold_address, asm_commands, sizeof(asm_commands)[0]*10);
I have counted 20 bytes of assembly code!
memcpy(hold_address, asm_commands, sizeof(asm_commands)[0]*20);
I am having a problem with a school assignment. The assignment is to write a metamorphic Hello World program. This program will produce 10 .com files that print "Hello World!" when executed. Each of the 10 .com files must be different from the others. I understand the concept of metamorphic vs oligomorphic vs polymorphic. My program currently creates 10 .com files and then writes the machine code to the files. I began by simply writing only the machine code to print hello world and tested it. It worked just fine. I then tried to add a decryption routine to the beginning of the machine code. Here is my current byte array:
#define ARRAY_SIZE(array) (sizeof((array))/sizeof((array[0])))
BYTE pushCS = 0x0E;
BYTE popDS = 0x1F;
BYTE movDX = 0xBA;
BYTE helloAddr1 = 0x1A;
BYTE helloAddr2 = 0x01;
BYTE movAH = 0xB4;
BYTE nine = 0x09;
BYTE Int = 0xCD;
BYTE tOne = 0x21;
BYTE movAX = 0xB8;
BYTE ret1 = 0x01;
BYTE ret2 = 0x4C;
BYTE movBL = 0xB3;
BYTE keyVal = 0x03; // Encrypt/Decrypt key
typedef unsigned char BYTE;
BYTE data[] = { 0x8D, 0x0E, 0x01, 0xB7, 0x1D, 0xB3, keyVal, 0x30, 0x1C, 0x46, 0xFE, 0xCF, 0x75, 0xF9,
movDX, helloAddr1, helloAddr2, movAH, nine, Int, tOne, movAX, ret1, ret2, Int, tOne,
0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 0x57, 0x6F, 0x72, 0x6C, 0x64, 0x21, 0x0D, 0x0D, 0x0A, 0x24 };
The decryption portion of the machine code is the first 14 bytes of "data". This decryption routine would take the obfuscated machine code bytes and decrypt them by xor-ing the bytes with the same key that was used to encrypt them. I am encrypting the bytes in my C++ code with this:
for (int i = 15; i < ARRAY_SIZE(data); i++)
{
data[i] ^= keyVal;
}
I have verified over and over again that my addressing is correct considering that the code begins at offset 100. What I have noticed is that when keyVal is 0x00, my code runs fine and I get 10 .com files that print Hello World!. However, this does me no good as 0x00 leaves everything unchanged. When I provide an actual key like 0x02, my program no longer works. It simply hangs until I close out DosBox. Any hints as to the cause of this would be a great help. I have some interesting plans for junk insertion (The actual metamorphic part) but I don't want to move on to that until I figure out this encrypt/decrypt issue.
The decryption portion of the machine code is the first 14 bytes of "data".
and
for (int i = 15; i < ARRAY_SIZE(data); i++)
do not match since in C++ array indexes start at 0.
In your array data[15] == helloAddr1 which means you are not encrypting the data[14] == movDX element. Double-check which elements should be encrypted and start at i = 14 if required.
I have a buffer type like this:
unsigned char buffer[] = {
0xB8, 0xB8, 0x00, 0xB8, 0xB8, 0x00, 0xB8, 0xB8, 0x00, 0xB8, 0xB8, 0x00,..
};
So I need to remove the null byte every X (every 2 bytes in this example). I don't want to remove all null byte because in my buffer I have melt bytes.
So just need to remove a range and in WinAPI. How can I do that?
I'm still not very comfortable with C++, also the buffer can be big.
I think the right way is by copy the buffer by memcpy in a loop but I can't find the syntax.
It seems that you don't want to use any of the more powerful features of C++ so I suspect that you are really looking for a C style routine. That would look like this:
void copyskip(void *dest, const void *src, size_t srclen, size_t skip)
{
size_t destidx = 0;
for (size_t srcidx=0; srcidx<srclen; )
{
if ((srcidx+1) % skip != 0)
{
((char*)dest)[destidx] = ((char*)src)[srcidx];
destidx++;
}
srcidx++;
}
}
You'd need to allocate the destination buffer before calling. And for your example you would pass 3 for the skip parameter.
Personally I'd much rather do it using C++ standard containers, but this is what I think you asked for.
I need to create a program when it run it should extract a image file. to do this I I used a char array to store the data. ex:
char data[]="ÿØÿà......";
I opened the image with a hex editor and copied the data and pasted it as above. but it gives many errors. (that may be because the image data have some bytes that ascii charactors are not available. ex: nul,)
con someone give me some advices on how to do this. how to create a byte array.
thanks in eny advice.
You should use a numeric initializer instead of a string literal... for example
const unsigned char data[] = { 0x01, 0x02, 0x03, 0x04,
0x05, 0x06, 0x07, 0x08 };
A simple way is writing a small script that generates the source code by reading the file... in Python it would be something like
data = open("datafile", "rb").read()
i = 0
while i < len(data):
chunk = data[i:i+8]
print ("0x%02x, " * len(chunk)) % tuple(map(ord, chunk))
i += 8
Read the data from the file using fopen or fstream. If you want to embed the file in the exe using a resource compiler.
I have the following code:
static unsigned char S0_gif[] = {
0x47, 0x49, 0x46, 0x38, 0x39, 0x61, 0x0f, 0x00, 0x0f, 0x00, 0x91, 0x02,
..
};
It's a hex representation of a gif file. I have 500 gifs that I need to store like that so I want to use a vector to make it easier for access.
Something like:
vector<char[]> gifs;
gif.push_back( {0x47, 0x49,..} );
Then in the loop:
{
MakeImage(gif[i], sizeof gif[i] );
}
I cannot find the right code for that. Any help would be greatly appreciated.
Petry
You cant do that, because vectors store constant sized structures, and youre's are variable sized. What you can do however, is store a vector of vector :)
vector<vector<char> > gifs; // note the neccessary space between > >
gif.push_back( vector<char>( S0_gif, S0_gif + sizeof(S0_gif) ) );
Then in the loop:
{
MakeImage( gifs[i] );
}
Another idea, if they are indeed stored as static variables, is not to store the data twice:
vector< unsigned char * > gifs;
vector< size_t > gifsizes;
gifs.push_back( S0_gif );
gifsizes.push_back( sizeof(S0_gif) );
Then in the loop:
{
MakeImage( gifs[i], gifsizes[i] );
}
Disclaimer : I probably forgot some &'s, feel free to correct me.
Looks like you are storing all 500 GIF files in a row. You cannot detect size of each without parsing its header. If your function MakeImage could parse GIF header you could return pointer to the next image from it.
Then the loop will look like:
char* img_ptr = S0_gif;
while ( img_ptr ) img_ptr = MakeImage( img_ptr );
I believe that the best solution is to generate a C/CPP file that declares a vector of images. All the rest means writing code, which is not generally recommended for a lot of initialization (my opinion).
unsigned char *Array[]={
S0_gif,
S1_gif,
S2_gif,
S3_gif,
...
};
The code for generating this can be easily written in a scripting language (bash, perl, python, etc). It should be something like this:
print "char *Array[]={"
for i in range(0,500)
print "S"+i+"_gif"
print "};"
Is this a solution to your question?