Safely embedding a string in C code (Secure string, Secure char*) - c++

I have a dll (ansi c) that has some string litarals defined.
__declspec(dllexport) char* GetSomeString()
{
return "This is a test string from TestLib.dll";
}
When compiled this string is still visible in "notepad" for example. I'm fairly new to C, so I was wondering, is there a way to safely store string literals?
Should I do it with a resx file (for example), that has some encrypted values, or what would be the best way?
Thanks
EDIT 1:
The scenario is basically the following in pseudo code:
if(hostname)
return hostname
else
return "Literal String"';
It's this "literal string" that I would like to see "secured" in some way..

Don't put your secrets on anyone else's computer if you want them to stay secret.
See my related answer, The #1 Law of Software Licensing
And Eric Lippert's similar answer

First of all, since your executable1 needs to decode that literal in memory, any attacker determined enough will be able to do the same; often it's just as easy as freezing the process after startup (or after it needed to use the string we want), creating a memory dump and use utilities like string over it. There are methods to mitigate the issue (e.g. zeroing the memory used by a sensitive string immediately after using it), but since your code is on a machine where the potential attacker has all the privileges, you can only put roadblocks: in the end your executable is completely in the attacker's hands.
That being said, if your concern is just "not leaving important strings en plein air" you may just run an executable packer/encrypter over your whole dll. This is as easy as adding a post-build step in your solution, the packer will compress/encrypt the whole executable image and build an executable that when launched will decrypt and run it in memory.
This method has the great advantage of not requiring any change to your code: you just run upx over the compiled dll and you get your compressed dll, no XORs or weird literals spread across your code are needed.
Of course, this is quite weak security (basically it will just protect from snooping around in the executable with notepad or a hex editor), but again, storing critical "secrets" in an executable that is going to be distributed is a bad idea in first place.
In the whole answer I "executable" is to be intended in the wide meaning - i.e. also dlls are included.

You probably want to store hardcoded passwords in the library, right? You can XOR the string with some value, and store it, then read it and XOR again. It's the simplest way, but it doesn't protect your string from any kind of disassembling/reverse engineering.

Related

Test environment for an Online Judge

I am planning to build an Online Judge on the lines of CodeChef, TechGig, etc. Initially, I will be accepting solutions only in C/C++.
Have thought through a security model for the same, but my concern as of now is how to model the execution and testing part.
Method 1
The method that seems to be more popular is to redirect standard input to the executable and redirect standard output to a file, for example:
./submission.exe < input.txt > output.txt
Then compare the output.txt file with some solution.txt file character by character and report the results.
Method 2
A second approach that I have seen is not to allow the users to write main(). Instead, write a function that accepts some arguments in the form of strings and set a global variable as the output. For example:
//This variable should be set before returning from submissionAlgorithm()
char * output;
void submissionAlgorithm(char * input1, char * input2)
{
//Write your code here.
}
At each step, and for a test case to be executed, the function submissionAlgorithm() is repeatedly called and the output variable is checked for results.
Form an initial analysis I found that Method 2 would not only be secure (I would prevent all read and write access to the filesystem from the submitted code), but also make the execution of test cases faster (maybe?) since the computations of test results would occur in memory.
I would like to know if there is any reason as to why Method 1 would be preferred over Method 2.
P.S: Of course, I would be hosting the online judge engine on a Linux Server.
Don't take this wrong, but you will need to look at security from a much higher perspective. The problem will not be the input and output being written to a file, and that should not affect performance too much either. But you will need to manage submisions that can actually take down your process (in the second case) or the whole system (with calls to the OS to write to disk, acquire too much memory....)
Disclaimer I am by no means a security expert.

C++ Win32 Replace Strings in an Executable

I'm looking for a good way to replace several strings inside a native win32 compiled exe. For example, I have the following in my code:
const char *updateSite = "http://www.place.com"
const char *updateURL = "/software/release/updater.php"
I need to modify these strings with other arbitrary length strings within the exe. I realize I could store this type of configuration elsewhere, but keeping it in the exe meets the portability requirements for my app. I would appreciate any help and/or advice on the best way to do this.
Thanks!
Update: I found some code in the Metasploit project that seems to do this:
MSF:Util:Exe
I would not mess around the the EXE itself, if you really need 1 file, then do the old zip append trick and put your configs in there.
Could look like this:
> BINARY DATA
> ZIP FILE DATA
> 32bit unsigned int which's value is the size of the appended zip file
Pros:
easy to extend / maintain
you don't mess with the exe itself
you can put lots of stuff in there
Contras:
You need to link some compression lib
If you don't want to zip it, then just write some simple uncompressed archive thing your own.
In a PE file is the global relocations table- it is a list of addresses (for example, global variables or constants that must be runtime-stored, like, say, strings) that must be altered by the PE loader. If you knew which entry this particular variable was, you could get it's address and then alter it manually. However, this would be a total bitch and you'd need an in-depth knowledge of your favourite compiler and the PE format. Easier just to use XML or Lua or something else that's totally portable - they were invented for exactly this kind of purpose.
Edit:
Why not just use a const char**? Is there something wrong with this being a normal runtime variable?
IMO the best place to store that strings in a string table resource. It's incorporated into your .EXE file, so the portability will not be compromised.
Use the visual studio editor to alter that values.
Use LoadString WinAPI, or better, CString::LoadString method, in your code, to load the values.
There's also 3-rd party software allowing you to modify the strings in the compiled .EXE, without recompilation.

Encrypting password in compiled C or C++ code

I know how to compile C and C++ Source files using GCC and CC in the terminal, however i would like to know if its safe to include passwords in these files, once compiled.
For example.. i check user input for a certain password e.g 123, but it appears compiled C/C++ programs is possible to be decompiled.
Is there anyway to compile a C/C++ source file, while keeping the source completely hidden..
If not, could anyone provide a small example of encrypting the input, then checking against the password e.g: (SHA1, MD5)
No you can't securely include password in your source file. Strings in executable file are in plain text, anyone with a text editor can easily look at your password.
A not so secure, but would trample some people, is to store the encrypted string instead. So, basically:
enc = "03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4"
bool check() {
pass = getPassFromUser();
encpass = myHashingFunction(pass);
return pass == encpass;
}
this will deter some people, but isn't really much more secure, it is relatively trivial for assembly hacker to replace the 'enc' string in your executable with another sha256-encoded string with a known cleartext value.
Even if you use a separate authentication server, it is not difficult to setup a bogus authentication server and fool your program connect to this bogus authentication server.
Even if you use SHA1 to generate a hash it is not really all that safe if you do it in a normal way (write a function to check a password) any determined or knowledgable hacker given access to the executable will be able to get around it (replace your hash with a known hash or just replace the checkPassword() call with a call that returns true.
The question is who are you trying to protect against? Your little brother, a hacker, international spies, industrial espionage?
Using SHA1 with the hash just contained within in the code (or a config file) will only protect against you little brother? (read casual computer users that can't be bothered to try and hack your program instead of paying the share ware price). In this case using plain text password or a SHA1 hash makes little difference (maybe a couple of percent more people will not bother).
If you want to make your code safe against anything else then you will need to do a lot more. A book on security is a good starting point but the only real way to do this is to take a security class where protection techniques are taught. This is a very specialized field and rolling your own version is likely to be counter productive and give you no real protection (using a hash is only the first step).
It is not recommended to keep any sensitive static data inside code. You can use configuration files for that. There you can store whatever you like.
But if you really want to do that first remember that the code can be easily changed by investigating with a debugger and modifying it. Only programs that user doesn't have access to can be considered safer (web sites for example).
The majority of login passwords (of different sites) are not stored in clear in the database but encrypted with algorithms MD5, SHA1, Blowfish etc.
I'd suggest you use one of these algorithms from OpenSSL library.
What I would do is using some public-key cryptographic algorithm. This will probably take a little longer to be cracked because in my opinion there is nothing 100% sure when talking about software protection.
It's not safe if you store them as plain text, you can just dump the file or use a utility like strings to find text in the executable.
You will have to encode them in some manner.
Here is a code sample that might help you, using OpenSSL.
#include <openssl/evp.h>
bool SHA256Hash(const char* buf, size_t buflen, char* res, size_t reslen)
{
if (reslen >= 32)
{
EVP_MD_CTX mdctx;
EVP_MD_CTX_init(&mdctx);
EVP_DigestInit_ex(&mdctx, EVP_sha256(), NULL);
EVP_DigestUpdate(&mdctx, buf, buflen);
EVP_DigestFinal_ex(&mdctx, res, &len);
EVP_MD_CTX_cleanup(&mdctx);
return (len == 32);
}
return false;
}
I took this sample from the systools library and had to adapt it. So i'm not sure it compiles without modifications. However, it should help you.
Please note that, to determine if storing a hash value of some password in your binary is safe, we must know what you want it for.
If you expect it to forbid some functionalities of your program unless some special password is given, then it is useless: an attacker is likely to remove the whole password-check code instead of trying to guess or reverse the stored password.
Try finding out Hashing Functions and Ciphering Methods for securing your passwords and their storage.

Including huge string in our c++ programs?

I am trying to include huge string in my c++ programs, Its size is 20598617 characters , I am using #define to achieve it. I have a header file which contains this statement
#define "<huge string containing 20598617 characterd>"
When I try to compile the program I get error as fatal error C1060: compiler is out of heap space
I tried following command line options with no success
/Zm200
/Zm1000
/Zm2000
How can I make successful compilation of this program?
Platform: Windows 7
You can't, not reliably. Even if it will compile, it's liable to break the runtime library, or the OS assumptions, and so forth.
If you tell us why you're trying to do it, we can offer lots of alternatives. Deciding how to handle arbitrarily large data is a major part of programming.
Edited to add:
Rather than guess, I looked into MSDN:
Prior to adjacent strings being
concatenated, a string cannot be
longer than 16380 single-byte
characters.
A Unicode string of about one half
this length would also generate this
error.
The page concludes:
You may want to store exceptionally
large string literals (32K or more) in
a custom resource or an external file.
What do other compilers say?
Further edited to add:
I created a file like this:
char s[] = {'x','x','x','x'};
I kept doubling the occurrences of 'x', testing each one as an #include file.
An 8388608 byte string succeeded; 16777216 bytes failed, with the "out of heap space" error.
I suspect you are running into a design limit on the size of a character string.
Most people really think that a million characters is long enough :-}
To avoid such design limits, I'd try not to put the whole thing into a single literal string. On the suspicion that #define macro bodies likewise have similar limits, I't try not to put the entire thing in a single #define, either.
Most C compilers will accept pretty big lists of individual characters as initializers. If you write
char c[]={ c1, c2, ... c20598617 };
with the c_i being your individual characters, you may succeed. I've seen GCC2 applications where there were 2 million elements like this (apparantly they were loading some type of ROM image). You might even be able to group the c_i into blocks of K characters for K=100, 1000, 10000 as suits your tastes, and that might actually help the compiler.
You might also consider running your string through a compression algorithm,
putting the compressed result into your C++ file by any of the above methods,
and decompressing after the program was loaded.
I suspect you can get a decompression algorithm into a few thousand bytes.
Store the string to a file and just open and read it...
Its much cleaner/organized that way [i'm assuming that right now you have a file named blargh.h which contains that one #Define...]
Um, store the string in a separate resource of some sort and load it in? Seriously, in embedded land, you would have this as a separate resource and not hold it in RAM. On windows, I believe you can use .dlls or other external resources to handle this for you. Compilers aren't designed to hold this size of resources for you and they will fail.
Increase the compiler heap space.
If your string comes from a large text or binary file, you may have luck with either the xxd -i command (to get everything in an array, per Ira Baxter's answer) or a variant of the bin2obj command (to get everything into a .o file you can link into the program).
Note that the string may not be null terminated in this case.
See answers to the earlier question, "How can I get the contents of a file at build time into my C++ string?"
(Also, as an aside: note the existence of the .xbm format.)
This is a very old question, but since there's no definitive answer yet: C++11's raw string literals seem to do the job.
This compiles nicely on GCC 4.8:
#include <string>
std::string data = R"(
... <1.4 MB of base85-encoded string> ...
)";
As said in other posts in this thread, this is definitely not the preferred way of handling large amounts of data.

Is there a 'catch' with FastFormat?

I just read about the FastFormat C++ i/o formatting library, and it seems too good to be true: Faster even than printf, typesafe, and with what I consider a pleasing interface:
// prints: "This formats the remaining arguments based on their order - in this case we put 1 before zero, followed by 1 again"
fastformat::fmt(std::cout, "This formats the remaining arguments based on their order - in this case we put {1} before {0}, followed by {1} again", "zero", 1);
// prints: "This writes each argument in the order, so first zero followed by 1"
fastformat::write(std::cout, "This writes each argument in the order, so first ", "zero", " followed by ", 1);
This looks almost too good to be true. Is there a catch? Have you had good, bad or indifferent experiences with it?
Is there a 'catch' with FastFormat?
Last time I checked, there was one annoying catch:
You can only use either the narrow string version or the wide string version of this library. (The functions for wchar_t and char are the same -- which type is used is a compile time switch.)
With iostreams, stdio or Boost.Format you can use both.
Found one "catch", though for most people it will never manifest. From the project page:
Atomic operation. It doesn't write out statement elements one at a time, like the IOStreams, so has no atomicity issues
The only way I can see this happening is if it buffers the whole write() call's output itself, then writes it out to the ostream in one step. This means it needs to allocate memory, and if an object passed into the write() call produces a lot of output (several megabytes or more), it can consume up to twice that much memory in internal buffers (assuming it uses the grow-a-buffer-by-doubling-its-size-each-time trick).
If you're just using it for logging, and not, say, dumping huge amounts of XML, you'll never see this problem.
The only other "catch" I'm seeing is:
Highly portable. It will work with all good modern C++ compilers; it even works with Visual C++ 6!
So it won't work with an old C++ compiler, like cfront, whereas iostreams is backward compatible to the late 80's. Again, I'd be surprised if anyone ever had a problem with this.
Although FastFormat is a good library there are a number of issues with it:
Limited formatting support, in particular the following features are not supported:
Leading zeros (or any other non-space padding)
Octal/hexadecimal encoding
Runtime width/alignment specification
The library is quite big for a relatively small task of formatting and has even bigger dependency (STLSoft).
It looks pretty interesting indeed! Good tip regardless, and +1 for that!
I've been playing with it for a bit. The main drawback I see is that FastFormat supports less formatting options for the output. This is I think a direct consequence of the way the higher typesafety is achieved, and a good tradeoff depending on your circumstances.
If you look in detail at his performance benchmark page, you'll notice that good old C printf-family functions are still winning on Linux. In fact, the only test case where they perform poorly is the test case that should be static string concatenations, where I would expect printf to be wasteful. Moreover, GCC provides static type-checking on printf-style function calls, so the benefit of type-safety is reduced. So: if you are running on Linux and if you need the absolute best performance, FastFormat is probably not the optimal solution.
The library depends on a couple of environment variables, as mentioned in the docs.
That might be no biggie to some people, but I'd prefer my code to be as self-contained as possible. If I check it out from source control, it should work and compile. It won't, if it requires you to set environment variables.