how to encrypt data files in C++?

how to encrypt data files in C++? - c++

now I am writing a app in C++, and currently my app reads models or parameters from several data files. Those files, i.e. self-define dictionary, are currently stored in plain text and to be loaded dynamically by C++ while runtime.
Yet, I don't want those files to be easily seen by my client while they get the released application, so I need to encrypt the file first. What's the general practice for this situation?
And those file are huge in size, so compile to a resource file is not a good option.
Actually I just need a simple 'encryption', at least not plain text stored in released version. And I dont want the encryption libraries which will load the whole file into the memory first in order to perform decryption, since the files are huge and no need to load its whole body into memory at one time.
Thanks!

Usually when you want to deal with encryption in C++ people tend to go for Open SSL libraries which encompass all of the functionality in a pretty standard way.
You'd have to get yourself a copy of the library and some code samples, but it's a pretty common thing and there's lots of documentation around.

Related

Local/dedicated sql database for a program

I have a program in C++ under development, which handles a lot of different types of objects which need to be retrieved from files on start-up. Total file size is a little less than 1GB. The program must run at least on Windows 7 and Linux.
My goal is to provide the functionality for reading from files, and storing them into objects (including data validation), so the solutions I can think of are:
simply provide at least one function per each object type to read from files, and copy the data to newly created objects. This is what I've been doing so far.
Use a database to store data and retrieve it using SQL.
I would like to switch to the second option, however I have a few limitations:
I am not allowed to connect to the internet with my program, so a remote server database is out of the question.
I am not allowed to make a program that forces the user to install additional software, in this case, a local database.
Access to the local data can be done only by running the program: no additional executables that the user can run to modify or read the files.
I have searched the internet for a c++ library that allows a program to store/retrieve specific data to local files, using SQL syntax, but so far I've got nothing. Database software I know, like MySQL, Postgresql etc. won't do because of the restrictions I mentioned.
My main question is: is there such library for c++, or do I have to implement my own?
I would prefer it were free, but I can do with anything I can find.
If there are no libraries like this, does anyone know any other alternatives for storing data locally for a c++ program? And if I were to implement my own, what way (as in file data structures) is there to store my objects (without it being extremely slow for less than 1GB of data)?

SQLite fits all your requirements except the last one, that prohibit users from modifying the data outside your program. Unfortunately, this one would be tough to overcome: a sufficiently proficient and determined user will always be able to break this restriction, even if you store your data in plain files of custom format.
Everything else fits perfectly: the database lives in a file, does not need a network connection, and can be installed as part of your program (it's a library that you link to your code).

Override c library file functions?

I am working on a game, and one of the requirements per the licence agreement of the sound assets I am using is that they be distributed in a way that makes them inaccessible to the end user. So, I am thinking about aggregating them into a flat file, encrypting them, or some such. The problem is that the sound library I am using (Hekkus Sound System) only accepts a 'char*' file path and handles file reading internally. So, if I am to continue to use it, I will have to override the c stdio file functions to handle encryption or whatever I decide to do. This seems doable, but it worries me. Looking on the web I am seeing people running into strange frustrating problems doing this on platforms I am concerned with(Win32, Android and iOS).
Does there happen to be a cross-platform library out there that takes care of this? Is there a better approach entirely you would recommend?

Do you have the option of using a named pipe instead of an ordinary file? If so, you can present the pipe to the sound library as the file to read from, and you can decrypt your data and write it to the pipe, no problem. (See Beej's Guide for an explanation of named pipes.)

Override stdio in a way that a lib you not knowing how it works exactly works in a way the developer hasn't in mind do not look like the right approach for me, as it isn't really easy. Implement a ramdrive needs so much effort that I recommend to search for another audio lib.
The Hekkus Sound System I found was build by a single person and last updated 2012. I wouldn't rely on a lib with only one person working on it without sharing the sources.
My advice, invest your time in searching for a proper sound lib instead of searching for a fishy work around for this one.

One possibility is to use a encrypted loopback filesystem (google for additional resources).
The way this works is that you put your assets on a encrypted filesystem, which actually lives in a simple file. This filesystem gets mounted someplace as a loopback device. Password needs to be supplied at attach / mount time. Once mounted, all files are available as regular files to your software. But otherwise, the files are encrypted and inaccessible.

It's compiler-dependent and not a guaranteed feature, but many allow you to embed files/resources directly into the exe and read them in your code as if from disk. You could embed your sound files that way. It will significantly increase the size of your exe however.

Another UNIX-based approach:
The environment variable LD_PRELOAD can be used to override any shared library an executable has been linked against. All symbols exported by a library mentioned in LD_PRELOAD are resolved to that library, including calls to libc functions like open, read, and close. Using the libdl, it is also possible for the wrapping library to call through to the original implementation.
So, all you need to do is to start the process which uses the Hekkus Sound System in an environment that has LD_PRELOAD set appropriately, and you can do anything you like to the file that it reads.
Note, however, that there is absolutely no way that you can keep the data inaccessible from the user: the very fact that he has to be able to hear it means he has to have access. Even if all software in the chain would use encryption, and your user is not willing to hack hardware, it would not be exactly difficult to connect the audio output jack with an audio input jack, would it? And you can't forbid you user to use earphones, can you? And, of course, the kernel can see all audio output unencrypted and can send a copy somewhere else...

The solution to your problem would be a ramdisk.
http://en.wikipedia.org/wiki/RAM_drive
Using a piece of memory in ram as if it was a disk.
There is software available for this too. Caching databases in ram is becoming popular.
And it keeps the file from being on the disk that would make it easy accessible to the user.

multiple access to file in r/w

I'm planning to write a programm which has to access to a certain file many times in r/w.
So I decided to use fstream, since I can use this class for both reading and writing purpose.
My idea is to open the file at the startup of the application and then close it as the application is closed too.
Since the file can be arbitrarily big, I was planning to use a "paging" structure, in which:
1) preallocate a fixed amount of memory for each page and a fixed number of page
2) load part of the file in to the first free page
3) if there is no free page, I select one non empty with a certain criterion, I commit all edit in it (if there are any) and then load the part of file in the page.
That's not so hard to code. But I was wondering If I'm going to reinvent the wheel... maybe the fstream itself is written in a smart way so that it also implements a similar paging mechanism. In that case, I would not take care about, just write and read at any time.
Some suggestion?

Don't do this by yourself. Unless you are using very exotic implementation, the fstream class already implement such a mechanism efficiently.
Checkout http://www.cplusplus.com/doc/tutorial/files/ "Buffers and Synchronization"
There are possible issues if you are seek-ing into file larger than 2GB with a old kernel or implementation of the standard library. Check this
Large file support in C++
or use Boost.Filesystem

Internal working of the standard C++ library vary by implementation. Hence a test would be needed to get some real data on your preferred platform. Generally memory mapped files are considered to be the fastest way to access data stored in a file (as Uflex has mentioned in his comment, but it has some drawbacks as well (see the linked wiki page). You can either use the standard (POSIX) C functions mmap() and munmap(), or the Boost C++ libraries which also have a portable C++ interface for memory mapped files.

Virtual Files for dynamic linking

my problem is pretty complicated and potentially impossible but here we go:
Using C++,
I'm currently working on an universal server engine for a game project of mine. Universal, because every part of the engine will be loaded dynamically after startup. Now, also game objects will inherit from a base object and have overloaded "Simulate" functions. In that way, every object would have it's specific behavior and I can do something I call "C++ Scripting" which is alot faster than interpreted lua script files. Also it's more dynamic.
(Please no solutions which would kill the c++ "scripting" part, like "forget the dynamic linking, that's insane". This performance boost is totally necessary, since I'm working with large voxel maps)
My Problem:
That are indeed alot of .dll/.so files and I wanted to pack those into a simple archive so I can use zlib on said source code and maybe pack everything together with textures and sounds in little "object packages".
Now the Windows DLL API and the Linux SO API won't allow me to load a dll/so file from a memory address, which is a shame.(Am I right there, or can I bypass that? :) ) I don't want to unzip and temp save those files on the filesystem because there are hundreds to thousands of them and that would increase the loading time alot.
Also I'm not interested in more external dependencies like boost.
So here are my Questions:
Is there a cross platform-method to create virtual files IN memory with a real path?
That way I could bypass the slow IO speeds of HDDs.
Or is it really not such a big deal to use temp files, because the file buffers of modern operating systems are fast/intelligent enough to NOT write all those files to disc?
(Actually Linux supports virtual file systems, but windows does not...)
I hope you guys can help me there :)

Not with winapi, that's for sure, but you can do it manually. You can load it into the memory, fill it's import table and call exported functions (after you called DllMain). I saw a program, where someone actually created a new process with that method ... See the PE documentation for details, but it works.
Also it's relatively easy to do, since you only need to find the PE import tables, and do what the dynamic linker does, fill it with jumps and addresses. Dlls contains position independent code, so no relocation needed.
It sould be the same on linux (only using the elf structure), but if you have a better solution with virtual file systems, you should use that.

How to create a virtual file?

I'd like to simulate a file without writing it on disk. I have a file at the end of my executable and I would like to give its path to a dll. Of course since it doesn't have a real path, I have to fake it.
I first tried using named pipes under Windows to do it. That would allow for a path like \\.\pipe\mymemoryfile but I can't make it works, and I'm not sure the dll would support a path like this.
Second, I found CreateFileMapping and GetMappedFileName. Can they be used to simulate a file in a fragment of another ? I'm not sure this is what this API does.
What I'm trying to do seems similar to boxedapp. Any ideas about how they do it ? I suppose it's something like API interception (Like Detour ), but that would be a lot of work. Is there another way to do it ?
Why ? I'm interested in this specific solution because I'd like to hide the data and for the benefit of distributing only one file but also for geeky reasons of making it works that way ;)
I agree that copying data to a temporary file would work and be a much easier solution.

Use BoxedApp and do not worry.

You can store the data in an NTFS stream. That way you can get a real path pointing to your data that you can give to your dll in the form of
x:\myfile.exe:mystreamname
This works precisely like a normal file, however it only works if the file system used is NTFS. This is standard under Windows nowadays, but is of course not an option if you want to support older systems or would like to be able to run this from a usb-stick or similar. Note that any streams present in a file will be lost if the file is sent as an attachment in mail or simply copied from a NTFS partition to a FAT32 partition.
I'd say that the most compatible way would be to write your data to an actual file, but you can of course do it one way on NTFS systems and another on FAT systems. I do recommend against it because of the added complexity. The appropriate way would be to distribute your files separately of course, but since you've indicated that you don't want this, you should in that case write it to a temporary file and give the dll the path to that file. Make sure you write the temporary file to the users' temp directory (you can find the path using GetTempPath in C/C++).
Your other option would be to write a filesystem filter driver, but that is a road that I strongly advise against. That sort of defeats the purpose of using a single file as well...
Also, in case you want only a single file for distribution, how about using a zip file or an installer?

Pipes are for communication between processes running concurrently. They don't store data for later access, and they don't have the same semantics as files (you can't seek or rewind a pipe, for instance).
If you're after file-like behaviour, your best bet will always be to use a file. Under Windows, you can pass FILE_ATTRIBUTE_TEMPORARY to CreateFile as a hint to the system to avoid flushing data to disk if there's sufficient memory.
If you're worried about the performance hit of writing to disk, the above should be sufficient to avoid the performance impact in most cases. (If the system is low enough on memory to force the file data out to disk, it's probably also swapping heavily anyway -- you've already got a performance problem.)
If you're trying to avoid writing to disk for some other reason, can you explain why? In general, it's quite hard to stop data from ever hitting the disk -- the user can always hibernate the machine, for instance.

Since you don't have control over the DLL you have to assume that the DLL expects an actual file. It probably at some point makes that assumption which is why named pipes are failing on you.
The simplest solution is to create a temporary file in the temp directory, write the data from your EXE to the temp file and then delete the temporary file.
Is there a reason you are embedding this "pseudo-file" at the end of your EXE instead of just distributing it with our application? You are obviously already distributing this third party DLL with your application so one more file doesn't seem like it is going to hurt you?
Another question, will this data be changing? That is are you expecting to write back data this "pseudo-file" in your EXE? I don't think that will work well. Standard users may not have write access to the EXE and that would probably drive anti-virus nuts.
And no CreateFileMapping and GetMappedFileName definitely won't work since they don't give you a file name that can be passed to CreateFile. If you could somehow get this DLL to accept a HANDLE then that would work.
And I wouldn't even bother with API interception. Just hand the DLL a path to an acutal file.

Reading your question made me think: if you can pretend an area of memory is a file and have kind of "virtual path" to it, then this would allow loading a DLL directly from memory which is what LoadLibrary forbids by design by asking for a path name. And this is why people write their own PE loader when they want to achieve that.
I would say you can't achieve what you want with file mapping: the purpose of file mapping is to treat a portion of a file as if it was physical memory, and you're wanting the reciprocal.
Using Detours implies that you would have to replicate everything the intercepted DLL function does except from obtaining data from a real file; hence it's not generic. Or, even more intricate, let's pretend the DLL uses fopen; then you provide your own fopen that detects a special pattern in the path and you mimmic the C runtime internals... Hmm is it really worth all the pain? :D

Please explain why you can't extract the data from your EXE and write it to a temporary file. Many applications do this -- it's the classic solution to this problem.
If you really must provide a "virtual file", the cleanest solution is probably a filesystem filter driver. "clean" doesn't mean "good" -- a filter is a fully documented and supported solution, so it's cleaner than API hooking, injection, etc. However, filesystem filters are not easy.
OSR Online is the best place to find Windows filesystem information. The NTFSD mailing list is where filesystem developers hang out.

How about using a some sort of RamDisk and writing the file to this disk? I have tried some ramdisks myself, though never found a good one, tell me if you are successful.

Well, if you need to have the virtual file allocated in your exe, you will need to create a vector, stream or char array big enough to hold all of the virtual data you want to write.
that is the only solution I can think of without doing any I/O to disk (even if you don't write to file).
If you need to keep a file like path syntax, just write a class that mimics that behaviour and instead of writing to a file write to your memory buffer. It's as simple as it gets. Remember KISS.
Cheers

Open the file called "NUL:" for writing. It's writable, but the data are silently discarded. Kinda like /dev/null of *nix fame.
You cannot memory-map it though. Memory-mapping implies read/write access, and NUL is write-only.

I'm guessing that this dll cant take a stream? Its almost to simple to ask BUT if it can you could just use that.

Have you tried using the \?\ prefix when using named pipes? Many APIs support using \?\ to pass the remainder of the path directly through without any parsing/modification.
http://msdn.microsoft.com/en-us/library/aa365247(VS.85,lightweight).aspx

Why not just add it as a resource - http://msdn.microsoft.com/en-us/library/7k989cfy(VS.80).aspx - the same way you would add an icon.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js