Include static data/text file

Include static data/text file - c++

I have a text file (>50k lines) of ascii numbers, with string identifiers, that can be thought of as a collection of data vectors. Based on user input, the application only needs one of these data vectors at runtime.
As far as I can see, I have 3 options for getting the information from this text file:
Keep it as a text file, extract the required vector at run-time. I believe the downside is that you can't have a relative path in the code, so the user would have to point to the file's correct location (?). Or alternatively, get the configure script to inject the absolute path as a macro.
Convert it to a static unsigned char using xxd (as explained here) and then include the resulting file. Downside is that a 5MB file turns into a 25MB include file. Am I correct in thinking that this 25MB is loaded into memory for the duration of the runtime?
Convert it to an object and link using objcopy as explained here. This seems to keep the file size about the same -- are there other trade-offs?
Is there a standard/recommended method for doing this? I can use C or C++ if that makes a difference.
Thanks.
(Running on linux with gcc)

I would go with number 1 and pass the filepath into the program as an argument. There's nothing wrong with doing that and it is simple and straight-forward.

You should have a look at the answers here:
Directory of running program
The top voted answer gives you a glue how to handle your data file. But instead of the home folder I would suggest to save it under /usr/share as explained in the link.

I'd preffer to use zlib (and both ways are possible:side file or include with compressed data).

Related

How to convert a string into .exe and then execute using system()?

I have a small myTest.exe file. I opened this in a text editor and copied the text.
std::string exeBinaryCode = "Copied text from exe";
Now I want that when I passed this string to the system(exeBinaryCode) then it will execute and give the same result that myTest.exe gives.
If anyone knows how to achieve this, please post the answer.

To begin with, executable files are binary files. You can't open them in text editors, or copy/paste them as text, or store them in a string variable.
(That last part isn't 100% true, since std::string basically just stores a string of bytes that don't necessarily have to be text, but you really shouldn't use it as such.)
There are a few different ways to achieve similar results, which you choose depends on what you're actually trying to accomplish.
Notice that none of these include directly running the binary data. Though there may be some obscure system call that allows you to do that you'll likely end up with loads of trouble (anti-virus, incompatibility across platforms, etc.).
Refer to the external executable by path
Simplest, just pass the path to the executable to system. If you intend to distribute your application you'd just package the external executable as well (so if you have your own code compiled into bin/myapp.exe in a zip-file you'd also have bin/whatineedtocall.exe in the same zip).
Unless you have very specific requirements this is what I'd recommend.
Use your build system to embed the data and write it to the file system
Some build systems and frameworks (for example CMake, see Embed resources (eg, shader code; images) into executable/library with CMake) have the ability to embed binary data such as executables into code. You can then, in your code, write this binary data to the file system when it is needed (preferably into some temporary location) and run it from there using system.
Embed as hexadecimal data and write to file system
Similar to the previous, but you can also insert the contents into your code manually. Note that you'd need to copy the executable binary not from a text editor, but in it's hexadecimal representation (see the previously linked question for examples, you'd want to end up with pretty much the same file).

How to distinguish between movie and image

is there any "good" way to distinguish between movie file and image file?
I would like to know what exactly my "std::wstring filePath" is - a movie, or an image.
Therefore, I could go further with strong assurance I am working with known file type.
In other words, I have two classes MyImage and MyMovie both need path to file in their constructors. I would like to verify path to file somehow before creating one of those classes.
bool isMovie(const std::wstring & filePath);
bool isImage(const std::wstring & filePath);
Of course I thought about file extensions, but I'm not sure that it is good and not prone to errors solution. So is it good to use file extension or any other feasible solution is possible.?
Thanks in advance

You can use libmagic to detect what kind of file it is. You pass the file path in and it'll give you a textual description or MIME type for the file.

Usually files have special so called magic bytes. I you have a control over the specification I would use this. If you try opening zip, gif, or other binary stuff you can usually find some distinctive strings there.
There is a unix tool utility called file that provides such functionality, so probably some sort of standard exists.
SQLite 3 provides a nice example. Look at 1.2.1 and 1.2.5. So not only the info that it is a SQLite 3 DB is given, but also additional application id, so other tools can recognize which application's DB it is.
I personally used few first bytes of a file to code type and version info for my files when I was playing with binary stuff.

Is it a good idea to include a large text variable in compiled code?

I am writing a program that produces a formatted file for the user, but it's not only producing the formatted file, it does more.
I want to distribute a single binary to the end user and when the user runs the program, it will generate the xml file for the user with appropriate data.
In order to achieve this, I want to give the file contents to a char array variable that is compiled in code. When the user runs the program, I will write out the char file to generate an xml file for the user.
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
I have two questions.
Q1. Do you have any other ideas for how to compile my file contents into binary, i.e, distribute as one binary file.
Q2. Is this even a good idea as I described above?

What you describe is by far the norm for C/C++. For large amounts of text data, or for arbitrary binary data (or indeed any data you can store in a file - e.g. zip file) you can write the data to a file, link it into your program directly.
An example may be found on sites like this one

I'll recommend using another file to contain data other than putting data into the binary, unless you have your own reasons. I don't know other portable ways to put strings into binary file, but your solution seems OK.
However, note that using \ at the end of line to form strings of multiple lines, the indentation should be taken care of, because they are concatenated from the begging of the next line：
char* buffers = "a xml format file contents, \
this represent many block text \
from a file,...";
Or you can use another form:
char *buffers =
"a xml format file contents,"
"this represent many block text"
"from a file,...";

Probably, my answer provides much redundant information for topic-starter, but here are what I'm aware of:
Embedding in source code: plain C/C++ solution it is a bad idea because each time you will want to change your content, you will need:
recompile
relink
It can be acceptable only your content changes very rarely or never of if build time is not an issue (if you app is small).
Embedding in binary: Few little more flexible solutions of embedding content in executables exists, but none of them cross-platform (you've not stated your target platform):
Windows: resource files. With most IDEs it is very simple
Linux: objcopy.
MacOS: Application Bundles. Even more simple than on Windows.
You will not need recompile C++ file(s), only re-link.
Application virtualization: there are special utilities that wraps all your application resources into single executable, that runs it similar to as on virtual machine.
I'm only aware of such utilities for Windows (ThinApp, BoxedApp), but there are probably such things for other OSes too, or even cross-platform ones.
Consider distributing your application in some form of installer: when starting installer it creates all resources and unpack executable. It is similar to generating whole stuff by main executable. This can be large and complex package or even simple self-extracting archive.
Of course choice, depends on what kind of application you are creating, who are your target auditory, how you will ship package to end-users etc. If it is a game and you targeting children its not the same as Unix console utility for C++ coders =)

It depends. If you are doing some small unix style utility with no perspective on internatialization, then it's probably fine. You don't want to bloat a distributive with a file no one would ever touch anyways.
But in general it is a bad practice, because eventually someone might want to modify this data and he or she would have to rebuild the whole thing just to fix a typo or anything.
The decision is really up to you.
If you just want to keep your distributive in one piece, you might also find this thread interesting: Store data in executable

Why don't you distribute your application with an additional configuration file? e.g. package your application executable and config file together.
If you do want to make it into a single file, try embed your config file into the executable one as resources.

I see it more of an OS than C/C++ issue. You can add the text to the resource part of your binary/program. In Windows programs HTML, graphics and even movie files are often compiled into resources that make part of the final binary.
That is handy for possible future translation into another language, plus you can modify resource part of the binary without recompiling the code.

Hiding application resources

I'm making a simple game with SFML 1.6 in C++. Of course, I have a lot of picture, level, and data files. Problem is, I don't want these files visible. Right now they're just plain picture files in a res/ subdirectory, and I want to either conceal them or encrypt them. Is it possible to put the raw data from the files into a resource file or something? Any solution is okay to me, I just don't want the files exposed to the user.
EDIT
Cross platform solutions best, but if they don't exist, that's okay, I'm working on windows. But I don't really want to use a library if it's not needed.

Most environments come with a resource compiler that converts images/icons/etc into string data and includes them in the source.
Another common technique is to copy them into the end of the final .exe as the last part of the build process. Then at run time, open the .exe as a file and read the data from some determined offset, see Embedding a filesystem in an executable?

The ideal way for this is to make your own archive format, which would contain all of your files' data along with some extra info needed to split files distinctly within it.

How to enable a shared object accessing a data file in runtime (UNIX)

I have a class method (implemented in a shared object in UNIX environment) which needs to access a text data file in runtime (using ifstream). Currently the method assumes that the data file is available for opening without any relative path, i.e something like
ifstream dataFile("data.txt");
The shared object is loaded from python code, and in order for it to be available for loading, it is being copied to the \usr\lib\ folder as a post-build step of the makefile. My question is how to make the text data file available for the shared object. I have considered the following possibilities:
Use some relative path, but that method is not totally fool-proof (the project is hosted on various instances and I cannot be sure the directory tree will stay the same (e.g) a month from now).
copy the data file as well to \usr\lib, but I feel this is a wrong attitude.
Any suggestions are welcomed.

The proper way to go about this is to make the location of the text file a configurable value that will be set when your project is installed. Using a configuration file in /etc/ is a common way to store that value.
That way you can put the text file in e.g. /usr/share/ with all the machine-independent files (that data file is machine-independent, right?) and your code would "know" where to find it.
Note that if the data file is going to be modified as part of your code's operation, then it should probably be placed somewhere under /var (/var/lib or perhaps /var/cache) according to the Filesystem Hierarchy Standard (FHS) and most other Unix filesystem standards.
If the data file could be considered a configuration file, as you mentioned in one of your comments, you could just hard-code its path to somewhere under /etc/ (e.g. /etc/MyProject/data.cfg) and go on.

I can think of two solutions :
When you load your shared object, you somehow give it the path to your file.
Instead of copying the file to /usr/lib you could create a symbolic link do it in /usr/lib but that is not the best thing to do imho.
The first solution is the best one for me.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Include static data/text file - c++

I would go with number 1 and pass the filepath into the program as an argument. There's nothing wrong with doing that and it is simple and straight-forward.

You should have a look at the answers here: Directory of running program The top voted answer gives you a glue how to handle your data file. But instead of the home folder I would suggest to save it under /usr/share as explained in the link.

I'd preffer to use zlib (and both ways are possible:side file or include with compressed data).

Related

How to convert a string into .exe and then execute using system()?

How to distinguish between movie and image

Is it a good idea to include a large text variable in compiled code?

Hiding application resources

How to enable a shared object accessing a data file in runtime (UNIX)

Categories

Resources