How to enable a shared object accessing a data file in runtime (UNIX) - c++

I have a class method (implemented in a shared object in UNIX environment) which needs to access a text data file in runtime (using ifstream). Currently the method assumes that the data file is available for opening without any relative path, i.e something like
ifstream dataFile("data.txt");
The shared object is loaded from python code, and in order for it to be available for loading, it is being copied to the \usr\lib\ folder as a post-build step of the makefile. My question is how to make the text data file available for the shared object. I have considered the following possibilities:
Use some relative path, but that method is not totally fool-proof (the project is hosted on various instances and I cannot be sure the directory tree will stay the same (e.g) a month from now).
copy the data file as well to \usr\lib, but I feel this is a wrong attitude.
Any suggestions are welcomed.

The proper way to go about this is to make the location of the text file a configurable value that will be set when your project is installed. Using a configuration file in /etc/ is a common way to store that value.
That way you can put the text file in e.g. /usr/share/ with all the machine-independent files (that data file is machine-independent, right?) and your code would "know" where to find it.
Note that if the data file is going to be modified as part of your code's operation, then it should probably be placed somewhere under /var (/var/lib or perhaps /var/cache) according to the Filesystem Hierarchy Standard (FHS) and most other Unix filesystem standards.
If the data file could be considered a configuration file, as you mentioned in one of your comments, you could just hard-code its path to somewhere under /etc/ (e.g. /etc/MyProject/data.cfg) and go on.

I can think of two solutions :
When you load your shared object, you somehow give it the path to your file.
Instead of copying the file to /usr/lib you could create a symbolic link do it in /usr/lib but that is not the best thing to do imho.
The first solution is the best one for me.

Related

Methods for opening a specific file inside the project WITHOUT knowing what the working directory will be

I've had trouble with this issue across many languages, most recently with C++.
The Issue Exemplified
Let's say we're working with C++ and have the following file structure for a project:
("Project" main folder with three [modules, data, etc] subfolders)
Now say:
Our maincode.cpp is in the Project folder
moduleA.cpp is in modules folder
data.txt is in data folder
moduleA.cpp wants to read data.txt
So the way I'd currently do it would be to assume maincode.cpp gets compiled & executed inside the Project folder, and so hardcode the path data/data.txt in moduleA.cpp to do the reading (say I used fstream fs("data/data.txt") to do so).
But what if the code was, for some reason, executed inside etc folder?
Is there a way around this?
The Questions
Is this a valid question? Or am I missing something with the wd (working directory) concept fundamentals?
Are there any methods for working around absolute paths so as to solve this issue in C++?
Are there any universal methods for doing the same with any language?
If there are no reasonable methods, how would you approach this issue?
Please leave a comment if I missed any important details with the problem's illustration!
At some point the program has to make an assumption where the file(s) are. Either by getting it from user input or a relative path with the presumed filename. As already said in the comments, C++ recently got std::filesystem added in C++17 which can help you making cross-platform code that interacts with the hosts' filesystem.
That being said, every program, big or small, has to make certain assumptions at some point, deleting or moving certain files is problematic for any program in case the program requires them to be at a certain location under a certain name. This is not solvable other than presenting the user with an error message etc.
As #Hatted Rooster said, it's not generally solvable for some arbitrary file without making some assumptions, however there are frameworks that allow you to "store" some files in the resources embedded into the executable (or otherwise). Those frameworks would usually allow your to handle such files in a opaque way, without the need to rely on a current working dir or relative paths.
For example, see the Qt Resource System.
Your program can deduce the path from argv[0] in the main call, if you know that it is always relative to your executable or you use an absolute path like "C:\myProgram\data\data.txt".
The second approach works in every language.

How to get real full path of a file or directory having one of its paths?

The same file system entry can be accessible in several paths.
real full path - /home/user/dir1/file1
path which contains parent dirs - /home/user/dir1/../dir1/file1
path with direct symlinks - /home/user/dir1/symlink_to_file1
path with indirect symlinks - /home/user/symlink_to_dir1/file1
...
I want two write a function which for given two paths will tell whether the file or directory specified by the second path is inside (including sub-directories) the directory specified by the first path.
I think the most obvious solution is to find real full paths of both file system entries then check whether the first real path is a prefix of the second. That is why the title of question is about finding real full paths.
NOTE: I want to write the function for both Windows and POSIX compatible systems.
NOTE: boost::filesystem cannot be used.
In Windows and Unix-land alike there is no single “real path”. In particular a file can have many different directory entries, called hardlinks, in Unix-land created via ln and in Windows 7 and later via mklink. But also, in Windows you can very simply define a local logical drive mapped to some directory, via the subst command, and drives mapped to file server directories via e.g. net use, and you can mount a drive as a directory, e.g. via the mountvol command.
However, the “real path” problem is just an imagined solution to the real problem, which is to establish whether a file or directory is inside a directory specified via a path.
For that, establish a system-specfic ID for the filesystem entity that you're searching for, and scan up the parent directory chain looking for that ID. Sorry, I misread the question. I can't think of any efficient way to do this, it sounds like brute force ID search through all possible directories, unless you can avail yourself of indexing information.
The question you need to know up front is this: How many ways are there to get to /path/to/filename? With symbolic links the answer is infinite (well, within the bound of the filesystem size). Any symbolic link anywhere on any portion of the filesystem could redirect to the file (or some portion of the path above the file). Even without considering hard links the search space must be the entire filesystem under /base/path/of/interest/ (which may be the entire filesystem).
Allowing symbolic links, and without further limitations, there is no non-brute-force method for establishing whether /path/to/filename is reachable within /base/path/of/interest/.

Linked directory not found

I have following scenario:
The main software I wrote uses a database created by a simulator. This database is around 10 GB big at the moment, so I want to keep only one copy of that data per system.
Assuming I have following projects:
Main Software using the data, located at /SimData
DLL using the data for debugging, searching for data at /SimData
Debugging tool to parse the image database, searching for the data at /SimData
Since I do not want to have all those programs have their own copy of SimData (not only to decrease place used, but also to ensure that all Simulation data used is always up to date for all programs).
I created for the DLL and Debugging Utility a link named SimData to MainSoftware/SimData, but when opening a file with "SimData\MyFile.data" it cannot find it, only the MainSoftware with the ACTUAL SimData folder can find it.
How can I use the MainSoftware/SimData folder without setting absolute paths?
This is on Windows 7 x64
I agree with Peter about adding the DB location as a configurable parameter. A common place to store that is in the registry.
however, If you want to create links that will be recognized by your software, try hardlinks. . fsutil should do the trick as described here.
You need a way to configure the database location. You could use an INI or other configuration file, or a registry setting, or a command-line input, or an environment variable. Or You could write your program to search a directory hierarchy... for example, if the various modules are usually siblings of each other in your directory tree, you could search for SimData/MyFile.data, ../SimData/MyFile.data, ../../MainSoftware/SimData/Myfile.data, and use the first one found.
Which answer is the "right one" depends on your situation.

Include static data/text file

I have a text file (>50k lines) of ascii numbers, with string identifiers, that can be thought of as a collection of data vectors. Based on user input, the application only needs one of these data vectors at runtime.
As far as I can see, I have 3 options for getting the information from this text file:
Keep it as a text file, extract the required vector at run-time. I believe the downside is that you can't have a relative path in the code, so the user would have to point to the file's correct location (?). Or alternatively, get the configure script to inject the absolute path as a macro.
Convert it to a static unsigned char using xxd (as explained here) and then include the resulting file. Downside is that a 5MB file turns into a 25MB include file. Am I correct in thinking that this 25MB is loaded into memory for the duration of the runtime?
Convert it to an object and link using objcopy as explained here. This seems to keep the file size about the same -- are there other trade-offs?
Is there a standard/recommended method for doing this? I can use C or C++ if that makes a difference.
Thanks.
(Running on linux with gcc)
I would go with number 1 and pass the filepath into the program as an argument. There's nothing wrong with doing that and it is simple and straight-forward.
You should have a look at the answers here:
Directory of running program
The top voted answer gives you a glue how to handle your data file. But instead of the home folder I would suggest to save it under /usr/share as explained in the link.
I'd preffer to use zlib (and both ways are possible:side file or include with compressed data).

Hiding application resources

I'm making a simple game with SFML 1.6 in C++. Of course, I have a lot of picture, level, and data files. Problem is, I don't want these files visible. Right now they're just plain picture files in a res/ subdirectory, and I want to either conceal them or encrypt them. Is it possible to put the raw data from the files into a resource file or something? Any solution is okay to me, I just don't want the files exposed to the user.
EDIT
Cross platform solutions best, but if they don't exist, that's okay, I'm working on windows. But I don't really want to use a library if it's not needed.
Most environments come with a resource compiler that converts images/icons/etc into string data and includes them in the source.
Another common technique is to copy them into the end of the final .exe as the last part of the build process. Then at run time, open the .exe as a file and read the data from some determined offset, see Embedding a filesystem in an executable?
The ideal way for this is to make your own archive format, which would contain all of your files' data along with some extra info needed to split files distinctly within it.