How to properly navigate directory paths in C++ - c++

I'm working on a solution within Visual Studio. It currently has two projects.
I will represent Directories or folders with capitals letters, and filenames will be all lower case. My solution structure is as follows:
SolutionDir
ProjectLib
source files
Shaders
shader files
ProjectApp
source files
x64
Debug
app.exe // debug build
Release
app.exe // release build
Within ProjectLib I have a function to open and read my Shader files. Here is what my function looks like:
std::vector<char> VRXShader::readFile(std::string_view shadername) {
std::string filename = std::string("Shaders/");
filename.append(shadername);
std::ifstream file(filename.data(), std::ios::ate | std::ios::binary);
if (!file.is_open()) {
throw std::runtime_error("failed to open file!");
}
size_t fileSize = static_cast<size_t>(file.tellg());
std::vector<char> buffer(fileSize);
file.seekg(0);
file.read(buffer.data(), fileSize);
file.close();
return buffer;
}
This function is being called within my VRXDevices::createPipeline function and here is the relevant code:
void VRXDevices::createPipeline(
VkDevice device, VkExtent2D swapChainExtent, VkRenderPass renderPass,
const std::vector<std::string_view>& shaderNames,
VkPipelineLayout& pipelineLayout, VkPipeline& pipeline
) {
std::vector<std::vector<char>> shaderCodes;
shaderCodes.resize(shaderNames.size());
for (auto& name : shaderNames) {
auto shaderCode = VRXShader::readFile(name.data());
}
// .... more code
}
The names are being created and passed to this function from my VRXEngine::initVulkan function which can be seen here:
void VRXEngine::initVulkan(
std::string_view app_name, std::string_view engine_name,
glm::ivec3 app_version, glm::ivec3 engine_version
) {
//... code
std::vector<std::string_view> shaderFilenames{ "vert.spv", "frag.spv" };
VRXDevices::createPipeline(device_, swapChainExtent_, renderPass_, shaderFilenames, pipelineLayout_, graphicsPipeline_);
}
I'm using just the name of the shader files such as vert.spv, frag.spv, geom.spv etc. I'm not including the paths here because these will be used as the key to a std::map<string_view, object>. So I'm passing a vector of these names from my ::initVulkan function into ::createPipeline().
Within ::createPipeline() is where ::readFile() is being called passing in the string_view.
Now as for my question... within ::readFile() I'm creating a local string and trying to initialize it with the appropriate path... then append to it the string_view for the shader's filename as can be seen from these two lines...
std::string filename = std::string("Shaders/");
filename.append(shadername);
I'm trying to figure out the appropriate string to initialize filename with... Shaders/ will be a part of the name, but it's not finding the file and I'm not sure what the appropriate prefix should be...
My working directories within both projects are as follows:
ProjectApp -> $(SolutionDir)x64/Release AND $(SolutionDir)x64/Debug
ProjectLib -> $(SolutionDir)x64/Release AND $(SolutionDir)x64/Debug
So I need to go back 2 directories then into VRX Engine/Shader...
What is the correct string value for navigating back directories?
Would I initialize filename with "../../VRX Engine/Shaders/" or is it "././" also, should I have quotes around VRX Engine since there is a space in the folder name? What do I need to initialize filename with before I append the shader name to it?

How to properly navigate directory paths in C++
It depends on which C++ standard your implementation claims to be compliant with.
Or else which additional libraries can you use.
C++ is useful on computers without directories (e.g. inside some operating system kernel coded in C++ and compiled with GCC, see OSDEV for examples).
Look on en.cppreference.com for details.
Licensing constraints could matter when using extra open source libraries.
If your implementation is C++17 compliant (in a "hosted" not "freestanding" way), use the std::filesystem part of the standard library.
If your operating system supports the Qt or POCO frameworks and you are allowed to use them (e.g. on C++11), you could use appropriate APIs. So QDir and related classes with Qt, Poco::Path and related classes with POCO.
Perhaps you want to code just for the WinAPI. Then read its documentation (I never coded on Windows myself, just on POSIX or Unix -e.g. Linux- and MSDOS....).

I was originally initializing my local temp string properly with "../../VRX Engine/Shaders/" before appending the string_view to it to be able to open the file. This was actually correct, but because it didn't initially work, I was assuming that it was wrong.
The correct string value for going back one directory should be "../" at least on Windows, I'm not sure about Linux, Mac, Android, etc...
My problem wasn't with the string at all, it pertained to settings within my projects. Within my project that builds into an executable, I had its working directory set to $(SolutionDir)x64/Debug and $(SolutionDir)x64/Release respectively which is correct for my solutions structure.
The issue was within my Engine project that is being built as a static library. Within its settings for its working directory, I had forgotten to modify both of the Debug and Release build options... These were still set to the default values of Visual Studio which I believe is (ProjectDir). Once I changed these to $(SolutionDir)x64/Debug and $(SolutionDir)x64/Release to match that of my ApplicationProject, I was able to open and read the contents of the files.

Related

How to read a file name containing 'œ' as character in C/C++ on windows

This post is not a duplicate of this one: dirent not working with unicode
Because here I'm using it on a different OS and I also don't want to do the same thing. The other thread is trying to simply count the files, and I want to access the file name which is more complex.
I'm trying to retrieve data information through files names on a windows 10 OS.
For this purpose I use dirent.h(external c library, but still very usefull also in c++).
DIR* directory = opendir(path);
struct dirent* direntStruct;
if (directory != NULL)
{
while (direntStruct = readdir(directory))
{
cout << direntStruct->d_name << endl;
}
}
This code is able to retrieve all files names located in a specific folder (one by one). And it works pretty well!
But when it encounter a file containing the character 'œ' then things are going crazy:
Example:
grosse blessure au cœur.txt
is read in my program as:
GUODU0~6.TXT
I'm not able to find the original data in the string name because as you can see my string variable has nothing to do with the current file name!
I can rename the file and it works, but I don't want to do this, I just need to read the data from that file name and it seems impossible. How can I do this?
On Windows you can use FindFirstFile() or FindFirstFileEx() followed by FindNextFile() to read the contents of a directory with Unicode in the returned file names.
Short File Name
The name you receive is the 8.3 short file name NTFS generates for non-ascii file names, so they can be accessed by programs that don't support unicode.
clinging to dirent
If dirent doesn't support UTF-16, your best bet may be to change your library.
However, depending on the implementation of the library you may have luck with:
adding / changing the manifest of your application to support UTF-8 in char-based Windows API's. This requires a very recent version of Windows 10.
see MSDN:
Use the UTF-8 code page under Windows - Apps - UWP - Design and UI - Usability - Globalization and localization.
setting the C++ Runtime's code page to UTF-8 using setlocale
I do not recommend this, and I don't know if this will work.
life is change
Use std::filesystem to enumerate directory content.
A simple example can be found here (see the "Update 2017").
Windows only
You can use FindFirstFileW and FindNextFileW as platform API's that support UTF16 strings. However, with std::filesystem there's little reason to do so (at least for your use case).
If you're in C, use the OS functions directly, specifically FindFirstFileW and FindNextFileW. Note the W at the end, you want to use the wide versions of these functions to get back the full non-ASCII name.
In C++ you have more options, specifically with Boost. You have classes like recursive_directory_iterator which allow cross-platform file searching, and they provide UTF-8/UTF-16 file names.
Edit: Just to be absolutely clear, the file name you get back from your original code is correct. Due to backwards compatibility in Windows filesystems (FAT32 and NTFS), every file has two names: the "full", Unicode aware name, and the "old" 8.3 name from DOS days.
You can absolutely use the 8.3 name if you want, just don't show it to your users or they'll be (correctly) confused. Or just use the proper, modern API to get the real name.

Initialize tesseract without any external resources (languages/dictionaries)

I am currently writing a C++ program that should read hex data from JPEG images. I have to compile it into one single windows executable without any external resources (like the "tessdata" directory or config files). As I am not reading any words or sentences, I don't need any dictionaries or languages.
My problem is now that I could not find a way to initialize the API without any language files. Every example uses something like this:
tesseract::TessBaseAPI api;
if (api.Init(NULL, "eng")) {
// error handling
return -1;
}
// do stuff
I also found that I can call the init function without language argument and with OEM_TESSERACT_ONLY:
if(api.Init(NULL, NULL, tesseract::OcrEngineMode::OEM_TESSERACT_ONLY)) {
// ...
}
This should disable the language/dictionary, but NULL just defaults to "eng". It seems like tesseract still wants a language file to initialize and will disable it afterwards.
This also seems to be the case for any other solutions I found so far: I always need .traineddata files to initialize the api and can disable them afterwards or using config files.
My question is now:
Is there any way to initialize the tesseract API in C++ using just the executable and no other resource files?
No. Tesseract always needs some language (default is eng) + osd (.traineddata) files. Without language data file tesseract is useless.
Your post seems that you made several wrong assumptions (e.g. about OEM_TESSERACT_ONLY), so maybe if you describe what you try to achieve with tesseract you can get better advice.

Limit log size with existing logging library (C++)

Being a beginner in C++, I found myself facing a problem when attempting to limit log file size using the ezlogger library: http://axter.com/ezlogger/
My take at it was to:
1) check log file size every n seconds
2) if size is too big, start logging to a second file (clearing it beforehand) Then switch between the files every n seconds.
I did 1. And I tackled 2 by changing the symlink used by the logging library as logging output file location (the app is running on Linux). However, it seems that the library retains a reference to the original file and never starts logging to the new file after changing the link.
The reason I decided to go this way was because I didn't want to touch the library. For an experienced programmer it would probably make more sense to somehow modify the library to enable switching log files. But with all the static variables and methods and hpp files containing actual code, I couldn't make sense of it and didn't know where to start.
So I guess I'm looking for opinions on my current approach, help with getting it to work and/or advice on how to do it differently/better.
Thanks.
Edit: I'm working on an existing older project which already uses ezlogger so I'd like to avoid using a different library if possible.
Either use logrotate (if you use unix like system) as it was suggested or modify your logging library. Those static variable you mention appear to be located in get_log_stream(). The modification would require checking on each get_log_stream call, the size of the current logging file. If the size exceeds some number of bytes then reopen stream. I don't see this logging library to be thread safe, so it probably isn't so you don't have to worry about it. But if your application is multithreaded then make a note of it.
The modification of get_log_stream would look as follows (its pseudocode):
// ...
if (logfile_is_open) {
if (logfile.tellp() > 1024*1024*10 /*10MB*/) {
logfile.close();
logfile.clear(); //clears flags
// TODO: update FileName accordingly, ie. add a count to it.
// TODO: remove older log files, etc.
logfile.open(FileName.c_str(), std::ios_base::out);
}
}
// Below is old code.
if (logfile_is_open) return logfile;
return std::cout;

Accessing resources from program in Debian package structure

I've made a DEB package of an C++ app that I've created. I want this app to use resources in the "data" directory, which, in my tests (for convenience), is in the same location that the program binary, and I call it from inside the code by its relative path. In the Debian OS there are standard locations to put the data files in (something like /usr/share/...), and other location to put the binaries in (probably /usr/bin). I'd not like to put the paths hard-coded in my program, I think its a better practice to access an image by "data/img.png" than "/usr/share/.../data/img.png". All the GNU classic programs respect the directories structure, and I imagine they do it in a good manner. I tried to use dpkg to find out the structure of the apps, but that didn't help me. Is there a better way that I'm doing to do this?
PS: I also want my code to be portable to Windows (cross-platform) avoiding using workarounds like "if WIN32" as much as possible.
In your Debian package you should indeed install your data in /usr/share/. When accessing your data, you should use the XDG standard, which states that $XDG_DATA_DIRS is a colon-separated list of data directories to search (also, "if $XDG_DATA_DIRS is either not set or empty, a value equal to /usr/local/share/:/usr/share/ should be used.").
This is not entirely linux specific or debian specific. I think is has something to do with Linux Standard Base or POSIX specifications maybe. I were unable to discover any specification quickly enough.
But you should not use some "base" directory and subdirectories in it for each type of data. Platform dependent code should belong into /usr/lib/programname, platform independent read-only data into /usr/share/programname/img.png. Data changed by application in /var/lib/programname/cache.db. Or ~/.programname/cache.db, depends what kind of application it is and what it does. Note: there is no need to "data" directory when /usr/share is already there for non-executable data.
You may want check http://www.debian.org/doc/manuals/developers-reference/best-pkging-practices.html if packaging for Debian. But it is not resources like in adroid or iphone, or windows files. These files are extracted on package install into target file system as real files.
Edit: see http://www.debian.org/doc/packaging-manuals/fhs/fhs-2.3.html
Edit2: As for multiplatform solution, i suggest you make some wrapper functions. On windows, it depends on installer, usually programs usually have path in registry to directory where they are installed. On unix, place for data is more or less given, you may consider build option for changing target prefix, or use environment variable to override default paths. On windows, prefix would be sufficient also, if it should not be too flexible.
I suggest some functions, where you will pass name of object and they will return path of file. It depends on toolkit used, Qt library may have something similar already implemented.
#include <string>
#ifdef WIN32
#define ROOT_PREFIX "c:/Program Files/"
const char DATA_PREFIX[] = ROOT_PREFIX "program/data";
#else
#define ROOT_PREFIX "/usr/"
/* #define ROOT_PREFIX "/usr/local/" */
const char DATA_PREFIX[] = ROOT_PREFIX "share/program";
#endif
std::string GetImageBasePath()
{
return std::string(DATA_PREFIX) + "/images";
}
std::string GetImagePath(const std::string &imagename)
{
// multiple directories and/or file types could be tried here, depends on how sophisticated
// it should be.
// you may check if such file does exist here for example and return only image type that does exist, if you can load multiple types.
return GetImageBasePath() + imagename + ".png";
}
class Image;
extern Image * LoadImage(const char *path);
int main(int argc, char *argv[])
{
Image *img1 = LoadImage(GetImagePath("toolbox").c_str());
Image *img2 = LoadImage(GetImagePath("openfile").c_str());
return 0;
}
It might be wise to make class Settings, where you can initialize platform dependent root paths once per start, and then use Settings::GetImagePath() as method.

Write a file in a specific path in C++

I have this code that writes successfully a file:
ofstream outfile (path);
outfile.write(buffer,size);
outfile.flush();
outfile.close();
buffer and size are ok in the rest of code.
How is possible put the file in a specific path?
Specify the full path in the constructor of the stream, this can be an absolute path or a relative path. (relative to where the program is run from)
The streams destructor closes the file for you at the end of the function where the object was created(since ofstream is a class).
Explicit closes are a good practice when you want to reuse the same file descriptor for another file. If this is not needed, you can let the destructor do it's job.
#include <fstream>
#include <string>
int main()
{
const char *path="/home/user/file.txt";
std::ofstream file(path); //open in constructor
std::string data("data to write to file");
file << data;
}//file destructor
Note you can use std::string in the file constructor in C++11 and is preferred to a const char* in most cases.
Rationale for posting another answer
I'm posting because none of the other answers cover the problem space.
The answer to your question depends on how you get the path. If you are building the path entirely within your application then see the answer from #James Kanze. However, if you are reading the path or components of the path from the environment in which your program is running (e.g. environment variable, command-line, config files etc..) then the solution is different. In order to understand why, we need to define what a path is.
Quick overview of paths
On the operating systems (that I am aware of), a path is a string which conforms to a mini-language specified by the operating-system and file-system (system for short). Paths can be supplied to IO functions on a given system in order to access some resource. For example here are some paths that you might encounter on Windows:
\file.txt
\\bob\admin$\file.txt
C:..\file.txt
\\?\C:\file.txt
.././file.txt
\\.\PhysicalDisk1\bob.txt
\\;WebDavRedirector\bob.com\xyz
C:\PROGRA~1\bob.txt
.\A:B
Solving the problem via path manipulation
Imagine the following scenario: your program supports a command line argument, --output-path=<path>, which allows users to supply a path into which your program should create output files. A solution for creating files in the specified directory would be:
Parse the user specified path based on the mini-language for the system you are operating in.
Build a new path in the mini-language which specifies the correct location to write the file using the filename and the information you parsed in step 1.
Open the file using the path generated in step 2.
An example of doing this:
On Linux, say the user has specified --output-path=/dir1/dir2
Parse this mini-language:
/dir1/dir2
--> "/" root
--> "dir1" directory under root
--> "/" path seperator
--> "dir2" directory under dir1
Then when we want to output a file in the specified directory we build a new path. For example, if we want to output a file called bob.txt, we can build the following path:
/dir1/dir2/bob.txt
--> "/" root
--> "dir1" directory under root
--> "/" path separator
--> "dir2" directory under dir1
--> "/" path seperator
--> "bob.txt" file in directory dir2
We can then use this new path to create the file.
In general it is impossible to implement this solution fully. Even if you could write code that could successfully decode all path mini-languages in existence and correctly represent the information about each system so that a new path could be built correctly - in the future your program may be built or run on new systems which have new path mini-languages that your program cannot handle. Therefore, we need to use a careful strategy for managing paths.
Path handling strategies
1. Avoid path manipulation entirely
Do not attempt to manipulate paths that are input to your program. You should pass these strings directly to api functions that can handle them correctly. This means that you need to use OS specific api's directly avoiding the C++ file IO abstractions (or you need to be absolutely sure how these abstractions are implemented on each OS). Make sure to design the interface to your program carefully to avoid a situation where you might be forced into manipulating paths. Try to implement the algorithms for your program to similarly avoid the need to manipulate paths. Document the api functions that your program uses on each OS to the user - this is because OS api functions themselves become deprecated over time so in future your program might not be compatible with all possible paths even if you are careful to avoid path manipulation.
2. Document the functions your program uses to manipulate paths
Document to the user exactly how paths will be manipulated. Then make it clear that it is the users responsibility to specify paths that will work correctly with the documented program behavior.
3. Only support a restricted set of paths
Restrict the path mini-languages your program will accept until you are confident that you can correctly manipulate the subset of paths that meet this set of restrictions. Document this to the user. Error if paths are input that do not conform.
4. Ignore the issues
Do some basic path manipulation without worrying too much. Accept that your program will exhibit undefined behavior for some paths that are input. You could document to the user that the program may or may not work when they input paths to it, and that it is the users responsibly to ensure that the program has handled the input paths correctly. However, you could also not document anything. Users will commonly expect that your program will not handle some paths correctly (many don't) and therefore will cope well even without documentation.
Closing thoughts
It is important to decide on an effective strategy for working with paths early on in the life-cycle of your program. If you have to change how paths are handled later it may be difficult to avoid a change in behaviour that might break the your program for existing users.
Try this:
ofstream outfile;
string createFile = "";
string path="/FULL_PATH";
createFile = path.as<string>() + "/" + "SAMPLE_FILENAME" + ".txt";
outfile.open(createFile.c_str());
outfile.close();
//It works like a charm.
That needs to be done when you open the file, see std::ofstream constructor or open() member.
It's not too clear what you're asking; if I understand correctly, you're
given a filename, and you want to create the file in a specific
directory. If that's the case, all that's necessary is to specify the
complet path to the constructor of ofstream. You can use string
concatenation to build up this path, but I'd strongly recommend
boost::filesystem::path. It has all of the functions to do this
portably, and a lot more; otherwise, you'll not be portable (without a
lot of effort), and even simple operations on the filename will require
considerable thought.
I was stuck on this for a while and have since figured it out. The path is based off where your executable is and varies a little. For this example assume you do a ls while in your executable directory and see:
myprogram.out Saves
Where Saves is a folder and myprogram.out is the program you are running.
In your code, if you are converting chars to a c_str() in a manner like this:
string file;
getline(cin, file, '\n');
ifstream thefile;
thefile.open( ("Saves/" + file + ".txt").c_str() );
and the user types in savefile, it would be
"Saves/savefile.txt"
which will work to get to to get to savefile.txt in your Saves folder. Notice there is no pre-slashes and you just start with the folder name.
However if you are using a string literal like
ifstream thefile;
thefile.open("./Saves/savefile.txt");
it would be like this to get to the same folder:
"./Saves/savefile.txt"
Notice you start with a ./ in front of the foldername.
If you are using linux, try execl(), with the command mv.