Efficiently List All Sub-Directories in a Directory - c++

Please see edit with advice taken so far...
I am attempting to list all the directories(folders) in a given directory using WinAPI & C++.
Right now my algorithm is slow & inefficient:
- Use FindFirstFileEx() to open the folder I am searching
- I then look at every file in the directory(using FindNextFile()); if its a directory file then I store its absolute path in a vector, if its just a file I do nothing.
This seems extremely inefficient because I am looking at every file in the directory.
Is there a WinAPI function that I can use that will tell me all the sub-directories in a given directory?
Do you know of an algorithm I could use to efficiently locate & identify folders in a directory(folder)?
EDIT:
So after taking the advice I have searched using FindExSearchLimitToDirectories but for me it still prints out all the files(.txt, etc.) & not just folders. Am I doing something wrong?
WIN32_FIND_DATA dirData;
HANDLE dir = FindFirstFileEx( "c:/users/soribo/desktop\\*", FindExInfoStandard, &dirData,
FindExSearchLimitToDirectories, NULL, 0 );
while ( FindNextFile( dir, &dirData ) != 0 )
{
printf( "FileName: %s\n", dirData.cFileName );
}

In order to see a performance boost there must be support at the file system level. If this does not exist then the system must enumerate every single object in the directory.
In principle, you can use FindFirstFileEx specifying the FindExSearchLimitToDirectories flag. However, the documentation states (emphasis mine):
This is an advisory flag. If the file system supports directory filtering, the function searches for a file that matches the specified name and is also a directory. If the file system does not support directory filtering, this flag is silently ignored.
If directory filtering is desired, this flag can be used on all file systems, but because it is an advisory flag and only affects file systems that support it, the application must examine the file attribute data stored in the lpFindFileData parameter of the FindFirstFileEx function to determine whether the function has returned a handle to a directory.
However, from what I can tell, and information is sparse, FindExSearchLimitToDirectories flag is not widely supported on desktop file systems.
Your best bet is to use FindFirstFileEx with FindExSearchLimitToDirectories. You must still perform your own filtering in case you meet a file system that doesn't support directory filtering at file system level. If you get lucky and hit upon a file system that does support it then you will get the performance benefit.

If you're using FindFirstFileEx, then you should be able to specify the _FINDEX_SEARCH_OPS::FindExSearchLimitToDirectories option (to be used as the fSearchOp param in FindFirstFileEx) to limit the first search (and any subsequent FindNextFile()) calls to directories.

Related

Are file names allowed to have '/' in them?

I am confused by a piece of Python code:
with open('/dev/null', 'w+') as null:
It may be because I do not have knowledge of other Operating Systems, but I thought file names are forbidden to have '/' character. If so, I do not understand how this is a valid command.
Now I do understand that when using the open function in Python, if the file exist in a directory other than the current working directory, one has to prepend the path to the file name argument. However, this does not seem to be the case here because the file name argument for the open function is simply '/dev/null/'. Is 'null' the file name.
Is this related to this:
https://en.wikipedia.org/wiki/Null_device
"in some operating systems, the null device is a device file that discards all data written to it but reports that the write operation succeeded. This device is called /dev/null on Unix and Unix-like systems"
On Unix systems, file name cannot contain forward slash as it's used as directory separator. A file can't have a name of exactly one or two dots as they're used for "current directory" and "parent directory", too. A path starting with a forward slash is an absolute path, going all the way down the directory tree from the root path.
In that code, it opens /dev/null, a special character device that discards everything written to it and reports write success. It's possible that in some cases one wants to discard the output from a specific function, like subprocess.run. In this case, opening a handle to the null device is useful.

iOS file size during write using only C/C++ APIs

Purpose: I am monitoring file writes in a particular directory on iOS using BSD kernel queues, and poll for file sizes to determine write ends (when the size stops changing). The basic idea is to refresh a folder only after any number of file copies coming from iTunes sync. I have a completely working Objective-C implementation for this but I have my reasons for needing to implement the same thing in C++ only.
Problem: The one thing stopping me is that I can't find a C or C++ API that will get the correct file size during a write. Presumably, one must exist because Objective-C's [NSFileManager attributesOfItemAtPath:] seems to work and we all know it is just calling a C API underneath.
Failed Solutions:
I have tried using stat() and lstat() to get st_size and even st_blocks for allocated block count, and they return correct sizes for most files in a directory, but when there is a file write happening that file's size never changes between poll intervals, and every subsequent file iterated in that directory have a bad size.
I have tried using fseek and ftell but they are also resulting in a very similar issue.
I have also tried modified date instead of size using stat() and st_mtimespec, and the date doesn't appear to change during a write - not that I expected it to.
Going back to NSFileManager's ability to give me the right values, does anyone have an idea what C API call that [NSFileManager attributesOfItemAtPath:] is actually using underneath?
Thanks in advance.
Update:
It appears that this has less to do with in-progress write operations and more with specific files. After closer inspection there are some files which always return a size, and other files that never return a size when using the C API (but will work fine with the Objective-C API). Even creating a copy of the "good" files the C API does not want to give a size for the copy but works fine with the original "good" file. I have both failures and successes with text (xml) files and binary (zip) files. I am using iTunes to add these files to the iPad's app's Documents directory. It is an iPad Mini Retina.
Update 2 - Answer:
Probably any of the above file size methods will work, if your path isn't invisibly trashed, like mine was. See accepted answer on why the path was trashed.
Well this weird behavior turned out to be a problem with the paths, which result in strings that will print normally, but are likely trashed in memory enough that file descriptors sometimes didn't like it (thus only occurring in certain file paths). I was using the dirent API to iterate over the files in a directory and concatenating the dir path and file name erroneously.
Bad Path Concatenation: Obviously (or apparently not-so-obvious at runtime) str-copying over three times is not going to end well.
char* fullPath = (char*)malloc(strlen(dir) + strlen(file) + 2);
strcpy(fullPath, dir);
strcpy(fullPath, "/");
strcpy(fullPath, file);
long sizeBytes = getSize(fullPath);
free(fullPath);
Correct Path Concatenation: Use proper str-concatenation.
char* fullPath = (char*)malloc(strlen(dir) + strlen(file) + 2);
strcpy(fullPath, dir);
strcat(fullPath, "/");
strcat(fullPath, file);
long sizeBytes = getSize(fullPath);
free(fullPath);
Long story short, it was sloppy work on my part, via two typos.

Correctly creating and running a win32 service with file I/O

I've written a very simple service application based on this code example.
The application as part of its normal running assumes there exists a file in the directory it is found, or in its execution path.
When I 'install' the service and then subsequently 'start' the service from the service manager in control panel. The application fails because it can't find the file to open and read from (even though the file is in the same directory as the installed executable).
My question is when a windows service is run, which is the expected running path supposed to be?
When calling 'CreateService' there only seems to be a path parameter for the binary, not for execution. Is there someway to indicate where the binary should be executed from?
I've tried this on windows vista and windows 7. Getting the same issues.
Since Windows services are run from a different context than normal user-mode applications, it's best if you don't make any assumptions about working directories or relative paths. Aside from differences in working directories, a service could run using a completely different set of permissions, etc.
Using an absolute path to the file that your service needs should avoid this problem entirely. Absolute paths will be interpreted the same regardless of the working directory, so this should make the working directory of your service irrelevant. There are several ways to go about this:
Hard-code the absolute path - This is perhaps the easiest way to avoid the problem, however it's also the least flexible. This method is probably fine for basic development and testing work, but you probably want something a bit more sophisticated before other people start using your program.
Store the absolute path in an environment variable - This gives you an extra layer of flexibility since the path can now be set to any arbitrary value and changed as needed. Since a service can run as a different user with a different set of environment variables, there are still some gotchas with this approach.
Store an absolute path in the registry - This is probably the most fool-proof method. Retrieving the path from the registry will give you the same result for all user accounts, plus this is relatively easy to set up at install time.
By default, the current directory for your Windows service is the System32 folder.
A promising solution is creating an environment variable that keeps the full path of your input location and retrieving the path from this variable at runtime.
If you use the same path as binary, you could just read binary path and modify it accordingly. But this is rather quick-fix rather than designed-solution. If I were you, I would either create system-wide environment variable and store value there, or (even better) use windows registry to store service configuration.
Note:
You will need to add Yourself some privileges using AdjustTokenPrivileges function, you can see an example here in ModifyPrivilege function.
Also be sure to use HKEY_LOCAL_MACHINE and not HKEY_CURRENT_USER. Services ar running under different user account so it's HKCU's will be different than what you can see in your registry editor.
Today I solved this problem as it was needed for some software I was developing.
As people above have said; you can hardcode the directory to a specific file - but that would mean whatever config files are needed to load would have to be placed there.
For me, this service was being installed on > 50,000 computers.
We designed it to load from directory in which the service executable is running from.
Now, this is easy enough to set up and achieve as a non-system process (I did most of my testing as a non-system process). But the thing is that the system wrapper that you used (and I used as well) uses Unicode formatting (and depends on it) so traditional ways of doing it doesn't work as well.
Commented parts of the code should explain this. There are some redundancies, I know, but I just wanted a working version when I wrote this.
Fortunately, you can just use GetModuleFileNameA to process it in ASCII format
The code I used is:
char buffer[MAX_PATH]; // create buffer
DWORD size = GetModuleFileNameA(NULL, buffer, MAX_PATH); // Get file path in ASCII
std::string configLoc; // make string
for (int i = 0; i < strlen(buffer); i++) // iterate through characters of buffer
{
if (buffer[i] == '\\') // if buffer has a '\' in it, replace with doubles
{
configLoc = configLoc + "\\\\"; // doubles needed for parsing. 4 = 2(str)
}
else
{
configLoc = configLoc + buffer[i]; // else just add char as normal
}
}
// Complete location
configLoc = configLoc.substr(0, configLoc.length() - 17); //cut the .exe off the end
//(change this to fit needs)
configLoc += "\\\\login.cfg"; // add config file to end of string
From here on, you can simple parse configLoc into a new ifsteam - and then process the contents.
Use this function to adjust the working directory of the service to be the same as the working directory of the exe it's running.
void AdjustCurrentWorkingDir() {
TCHAR szBuff[1024];
DWORD dwRet = 0;
dwRet = GetModuleFileName(NULL, szBuff, 1024); //gets path of exe
if (dwRet != 0 && GetLastError() != ERROR_INSUFFICIENT_BUFFER) {
*(_tcsrchr(szBuff, '\\') + 1) = 0; //get parent directory of exe
if (SetCurrentDirectory(szBuff) == 0) {
//Error
}
}
}

Write a file in a specific path in C++

I have this code that writes successfully a file:
ofstream outfile (path);
outfile.write(buffer,size);
outfile.flush();
outfile.close();
buffer and size are ok in the rest of code.
How is possible put the file in a specific path?
Specify the full path in the constructor of the stream, this can be an absolute path or a relative path. (relative to where the program is run from)
The streams destructor closes the file for you at the end of the function where the object was created(since ofstream is a class).
Explicit closes are a good practice when you want to reuse the same file descriptor for another file. If this is not needed, you can let the destructor do it's job.
#include <fstream>
#include <string>
int main()
{
const char *path="/home/user/file.txt";
std::ofstream file(path); //open in constructor
std::string data("data to write to file");
file << data;
}//file destructor
Note you can use std::string in the file constructor in C++11 and is preferred to a const char* in most cases.
Rationale for posting another answer
I'm posting because none of the other answers cover the problem space.
The answer to your question depends on how you get the path. If you are building the path entirely within your application then see the answer from #James Kanze. However, if you are reading the path or components of the path from the environment in which your program is running (e.g. environment variable, command-line, config files etc..) then the solution is different. In order to understand why, we need to define what a path is.
Quick overview of paths
On the operating systems (that I am aware of), a path is a string which conforms to a mini-language specified by the operating-system and file-system (system for short). Paths can be supplied to IO functions on a given system in order to access some resource. For example here are some paths that you might encounter on Windows:
\file.txt
\\bob\admin$\file.txt
C:..\file.txt
\\?\C:\file.txt
.././file.txt
\\.\PhysicalDisk1\bob.txt
\\;WebDavRedirector\bob.com\xyz
C:\PROGRA~1\bob.txt
.\A:B
Solving the problem via path manipulation
Imagine the following scenario: your program supports a command line argument, --output-path=<path>, which allows users to supply a path into which your program should create output files. A solution for creating files in the specified directory would be:
Parse the user specified path based on the mini-language for the system you are operating in.
Build a new path in the mini-language which specifies the correct location to write the file using the filename and the information you parsed in step 1.
Open the file using the path generated in step 2.
An example of doing this:
On Linux, say the user has specified --output-path=/dir1/dir2
Parse this mini-language:
/dir1/dir2
--> "/" root
--> "dir1" directory under root
--> "/" path seperator
--> "dir2" directory under dir1
Then when we want to output a file in the specified directory we build a new path. For example, if we want to output a file called bob.txt, we can build the following path:
/dir1/dir2/bob.txt
--> "/" root
--> "dir1" directory under root
--> "/" path separator
--> "dir2" directory under dir1
--> "/" path seperator
--> "bob.txt" file in directory dir2
We can then use this new path to create the file.
In general it is impossible to implement this solution fully. Even if you could write code that could successfully decode all path mini-languages in existence and correctly represent the information about each system so that a new path could be built correctly - in the future your program may be built or run on new systems which have new path mini-languages that your program cannot handle. Therefore, we need to use a careful strategy for managing paths.
Path handling strategies
1. Avoid path manipulation entirely
Do not attempt to manipulate paths that are input to your program. You should pass these strings directly to api functions that can handle them correctly. This means that you need to use OS specific api's directly avoiding the C++ file IO abstractions (or you need to be absolutely sure how these abstractions are implemented on each OS). Make sure to design the interface to your program carefully to avoid a situation where you might be forced into manipulating paths. Try to implement the algorithms for your program to similarly avoid the need to manipulate paths. Document the api functions that your program uses on each OS to the user - this is because OS api functions themselves become deprecated over time so in future your program might not be compatible with all possible paths even if you are careful to avoid path manipulation.
2. Document the functions your program uses to manipulate paths
Document to the user exactly how paths will be manipulated. Then make it clear that it is the users responsibility to specify paths that will work correctly with the documented program behavior.
3. Only support a restricted set of paths
Restrict the path mini-languages your program will accept until you are confident that you can correctly manipulate the subset of paths that meet this set of restrictions. Document this to the user. Error if paths are input that do not conform.
4. Ignore the issues
Do some basic path manipulation without worrying too much. Accept that your program will exhibit undefined behavior for some paths that are input. You could document to the user that the program may or may not work when they input paths to it, and that it is the users responsibly to ensure that the program has handled the input paths correctly. However, you could also not document anything. Users will commonly expect that your program will not handle some paths correctly (many don't) and therefore will cope well even without documentation.
Closing thoughts
It is important to decide on an effective strategy for working with paths early on in the life-cycle of your program. If you have to change how paths are handled later it may be difficult to avoid a change in behaviour that might break the your program for existing users.
Try this:
ofstream outfile;
string createFile = "";
string path="/FULL_PATH";
createFile = path.as<string>() + "/" + "SAMPLE_FILENAME" + ".txt";
outfile.open(createFile.c_str());
outfile.close();
//It works like a charm.
That needs to be done when you open the file, see std::ofstream constructor or open() member.
It's not too clear what you're asking; if I understand correctly, you're
given a filename, and you want to create the file in a specific
directory. If that's the case, all that's necessary is to specify the
complet path to the constructor of ofstream. You can use string
concatenation to build up this path, but I'd strongly recommend
boost::filesystem::path. It has all of the functions to do this
portably, and a lot more; otherwise, you'll not be portable (without a
lot of effort), and even simple operations on the filename will require
considerable thought.
I was stuck on this for a while and have since figured it out. The path is based off where your executable is and varies a little. For this example assume you do a ls while in your executable directory and see:
myprogram.out Saves
Where Saves is a folder and myprogram.out is the program you are running.
In your code, if you are converting chars to a c_str() in a manner like this:
string file;
getline(cin, file, '\n');
ifstream thefile;
thefile.open( ("Saves/" + file + ".txt").c_str() );
and the user types in savefile, it would be
"Saves/savefile.txt"
which will work to get to to get to savefile.txt in your Saves folder. Notice there is no pre-slashes and you just start with the folder name.
However if you are using a string literal like
ifstream thefile;
thefile.open("./Saves/savefile.txt");
it would be like this to get to the same folder:
"./Saves/savefile.txt"
Notice you start with a ./ in front of the foldername.
If you are using linux, try execl(), with the command mv.

CreateHardLink and CreateSymbolicLink Win32 Functions

I am completing a project to create dummy file systems for backup testing and need to develop a method of creating a Hardlinks and Softlinks within the structures.
The CreateHardLink and CreateSymbolicLink functions in windows.h receive file location and names based upon the current working directory.
The source code now changes directory, but those two functions do not successfully execute.
wstring hltarg;
hltarg = L"sym";
hltarg += ExistingFileName;
CreateHardLinkW(hltarg.c_str(), ExistingFileName.c_str(), NULL)
where hltarg concatenates the existing file name to the end of sym.
Because I moved my working directory to my target directory, neither of these strings contains a full path, but rather just the target file names.
Any advice on a different route to take rather than changing current directory?
The application will need to be portable so no hard references to file paths can be made, although desired file paths will be provided.