File search APIs on Linux - c++

In my project, I need to show all files on user's drive filtered by the filename with a text line. Are there any APIs to do such thing?
On Windows, I know, there're FindFirstFile and FindNextFile functions in WinAPI.
I use C++/Qt.

There's ftw() and linux has fts()
Besides those, you can iterate directories, using e.g. opendir8/readdir()

Qt provides the QDirIterator class:
QDirIterator iter("/", QDirIterator::Subdirectories);
while (iter.hasNext()) {
QString current = iter.next();
// Do something with 'current'...
}

If you are looking for a Unix command, you could do this :
find source_dir -name 'regex'
If you want to do it C++ style, I'd suggest to use boost::filesystem. It's a very powerfull cross platform library.
Of course, you will have to add an additional library.
Here is an example :
std::vector<std::string> list_files(const std::string& root, const bool& recursive, const std::string& filter, const bool& regularFilesOnly)
{
namespace fs = boost::filesystem;
fs::path rootPath(root);
// Throw exception if path doesn't exist or isn't a directory.
if (!fs::exists(rootPath)) {
throw std::exception("rootPath does not exist");
}
if (!fs::is_directory(rootPath)) {
throw std::exception("rootPath is not a directory.");
}
// List all the files in the directory
const std::regex regexFilter(filter);
auto fileList = std::vector<std::string>();
fs::directory_iterator end_itr;
for( fs::directory_iterator it(rootPath); it != end_itr; ++it) {
std::string filepath(it->path().string());
// For a directory
if (fs::is_directory(it->status())) {
if (recursive && it->path().string() != "..") {
// List the files in the directory
auto currentDirFiles = list_files(filepath, recursive, filter, regularFilesOnly);
// Add to the end of the current vector
fileList.insert(fileList.end(), currentDirFiles.begin(), currentDirFiles.end());
}
} else if (fs::is_regular_file(it->status())) { // For a regular file
if (filter != "" && !regex_match(filepath, regexFilter)) {
continue;
}
} else {
// something else
}
if (regularFilesOnly && !fs::is_regular_file(it->status())) {
continue;
}
// Add the file or directory to the list
fileList.push_back(filepath);
}
return fileList;
}

you can also use glob
http://man7.org/linux/man-pages/man3/glob.3.html
has the advantage of existing on a lot of Unices (Solaris for sure) as it is part of POSIX.
Ok, it's not C++ but pure C.

Look man find. find supports filtering by a mask ( -name option for examole)

Related

C++: browse filenames in Windows which have dots inside the names

Let's take as example the following file name: abc.def.txt. I'm using FindFirstFileA() and FindNextFileA() in order to browse a directory. The function looks like this:
std::vector<std::string>* readDir(std::string pattern) {
auto v = new std::vector<std::string>;
pattern.append("\\*");
WIN32_FIND_DATAA data;
HANDLE hFind;
if ((hFind = FindFirstFileA(pattern.c_str(), &data)) != INVALID_HANDLE_VALUE) {
do {
v->push_back(data.cFileName);
} while (FindNextFileA(hFind, &data) != 0);
FindClose(hFind);
}
return v;
}
The problem is that the filename mentioned earlier will appear as abcdef.txt or abc def.txt. What can I do to get the names of files with their dots?
It looks like you're finding the old 8.3 name. You'd want FindFirstFileEx with level FindExInfoBasic. This ignores 8.3 names.

Using cv::glob for multiple extensions

I'm trying to use cv::glob to find images in a folder system. What I now want is to search for multiple file extensions at once (let's say .jpg and .png). Is there a way to do this?
The opencv documentation on this method doesn't specify the pattern parameter.
At the moment I'm using the ugly and inefficient method of searching for each extension seperately and combining the results. See:
vector<cv::String> imageNames;
vector<string> allowedExtensions = { ".jpg", ".png" };
for (int i = 0; i < allowedExtensions.size(); i++) {
vector<cv::String> imageNamesCurrentExtension;
cv::glob(
inputFolder + "*" + allowedExtensions[i],
imageNamesCurrentExtension,
true
);
imageNames.insert(
imageNames.end(),
imageNamesCurrentExtension.begin(),
imageNamesCurrentExtension.end()
);
}
Open CV or even OS file system API has no build-in way to do that. You can improve your code by eliminating multiple iteration over the folder/folders. For example you can use boost::filesystem::recursive_directory_iterator and some filter function to retrieve all the files in a single iteration.
Here is the sample:
#include <boost/filesystem.hpp>
#include <set>
namespace fs = ::boost::filesystem;
void GetPictures(const fs::path& root, const std::set<string>& exts, vector<fs::path>& result)
{
if(!fs::exists(root) || !fs::is_directory(root))
{
return;
}
fs::recursive_directory_iterator it(root);
fs::recursive_directory_iterator endit;
while(it != endit)
{
if(fs::is_regular_file(*it) && exts.find(it->path().extension()) != exts.end())
{
result.push_back(it->path());
}
++it;
}
}
It's only a sample, you should take care of string casing, error handling, etc.

The best way to ignore files with other extensions when using the C++ experimental <filesystem>?

With the future C++, is there a better way to ignore files with other than wanted extensions than the one shown in the code snippet below?
I am learning the C++ experimental <filesystem> (http://en.cppreference.com/w/cpp/experimental/fs) while writing a simple program that transforms text files from one directory to text file in another directory. The program takes input and output directories via command-line arguments. Only the files with certain extensions (like .csv, .txt, ...) should be processed. The output files should have the .xxx extension.
#include <filesystem>
namespace fs = std::tr2::sys; // the implementation from Visual Studio 2015
...
fs::path srcpath{ argv[1] };
fs::path destpath{ argv[2] };
...
for (auto name : fs::directory_iterator(srcpath))
{
if (!fs::is_regular_file(name))
continue; // ignore the non-files
fs::path fnameIn{ name }; // input file name
// Ignore unwanted extensions (here lowered because of Windows).
string ext{ lower(fnameIn.extension().string()) };
if (ext != ".txt" && ext != ".csv")
continue;
// Build the output filename path.
fs::path fnameOut{ destpath / fnameIn.filename().replace_extension(".xxx") };
... processing ...
}
Basically, your question boils down to, "given a string, how do I determine if it matches one of a number of possibilities?" That's pretty trivial: put the possibilities in a std::set:
//Before loop
std::set<std::string> wanted_exts = {".txt", ".csv"};
//In loop
string ext{ lower(fnameIn.extension().string()) };
if (wanted_exts.find(ext) == wanted_exts.end())
continue;
You can of course keep wanted_exts around for as long as you like, since it probably won't change. Also, if you have Boost.Containers, I would suggest making wanted_exts a flat_set. That will help minimize allocations.
std::tr2::sys was the namespace MSVC used in VS2013 to ship the filesystem TS, but that is actually supposed to be in the std::experimental::v1 namespace; the old namespace has been retained for backwards compatibility. v1 is an inline namespace, so you can drop that from the name and say
namespace fs = std::experimental::filesystem;
Assuming using boost is an option, you can perform filtering of the directory entries using Boost.Range adaptors. And testing for any one of several extensions can be done using boost::algorithm::any_of_equal.
#include <boost/algorithm/cxx11/any_of.hpp>
#include <boost/range/adaptors.hpp>
for(auto const& p :
boost::make_iterator_range(fs::directory_iterator(srcpath), {})
| boost::adaptors::transformed([](auto const& d) {
return fs::path(d); })
| boost::adaptors::filtered([](auto const& p) {
return fs::is_regular_file(p); })
| boost::adaptors::filtered([](auto const& p) {
auto const& exts = { ".txt", ".csv" };
return boost::algorithm::any_of_equal(exts, p.extension().string()); })
) {
// all filenames here will have one of the extensions you tested for
}
The solution of the loop that I have finally chosen...
#include <filesystem>
namespace fs = std::experimental::filesystem;
...
set<string> extensions{ ".txt", ".csv" };
for (auto const& name : fs::directory_iterator(srcpath))
{
if (!fs::is_regular_file(name))
continue;
fs::path fnameIn{ name };
string ext{ lower(fnameIn.extension().string()) };
if (extensions.find(ext) != extensions.end())
{
fs::path fnameOut{ destpath / fnameIn.filename().replace_extension(".xxx") };
processing(fnameIn, fnameOut);
}
}

directory structures C++

C:\Projects\Logs\RTC\MNH\Debug
C:\Projects\Logs\FF
Is there an expression/string that would say go back until you find "Logs" and open it? (assuming you were always below it)
The same executable is run out of "Debug", "MNH" or "FF" at different times, the executable always should save it's log files into "Logs".
What expression would get there WITHOUT referring to the entire path C:\Projects\Logs?
Thanks.
You might have luck using the boost::filesystem library.
Without a compiler (and ninja-copies from boost documentation), something like:
#include <boost/filesystem.hpp>
namespace boost::filesystem = fs;
bool contains_folder(const fs::path& path, const std::string& folder)
{
// replace with recursive iterator to check within
// sub-folders. in your case you just want to continue
// down parents paths, though
typedef fs::directory_iterator dir_iter;
dir_iter end_iter; // default construction yields past-the-end
for (dir_iter iter(path); iter != end_iter; ++iter)
{
if (fs::is_directory(iter->status()))
{
if (iter->path().filename() == folder)
{
return true;
}
}
}
return false;
}
fs::path find_folder(const fs::path& path, const std::string& folder)
{
if (contains_folder(path, folder))
{
return path.string() + folder;
}
fs::path searchPath = path.parent_path();
while (!searchPath.empty())
{
if (contains_folder(searchPath, folder))
{
return searchPath.string() + folder;
}
searchPath = searchPath.parent_path();
}
return "":
}
int main(void)
{
fs::path logPath = find_folder(fs::initial_path(), "Log");
if (logPath.empty())
{
// not found
}
}
For now this is completely untested :)
It sounds like you're asking about a relative path.
If the working directory is C:\Projects\Logs\RTC\MNH\Debug\, the path ..\..\..\file represents a file in the Logs directory.
If you might be in either C:\Projects\Logs\RTC\MNH\ or C:\Projects\Logs\RTC\MNH\Debug\, then no single expression will get you back to Logs from either place. You could try checking for the existence of ..\..\..\..\Logs and if that doesn't exist, try ..\..\..\Logs, ..\..\Logs and ..\Logs, which one exists would tell you how "deep" you are and how many ..s are required to get you back to Logs.

Recursive file search using C++ MFC?

What is the cleanest way to recursively search for files using C++ and MFC?
EDIT: Do any of these solutions offer the ability to use multiple filters through one pass? I guess with CFileFind I could filter on *.* and then write custom code to further filter into different file types. Does anything offer built-in multiple filters (ie. *.exe,*.dll)?
EDIT2: Just realized an obvious assumption that I was making that makes my previous EDIT invalid. If I am trying to do a recursive search with CFileFind, I have to use *.* as my wildcard because otherwise subdirectories won't be matched and no recursion will take place. So filtering on different file-extentions will have to be handled separately regardless.
Using CFileFind.
Take a look at this example from MSDN:
void Recurse(LPCTSTR pstr)
{
CFileFind finder;
// build a string with wildcards
CString strWildcard(pstr);
strWildcard += _T("\\*.*");
// start working for files
BOOL bWorking = finder.FindFile(strWildcard);
while (bWorking)
{
bWorking = finder.FindNextFile();
// skip . and .. files; otherwise, we'd
// recur infinitely!
if (finder.IsDots())
continue;
// if it's a directory, recursively search it
if (finder.IsDirectory())
{
CString str = finder.GetFilePath();
cout << (LPCTSTR) str << endl;
Recurse(str);
}
}
finder.Close();
}
Use Boost's Filesystem implementation!
The recursive example is even on the filesystem homepage:
bool find_file( const path & dir_path, // in this directory,
const std::string & file_name, // search for this name,
path & path_found ) // placing path here if found
{
if ( !exists( dir_path ) ) return false;
directory_iterator end_itr; // default construction yields past-the-end
for ( directory_iterator itr( dir_path );
itr != end_itr;
++itr )
{
if ( is_directory(itr->status()) )
{
if ( find_file( itr->path(), file_name, path_found ) ) return true;
}
else if ( itr->leaf() == file_name ) // see below
{
path_found = itr->path();
return true;
}
}
return false;
}
I know it is not your question, but it is also easy to to without recursion by using a CStringArray
void FindFiles(CString srcFolder)
{
CStringArray dirs;
dirs.Add(srcFolder + "\\*.*");
while(dirs.GetSize() > 0) {
CString dir = dirs.GetAt(0);
dirs.RemoveAt(0);
CFileFind ff;
BOOL good = ff.FindFile(dir);
while(good) {
good = ff.FindNextFile();
if(!ff.IsDots()) {
if(!ff.IsDirectory()) {
//process file
} else {
//new directory (and not . or ..)
dirs.InsertAt(0,nd + "\\*.*");
}
}
}
ff.Close();
}
}
Check out the recls library - stands for recursive ls - which is a recursive search library that works on UNIX and Windows. It's a C library with adaptations to different language, including C++. From memory, you can use it something like the following:
using recls::search_sequence;
CString dir = "C:\\mydir";
CString patterns = "*.doc;abc*.xls";
CStringArray paths;
search_sequence files(dir, patterns, recls::RECURSIVE);
for(search_sequence::const_iterator b = files.begin(); b != files.end(); b++) {
paths.Add((*b).c_str());
}
It'll find all .doc files, and all .xls files beginning with abc in C:\mydir or any of its subdirectories.
I haven't compiled this, but it should be pretty close to the mark.
CString strNextFileName , strSaveLog= "C:\\mydir";
Find.FindFile(strSaveLog);
BOOL l = Find.FindNextFile();
if(!l)
MessageBox("");
strNextFileName = Find.GetFileName();
Its not working. Find.FindNextFile() returning false even the files are present in the same directory``