Getting list of files and folders on the user's computer with the filename filtered by the text line - c++

Currently I'm developing a project that should do the thing described above on Windows. I have the idea to recurcively go through all user's drives and collect all information on then, but it seems to be really time consuming. So is there a better way to do such thing (maybe to use OS's index file or NTFS MFT)?
I use C++/Qt.

You can search for any of the many code examples for this and use one.
The library finctions which you use FindFirstFile and FindNextFile are optimized and will go firectly to the FAT. They are coded by microsoft & I doubt that there is a faster way.
Btw, what do mean by "filtered by the text line"? Do you mean you want only filenames matching a certain pattern (use teh above) or files containing a string?

Related

get modified files after given timestamp in windows file system in Cpp code

Is there any way that I can get modified files/folders after a given timestamp in windows file system? I don't want to traverse entire file system and check which file/folder is modified in my code. Does windows provide any API which returns modified files/folders after a given time stamp ?
No, there is no direct WinAPI to accomplish this.
I'd suggest traversing only through certain folders (exclude folders like Windows, ProgramData) etc. Traverse only through the folders that make sense. ex: Users.
Why? Because the system files in Windows and such folders are accessed very frequently and are modified after system updates. Unless you're keen to see when the system files were modified, I'd say the data is going to be irrelevant and of no meaning.

Is there a faster alternative to enumerating folders than FindFirstFile/FindNextFile with C++?

I need to get all paths to subfolders within a folder (with WinAPIs and C++.) So far the only solution that I found is recursively calling FindFirstFile / FindNextFile but it takes a significant amount of time to do this on a folder with a deeper hierarchy.
So I was wondering, just to get folder names, is there a faster approach?
If you really just need subfolders you should be able to use FindFirstFileEx with
search options to filter out non-directories.
The docs suggest this is an advisory flag only, but your filesystem may support this optimization - give it a try.
FindExSearchLimitToDirectories
This is an advisory flag. If the file
system supports directory filtering, the function searches for a file
that matches the specified name and is also a directory. If the file
system does not support directory filtering, this flag is silently
ignored.
A faster approach would be to bypass the FindFirstFile...() API and go straight to the file system directly. You can use DeviceIoControl() with the FSCTL_ENUM_USN_DATA control to access the master file table, at least on NTFS formatted volumes. With that information, you can directly access the records for files/folders, which includes their attributes, parent info, etc. Yes, it would be more work, but it should also be faster since you can optimize the code to access just the pieces you need.
That is the fastest approach you can come across. Also you may consider using another thread to manage directory enumerations as it takes a lot of time. even Microsoft file explorer spend some time if the directory has a lot of sub folders/files.
One more thing here is that you can enumerate directories once and then register for any updates. so the cost of enumerating the folder should be made only once during start up.

Opening a File with different text editors

Apparently this supposed to be possible. For example opening and operating on a file with NOTEPAD, or HxD. But aren't they all text files...how would one specify which text editor to open the file and operate on the file with using the WINDOWS API. It is certainly not in "CreateFile".
Hopefully I'm understanding your question... The easiest way to do this is to launch the desired editor and pass the filename as an argument, rather than "invoking" the file (which will launch the default program associated with the file type).
For example, notepad.exe mytextfile.txt or gvim.exe mytextfile.txt.
If the editor is not on your %PATH%, you'll need to use a full path file name.
What are you trying to do, exactly? You could:
Maintain a list of editors that you expect to be installed and have entries for in the system's PATH (bad idea)
Have an editor/editors that you want to use, query the Windows registry to find the installation path of the editors (using RegGetValue), and launch the editor with CreateProcess) (a little better idea)
Query the registry to get the default editor for a given file type and then launch that editor using CreateProcess. (best idea)
But it all depends on what your goal is really.
Edit based on requirements
So, just so we're on the same page, from C++, you want to:
Take a command line parameter to your C++ application (filename)
Open that file in an arbitrary editor
Detect when the user has made changes to that file
Operate on the file contents
Is that correct?
If so, you could:
Use Boost libs to compute a CRC for the current data in the file
Launch an editor using one of the methods I initially described
Stick in a tight loop and sleep so you don't chew up resources while the initially computed CRC matches one calculated every iteration of the loop
Of course, there are all kinds of issues that you'd have to deal with (that's just a super simple way of describing the algorithm I might use), such as:
What happens if the user doesn't change the file?
What happens if the file isn't found?
I'm sure that there are a number of different methods of doing this, but this is the easiest method that I can think of at the moment (while still being able to be fairly certain of the changes).
Disclaimer: I haven't implemented something like this, so I might be completely off base ;)
Are you looking for the ShellExecute() or ShellExecuteEx() APIs on Windows? They'll launch whatever program is registered for a file (generally based on the filename extention).

Methods of storing application data/settings without the registry?

I need some methods of storing and getting data from a file (in WIN32 api c++ application, not MFC or .NET)
e.g. saving the x, y, width and height of the window when you close it, and loading the data when you open the window.
I have tried .ini files, with the functions -- WritePrivateProfileString and ReadPrivateProfileString/Int, but on MSDN it says
"This function is provided only for compatibility with 16-bit Windows-based applications. Applications should store initialization information in the registry."
and when i tried on my Windows7 64bit machine to read a ini file, i got blue screen! (in debug mode with visual studio) O.O
I notice that most other application use XML to store data, but I don't have a clue how to read/write xml data in c++, are there any libraries or windows functions which will allow me to use xml data?
Any other suggestions would be good too, thanks.
There is nothing wrong with .ini files, the only problem with them is where to write them. CIniFile from CodeProject is good enough class. Ini file should be placed in %APPDATA%/<Name Of Your Application> (or %LOCALAPPDATA%\<Same Name Here>, as described below).
EDIT: If we are talking about Windows family of operating systems from Windows 2000 onward then function SHGetFolderPath is portable way to retrieve user specific folder where application configuration files should be stored. To store data in romaing folder use CSIDL_APPDATA with SHGetFolderPath. To store data to local folder use CSIDL_LOCAL_APPDATA.
The difference between local and roaming folder is in the nature of the data to be stored. If data is too large or machine specific then store it in local folder. Your data (coordinates and size of the window) are local in nature (on other machine you may have different resolution), so you should actually use CSIDL_LOCAL_APPDATA.
Windows Vista and later have extended function SHGetKnownFolderPath with its own set of constants, but if you seek compatibility stick to the former SHGetFolderPath.
TinyXML is a popular and simple XML parser for C++.
Apart from that, you can really use any format you want to store your settings, though it's considered good practice to keep settings in text format so that they can be hand-edited if necessary.
It's fairly simple to write your own functions for reading/writing a file in INI or similar format. The format is entirely up to you, as long as it's easily comprehensible to humans. Some possibilities are:
; Comment
# Comment
Key = Value (standard INI format)
Key Value
Key: Value
You could use Boost.PropertyTree for this.
Property trees are versatile data
structures, but are particularly
suited for holding configuration data.
The tree provides its own,
tree-specific interface, and each node
is also an STL-compatible Sequence for
its child nodes.
It supports serialization, and so is well-suited to managing and persisting changeable configuration data. There is an example here on how to load and save using the XML data format that this library supports.
The library uses RapidXML internally but hides the parsing and encoding details, which would save you some implementation time (XML libraries all have their idiosyncracies), while still allowing you to use XML as the data representation on disk.
libxml2. I have seen quite a lot places where it is used. Easy to use and loads of examples to get you started and not a vast library as such. And in C, take it wherever you want.
pugixml is another good (and well documented) XML parser library. And If you like portability XML is a better option.
While INI files may not be the best format, I think you can safely ignore the warning MSDN puts on WritePrivateProfileString and ReadPrivateProfileString.
Those two functions are NEVER going away. It would break THOUSANDS of applications.
That warning has been there for years and I suspect was added when the registry was all the rage and someone naively thought it would one day completely replace INI files.
I might be wrong but it would be very unlike Microsoft to break so many existing apps like this for no good reasons. (Not that they do not occasionally break backwards compatibility, but this would cause huge problems for zero benefit.)
Ohhh My GOD? Have you ever thought of stright-forward solution rather then thinking of Super-Duper-all-can-do framework way?
Sorry...
You want to store two numbers between restarts???
Save: Open a file, write these two numbers, close the file:
std::ifstream out(file_name);
out << x << ' ' << y;
out.close();
Load: Open a file, read these two numbers, close the file:
std::ifstream in(file_name);
if(!in) return error...
in >> x >> y;
if(!in) return error...
in.close();
Libconfig is the best solution in C++ as far as I have tried.
Works multi platform with minimum coding.
You must try that!
I like the TinyXML solution suggested.
But for Windows, I like .ini even more.
So I'll suggest the inih library, free and open source on GitHub here. Very simple and easy to use - 1 header file library iirc.

HD Regular Expression Search

I am working on a project for my computer security class and I have a couple questions. I had an idea to write a program that would search the whole hard drive looking for email addresses. I am just looking for addresses stored in plain text since it would be hard to find anything otherwise. I figured the best way to find addresses would be to use a regular expression.
I wrote an application in C# that works fairly well but it I would like to see if anyone has any better ideas. I am completely up for writing this in another language since I'm assuming C# isn't the best for this type of thing. So far the application I created just starts at the C:/ and recursively locates all files on the drive skipping those that aren't accessible. It also skips all common image, video, audio, compressed, and files over 512mb. This speeds it up quite a bit but there is a small chance that a large file could contain something useful. It takes about 12 seconds to generate the list of files and I'm guessing about an hour to check them all. One downside is that it uses about 50% cpu while scanning.
I'm looking for ideas on how to improve the search. Is there a faster way, a more efficient way, a more thorough way, things like that? I was trying to think if there was any way that you could tell if the file would contain plain text strings or not. Just let me know if you have any cool ideas. Thanks.
To be honest, the easiest existing way to do this is to use grep. As you improve your program, compare your speeds to it, and when you get close, stop worrying about optimizing. Alternatively, take a look at its source for an example of an existing product that does what you're looking for.
As noted elsewhere, tools already exist for this if you install Win32 ports of UNIX tools. Alternatively, the Windows equivalent is:
for /r c:\ %i in (*.*) do findstr /i /r "regular expression" "%i"
you should just use grep + find. grep is optimized for searching files fast, and find is optimized for providing lists of appropriate files for things like this. people have spent a long time optimizing these tools - no need to reinvent the wheel.