C++ multiple files with common name beginning - c++

Is there any way to open files which have common name beginning without specifying their full names and amount? They should be open one by one, not all at once.
For example, I have files:
certainstr_123.txt, certainstr_5329764.txt, certainstr_1323852.txt.
Or, maybe, it can be more easily done in some another language?
Thank you.

C++ doesn't define a standard way for listing files in that way.
The best cross platform approach is to use a library such as the boost filesystem module. I don't think boost::filesystem has wildcard search, you have to filter files yourself but it isn't difficult.
You could use regular expressions, like in the other answer (it's the perfect-fit solution).
Probably it could be enough to check file extension (i->path().extension()) and filename starting with "certainstr_" (boost::starts_with or std::string::substr).
If you choose C++11 standard regex library make sure to have a recent libstdc++.
There are a lot of system specific functions. E.g. see:
How can I get the list of files in a directory using C or C++?
How do I get a list of files in a directory in C++?
for some Unix/Windows examples.
You could also try something like (Windows):
std::system("dir /b certainstr_*.txt > list.txt");
or (Unix):
std::system("ls -m1 certainstr_*.txt > list.txt");
parsing the list.txt output file (of course this is a hack).
Anyway, depending on your needs, Python-based (or script-based) solutions could be simpler (see also How to list all files of a directory?):
[glob.glob('certainstr_*.txt')][3]
or also:
files = [f for f in os.listdir('.') if re.match(r'certainstr_\d+.txt', f)]
This is the Python equivalent of https://stackoverflow.com/a/26585425/3235496

I suggest just using a regular expression. Pseudo code:
boost::regex reg("^certainstr_\\d+.txt$");
for(recursive_directory_iterator it("."); it != recursive_directory_iterator(); ++it)
{
if(boost::regex_search(it->string(), reg))
{
cout << *it << endl;
}
}

Related

Simpler way to find the file that define my C/C++ function / macro than 'grep'

I start to work on a huge project with tones of C and C++ files, already wrote by someone else.
Is there any faster/simpler ways to find in what file any macro or function is define other than a grep -r ? It is kind of long.
In some IDE there is this magical thing like right click and "go to definition". But I'm currently using emacs. I don't know if there is any customisation that can do this ?
Each time, I have to copy past the name in my terminal, run a grep and re copy past the file path in my emacs. (And you know, I am lazy...)
CTags. You can try using Ctags with emacs and it will help you to navigate to the function declaration directly. For its usage, please refer to https://www.emacswiki.org/emacs/EmacsTags
You can also explore Cscope, It has a better feature set than ctags which works directly on pattern recognition. But sometimes, you just need to navigate through code and more often than not ctags does the job.
Each time, I have to copy past the name in my terminal, run a grep and re copy past the file path in my emacs.
You can improve on this by using M-x rgrep inside Emacs. It asks for the regular expression, a glob pattern of files to look in, and a directory to start in. It then does a recursive grep, outputs the results in a buffer, and you can jump directly from the hits to the corresponding file.
For the glob pattern, you could type something like *.c, or you could use one of the aliases defined in the variable grep-files-aliases. For example, ch is equivalent to *.[ch] (C source and header files) and cchh is equivalent to *.cc *.[ch]xx *.[ch]pp *.[CHh] *.CC *.HH *.[ch]++ (C++ source and header files).
You might find that this works well enough that you don't need ctags and other tools suggested in the other answers and comments.
For ease of finding function definitions in C, some projects use the convention that the function name in the definition starts in the first column:
/* this is just a declaration */
int main(int argc, char *argv[]);
/* in the definition, the function name starts on its own line */
int
main(int argc, char *argv[])
{
...
That means that you can find the definition, excluding any calls to the function, with the regex ^main.
grep should be really fast if you limit the search to the directories and file types (generally .h and .hpp) that are likely to contain it. For example if you know it is in your application just search there, if you know it's from FreeType (generally FT_*) search there.
More RAM will help the system cache files better, and if your on a HDD best to get an SSD. If your working directly on a VM, especially one with remote disks, if can work locally that will often be faster.
Otherwise many fully functional IDE's (Visual Studio, XCode, Eclipse, etc.) have C++ integration to keep track of these things, and will for example offer a "Go to declaration" and "Go to definition" option as a shortcut or context menu when over a symbol.

List only files but not directories using list.files

How can I list only files, but not directories using list.files (not recursively)? It has an include.dirs argument, but this is ignored when not being used recursively.
I had been thinking something like
list.files(path=myDir, pattern="[^/]$")
but that doesn't seem to work, or a few variations on it. Is there a regex that I can plug in here or a function. I know I can do list.dirs and take a setdiff, but this is already slow enough, I want this to be quicker.
PS: currently on linux, but need something that works cross-platform.
PPS: file.info is really slow, so I think that is also not going to work.
PPPS: It doesn't need to be list.files, that is just the function I had thought should do it.
Consider this regex pattern that matches any file containing letters or numbers and contains the dot extension (to leave out subdirectories but unfortunately files without extensions):
# WITH ANCHORING
files <- list.files(path, pattern=("[a-zA-Z0-9]*[.][a-zA-Z0-9]*$"))
# MATCHING LETTER AND/OR NUMBER FILES WITH EXTENSION
files = list.files(myDir, pattern=("[a-zA-Z0-9]*[.]"))
# WILDCARD FILE MATCHING WITH EXTENSION
files = list.files(myDir, pattern=("*[.]"))
Some other regex variations to catch files with periods (note these also get directories with periods and miss files with no extensions)
list.files(pattern="\\..+$")
list.files(pattern="\\.[[:alnum:]]+$")
And using system2 with ls seems to work pretty well (thanks #42- as well from comments),
system2("ls", args=c("-al", "|", "grep", "^-"))
should get only regular files (including ones without extensions), or
system2("ls", args=c("--classify"))
should return files with directories having a "/" appended so they can be determined.
For an alternative open-source solution, consider the Python solution that allows you to condition if item is a directory and using os.path.join() is agnostic to any OS platform.
import os
files = [f for f in os.listdir(myDir) if os.path.isfile(os.path.join(myDir, f))]

Iterating through files in a folder in D

In D programming, how can I iterate through all files in a folder?
Is there a D counterpart to python's glob.iglob?
http://dlang.org/phobos/std_file.html#dirEntries
So like
import std.file;
foreach(string filename; dirEntries("folder_name", "*.txt", SpanMode.shallow) {
// do something with filename
}
See the docs for more info. The second string, the *.txt filter, is optional, if you leave it out, you see all files.
The SpanMode can be shallow to skip going into subfolders or something like SpanMode.depth to descend into them.
Take a look at std.file.dirEntries. It will allow you to iterate over all of the files in a directory either shallowly (so it doesn't iterate any subdirectories), with breadth-first search, or with depth-first search. And you can tell it whether you want it to follow symlinks or not. It also supports wildcard strings using std.path.globMatch. A basic example would be something like
foreach(DirEntry de; dirEntries(myDirectory, SpanMode.shallow))
{
...
}
However, because dirEntries returns a range of DirEntrys, it can be used in the various range-based functions in Phobos and not just with foreach.

Reorganizing large amount of files with regex?

I have a large amount of files organized in a hierarchy of folders and particular file name notations and extensions. What I need to do, is write a program to walk through the tree of files and basically rename and reorganize them. I also need to generate a report of the changes and information about the transformed organization along with statistics.
The solution that I can see, is to walk through the tree of files just like any other tree data structure, and use regular expressions on the path name of the files. This seems very doable and not a huge amount of work. My questions are, is there tools I should be using other than just C# and regex? Perl comes to mind since I know it was originally designed for report generation, but I have no experience with the language. And also, is using regex for this situation viable, because I have only used it for file CONTENTS not file names and organization.
Yes, Perl can do this. Here's something pretty simple:
#! /usr/bin/env perl
use strict;
use warnings;
use File::Find;
my $directory = "."; #Or whatever directory tree you're looking for...
find (\&wanted, $directory);
sub wanted {
print "Full File Name = <$File::Find::name>\n";
print "Directory Name = <$File::Find::dir>\n";
print "Basename = <$_\n>";
# Using tests to see various things about the file
if (-f $File::Find::name) {
print "File <$File::Find::name> is a file\n";
}
if (-d $File::Find::name) {
print "Directory <$File::Find::name> is a directory\n";
}
# Using regular expressions on the file name
if ($File::Find::name =~ /beans/) { #Using Regular expressions on file names
print "The file <$File::Find::name> contains the string <beans>\n";
}
}
The find command takes the directory, and calls the wanted subroutine for each file and directory in the entire directory tree. It is up to that subroutine to figure out what to do with that file.
As you can see, you can do various tests on the file, and use regular expressions to parse the file's name. You can also move, rename, or delete the file to your heart's content.
Perl will do exactly what you want. Now, all you have to do is learn it.
If you can live with glob patterns instead of regular expressions, mmv might be an option.
> ls
a1.txt a2.txt b34.txt
> mmv -v "?*.txt" "#2 - #1.txt"
a1.txt -> 1 - a.txt : done
a2.txt -> 2 - a.txt : done
b34.txt -> 34 - b.txt : done
Directories at any depth can be reorganized, too. Check out the manual. If you run Windows, you can find the tool in Cygwin.

Programmatically search + replace in a .doc

If I'm given a .doc file with special tags in it such as [first_name], how do I go about replacing all occurrences of it with something like "Clark"? A simple binary replacement only works if the replacement string is the exact same length.
Haskell, C, and C++ answers would be best, but any compiled language would do. I'd also prefer to do this without an external library since it has to be deployed on Windows and Linux and cross-platform dependency handling is a bitch.
To summarize...
.doc -> magic program -> .doc with strings replaced
You could use the Word COM component ("Word.Application") on Windows to open the file, do the replacements, save the file, and close it. However, this is Windows-only and can be buggy.
Another thing you could do is use the OpenOffice.org command line interface to convert the file to the ODF format, unzip the file (ODF is mostly zipped XML), do the replacements with the files inside, re-zip the file, and re-convert it to .doc format. However, OpenOffice.org doesn't always read Word files correctly (especially if there is a lot of complex formatting) and it can make it harder to distribute (users must either have OpenOffice.org or you must distribute it with your program).
Also, if you have a file in the .docx format, you can unzip it, do the replacements, and re-zip it.
First read the Word Document Specification.
If that hasn't terrified you, then you should find it fairly straightforward to figure out how to read and write it. It must be possible; Word manages to do it most of the time.
You probably have to use .Net programming (VB or C#) to create an object of Word.Application and then use the MS Word object model to manipulate your document.
Why do you want to be using C/C++/Haskell or another compiled language? I'm not too familiar with Haskell, but in general I would say that C is not a great language for performing text processing. A lot of interpreted languages (Perl, Python, etc.) also have powerful regular expression libraries that are suited for finding and replacing phrases.
With that said, as the other posters have noted, you will still have to deal with the eccentricities of the .doc format.