Loading files that meet certain criteria into hidden buffers in vim - c++

I'd like to do some code refactoring in vim. I have found the following gem to apply transformations to all buffers.
:dobuf %s/match/replace/gc
My code is layed out with the root directory having a directory for the dependencies and a build directory. I want to load all .cc , .h and .proto files from ./src ./include and ./tests. But not from the dependencies and build directories, into background/hidden buffers. I want to do this to do the refactor using the command above.
If someone knows of a cleaner way to perform the use case, please show it.
Note: I know you can string together find and sed to do this from the shell, however I prefer doing it in vim , if at all possible. The /gc prefix in the pattern I presented above serves the role of confirming replacements on each match, I need this functionality as often I don't want to replace certain matches, the find and sedsolution is too restrictive and finicky when attempting my use-case, it is also easy to destroy files when doing in-place replacements.
For reference using sed and find:
List candidate replacements:
find src include tests -name *.h -or -name *.cc -or -name *.proto|
xargs sed -n 's/ListServices/list_services/p'
Perform replacements:
`find src include tests -name *.h -or -name *.cc -or -name *.proto|
xargs sed -i 's/ListServices/list_services`'

You can use :argadd to add the files you need to vim's argument list. This will load them as inactive buffers (you can see them afterwards with an :ls. In your case, it might look like this:
argadd src/**/*.cc
argadd src/**/*.h
argadd src/**/*.proto
And so on, for the include and tests directories. You might want to make a command for that or experiment with glob patterns to make it a bit simpler. Afterwards, your command should work, although I'd recommend running it with :argdo instead:
argdo %s/match/replace/gc
This will only execute it for the buffers you explicitly specified, not for any of the other ones you might have opened at the time. Check :help argadd and :help argdo for more information.

Related

Find and replace pattern in large number of files

I want to replace text in about 80.000 log files using a regex. I love the batch search and replace of VSCode. I was unable to do this with VSCode, because it did not seem to handle this amount of data well. Any suggestion how I could do this with VSCode? Are there suggestions for alternatives?
Instead of depending on a GUI based tool, it might be easier to for a CLI tool for this.
If you're using Linux, or willing to install any of the tools like sed and find if you're on Windows then it should be relatively simple.
You can use sed which is a command line tool on all (or at least most) distributions of Linux, and can be installed on Windows.
Usage (for this use case):
sed -i s/{pattern}/{replacement}/g {file}
Use sed to replace the matched pattern with a replacement, using the global modifier to match all results, and the file to do the replacement and overwrite.
To target all files in a directory you can do:
find -type f -name "*.log" exec sed -i s/{pattern}/{replacement}/g {};
Find items recursively starting from the current directory where it's type is file, and it has a name ending with .log. Then use sed to replace the pattern with the contents you want for each matched file.
You can find how to get tools like sed and find for Windows on the following question:
https://stackoverflow.com/a/127567/6277798

Use [msys] bash to remove all files whose name matches a pattern, regardless of file-name letter-case

I need a way to clean up a directory, which is populated with C/C++ built-files (.o, .a, .EXE, .OBJ, .LIB, etc.) produced by (1) some tools which always create files having UPPER-CASE names, and (2) other tools which always create lower-case file names. (I have no control over the tools.)
I need to do this from a MinGW 'msys' bash.exe shell script (or bash command prompt). I understand piping (|), but haven't come up with the right combination of exec's yet. I have successfully filtered the file names, using commands like this example:
ls | grep '.\.[eE][xX][eE]'
to list all files having any case-combination of letters in the file-extension--this example gets all the executable (e.g. ".EXE") files.
(I'll be doing similar for .o, .a, .OBJ, .LIB, .lib, .MAP, etc., which all share the same directory as the C/C++ source files. I don't want to delete the source files, only the built-files. And yes, I probably should rework the directory structure, to use a separate directory for the built-files [only], but that will take time, and I need a quick solution now.)
How can I merge the above command with "something" else (e.g., like the 'rm -f' command???), to carry this the one step further, to actually delete [only] those filtered-out files from the current directory? (I'm hopeful for a solution which does not require a temporary file to hold the filtered file names.)
Adding this answer because the accepted answer is suggesting practices which are not-recommended in actual scripts. (Please don't feel bad, I was also on that track once..)
Parsing ls output is a NO-NO! See http://mywiki.wooledge.org/ParsingLs for more detailed explanation on why.
In short, ls separates the filenames with newline; which can be present in the filename itself. (Plus, ls does not handle other special characters properly. ls prints the output in human readable form.) In unix/linux, it's perfectly valid to have a newline in the filename.
A unix filename cannot have a NULL character though. Hence below command should work.
find /path/to/some/directory -iname '*.exe' -print0 | xargs -0 rm -f
find: is a tool used to, well, find files matching the required pattern/criterion.
-iname: search using particular names, case insensitive. Note that the argument to -iname is wildcard, not regex.
-print0: Print the file names separated by NULL character.
xargs: Takes the input from stdin & runs the commands supplied (rm -f in this case) on them. The input is separaed by white-space by default.
-0 specifies that the input is separated by null character.
Or even better approach,
find /path/to/some/directory -iname '*.exe' -delete
-delete is a built-in feature of find, which deletes the files found with the pattern.
Note that if you want to do some other operation, like move them to particular directory, you'd need to use first option with xargs.
Finally, this command find /path/to/some/directory -iname '*.exe' -delete would recursively find the *.exe files/directories. You can restrict the search to current directory with -maxdepth 1 & filetype to simple file (not directory, pipe etc.) using -type f. Check the manual link I provided for more details.
this is what you mean?
rm -f `ls | grep '.\.[eE][xX][eE]'`
but usually your "ls | grep ..." output will have some other fields that you have to strip out such as date etc., so you might just want to output the file name itself.
try something like:
rm -f `ls | grep '.\.[eE][xX][eE]' | awk '{print $9}'`
where you file name is in the 9th field like:
-rwxr-xr-x 1 Administrators None 283 Jul 2 2014 search.exe
You can use following command:
ls | grep '.\.[eE][xX][eE]' | xargs rm -f
Use of "xargs" would turn standard input ( in this case output of the previous command) as arguments for "rm -f" command.

Automatically fix filename cases in C++ codebase?

I am porting a C++ codebase which was developed on a Windows platform to Linux/GCC. It seems that the author didn't care for the case of filenames, so he used
#include "somefile.h"
instead of
#include "SomeFile.h"
to include the file which is actually called "SomeFile.h". I was wondering if there is any tool out there to automatically fix these includes? The files are all in one directory, so it would be easy for the tool to find the correct names.
EDIT: Before doing anything note that I'm assuming you either have copies of the files off ot the side or preferably that you have a baseline version in source control should you need to roll back for any reason.
You should be able to do this with sed: Something like sed -i 's/somefile\.h/SomeFile.H/I' *.[Ch]
This means take a case-insensitive somefile (trailing /I) and do an in-place (same file) replacement (-i) with the other text, SomeFile.H.
You can even do it in a loop (totally untested):
for file in *.[Ch]
do
sed -i "s/$file/$file/I" *.[Ch]
done
I should note that although I don't believe this applies to you, Solaris sed doesn't support -i and you'd have to install GNU sed or redirect to a file and rename.
Forgive my, I'm away from my linux environment right now so I can't test this myself, but I can tell you what utilities you would need to use to do it.
Open a terminal and use cd to navigate to the correct directory.
cd ~/project
Get a list of all of the .h files you need. You should be able to accomplish this with the shell's wildcard expansion without any effort.
ls include/*.h libs/include/*.h
Get a list of all of the files in the entire project (.c, .cpp, .h, .whatever), anything that can #include "header.h". Again, wildcard expansion.
ls include/*.h libs/include/*.h *.cpp libs/*.cpp
Iterate over each file in the project with a for loop
for f in ... # wildcard file list
do
echo "Looking in $f"
done
Iterate over each header file with a for loop
for h in ... # wildcard header list
do
echo "Looking for $h"
done
For each header in each project file, use sed to search for #include "headerfilename.h", and replace with #include "HeaderFileName.h" or whatever the correct case is.
Warning: Untested and probably dangerous: This stuff is a place to start and should be thoroughly tested before use.
h_escaped=$(echo $h | sed -e 's/\([[\/.*]\|\]\)/\\&/g') # escapes characters in file name
argument="(^\s*\#include\s*\")$h_escaped(\"\s*\$)" # I think this is right
sed -i -e "s/$argument/\$1$h\$2/gip"`
Yes, I know it looks awful.
Things to consider:
Rather than going straight to running this on your production codebase, test it thoroughly first.
sed can eat files like a VCR can eat tapes.
Make a backup.
Make another backup.
This is an O(N^2) operation involving hard disk access, and if your project is large it will run slowly. If your project is not gigantic, don't bother, but if it is, consider doing something to pipe sed's output to other seds.
Your search should be case insensitive: it should match #include, #INCLUDE, #iNcLuDe, and any combination of case present in the existing header filename, as well as any amount of whitespace between the include and the header. Bonus points if you preserve whitespace.
Use Notepad++ to do a 'Find in Files' and replace.
From toolbar:
Search - Find in Files.
Then complete the 'Find what' and 'Replace with'.

How to use shell magic to create a recursive etags using GNU etags?

The standard GNU etags does not support a recursive walk of directories as done by exuberant ctags -R. If I only have access to the GNU etags, how can I use bash shell magic to get etags to produce a TAGS table for all the C++ files *.cpp and *.h files in the current directory and all directories below the current one recursively to create a TAGS table in the current directory which has the proper path name for emacs to resolve the TAGS table entries.
The Emacs Wiki is often a good source for answers to common problems or best practices. For your specific problem there is a solution for both Windows and Unixen:
http://www.emacswiki.org/emacs/RecursiveTags#toc2
Basically you run a command to find all .cpp and all .h files (change file selectors if you use different file endings, such as e.g., .C) and pipe the result into etags. Since Windows does not seem to have xargs, you need a more recent version of etags that can read from stdin (note the dash at the end of the line which symbolizes stdin). Of course, if you use a recent version of etags, you can use the dash parameter instead of xargs there, too.
Windows:
cd c:\source-root
dir /b /s *.cpp *.h *.hpp | etags --your_options -
Unix:
cd /path/to/source-root
find . -name "*.cpp" -print -or -name "*.h" -print | xargs etags --append
This command creates etags file with default name "TAGS" for .c, .cpp, .Cpp, .hpp, .Hpp .h files recursively
find . -regex ".*\.[cChH]\(pp\)?" -print | etags -
Most of the answers posted here pipe the find output to xargs. This breaks if there are spaces in filenames inside the directory tree.
A more general solution that works if there are spaces in filenames (for .c and .h files) could be:
find . -name "*.[cChH]" -exec etags --append {} \;
Use find. man find if you need to.

What's the most compact version of "match everything but these strings" in the shell or regex?

Linux: I want to list all the files in a directory and within its subdirectories, except some strings. For that, I've been using a combination of find/grep/shell globbing. For instance, I want to list all files except those in the directories
./bin
./lib
./resources
I understand this can be done as shown in this question and this other. But both versions are not solving the case "everything, but this pattern" in general terms.
It seems that it is much easier to use a conditional for filtering the results, but I wonder if there is any compact and elegant way of describing this in regexp or in the shell extended globbing.
Thanks.
yourcommand | egrep -v "pattern1|pattern2|pattern3"
Use prune option of find.
find . -path './bin' -prune -o -path './lib' -prune -o -path './resources' -prune -o «rest of your find params»
With bash's extglob shopt setting enabled, you can exclude files with ! in your wildcard pattern. Some examples:
Everything except bin, lib, and resources
shopt -s extglob
ls -Rl !(bin|lib|resources)
Everything with an i in it, except bin and lib
ls -Rl !(bin|lib|!(*i*))
(Everything that has an i in it is the same as everything except the things that don't have i's in them.)