Find text string or part of text with dot in grepWin - regex

I am using grepWin on Windows 7 64. http://tools.tortoisesvn.net/grepWin.html
I have a folder with files and their duplicate copies.
The original files are named "FILENAME DOT FILETYPE" (without spaces), for example "cartonbox.shelf".
The copies of these file are named "FILENAME DOT 1 DOT FILETYPE" (without spaces), for example "cartonbox.1.shelf".
I am trying to find all files that contain the exact string:
"DOT 1 DOT FILETYPE" (without spaces), so all files that have for example ".1.shelf" in them.
How can I do that in grepWin please?
If I try "\.1\shelf" or "\.1\.shelf" for example I do not get any results.
What is my mistake please? Been reading http://www.regular-expressions.info/ but cannot come up with correct pattern.
How can I generally search for an exact part of the filename regardless of symbols?
Basically if the file I want to find has for example "garden_1.1.4-JE50.tree" in it how do I tell grepWin to find this exact string of text including underscore, dots or other characters?

Grep stands for g/re/p (global / regular expression / print)
It searches IN files, not file names. That text would need to be text-readable in the file for which you are searching.
In the directory you want to search you could do something like:
dir *.* /b/s > my_file.txt
Then you can perform your regular expressions checks with grepWin on my_file.txt
In Unix and Linux you normally pipe the commands via the command line:
ls -a | grep \.tree$
In Windows you would use
dir * /b | findstr \.tree$

I learned that gripWin is for searching IN files, I am looking to search parts of filenames of files, not in them, but simply their names. Hence I am now reading this: https://superuser.com/questions/209231/what-search-utilities-can-search-by-file-name-in-windows-7
Thanks for explaining this crucial misunderstanding to me, cpattersonv1.

Related

Sed - How to read a file line by line and go the path mentioned in the file then replace string?

I am on a new project where I need to add some strings to all the API names, which are exported
Someone hinted this can be done with simple sed commands.
What really needed is : Example :
In my project say 100 files and many files have something like the below pattern
in file1 its mentioned at some line : export(xyx);
in file2 its mentioned at some line : export (abc);
What is needed here is to replace the
xyz with xyz_temp and
abc with abc_temp.
Now the problem is these APIs are in different folders and different files.
Fortunately, I got to know we can redirect the result of cscope tool to some file with matching patterns.
so I did redirect the result of a search of the "export" string and I got below. Say file I have exported the scope result - export_api.txt as below.
/path1/file1.txt export(xyz);
/path2/file2.txt export(abc);
Now, I am not sure how to use sed to do this automation of
Reading this export_ap.txt
Reading each line
Replacing the string as above.
Any direction would highly appriciated.
Thanks in advance.
If you have a list of files which need to be changed and your replacement only needs to append _tmp, then this can be accomplished with a single sed call:
sed -i 's/export(\(abc\|xyz\));/export(\1_tmp);/' files...
-i will modify the files in-place, overwriting them.
If you don't care for what you are going to replace, but append a postfix to all export expressions, match any identifier. Here is one such example:
export(\([^)]*\))
Depending on your expressions and valid identifier names, you might want to or need to change this to one of:
export(\(.*\))
export(\([_a-zA-Z][_a-zA-Z0-9]*\))
export(\([_a-zA-Z"'][_a-zA-Z0-9"']*\))
export(\([_a-zA-Z]*\))
…
Another option would be to only match lines containing "export(" and then replace the closing parenthisis (given that your input lines contain the token ");" only once):
sed -i '/export(/s/);/_tmp);/' files...
# or reusing the complete match:
sed -i '/export(/s/);/_tmp&/' files...
This avoids the backreference and makes the regular expression simpler, because they can now be of fixed size
You can use the read builtin to parse the line in your export_api.txt file, then call sed on each file. Pattern match the export snippet to choose the correct sed invocation. The way read is invoked here assumes that your path and snippet are delimited by IFS and that path does not contain any whitespace or separators:
while read -r path snippet; do
case "$snippet" in
*abc*) sed -i 's/export(abc);/export(abc_tmp);/' "$path" ;;
*xyz*) sed -i 's/export(xyz);/export(xyz_tmp);/' "$path" ;;
esac
done < export_api.txt
NOTE: this will change/overwrite any of your files. Your files might be left in a broken state.
PS I wonder why you cannot use your IDE to search/replace those occurrences?

Command Line findstr with a regular expression

I need to search through all the files in a directory and sub directories to match any of the numbers in the reg exp. Basically in our code we have blocks of code based on certain project numbers. I need to find these blocks by project number. This regular expression does what I need but I cannot get it to work at the command line
([^0-9]|^)(56|14|2)([^0-9]|$)
I tested this on https://www.freeformatter.com/regex-tester.html against this string "If session.projid = 56 and then again 14 or something else"
I am trying this at the command line
findstr /s /R /C:"([^0-9]|^)(56|14|2)([^0-9]|$)" *.*
But no results and I know there should be. Thanks in advance for any help on this.
See these docs:
FINDSTR does not support alternation with the pipe character (|) multiple Regular Expressions can be separated with spaces, just the same as separating multiple words (assuming you have not specified a literal search with /C) but this might not be useful if the regex itself contains spaces.
In your case, you may use \< / \> word boundaries with each number and you may specify all your alternatives after a space:
findstr /s /r "\<56\> \<14\> \<2\>" *.*

Linux: rename files containing ASCII-Code for capital letters

I have a collection of files where the capital letters are replaced by their ASCII-code (example ;065 for A). How can I most effectively recursively rename them from the command line?
Since I don't want to make the mess worse, I unfortunately don't know how test any commands...
For me it would be no problem to modify the command for each letter.
Many Linux distributions ship some variant or another of the Perl rename script, sometimes as prename, sometimes as rename. Any variant will do, but not the Linux rename utility that isn't written in Perl (run it with no argument and see if the help text mentions perl anywhere). This script runs Perl code on file names, typically a regex replacement.
prename -n 's/;(03[2-9]|0[4-9][0-9]|1[01][0-9]|12[0-6])/chr($1)/eg' *
I made a regular expression that matches three-digit numbers that are the character code of a printable ASCII character. You may need to adjust it depending on exactly what can follow a semicolon. The * at the end says to rename all files in the current directory, it's just a normal shell wildcard. It's ok to include files that don't contain anything to rename: prename will just skip them.
The -n option says to show what would be done, but don't actually rename any file. Review the output. If you're happy with it, run the command again without -n to actually rename the files.

Using Grep and Regex to generate a text file which lists all files that have specific strings in them

I am having trouble figuring out a very specific use case for Grep and Regex. I have searched a lot and can't seem to find the answer.
I want to run Grep recursively from within a root directory which contains directories for hundreds of different programs. I want the output to print to a text file and list the file names of all the files which match some regex. Specifically, here's what I need to do in english:
find all files that contain the text [stringA] AND ["stringB"] OR ['stringB'] and print the file paths for each of these files to an output text file with some context added. For the two stringB pieces, the quotes are important and I only want to find files that have stringB with single or double quotes around it. In other words,
all files that contain stringA with no quotes and stringB with either single or double quotes.
Any help is appreciated, thanks!
Edit: Sorry for asking a terrible question :)
I have done some research and have figured out part of it. I have figured out the regex for finding the string I am looking for. The actual string is "ATA". To find this globally, case-insensitive, and with either type of quotes around it with some or no whitespace on either side of the string, we can use:
/['"]\s*ATA\s*['"]/gi
The first string I need to also find is a function call: call wflnkmod( and the regex for that is:
/call\swflnkmod\(/gi
Now I just need to figure out how to search a file and ensure that it has an instance of each of these in it, then figure out the grep command to use to search all files recursively and print them to an output file!
Solution
Should do the trick (<dir> needs to be replaced with the directory to search in):
grep -r -i -l "stringA" <dir> | xargs grep -i -l -E "'stringB'|\"stringB\""
Explanation
The above line searches for stringA first. The resulting list of files is piped into xargs and the specified command (grep -i -l -E "'stringB'|\"stringB\"") is invoked with the additional parameters provided through stdin (the list of files, that contain stringA).
grep options:
-r, --recursive (self explanatory)
-i, --ignore-case search case insensitive
-l, --files-with-matches only print filename of files that match the
given pattern
-E interpret pattern as extended regular expression
The two main issues with the expressions you've tried are:
Inappropriate use of []. [] lets you define a character set, e.g. [abcd] matches one of specified characters.
Incorrect quoting or escaping of quotes in your expressions. If you quote something in double quotes ("), double quotes inside the quoted string need to be escaped: \" yields a " in a quoted string. If you're not quoting your expression at all, ( needs to be escaped as well.
for AND search it's better to chain greps, OR can be handled.
Assume we are looking for words cat and "ata" or 'ata'. If you're only interested in the filenames, not the actual matches...
grep -wilm1 cat file* | xargs grep -wilEm1 "'ata'|\"ata\""
file1
file2
searches for the full word match -w (not substrings), exits after first match -m1 and ignores cases (-i). The OR requires -E flag. -l prints the filenames only.
here are the files used for testing.
==> file1 <==
cat
fat hat "ata"
==> file2 <==
cat fat hat 'ata'
==> file3 <==
cat fat hat ata
==> file4 <==
category fat hat ata

Rename Files Mac Command Line

I have a bunch of files in a directory that were produced with rather unfortunate names. I want to change two of the characters in the name.
For example I have:
>ch:sdsn-sdfs.txt
and I want to remove the ">" and change the ":" to a "_".
Resulting in
ch_sdsn-sdfs.txt
I tried to just say mv \\>ch\:* ch_* but that didn't work.
Is there a simple solution to this?
For command line script to rename, this stackoverflow question has good answers.
For Mac, In GUI, Finder comes with bulk rename capabilities. If source list of files has some pattern to find & replace, it comes very handy.
Select all the files that need to be replaced, right click and select rename
On rename, enter find and replace string
Other options in rename, to sequence the file names:
To prefix or suffix text:
First, I should say that the easiest way to do this is to use the
prename or rename commands.
Homebrew package rename, MacPorts package renameutils :
rename s/0000/000/ F0000*
That's a lot more understandable than the equivalent sed command.
But as for understanding the sed command, the sed manpage is helpful. If
you run man sed and search for & (using the / command to search),
you'll find it's a special character in s/foo/bar/ replacements.
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
Therefore, \(.\) matches the first character, which can be referenced by \1.
Then . matches the next character, which is always 0.
Then \(.*\) matches the rest of the filename, which can be referenced by \2.
The replacement string puts it all together using & (the original
filename) and \1\2 which is every part of the filename except the 2nd
character, which was a 0.
This is a pretty cryptic way to do this, IMHO. If for
some reason the rename command was not available and you wanted to use
sed to do the rename (or perhaps you were doing something too complex
for rename?), being more explicit in your regex would make it much
more readable. Perhaps something like:
ls F00001-0708-*|sed 's/F0000\(.*\)/mv & F000\1/' | sh
Being able to see what's actually changing in the
s/search/replacement/ makes it much more readable. Also it won't keep
sucking characters out of your filename if you accidentally run it
twice or something.