grep regex multiple replacements - regex

I will probably have done it "manually" by the time I get an answer for this.
I have two variables (varA, varB) I want to replace with (a, b) respectively, this currently requires two separate find and replaces.
with regex grep I know how to do two separate searches using
varA | varB
but there is no replace function that will similarly do a respective replacement
unless you know better? thanks for any insight

grep is used for searching pattern in a given input. You should use sed for text replacements. For multiple replacements in single sed command just use it like this:
sed -e 's/varA/foo/g' -e 's/varB/bar/g' file.txt

Related

Grep multiple files using regex for specifying filenames to search for

Let's say I have n files with names like link123.txt, link345.txt, link645.txt, etc.
I'd like to grep a subset of these n files for a keyword. For example:
grep 'searchtext' link123.txt link 345.txt ...
I'd like to do something like
grep 'searchtext' link[123\|345].txt
How can I mention the filenames as regex in this case?
you can use find and grep together like this
find . -regex '.*/link\(123\|345\).txt' -exec grep 'searchtext' {} \;
Thanks for ghoti's comment.
You can use the bash option extglob, which allows extended use of globbing, including | separated pattern lists.
#(123|456)
Matches one of 123 or 456 once.
shopt -s extglob
grep 'searchtext' link#(123|345).txt
shopt -u extglob
I think you're probably asking for find functionality to search for filenames with regex.
As discussed here, you can easely use find . -regex '.*/link\([0-9]\{3\}\).txt' to show all these three files. Now you have only to play with regex.
PS: Don't forget to specify .*/ in the beginning of pattern.
It seems, you don't need regex to determine the files to grep, since you enumerate them all (well, actually you enumerate the minimal unique part without repeating common prefix/suffix).
If regex functionality is not needed and the only aim is to avoid repeating common prefix/suffix, then simple iterating would be an option:
for i in 123 345 645; do grep searchpattern link$i.txt; done

Sed dynamic backreference replacement

I am trying to use sed for transforming wikitext into latex code. I am almost done, but I would like to automate the generation of the labels of the figures like this:
[[Image(mypicture.png)]]
... into:
\includegraphics{mypicture.png}\label{img-1}
For what I would like to keep using sed. The current regex and bash code I am using is the following:
__tex_includegraphics="\\\\includegraphics[width=0.95\\\\textwidth]{$__images_dir\/"
__tex_figure_pre="\\\\begin{figure}[H]\\\\centering$__tex_includegraphics"
__tex_figure_post="}\\\\label{img-$__images_counter}\\\\end{figure}"
sed -e "s/\[\[Image(\([^)]*\))\]\].*/$__tex_figure_pre\1$__tex_figure_post/g"\
... but I cannot make that counter to be increased. Any ideas?
Within a more general perspective, my question would be the following: can I use a backreference in sed for creating a replacement that is different for each of the matches of sed? This is, each time sed matches the pattern, can I use \1 as the input of a function and use the result of this function as the replacement?
I know it is a tricky question and I might have to use AWK for this. However, if somebody has a solution, I would appreciate his or her help.
This might work for you (GNU sed):
sed -r ':a;/'"$PATTERN"'/{x;/./s/.*/echo $((&+1))/e;/./!s/^/1/;x;G;s/'"$PATTERN"'(.*)\n(.*)/'"$PRE"'\2'"$POST"'\1/;ba}' file
This looks for a PATTERN contained in a shell variable and if not presents prints the current line. If the pattern is present it increments or primes the counter in the hold space and then appends said counter to the current line. The pattern is then replaced using the shell variables PRE and POST and counter. Lastly the current line is checked for further cases of the pattern and the procedure repeated if necessary.
You could read the file line-by-line using shell features, and use a separate sed command for each line. Something like
exec 0<input_file
while read line; do
echo $line | sed -e "s/\[\[Image(\([^)]*\))\]\].*/$__tex_figure_pre\1$__tex_figure_post/g"
__images_counter=$(expr $__images_counter + 1)
done
(This won't work if there are multiple matches in a line, though.)
For the second part, my best idea is to run sed or grep to find what is being matched, and then run sed again with the value of the function of the matched text substituted into the command.

sed conditional replace of a variable

I have within a file a bunch of codenumbers that in general are of the form, integer.integer The first integer is necessary, the second may be empty. e.g. 123.45 or 12.345 and 12 are all valid codenumbers.
I want to use sed to change each of these lines into
job{123}subjob{45}
job{12}subjob{345}
job{12}
So far I have
sed -e 's/codenumber{\([0-9]*\)\.*\([0-9]*\)}/job{\1}subjob{\2}/g'
which results in
job{123}subjob{45}
job{12}subjob{345}
job{12}subjob{}
Is there a way for sed to realise that when the variable \2 is empty, to print a default value instead, say 0. Hence the last line of the given example would say
job{12}subjob{0}
I suppose this could be possible via two sed runs, but I am interested if it was possible with one.
You could simply extend your sed command to patch up empty subjob numbers:
sed -e 's/codenumber{\([0-9]*\)\.*\([0-9]*\)}/job{\1}subjob{\2}/g' \
-e 's/subjob{}/subjob{0}/g'
I don't think this is possible in sed. But indeed you can do two sed runs (they're really fast so it shouldn't be a problem), the second being
sed -e 's/subjob\{\}//g'
This might work for you (GNU sed):
sed 's/codenumber/job/;s/\./}subjob{/;/subjob/!s/$/subjob{0}/' file
I know, a bit too late and in addition not answering the question, but if someone lands here as I did, and does not mind to use Perl instead, so the expression is:
s/codenumber{([0-9]*)\.*([0-9]*)}/job{$1}subjob{${\($2?$2:"0")}}/g
e.g. in:
echo -e 'codenumber{123.45}\ncodenumber{12.345}\ncodenumber{12}' | perl -e 'while(<STDIN>) { print s/codenumber{([0-9]*)\.*([0-9]*)}/job{$1}subjob{${\($2?$2:"0")}}/gr;}'
So, the point is that Perl allows you to use the string interpolation ${\(EXPRESSION)} also in the regexp and calculate the replacement, based on the matched value, using a Perl expression.

Grep regular expression to find words in any order

Context: I want to find a class definition within a lot of source code files, but I do not know the exact name.
Question: I know a number of words which must appear on the line I want to find, but I do not know the order in which they will appear. Is there a quick way to look for a number of words in any order on the same line?
For situations where you need to search on a large number of words, you can use awk as follows:
awk "/word1/&&/word2/&&/word3/" *.c
(If you are a cygwin user, the command is gawk.)
If you're trying to find foo, bar, and baz, you can just do:
grep foo *.c | grep bar | grep baz
That will find anything that has all three in any order. You can use word boundaries if you use egrep, otherwise that will match substrings.
While this is not an exact answer your grep question, but you should check the "ctags" command for generating tags file from the source code. For the source code objects this should help you a much more than an simple grep. check: http://ctags.sourceforge.net/ctags.html
Using standard basic regex recursively match starting from the current directory any .c file with the indicated words (case insesitive, bash flavour):
grep -r -i 'word1\|word2\|word3' ./*.c
Using standard extended regex:
grep -r -i -E 'word1|word2|word3' ./*.c
You can also use perl regex:
grep -r -i -P 'word1|word2|word3' ./*.c
If you need to search with a single grep command (for example, you are searching for multiple pattern alternatives on stdin), you could use:
grep -e 'word1.*word2' -e 'word2.*word1' -e 'alternative-word'
This would find anything which has word1 and word2 in either order, or alternative-word.
(Note that this method gets exponentially complicated as the number of words in arbitrary order increases.)

grep egrep multiple-strings

Suppose I have several strings: str1 and str2 and str3.
How to find lines that have all the strings?
How to find lines that can have any of them?
And how to find lines that have str1 and either of str2 and str3 [but not both?]?
This looks like three questions. The easiest way to put these sorts of expressions together is with multiple pipes. There's no shame in that, particularly because a regular expression (using egrep) would be ungainly since you seem to imply you want order independence.
So, in order,
grep str1 | grep str2 | grep str3
egrep '(str1|str2|str3)'
grep str1 | egrep '(str2|str3)'
you can do the "and" form in an order independent way using egrep, but I think you'll find it easier to remember to do order independent ands using piped greps and order independent or's using regular expressions.
You can't reasonably do the "all" or "this plus either of those" cases because grep doesn't support lookahead. Use Perl. For the "any" case, it's egrep '(str1|str2|str3)' file.
The unreasonable way to do the "all" case is:
egrep '(str1.*str2.*str3|str3.*str1.*str2|str2.*str1.*str3|str1.*str3.*str2)' file
i.e. you build out the permutations. This is, of course, a ridiculous thing to do.
For the "this plus either of those", similarly:
egrep '(str1.*(str2|str3)|(str2|str3).*str1)' file
grep -E --color "string1|string2|string3...."
for example to find whether our system using AMD(svm) or Intel(vmx) processor and if it is 64bit(lm) lm stands for long mode- that means 64bit...
command example:
grep -E --color "lm|svm|vmx" /proc/cpuinfo
-E is must to find multiple strings
Personally, I do this in perl rather than trying to cobble together something with grep.
For instance, for the first one:
while (<FILE>)
{
next if ! m/pattern1/;
next if ! m/pattern2/;
next if ! m/pattern3/;
print $_;
}