Sed and grep regex syntaxes differ - regex

This is a part of my shell script, which I use to perform a recursive find and replace in the working directory. Backup and other utilities are in other functions, which are irrelevant to my problem.
#!/bin/bash
# backup function goes here
# #param $1 The find pattern.
# #param $2 The replace pattern.
function findAndReplace {
bufferFile=/tmp/tmp.$$
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
sedPattern="s/$1/$2/g"
echo "Using pattern $sedPattern"
for f in $filesToReplace; do
echo "sedding file $f"
sed "$sedPattern" "$f" > "$bufferFile"
exitCode=$?
if [ $exitCode -ne 0 ] ; then
echo "sed $sedPattern exited with $exitCode"
exit 1
fi
chown --reference=$f $bufferFile
mv $bufferFile $f
done
}
backup
findAndReplace "$1" "$2"
Here's a sample usage: recursive-replace.sh "function _report" "function report".
It works, but there is one problem. It uses sed on ALL files in the working directory. I would like to sed only those files, which contain the find pattern.
Then, I modified the line:
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
to:
filesToReplace=`grep -rl "$1" . | grep -vi cvs | grep -v '#'`
And it works too, but not for all find patterns. E.g. for pattern \$this->report\((.*)\) I recieve error: grep: Unmatched ( or \(. This pattern is correct for sed, but not for grep.
Regex syntaxes for grep and sed differ. What can I do?

use grep -E ("extended" regexp option) — it usually solves the problem.
(also sometimes available as egrep)
Also, why not keep using find?
filesToReplace=`find . -name CVS -prune -o -type f -exec grep -l "$1" {} \; | grep -v '#'`
Also note the -i option of sed, which allows in-place changes in files and the removal of the bufferFile/chown/mv logic.

Why not compare source and buffer files before overwriting the source file:
#!/bin/bash
# backup function goes here
# #param $1 The find pattern.
# #param $2 The replace pattern.
function findAndReplace {
bufferFile=/tmp/tmp.$$
filesToReplace=`find . -type f | grep -vi cvs | grep -v '#'`
sedPattern="s/$1/$2/g"
echo "Using pattern $sedPattern"
for f in $filesToReplace; do
echo "sedding file $f"
sed "$sedPattern" "$f" > "$bufferFile"
exitCode=$?
if [ $exitCode -ne 0 ] ; then
echo "sed $sedPattern exited with $exitCode"
exit 1
fi
cmp -s $f $bufferFile
if [ $? -ne 0 ]; then
chown --reference=$f $bufferFile
mv $bufferFile $f
fi
done
}
backup
findAndReplace "$1" "$2"

Related

SED: replace semvers of multiple files

[context] My script needs to replace semvers of multiple .car names with commit sha. In short, I would like that every dev_CA_1.0.0.car became dev_CA_6a8zt5d832.car
ADDING commit sha right before .car was pretty trivial. With this, I end up with dev_CA_1.0.0_6a8zt5d832.car
find . -depth -name "*.car" -exec sh -c 'f="{}"; \
mv -- "$f" $(echo $f | sed -e 's/.car/_${CI_COMMIT_SHORT_SHA}.car/g')' \;
But I find it incredibly difficult to REPLACE. What aspect of sed am I misconceiving trying this:
find . -depth -name "*.car" -exec sh -c 'f="{}"; \
mv -- "$f" $(echo $f | sed -r -E 's/[0-9\.]+.car/${CI_COMMIT_SHORT_SHA}.car/g')
or this
find . -depth -name "*.car" -exec sh -c 'f="{}"; \
mv -- "$f" $(echo $f | sed -r -E 's/^(.*_)[0-9\.]+\.car/\1${CI_COMMIT_SHORT_SHA}\.car/g')' \;
no matches found: f="{}"; mv -- "$f" $(echo $f | sed -r -E ^(.*_)[0-9.]+.car/1684981321531.car/g)
or multiple variants:
\ escaping (e.g. \\.)
( and ) escaping (e.g. \() (I read somewhere that regex grouping with sed requires some care with ())
Is there a more direct way to do it?
Edit
$f getting in sed are path looking like
./somewhere/some_project_CA_1.2.3.car
./somewhere_else/other_project_CE_9.2.3.car
You may use
sed "s/_[0-9.]\{1,\}\.car$/_${CI_COMMIT_SHORT_SHA}.car/g"
See the online demo
Here, sed is used with a POSIX ERE expression, that matches
_ - an underscore
[0-9.]\{1,\} - 1 or more digits or dots
\.car - .car (note that a literal . must be escaped! a . pattern matches any char)
$ - end of string.
Can you try this :
export CI_COMMIT_SHORT_SHA=6a8zt5d832
find . -depth -name "*.car" -exec sh -c \
'f="{}"; echo mv "$f" "${f%_*}_${CI_COMMIT_SHORT_SHA}.car"' \;
Remove echo once you are satisfied of the result.

Bash check if file matching regex exists and assign filename to variable

I'm looking for a fast, short and portable way to check if a file matching the regex (env(ironment)?|requirements).ya?ml exists in the current working directory and if so assign its basename to a variable $FILE for further processing.
Basically, I'd like to combine getting the handle in
for FILE in environment.yml env.yml requirements.yml environment.yaml env.yaml requirements.yaml; do
if [ -e $FILE ]; then
...
fi
done
with using a regex as in
if test -n "$(find -E . -maxdepth 1 -regex '.*(env(ironment)?|requirements).ya?ml' -print -quit)"
then
...
fi
Stick it in a variable:
file="$(find -E . -maxdepth 1 -regex '.*(env(ironment)?|requirements).ya?ml' -print -quit)"
if [ -n "$file" ]
then
echo "I found $file"
else
echo "No such file."
fi
Alternatively, you can keep your loop and shorten it using brace expansion:
for file in {env{,ironment},requirements}.{yml,yaml}
do
if [ -e "$file" ]
then
echo "Found $file"
else
echo "There is no $file"
fi
done
or match files directly using bash's extglob:
shopt -s nullglob
for file in #(env?(ironment)|requirements).y?(a)ml
do
echo "Found $file"
done

need to rename many files in directory using sed and find

I would like to rename all files named *-6.0.dll with *-6.1.dll
I tried:
find . -name '*-6.0.dll*' -exec mv {} $(echo {} | sed -e 's/-6.0.dll/-6.1.dll/g') \;
but this didn't work; the file names didn't change.
Any ideas?
for x in *-6.0.dll; do y=$(echo $x | sed -e 's/-6\.0\.dll$/-6.1.dll/'); echo mv $x $y; done
Remove the echo once you are satisfied the results are correct.
use this:
find . -name '*-6.0.dll*' -exec sh -c 'mv {} $(echo {} | sed -e 's/\-6\.0\.dll/\-6\.1\.dll/g')' \;
an explanation of using the sh -c vs mv can be found here http://linuxplayer.org/2010/05/shell-programming-trap-batch-rename-with-find
I also modified your regex, some of the characters need to be escaped for proper matching.

How to match once per file in grep?

Is there any grep option that let's me control total number of matches but stops at first match on each file?
Example:
If I do this grep -ri --include '*.coffee' 're' . I get this:
./app.coffee:express = require 'express'
./app.coffee:passport = require 'passport'
./app.coffee:BrowserIDStrategy = require('passport-browserid').Strategy
./app.coffee:app = express()
./config.coffee: session_secret: 'nyan cat'
And if I do grep -ri -m2 --include '*.coffee' 're' ., I get this:
./app.coffee:config = require './config'
./app.coffee:passport = require 'passport'
But, what I really want is this output:
./app.coffee:express = require 'express'
./config.coffee: session_secret: 'nyan cat'
Doing -m1 does not work as I get this for grep -ri -m1 --include '*.coffee' 're' .
./app.coffee:express = require 'express'
Tried not using grep e.g. this find . -name '*.coffee' -exec awk '/re/ {print;exit}' {} \; produced:
config = require './config'
session_secret: 'nyan cat'
UPDATE: As noted below the GNU grep -m option treats counts per file whereas -m for BSD grep treats it as global match count
So, using grep, you just need the option -l, --files-with-matches.
All those answers about find, awk or shell scripts are away from the question.
I think you can just do something like
grep -ri -m1 --include '*.coffee' 're' . | head -n 2
to e.g. pick the first match from each file, and pick at most two matches total.
Note that this requires your grep to treat -m as a per-file match limit; GNU grep does do this, but BSD grep apparently treats it as a global match limit.
I would do this in awk instead.
find . -name \*.coffee -exec awk '/re/ {print FILENAME ":" $0;exit}' {} \;
If you didn't need to recurse, you could just do it with awk:
awk '/re/ {print FILENAME ":" $0;nextfile}' *.coffee
Or, if you're using a current enough bash, you can use globstar:
shopt -s globstar
awk '/re/ {print FILENAME ":" $0;nextfile}' **/*.coffee
using find and xargs.
find every .coffee files and excute -m1 grep to each of them
find . -print0 -name '*.coffee'|xargs -0 grep -m1 -ri 're'
test
without -m1
linux# find . -name '*.txt'|xargs grep -ri 'oyss'
./test1.txt:oyss
./test1.txt:oyss1
./test1.txt:oyss2
./test2.txt:oyss1
./test2.txt:oyss2
./test2.txt:oyss3
add -m1
linux# find . -name '*.txt'|xargs grep -m1 -ri 'oyss'
./test1.txt:oyss
./test2.txt:oyss1
find . -name \*.coffee -exec grep -m1 -i 're' {} \;
find's -exec option runs the command once for each matched file (unless you use + instead of \;, which makes it act like xargs).
You can do this easily in perl, and no messy cross platform issues!
use strict;
use warnings;
use autodie;
my $match = shift;
# Compile the match so it will run faster
my $match_re = qr{$match};
FILES: for my $file (#ARGV) {
open my $fh, "<", $file;
FILE: while(my $line = <$fh>) {
chomp $line;
if( $line =~ $match_re ) {
print "$file: $line\n";
last FILE;
}
}
}
The only difference is you have to use Perl style regular expressions instead of GNU style. They're not much different.
You can do the recursive part in Perl using File::Find, or use find feed it files.
find /some/path -name '*.coffee' -print0 | xargs -0 perl /path/to/your/program

how can I make this sed capture accomplish a more complex substitution

when fixing mass spelling errors in my code base i have used this:
find . -path '*/.svn' -prune -o -name "*min.js" -prune -o -name "*min.css" -prune -o -name "flashLocaleXml.xml" -prune -o -type f -print0 | xargs -0 egrep -n "priority=" -exec sed -i 's/replace/newval/' {} \;
to fix a specific spelling error in all the files in my repo.
however, i am not very good with sed captures, i want to do something like:
X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);
becomes:
EventUtil.addEventListener(X, LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);
I have read up extensively but I would appreciate someone explaining how sed captures work with this as a specific example.
I have given it a few shots, but nothing I come up with works: here are my tries
echo "X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);" | sed 's/\(.*\)EventUtil\(.*EventUtil\)/\1X\2/'
echo "X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);" | sed -r 's/(....),(....),(*\.addEventListener)(LevelUpEvent.*)/\1,\2\n\1,\2,/g'
echo "X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);" | sed 's/\([\.$]*\) \([\.$]*\)/\2 \1/'
thank you in advance!
Try with:
sed 's/\([^.]*\)\([^(]*(\)/EventUtil\2\1, /'
Output:
EventUtil.addEventListener(X, LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);
Explanation:
\([^.]*\) # Content until first '.'
\([^(]*(\) # Content until first '('
EventUtil\2\1, # Literal 'EventUtil' plus grouped content in previous expression.
This sed command will do.
sed 's/\(X\).\(addEventListener\)(\(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels\));/EventUtil.\2(\1, \3);/'
Example
$ echo "X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);" | sed 's/\(X\).\(addEventListener\)(\(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels\));/EventUtil.\2(\1, \3);/'
EventUtil.addEventListener(X, LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);
Try this
echo "X.addEventListener(LevelUpEvent.GENERIC_LEVEL_UP, updateLevels);" | sed -e "s/\([A-Za-z]\{1,\}\)\.addEventListener(/EventUtil.addEventListener(\1, /"
This regexp will recognize a variable name using
\([A-Za-z]\{1,\}\)
Then .addEventListener(
\.addEventListener(
And replace it with
EventUtil.addEventListener(\1
In which \1 represents the variable name