How to match once per file in grep?

How to match once per file in grep? - regex

Is there any grep option that let's me control total number of matches but stops at first match on each file?
Example:
If I do this grep -ri --include '*.coffee' 're' . I get this:
./app.coffee:express = require 'express'
./app.coffee:passport = require 'passport'
./app.coffee:BrowserIDStrategy = require('passport-browserid').Strategy
./app.coffee:app = express()
./config.coffee: session_secret: 'nyan cat'
And if I do grep -ri -m2 --include '*.coffee' 're' ., I get this:
./app.coffee:config = require './config'
./app.coffee:passport = require 'passport'
But, what I really want is this output:
./app.coffee:express = require 'express'
./config.coffee: session_secret: 'nyan cat'
Doing -m1 does not work as I get this for grep -ri -m1 --include '*.coffee' 're' .
./app.coffee:express = require 'express'
Tried not using grep e.g. this find . -name '*.coffee' -exec awk '/re/ {print;exit}' {} \; produced:
config = require './config'
session_secret: 'nyan cat'
UPDATE: As noted below the GNU grep -m option treats counts per file whereas -m for BSD grep treats it as global match count

So, using grep, you just need the option -l, --files-with-matches.
All those answers about find, awk or shell scripts are away from the question.

I think you can just do something like
grep -ri -m1 --include '*.coffee' 're' . | head -n 2
to e.g. pick the first match from each file, and pick at most two matches total.
Note that this requires your grep to treat -m as a per-file match limit; GNU grep does do this, but BSD grep apparently treats it as a global match limit.

I would do this in awk instead.
find . -name \*.coffee -exec awk '/re/ {print FILENAME ":" $0;exit}' {} \;
If you didn't need to recurse, you could just do it with awk:
awk '/re/ {print FILENAME ":" $0;nextfile}' *.coffee
Or, if you're using a current enough bash, you can use globstar:
shopt -s globstar
awk '/re/ {print FILENAME ":" $0;nextfile}' **/*.coffee

using find and xargs.
find every .coffee files and excute -m1 grep to each of them
find . -print0 -name '*.coffee'|xargs -0 grep -m1 -ri 're'
test
without -m1
linux# find . -name '*.txt'|xargs grep -ri 'oyss'
./test1.txt:oyss
./test1.txt:oyss1
./test1.txt:oyss2
./test2.txt:oyss1
./test2.txt:oyss2
./test2.txt:oyss3
add -m1
linux# find . -name '*.txt'|xargs grep -m1 -ri 'oyss'
./test1.txt:oyss
./test2.txt:oyss1

find . -name \*.coffee -exec grep -m1 -i 're' {} \;
find's -exec option runs the command once for each matched file (unless you use + instead of \;, which makes it act like xargs).

You can do this easily in perl, and no messy cross platform issues!
use strict;
use warnings;
use autodie;
my $match = shift;
# Compile the match so it will run faster
my $match_re = qr{$match};
FILES: for my $file (#ARGV) {
open my $fh, "<", $file;
FILE: while(my $line = <$fh>) {
chomp $line;
if( $line =~ $match_re ) {
print "$file: $line\n";
last FILE;
}
}
}
The only difference is you have to use Perl style regular expressions instead of GNU style. They're not much different.
You can do the recursive part in Perl using File::Find, or use find feed it files.
find /some/path -name '*.coffee' -print0 | xargs -0 perl /path/to/your/program

Related

Nesting Find and sed command in if else

I am using a find and sed command to replace characters in a file. see the code 1 below
find . -type f -exec sed -i '/Subject/{:a;s/(Subject.*)Subject/\1SecondSubject/;tb;N;ba;:b}' {} +
Given that I have multiple files I need to replace. In a given situation, the Subject I am trying to replace is not available.
Is there a way I can first check if the file contains the attribute 'Subject' if not I need to execute another command. i.e
Check if the file contains character 'Subject'
If true then execute code1 above
If there is no instance of Subject execute code 2 below
find . -name "*.html" -exec rename 's/.html$/.xml/' {} ;
Any Ideas? Thanks in advance

Something like this should work.
find . -type f \( \
-exec grep -q "Subject" {} \; \
-exec sed -i '/Subject/{:a;s/(Subject.*)Subject/\1SecondSubject/;tb;N;ba;:b}' {} \; \
-o \
-exec rename 's/.html$/.xml/' {} \; \)
-exec takes the exit code of the command it executes, so -exec grep -q "Subject" {} \; will only be true if the grep is true. And since the short circuit -o (or) has a lower precedence than the implied -a (and) between the other operators it should conversely only get executed if the grep is false.

You can use find in a process substitution like this:
while IFS= read -d'' -r file; do
echo "processing $file ..."
if grep -q "/Subject/" "$file"; then
sed -i '{:a;s/(Subject.*)Subject/\1SecondSubject/;tb;N;ba;:b}' "$file"
else if [[ $file == *.html ]]; then
rename 's/.html$/.xml/' "$file"
fi
done < <(find . -type f -print0)

How to recursively change files in directories whose name matches a string in Perl?

I have many directories for different projects. Under some project directories, there are subdirectories named "matlab_programs". In only subdirectories named matlab_programs, I would like to replace the string 'red' with 'blue' in files ending with *.m.
The following perl code will recursively replace the strings in all *.m files, regardless of what subdirectories the files are in.
find . -name "*.m" | xargs perl -p -i -e "s/red/blue/g"
And to find the full paths of all directories called matlab_programs,
find . -type d -name "matlab_programs"
How can I combine these so I only replace strings if the files are in a subdirectory called matlab_programs?

Perl has the excellent File::Find module, that lets you specify a callback to be called on each file.
So you can specified a complex compound criteria, like this:
#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;
sub find_files {
next unless m/\.m\z/; # skip any files that don't end in .m
if ( $File::Find::dir =~ m/matlab_programs$/ ) {
print $File::Find::name, " found\n";
}
}
find( \&find_files, "." );
And then you can do whatever you wish with the files you find - like opening/text replacing and closing.

You want to find all directories named matlab_programs using
find . -type d -name "matlab_programs"
and then execute
find $f -name "*.m" | xargs perl -p -i -e "s/red/blue/g"
on all results $f. Judging by your use of xargs, there are no special characters such as spaces in your file names. so the following should work:
find `find . -type d -name "matlab_programs"` -name "*.m" |
xargs perl -p -i -e "s/red/blue/g"
or
find . -type d -name "matlab_programs" |
while read f
do
find $f -name "*.m" | xargs perl -p -i -e "s/red/blue/g"
done |
xargs perl -p -i -e "s/red/blue/g"
Incidentally, I'd use single quotes here; I always use them whenever the quoted string is to be taken literally.

Do you have bash? The $(...) syntax works like backticks (the way both the shell and Perl use them) but they can be nested.
perl -pi -e s/red/blue/g $(find $(find . -type d -name matlab_programs) -type f -name \*.m)
Many flavors of find also support a -path pattern test, so you can just combine your filename conditions into that argument
perl -pi -e s/red/blue/g $(find . -type f -path \*/matlab_programs/\*.m)

Regular expression search and replace across whole directory in Linux terminal

In my PHP project, I have a PHP function that does some language stuff and is being called as:
<?php echo __('STRING'); ?>
I would like to switch from the consistent usage of uppercase string indexes, to a consistent usage of lowercase string indexes, so I would like to replace all these occurances:
__('SOMETHING')
With:
__('something')
What would be the command to do this?
I have a command ready for easy search & replace functionality, but I don't know how to write the regex.
find . -name "*.php" -print | xargs sed -i 's/search/replace/g'

You can use -print0 with xargs -0:
find . -name "*.php" -print0 | xargs -0 -I {} sed -i.bak 's/search/replace/g' {}

Use strtolower function.
Example:-
<?php
echo strtolower("Hello WORLD.");
?>
Result:-
hello world.

search and replace files in linux(sed)

I'm trying to search and replace the following:
<?php
<!DOCTYPE HTML>
with
<!DOCTYPE HTML>
so far I have tried this:
find . \( -name "*.php" \) -exec grep -Hn "<?php <\!DOCTYPE HTML>" {} \; -exec sed -i 's/<?php <\!DOCTYPE HTML>/<\!DOCTYPE HTML>/g' {} \;
But it's not finding any instances of files with my needle string which exists on my server.

find . -name "*.php" -exec grep -lZz '^<?php[[:space:]]\+<!DOCTYPE HTML>' {} + |
xargs -r0 sed -i '^<?php[[:space:]]*$/,1d'
Edit: The previous version didn't work due to the character \n in the pattern. The updated version avoid this character.

With GNU awk (for RS='\0' to read the whole file as one record) and assuming your file names don't contain newlines all you need is the clear, simple:
find . -name '*.php' -print |
while IFS= read -r file; do
gawk -v RS='\0' '{gsub(/<\?php\n<!DOCTYPE HTML>/,"<!DOCTYPE HTML>"); print}' "$file" > tmp &&
mv tmp "$file"
done

Command line perl regex to find dangling Javascript commas

Hello I'm seeking a Perl one-liner if possible, to scan all of our Javascript files, to find so-called "rogue commas". That is, commas that come at the end of an array or object data structure, and therefore commas that come immediately before either an ']' or '}' character.
The main challenge I'm encountering is how to make the regex that checks for ] or } non-greedy. The regex needs to span multiple lines, since the comma could end one line, followed by the } or ] on the next line, but I've figured out how to do that with the help of the book Minimal Perl.
Also, I'd like to be able to pipe a number of files to this Perl regex (via find/xargs), and so I'd like to print the name of the input file, and the line number within that file.
Below are various attempts of mine that are not particularly close to working straight from my bash history. Thanks in advance:
find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+$/ and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+/ and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+\]/ and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+[\]\}]/ and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+[\]\}]/ and print $_;' | wc -l find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+[\]\}]/ and print $_;' | wc -l find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+}/ and print $_;' | wc -l find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+}?/ and print $_;' | wc -l find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,\s+}+?/ and print $_;' | wc -l find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,$/' and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/,$/ and print $_;' find winhome/workspace/SsuExt4Zoura/quotetool/js
-name "*.js" | xargs perl -00 -wnl -e '/\,$/ and print $_;'

With the -00 switch, you change the record separator, and (probably) get the whole file in one line, which allows you to find multi-line trailing commas. However, it also makes the print $_ print the whole line. What you probably want is printing the file name:
print $ARGV if /,\s*[\]\}]/;

Most of these look like a decent approach to the problem, with one small issue. You probably want ,\s*(?:$|[\]\}]) rather than ,\s+(?:$|[\]\}]) as there may not be even one space. Your + quantifier might miss forms like ,].
Having said that, JavaScript can be pretty subtle, and you might well encounter comments and other stuff, which might legitimately end with a comma before something unexpected, like the end of the file or a }. A cheap solution might be to use a perl s/// form to simply remove all the comments before applying your tests.
If you're handling JSON, JSON::XS can enforce validity with its relaxed option.
If you need real validation, something like JSLint is probably the way to go. I've had a lot of success with using Rhino to embed JavaScript (a bit less using Perl with SpiderMonkey) and using this as a set of tests against JavaScript code would be a nice way to ensure reliability over time.

An easy solution to this problem is to use comma-first style. Since commas never come at the end of a line, there is never a 'trailing comma'.
For example:
var myObj = { foo: 1
, bar: 2
, baz: 4
}
You can easily detect if a comma is missing, it's obvious which elements belong to what set of braces, and there's never a 'trailing comma problem'.
See also https://gist.github.com/357981

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to match once per file in grep? - regex

So, using grep, you just need the option -l, --files-with-matches. All those answers about find, awk or shell scripts are away from the question.

find . -name \*.coffee -exec grep -m1 -i 're' {} \; find's -exec option runs the command once for each matched file (unless you use + instead of \;, which makes it act like xargs).

Related

Nesting Find and sed command in if else

How to recursively change files in directories whose name matches a string in Perl?

Regular expression search and replace across whole directory in Linux terminal

search and replace files in linux(sed)

Command line perl regex to find dangling Javascript commas

Categories

Resources