Using egrep regex to capture part of line - regex

I'm trying to commit git patches via a bash script. This is not a git question! Here is what I want to do, I have a list of files in a directory. I want read those files one by one extract a particular line out of it and then commit.
Here is what I got so far;
patches=/{location}/*.patch
for patch in $patches
do
echo "Processing $patch file..."
git apply $patch
git add --all
git commit -m | egrep -o "(^Subject: \[PATCH [0-9]\/[0-9]\].)(.*)$" $f
echo "Committed $patch file..."
done
Couldn't get the egrep regex working to pass on the proper commit message.
Here is an example line from a patch file;
.....
Subject: [PATCH 1/3] XSR-2756 Including ldap credentials in property file.
......
I just want to capture "XSR-2756 Including ldap credentials in property file." and use as a commit description to git.

Assuming you have GNU grep, use a Perl look-behind:
git commit -m "$(grep -Po '(?<=Subject: \[PATCH \d/\d\].).*') $patch"

Don't use the -o to egrep in this case (since you're matching a bunch of stuff you don't want printed). Instead, just match the whole line and pipe it to 'cut' (or sed, or something else that will trim a prefix from a line.)
Also, you're piping the output of git commit into egrep, not providing the output of egrep as a command line option to git commit... I think you want something like:
git commit -m "$(egrep '<your regexp here>' $f | cut -d] -f2-)"

I'd use sed for this
git commit -m | sed -r -n 's#^Subject: \[PATCH [0-9]/[0-9]\] ##p;'

Related

Regex with inotifywait to compile two types of file in golang

I use a script to auto-compile in golang with inotifywait. But this script only checks files with the extension .go. I want to also add the .tmpl extension but the script uses regular expressions. What kind of changes I have to make to this line to get the desired result?
inotifywait -q -m -r -e close_write -e moved_to --exclude '[^g][^o]$' $1
I've tried to concatenate with | or & and other things like ([^t][^m][^p][^l]|[^g][^o])$ but nothing seems to work.
Rather than trying to use a regex to exclude two types of file, why don't you just only watch those files?
inotifywait -q -m -r -e close_write -e moved_to /path/**/*.{go,tmpl}
To use the ** (which does a recursive match), you may have to enable bash's globstar:
shopt -s globstar

git clean exclude with regex

I'd like to find a regex-way of using git clean.
Without regex:
git clean -dfx --exclude=".idea/"
With regex (tried; not working):
git clean -dfx --exclude='(.*\/)*(\.idea\/.*)(.*)'
git clean -dfx --exclude="(.*\/)*(\.idea\/.*)(.*)"
git clean -dfx --exclude=r'(.*\/)*(\.idea\/.*)(.*)'
git clean -dfx --exclude=r"(.*\/)*(\.idea\/.*)(.*)"
How do you use git clean with regex?
git clean has no support for regular expressions.
A workaround would be something like this:
$ git clean -n | cut -f3 -d' ' | grep -v -E --color=never '<PATTERN>' | ifne git clean
Breakdown of things happening here:
git clean -n produces a list of files that would be removed if git clean would be executed (you can use flags like -d, -x or -X here too)
-n dry-run (do not actually do anything)
cut -f3 -d' ' cuts the third field from those matches (delimited by an whitespace)
-f3 third field
-d' ' use whitespace as the delimiter
grep -v -E --color=never '<PATTERN>'
-v invert the matches from grep
-E interpret PATTERN as an extended regular expression
color=never to prevent colored grep output to mess with the following commands (may be omitted)
'<PATTERN>' a regular expression
ifne git clean will pipe the file list (if there are files) to git clean
ifne a utility function from moreutils (installable via homebrew or other package managers)
git clean will take this list and clean the files (use -n first to make sure no files get removed that you did not expect)
That is the magic of small command line programs each doing a simple specific task

Safe search&replace on linux

Let's consider I have files located in different subfolders and I would like to search, test and replace something into these files.
I would like to do it in three steps:
Search of a specific pattern (with or without regexp)
Test to replace it with something (with or without regexp)
Apply the changes only to the concerned files
My current solution is to define some aliases in my .bashrc in order to easily use grep and sed:
alias findsrc='find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc"'
alias grepsrc='findsrc | xargs grep -n --color '
alias sedsrc='findsrc | xargs sed '
Then I use
grepsrc <pattern> to search my pattern
(no solution found yet)
sedsrc -i 's/<pattern>/replace/g'
Unfortunately this solution does not satisfy me. The first issue is that sed touch all the files even of no changes. Then, the need to use aliases does not look very clean to me.
Ideally I would like have a workflow similar to this one:
Register a new context:
$ fetch register 'mysrcs' --recurse *.h *.c *.asm *.inc
Context list:
$ fetch context
1. mysrcs --recurse *.h *.c *.asm *.inc
Extracted from ~/.fetchrc
Find something:
$ fetch files mysrcs /0x[a-f0-9]{3}/
./foo.c:235 Yeah 0x245
./bar.h:2 Oh yeah 0x2ac hex
Test a replacement:
$ fetch test mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Apply the replacement:
$ fetch subst --backup mysrcs /0x[a-f0-9]{3}/0xabc/
./foo.c:235 Yeah 0xabc
./bar.h:2 Oh yeah 0xabc hex
Backup number: 242
Restore in case of mistake:
$ fetch restore 242
This kind of tools look pretty standard to me. Everybody needs to search and replace. What alternative can I use that is standard in Linux?
#!/bin/ksh
# Call the batch with the 2 (search than replace) pattern value as argument
# assuming the 2 pattern are "sed" compliant regex
SearchStr="$1"
ReplaceStr="$2"
# Assuming it start the search from current folder and take any file
# if more filter needed, use a find before with a pipe
grep -l -r "$SearchStr" . | while read ThisFile
do
sed -i -e "s/${SearchStr}/${ReplaceStr}/g" ${ThisFile}
done
should be a base script to adapt to your need
I often have to perform such maintenance tasks. I use a mix of find, grep, sed, and awk.
And instead of aliases, I use functions.
For example:
# i. and ii.
function grepsrc {
find . -name "*.[ch]" -or -name "*.asm" -or -name "*.inc" -exec grep -Hn "$1"
}
# iii.
function sedsrc {
grepsrc "$1" | awk -F: '{print $1}' | uniq | while read f; do
sed -i s/"$1"/"$2"/g $f
done
}
Usage example:
sedsrc "foo[bB]ar*" "polop"
for F in $(grep -Rl <pattern>) ; do sed 's/search/replace/' "$F" | sponge "$F" ; done
grep with the -l argument just lists files that match
We then use an iterator to just run those files which match through sed
We use the sponge program from the moreutils package to write the processed stream back to the same file
This is simple and requires no additional shell functions or complex scripts.
If you want to make it safe as well... check the folder into a Git repository. That's what version control is for.
Yes there is a tool doing exactely that you are looking for. This is Git. Why do you want to manage the backup of your files in case of mistakes when specialized tools can do that job for you?
You split your request in 3 subquestions:
How quickly search into a subset of my files?
How to apply a substitution temporarly, then go back to the original state?
How to substitute into your subset of files?
We first need to do some jobs in your workspace. You need to init a Git repository then add all your files into this repository:
$ cd my_project
$ git init
$ git add **/*.h **/*.c **/*.inc
$ git commit -m "My initial state"
Now, you can quickly get the list of your files with:
$ git ls-files
To do a replacement, you can either use sed, perl or awk. Here the example using sed:
$ git ls-files | xargs sed -i -e 's/search/replace/'
If you are not happy with this change, you can roll-back anytime with:
$ git checkout HEAD
This allows you to test your change and step-back anytime you want to.
Now, we did not simplified the commands yet. So I suggest to add an alias to your Git configuration file, usually located here ~/.gitconfig. Add this:
[alias]
sed = ! git grep -z --full-name -l '.' | xargs -0 sed -i -e
So now you can just type:
$ git sed s/a/b/
It's magic...

Remove duplicate filename extensions

I have thousands of files named something like filename.gz.gz.gz.gz.gz.gz.gz.gz.gz.gz.gz
I am using the find command like this find . -name "*.gz*" to locate these files and either use -exec or pipe to xargs and have some magic command to clean this mess, so that I end up with filename.gz
Someone please help me come up with this magic command that would remove the unneeded instances of .gz. I had tried experimenting with sed 's/\.gz//' and sed 's/(\.gz)//' but they do not seem to work (or to be more honest, I am not very familiar with sed). I do not have to use sed by the way, any solution that would help solve this problem would be welcome :-)
one way with find and awk:
find $(pwd) -name '*.gz'|awk '{n=$0;sub(/(\.gz)+$/,".gz",n);print "mv",$0,n}'|sh
Note:
I assume there is no special chars (like spaces...) in your filename. If there were, you need quote the filename in mv command.
I added a $(pwd) to get the absolute path of found name.
you can remove the ending |sh to check generated mv ... .... cmd, if it is correct.
If everything looks good, add the |sh to execute the mv
see example here:
You may use
ls a.gz.gz.gz |sed -r 's/(\.gz)+/.gz/'
or without the regex flag
ls a.gz.gz.gz |sed 's/\(\.gz\)\+/.gz/'
ls *.gz | perl -ne '/((.*?.gz).*)/; print "mv $1 $2\n"'
It will print shell commands to rename your files, it won't execute those commands. It is safe. To execute it, you can save it to file and execute, or simply pipe to shell:
ls *.gz | ... | sh
sed is great for replacing text inside files.
You can do that with bash string substitution:
for file in *.gz.gz; do
mv "${file}" "${file%%.*}.gz"
done
This might work for you (GNU sed):
echo *.gz | sed -r 's/^([^.]*)(\.gz){2,}$/mv -v & \1\2/e'
find . -name "*.gz.gz" |
while read f; do echo mv "$f" "$(sed -r 's/(\.gz)+$/.gz/' <<<"$f")"; done
This only previews the renaming (mv) command; remove the echo to perform actual renaming.
Processes matching files in the current directory tree, as in the OP (and not just files located directly in the current directory).
Limits matching to files that end in at least 2 .gz extensions (so as not to needlessly process files that end in just one).
When determining the new name with sed, makes sure that substring .gz doesn't just match anywhere in the filename, but only as part of a contiguous sequence of .gz extensions at the end of the filename.
Handles filenames with special chars. such as embedded spaces correctly (with the exception of filenames with embedded newlines.)
Using bash string substitution:
for f in *.gz.gz; do
mv "$f" "${f%%.gz.gz*}.gz"
done
This is a slight modification of jaypal's nice answer (which would fail if any of your files had a period as part of its name, such as foo.c.gz.gz). (Mine is not perfect, either) Note the use of double-quotes, which protects against filenames with "bad" characters, such as spaces or stars.
If you wish to use find to process an entire directory tree, the variant is:
find . -name \*.gz.gz | \
while read f; do
mv "$f" "${f%%.gz.gz*}.gz"
done
And if you are fussy and need to handle filenames with embedded newlines, change the while read to while IFS= read -r -d $'\0', and add a -print0 to find; see How do I use a for-each loop to iterate over file paths output by the find utility in the shell / Bash?.
But is this renaming a good idea? How was your filename.gz.gz created? gzip has guards against accidentally doing so. If you circumvent these via something like gzip -c $1 > $1.gz, buried in some script, then renaming these files will give you grief.
Another way with rename:
find . -iname '*.gz.gz' -exec rename -n 's/(\.\w+)\1+$/$1/' {} +
When happy with the results remove -n (dry-run) option.

Git add lines to index by grep/regex

I have a giant patch that I would like to break into multiple logical git commits. A large number of the changes are simply changing variable names or function calls, such that they could easily be located with a grep. If I could add to the index any changes that match a regex then clean up in git gui, it would save me a lot of manual work. Is there a good way to update the index on a line-by-line basis using a regex within git or from some output of grep (e.g. line numbers)?
I found a similar question, but I'm not sure how to build the temporary file from a regex-type search.
patchutils has a command grepdiff that can be use to achieve this.
# check that the regex search correctly matches the changes you want.
git diff -U0 | grepdiff 'regex search' --output-matching=hunk
# then apply the changes to the index
git diff -U0 | grepdiff 'regex search' --output-matching=hunk | git apply --cached --unidiff-zero
I use -U0 on the diff to avoid getting unrelated changes. You might want to adjust this value to suite your situation.
More simply, you can use git add -p and utilize the / option to search through your diff for the patches to add. Its not totally automated, but its easier than other alternatives I've found.
You could first run:
git status | \grep "your_pattern"
If the output is as intended, then add the files to the index:
git add $(git status | \grep "your_pattern")
I'm working now on Git-Bash over Windows, and I got a similar problem: I didn't need add some few files from the "not staged for commit file list":
$ git status
On branch Bug_#292400_buggy
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: the/path/to/the/file333.NO
modified: the/path/to/the/file334.NO
modified: the/path/to/the/file1.ok
modified: the/path/to/the/file2.ok
modified: the/path/to/the/file3.ok
modified: the/path/to/the/file4.ok
....................................
modified: the/path/to/the/file666.ok
First, I checked if the file selection was what I was looking for:
$ git status | grep ok
modified: the/path/to/the/file1.ok
modified: the/path/to/the/file2.ok
modified: the/path/to/the/file3.ok
modified: the/path/to/the/file4.ok
....................................
modified: the/path/to/the/file666.ok
I tried with one idea as descibed in this dorum in order to add the same file list with git, as:
$ git add $(git status | \grep "your_pattern")
But it doesn't work for me (Remember: Git-Bash over Windows10)
At least, I tried in a straight way, and it worked fine:
$ git add *ok
$ git status
On branch Bug_#292400_buggy
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
modified: the/path/to/the/file1.ok
modified: the/path/to/the/file2.ok
modified: the/path/to/the/file3.ok
modified: the/path/to/the/file4.ok
....................................
modified: the/path/to/the/file666.ok
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: the/path/to/the/file333.NO
modified: the/path/to/the/file334.NO
Ready to commit, so.
I found an answer.
There are some steps.
git status --porcelain gives git status easy-to-parse format for scripts like grep.
sed s/^...// slices from 3rd characters to end lines
xargs serves you to run script line by line
In my case, using django that need to ignore migrations, my script is git status --porcelain | sed s/^...// | grep -v migrations | xargs git add.
You can customize grep options to fit your needs
documents
xargs
git-status
sed
xargs is what your looking for. Try this:
grep -irl 'regex_term_to_find' * | xargs -I FILE git add FILE
Up to the pipe | is your standard grep command for searching all files *. Options are:
i - case insensitive
r - recursive through directories
l - list names of files only
In the xargs part of the statement FILE is the name of the variable to use for each argument/match passed by the grep command. Then enter the desired command using the variable where appropriate.