vimrc search and replace all excluding previous matches - regex

So I have two functions in my vimrc which I use a lot:
function! FindAndReplaceAllConfirm(from, to)
exec '%s/' . a:from . '/' . a:to . '/gc'
endfunction
function! FindAndReplaceAll(from, to)
exec '%s/' . a:from . '/' . a:to . '/g'
endfunction
The problem is consider if I'm replacing Foo with FooBar. Sometimes I already have FooBar in the file and I don't want FooBar becoming FooFooBar. How does one exclude patches like this.

You can add word boundaries \< and \> to match and replace only exact words as in the following function:
function! FindAndReplaceAll(from, to)
exec '%s/\<' . a:from . '\>/' . a:to . '/g'
endfunction

Related

Replace text between 2 particular lines in a text file using sed

Similar questions have been asked but they are for Powershell.
I have a Markdown file like:
.
.
.
## See also
- [a](./A.md)
- [A Child](./AChild.md)
.
.
.
- [b](./B.md)
.
.
.
## Introduction
.
.
.
I wish to replace all occurrences of .md) with .html) between ## See also and ## Introduction :
.
.
.
## See also
- [a](./A.html)
- [A Child](./AChild.html)
.
.
.
- [b](./B.html)
.
.
.
## Introduction
.
.
.
I tried like this in Bash
orig="\.md)"; new="\.html)"; sed "s~$orig~$new~" t.md -i
But, this replaces everywhere in the file. But I wish that the replacement happens only between ## See also and ## Introduction
Could you please suggest changes? I am using awk and sed as I am little familiar with those. I also know a little Python, is it recommended to do such scripting in Python (if it is too complicated for sed or awk)?
$ sed '/## See also/,/## Introduction/s/\.md/.html/g' file

perl non-greedy replace not working at the start

I am having a XML similar to this
<Level1Node>
.
.
<Level2Node val="Retain"/>
.
.
</Level1Node>
<Level1Node>
.
.
<Level2Node val="Replace"/>
.
.
</Level1Node>
<Level1Node>
.
.
<Level2Node val="Retain"/>
.
.
</Level1Node>
I need to remove only the below node,
<Level1Node>
.
.
<Level2Node val="Replace"/>
.
.
</Level1Node>
To have it replaced in non-greedy manner, I used the below regex,
perl -0 -pe "s|<Level1Node>.*?<Level2Node val="Retain"/>.*?</Level1Node>||gs" myxmlfile
But the non-geedy terminates the match only at the end of the pattern, not at the start. How to get it started at the last match of <Level1Node>
You will need to use a negative lookahead to make sure you do not match closing Level1Node tags where you don't want to:
perl -0 -pe 's|<Level1Node>(?:(?!<\/Level1Node>).)*<Level2Node val="Retain"\/>(?:(?!<\/Level1Node>).)*<\/Level1Node>||gs' tmp.txt
Details:
<Level1Node>
(?:(?!<\/Level1Node>).)* # Everything except </Level1Node>
<Level2Node val="Retain"\/>
(?:(?!<\/Level1Node>).)* # Everything except </Level1Node>
<\/Level1Node>
?: is only here so that the parenthesis are not interpreter as a capturing group.
If you plan to run this on a large file, you should probably check the cost of the negative lookahead, it might be high.
Use a proper parser! It's way simpler.
perl -MXML::LibXML -e'
my $doc = XML::LibXML->new->parse_file($ARGV[0]);
$_->unbindNode() for $doc->findnodes(q{//Level1Node[Level2Node[#val!="Retain"]]});
$doc->toFH(\*STDOUT);
' tmp.txt

bash recursive manipulation of strings in a subset of files in all subdirectories

In a directory there are a lot of (sub)subdirectories with different files. The string manipulation shall be executed on one file type (e.g. *.c) only.
The string I'd like to manipulate has the following structure:
[text][string before specific underscore]_[string after specific underscore]_[string rest][text]
[text] can be [a-z], [A-Z], [0-9], _ or space.
[string before specific underscore] can be [a-z], [A-Z], [0-9].
[string after specific underscore] is known. Lets assume it is 'MOVE'.
[string rest] can be [a-z], [A-Z], [0-9] or _.
My goal is to change the two strings left and right to first underscore:
[text][string after specific underscore]_[string before specific underscore]_[string rest][text]
Example of one c file:
h_a1Ha MOVE_Ab1_rest h _4Aihi
bl_aa abc123ABC_MOVE_rest bl_ub
blu_b abcABC_MOVE_rest bla_a
foo _o Abc_MOVE_rest tes _t
I want to change MOVE with the expression before first underscore:
h_a1Ha MOVE_Ab1_rest h _4Aihi
bl_aa MOVE_abc123ABC_rest bl_ub
blu_b MOVE_abcABC_rest bla_a
foo _o MOVE_Abc_rest tes _t
When all expressions before first underscore are known this works:
find . -name "*.c" -exec sed -i "s/abc123ABC_MOVE_/MOVE_abc123ABC_/g" '{}' \;
find . -name "*.c" -exec sed -i "s/abcABC_MOVE/MOVE_abcABC/g" '{}' \;
find . -name "*.c" -exec sed -i "s/Abc_MOVE_/MOVE_Abc_/g" '{}' \;
How can I do this string manipulation without writing explicitly the string before first underscore? I think I need a regular expression which looks for this token
_MOVE_ (_MOVE shall be also sufficient, I guess.)
and changes what is before and after first underscore.
Question 2:
If one has an idea how to solve the mentioned problem it would be perfect. Even better (yeah, even better than perfect ;) would be to exclude one specific string (e.g. Abc_) that the result becomes:
h_a1Ha MOVE_Ab1_rest h _4Aihi
bl_aa MOVE_abc123ABC_rest bl_ub
blu_b MOVE_abcABC_rest bla_a
foo _o Abc_MOVE_rest tes _t
Thanks and cheers,
David
I think the above two answers are too fancy, maybe you can try this one, it's simple enough to solve you problem:
sed -r -e 's/([a-zA-Z0-9]+)_(MOVE)/\2_\1/g; s/(MOVE)_(Abc)/\2_\1/g'
Check this command:
sed -r 's/([^_]*)_([^_]*)/1st: \1\n2nd: \2/' <<< 'foo_bar'
it gives you:
1st: foo
2nd: bar
You can match a sequence of non underscores by [^_]*. Using the parentheses () you can capture them individually and access them in the replacement pattern like \1, \2 and so on.
You can try this :
$ sed '/[^ ]* Abc/!{/[^ ]* MOVE/! s/\([^ ]* \)\([^_]*\)_\([^_]*\)_\(.*\)/\1\3_\2_\4/} ' file
haha MOVE_Ab1_rest hihi
blaa MOVE_abc123ABC_rest blub
blub MOVE_abcABC_rest blaa
fooo Abc_MOVE_rest test
It swaps strings surrounding first underscore except when string before first underscore starts with MOVE or Abc.
Maybe a bit more readable if you have Extended Regex support (-r option) :
sed -r '/[^ ]* Abc/!{/[^ ]* MOVE/! s/([^ ]* )([^_]*)_([^_]*)_(.*)/\1\3_\2_\4/}' file
The idea here is to deal with spaces and _ to capture groups. It's a more generic approach than using classes of characters that should be updated with possible omitted characters.

Regex for find command to match files with a non-empty file extension

I have a series of files that I want to clean up that are .log files that have been rotated. Examples:
error.log
access.log
error.log-2016-02-05
access.log.1
debug.log
debug.log--2
Regex is matching all of the log files with:
find . -regextype posix-extended -regex '^.*.log.*'
How can I only match ONLY the files that have characters after *.log?
Replace the last occurrence of .* with .+.
* matches 0 or more instances of the previous character.
+ matches 1 or more instances.
You also need to escape the . before log with a \, otherwise it will match any character rather than just a literal period.
In summary, use this:
find . -regextype posix-extended -regex '^.*\.log.+'
A few other adjustments might also be useful:
you probably don't want to match files with empty filenames, so you should also switch the first .* to a .+ as well (Thanks, Jan!).
you probably don't want to allow files with file extension .log. (a single . character after .*log), so you should switch the final .+ to \..+.
This would give you the final command:
find . -regextype posix-extended -regex '^.+\.log\..+'

Pull value for HostName for IPconfig command

I have a text file for IPCONFIG command, and am interested to obtain value for HOST NAME i.e. S4333AAB45 utilizing REGEX.
Windows IP Configuration
Host Name . . . . . . . . . . . . : S4333AAB45
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
I tried following option and it didn't work
/\bHost Name\s+(\d+)/
Here is what I would use:
/\s+Host Name.*: (\w+)$/
Use Field Splitting with AWK
You don't say what regular expression engine you're using, or why you need to use a regular expression to match the host name portion. If you have access to AWK, you can treat this as a field-splitting issue instead. For example:
awk '/\<Host Name\>/ { print $NF }' /tmp/foo
Use Known Line Positions
Assuming you've got Cygwin or similar installed, you can use the position of the interesting record to get the data you want without a regular expression at all. For example:
cat /tmp/foo | head -n3 | cut -d: -f2 | tr -d ' '
Just replace the cat command with your call to ipconfig instead, and you should get the results you want.
Use sed Instead
You can also use sed to find the line you're interested in, and print out just the trailing word on the line. For example:
sed -n '/\<Host Name\>/ s/.*[[:space:]]\([[:alnum:]]\+\)$/\1/p' /tmp/foo
Your host had a letter "S" as the first character of the host name, so "(\d+)" wouldn't be correct for matching your host name. You also failed to account for the dots and colon on the host name line. So the answer from weexpectedTHIS should do the trick. But for your information, here's how you could get the host name without first creating an intermediate file.
$ipconfig = `ipconfig /all`;
($host) = $ipconfig =~ /^\s*Host Name.*:\s*(\w+)/m;
You would need the "/m" in there so that the "^" will match the start of any line in the multi-line contents of $ipconfig. I tend to use "\s*" instead of "\s+" as a sort of insurance against future changes in the output format (where white space is often removed or expanded in newer versions of a command).