Vim substitute - encase `\ref{eq:x}` with brackets - regex

I have a latex document which has a bunch of strings of the form
Eq.~\ref{eq:x}
where x is in general a different string for each occurrence. I want to replace the above with
Eq.~(\ref{eq:x})
I can match some of the occurrences searching with /\\ref{eq:.*\} but this doesn't work if you have something like
blah Eq.~\ref{eq:x} something something \cite{this}
Note that I don't want to replace \ref{eq: with a latex macro which handles the brackets internally.

* is a greedy quantifier that will match as many characters as possible. So, if you have several } on the line, .*} will match every character up to the last } on the line.
You should use a non-greedy quantifier instead:
/\\ref{eq:.\{-}\}
See :help \{.

Related

Notepad++ regular expressions and replace

I have a couple of sentences that need processing using regular expressions. They're in a text file and I'm opening it in notepad++.
<tag>There are two tags here</tag>
<tag>How am i supposed to
feel when this is happening?</tag>
<tag>I'm not sure.
But oh well<tag>
Is it possible to use notepad++'s regular expressions and replace functionality to produce an output like so:
<tag>There are two tags here</tag>
<tag>How am i supposed to feel when this is happening?</tag>
<tag>I'm not sure. But oh well<tag>
So that sentences that span over two or more lines are joined based on the fact that there is a > at the end of the sentence. Thanks.
Replace this:
[\r\n]+(?!<)
with a space
Click for Demo
Explanation:
[\r\n]+ - matches 1+ occurrences of a \r or \n
(?!<) - negative lookahead to validate that the above match is not followed by an opening tag <
Before Replacement with space:
After replacing the matches with space:

vim delete regex: pattern not found

This is my first time trying to use a regex for deletion.
The regex:
/net=.+\.net/
as shown here matches a string that starts with net= some random characters and ends with .net
However, when using it in vim:
:g/net=.+\.net/d
I simply get Pattern not found: net=.+\.net
I am guessing that vim uses a slightly different format, or do I need to escape the characters =, . and + ?
:help pattern is your friend. In your case, you need to escape + or prefix your whole pattern with \v to turn it “verymagic”.
Do not escape =, it would turn it into the same thing as {0,1} in some regexp engine, namely a greedy optional atom matcher.

NOTEPAD++ REGEX - I can't get what's in between two strings, I don't get it

I'm so close to understanding regex. I'm a bit stumped, I thought i understood lazy and greedy.
Here is my current regex: <g_n><!\[CDATA\[([^]]+)(?=]]><\/g_n>)
My current regex makes:
<g_n><![CDATA[xxxxxxxxxx]]></g_n>
match to:
<g_n><![CDATA[xxxxxxxxxx
But I want to make it match like this:
xxxxxxxxxx
You want
<g_n><!\[CDATA\[(.*?)]]></g_n>
then if you want to replace it use
\1
in the replacement box
Your matching the whole string, the brackets around the .*? match all of that and put it in the \1 variable
So the match will be all of the string with \1 referring to what you want
To change the xxxxx
Regex :
(<g_n><![CDATA[)(?:.*?)(]]></g_n>)
Replacement
\1WHAT YOU WANT TO CHANGE TO\2
It looks like you need to add escape slashes to the two closing square brackets, as they are literals from the string you're parsing.
<g_n><!\[CDATA\[.*+?\]\]><\/g_n>
^ ^
Any square brackets not being escaped by backslashes will be treated as regex operational brackets, which in this case won't catch the input string.
EDIT, I think the +? is redundant.
\[.*\]\]> ...
should suffice, since .* means any character, any amount of times.
Tested with notepad++ 6.3.2:
find: (<g_n><!\[CDATA\[)([^]]+)(?=]]></g_n>)
replace: $1WhatYouWant
You can replace + by * in the pattern to match void CDATA:
<g_n><![CDATA[]]></g_n>

Regular expression find/replace notepad++

I've a huge text file with lines like this:
080012;Bovalino;RC;CAL;0964;89034;B098;9021;http://www.website-most.en/000/000/
And i would like extract only:
080012;***Bovalino***;***RC***;CAL;***0964***;***89034***;B098;9021;http://www.website-most.en/000/000/
And delete all other text.
Can this be done with regular expressions?
You can capture the stuff you want to keep and use a backreference in the replacement string:
Find what: ^\d*;(\w*;\w*);\w*;(\d*;\d*).*
Replace with: \1;\2
And make sure you do not tick the . matches newline option.
With Notepad++ 6 you can also use $1;$2 for the replacement (with the same meaning).
If the different fields may contain all sorts of characters and not just digits and letters, this is probably your best bet:
Find what: ^[^;]*;([^;]*;[^;]*);[^;]*;([^;]*;[^;]*).*

Regex to change the number of spaces in an indent level

Let's say you have some lines that look like this
1 int some_function() {
2 int x = 3; // Some silly comment
And so on. The indentation is done with spaces, and each indent is two spaces.
You want to change each indent to be three spaces. The simple regex
s/ {2}/ /g
Doesn't work for you, because that changes some non-indent spaces; in this case it changes the two spaces before // Some silly comment into three spaces, which is not desired. (This gets far worse if there are tables or comments aligned at the back end of the line.)
You can't simply use
/^( {2})+/
Because what would you replace it with? I don't know of an easy way to find out how many times a + was matched in a regex, so we have no idea how many altered indents to insert.
You could always go line-by-line and cut off the indents, measure them, build a new indent string, and tack it onto the line, but it would be oh so much simpler if there was a regex.
Is there a regular expression to replace indent levels as described above?
In some regex flavors, you can use a lookbehind:
s/(?<=^ *) / /g
In all other flavors, you can reverse the string, use a lookahead (which all flavors support) and reverse again:
s/ (?= *$)/ /g
Here's another one, instead utilizing \G which has NET, PCRE (C, PHP, R…), Java, Perl and Ruby support:
s/(^|\G) {2}/ /g
\G [...] can match at one of two positions:
✽ The beginning of the string,
✽ The position that immediately follows the end of the previous match.
Source: http://www.rexegg.com/regex-anchors.html#G
We utilize its ability to match at the position that immediately follows the end of the previous match, which in this case will be at the start of a line, followed by 2 whitespaces (OR a previous match following the aforementioned rule).
See example: https://regex101.com/r/qY6dS0/1
I needed to halve the amount of spaces on indentation. That is, if indentation was 4 spaces, I needed to change it to 2 spaces.
I couldn't come up with a regex. But, thankfully, someone else did:
//search for
^( +)\1
//replace with (or \1, in some programs, like geany)
$1
From source: "^( +)\1 means "any nonzero-length sequence of spaces at the start of the line, followed by the same sequence of spaces. The \1 in the pattern, and the $1 in the replacement, are both back-references to the initial sequence of spaces. Result: indentation halved."
You can try this:
^(\s{2})|((?<=\n(\s)+))(\s{2})
Breakdown:
^(\s{2}) = Searches for two spaces at the beginning of the line
((?<=\n(\s)+))(\s{2}) = Searches for two spaces
but only if a new line followed by any number of spaces is in front of it.
(This prevents two spaces within the line being replaced)
I'm not completely familiar with perl, but I would try this to see if it work:
s/^(\s{2})|((?<=\n(\s)+))(\s{2})/\s\s\s/g
As #Jan pointed out, there can be other non-space whitespace characters. If that is an issue, try this:
s/^( {2})|((?<=\n( )+))( {2})/ /g