Remove dashes surrounded by numbers on both sides

Remove dashes surrounded by numbers on both sides - regex

I'm trying to search and replace using regex in TextWrangler (https://gist.github.com/ccstone/5385334, http://www.barebones.com/products/textwrangler/textwranglerpower.html)
I have rows like this
56-84 29 STRINGOFLETTERS -2.54
I´d like to replace the dash in "56-84" with a tab, so I get
56 84 29 STRINGOFLETTERS -2.54
But without replacing the dash in "-2.54"
How do I specifically only remove dashes surrounded by numbers on both sides?
My regex knowledge is extremelly small, I tried to find [0-9]-[0-9] and replace with [0-9][0-9] but that didnt work.

Your link says "The PCRE engine (Perl Compatible Regular Expressions) is what BBEdit and TextWrangler use". So hopefully you can use lookaround with your regex.
replace regex:
(?<=\d)-(?=\d)
replace with tab(\t).

If it's plain text, not sure you need TextWrangler. You can just use the "sed" command of unix:
$ sed 's/\d-\d/\d\d/g' a.txt > b.txt

You actually need to capture the numbers you want. So the regex would be:
^([0-9])-([0-9])
I'm assuming here that the numbers start at the beginning of the line. If not, you can remove the ^.
Based on your link, the flavor of regex is PCRE, so backreferences look like \1, and \2 in the replacement pattern. So your replacement pattern simply becomes:
\1\t\2
Here \1 refers to the first group (so the first number) and \2 refers to the second group (so the second number).

Related

Notepad++: add parentheses to timestamps

I have text with timestamps for gaps in audio, e.g. "We met 51:33 at the bar". I need to add paretheses to timestamps to make them more readable: "We met (51:33) at the bar". How do I do that in Notepad++??
I already created a regex search and replace. It searches for \d\d:\d\d and replaces it with (\d\d:\d\d). Unfortunately, I get (dd:dd) everywhere.

You may enhance the regex a bit with lookarounds (?<!\d) and (?!\d) to make sure you get xx:xx not enclosed with other digits, and use backreference to the whole match ($&) in the replacement:
Search: (?<!\d)\d\d:\d\d(?!\d)
Replace: \($&\)
Note: the ( and ) must be escaped in the replacement pattern since NPP uses Boost conditional replacement pattern syntax.

Need regular expression to replace before and after while keeping the numbers in the middle

I have the following:
itemid=44'>Red Flower</a>
I need it to be this:
_ITEMID_START_44_ITEMID_END_
Can this be done with regular expressions? I need to keep the id (44 in the example), and replace everything on the left with _ITEMID_START_and everything on the right with _ITEMID_END_.
Note: The itemid is one digit or two but never no more than two.
I found something about tagged regular expressions and backreferences which seems like it would work but the syntax is killing me.
I tried this (and other attempts):
Find What: ^(\bitemid=\b)^([0-9][0-9]^)\b'>\b[a-z]+\b</a>\b)
Replace With: ^(\b_ITEMID_START_\b^2^(\b_ITEMID_END_\b
I am using UltraEdit to do the find and replace in over 20,000 *.html files. Any help would be very much appreciated.

The solution of Casimir et Hippolyte and also first solution of Avinash Raj work both in UltraEdit with selecting Perl as regular expression engine. The second search string of Avinash Raj requires removing backslash left of character ' in search string to work in UltraEdit.
UltraEdit has 3 regular expression engines: UltraEdit, Unix and Perl.
The search string in the question is a mixture of UltraEdit and Perl regular expression syntax and therefore does not work.
With UltraEdit reguar expression engine:
Find what: itemid=^([0-9]+^)*</a>
Replace with: _ITEMID_START_^1_ITEMID_END_
With Unix or Perl regular expression engine:
Find what: itemid=([0-9]+).*</a>
Replace with: _ITEMID_START_\1_ITEMID_END_
More secure because non greedy, but only with Perl regex engine:
Find what: itemid=(\d+).*?</a>
Replace with: _ITEMID_START_\1_ITEMID_END_
IDM published the power tips tagged expressions for UltraEdit regex engine and Perl regular expressions: Backreferences for Perl regex engine.

You can try this:
Find What: \bitemid=([0-9][0-9]?)'>[^<]*</a>
Replace With: _ITEMID_START_\1_ITEMID_END_
A replacement string is a normal string, and all the regex special characters (except for the backreference) loose their special meaning.
\b the word boundary is the limit between a character that come from the \w character class (a shortcut for [A-Za-z0-9_]) and an other character.
Note: I can't try it with ultraedit, if you obtain a literal \1, replace it with $1

The below regex would match everything and capture only the digits which was just after to the itemid=. And in the replacement part, the whole line is replaced with _ITEMID_START_\1_ITEMID_END_ (\1 represents the first captured group. It may vary for different languages)
.*(?<=\bitemid=)([0-9]{1,2}).*
And the substitution would be,
_ITEMID_START_\1_ITEMID_END_
DEMO
If you just want to replace only,
itemid=44'>Red Flower</a>
with
_ITEMID_START_44_ITEMID_END_
Then your regex would be,
\bitemid=([0-9]{1,2})\'>[^<]*<\/a>
And the substitution would be,
_ITEMID_START_\1_ITEMID_END_

Is there a smarter way to keep the indentation when replacing characters with a regex?

I want to replace the asterisks in a Markdown list with hyphens.
Example:
1.0
1.1
1.2
2
2.1
2.2
Currently I have a separate regex pattern for up to three levels of indentation set up in Keyboard Maestro for Mac:
I wonder if there isn't a smarter way to do this and which adresses all kinds of indentation.

In many regular expression search and replace systems, you can refer to a parenthesized group in the regular expression in the replacement, using \1, \2, etc. to refer to each successive group. So for example, in sed you could do:
sed -e 's/\(^[\t ]*\)\*/\1-/'
I'm not sure if Keyboard Maestro gives you that option. It mentions that it uses ICU regular expressions; if it also uses their replacement options, then you can use $1, $2 etc. to refer to the replacement.
If not, all is not lost. You can use a lookbehind assertion to match the sequence of whitespace before the the asterisk, without including the asterisk as part of the match; then just use a single dash as your replacement:
Search for: (?<=^[\t ]*)\*
Replace with: -

You can use submatching groups and reference them in the replacing string like this:
Regular expression matching your lines with list items: ([\t ]*)\*(.*)
The string used for replacement: \1-\2

While replacing using regex, How to keep a part of matched string?

I have
12.hello.mp3
21.true.mp3
35.good.mp3
.
.
.
so on as file names in listed in a text file.
I need to replace only those dots(.) infront of numbers with a space.(e.g. 12.hello.mp3 => 12 hello.mp3).
If I have regex as "[0-9].", it replaces number also.
Please help me.

Replace
^(\d+)\.(.*mp3)$
with
\1 \2
Also, in recent versions of notepad++, it will also accept the following, which is also accepted by other IDEs/editors (eg. JetBrains products like Intellij IDEA):
$1 $2
This assumes that the notepad++ regex matching engine supports groups. What the regex basically means is: match the digits in front of the first dot as group 1 and everything after it as group 2 (but only if it ends with mp3)

I tested with vscode. You must use groups with parentheses (group of regex)
Practical example
start with sample data
1 a text
2 another text
3 yet more text
Do the Regex to find/Search the numerical digits and spaces. The group here will be the digits as it is surrounded in parenthesis
(\d)\s
Run a replace regex ops. Replace spaces for a dash but keep the numbers or digits in each line
$1-
Outputs
1-a text
2-another text
3-yet more text

Using the basic pattern, well described in the accepted answer here is an example to add the class="odd" and class="even" to every <tr> element in Notepad++ or any other regex compatible editor:
Find what: (<tr><td>)(.*?\r\n)(<tr><td>)(.*?\r\n)
Replace with: <tr class="odd"><td>\2<tr class="even"><td>\4

How to find occurences of same subsequent characters in a string with a regular expression?

How can I find occurences of same subsequent characters in a string with a regular expression or function?
Example:
I am leet and I have a three pizzas. That noob right there has only one pizza. Poor boy.

You can use a backreference:
/(.)\1/
Change \1 to \1+ if you want to find sequences of length two or more.
Note that the syntax can vary depending on the regular expression engine you are using.

Not sure which version of regex you're working with, but for egrep, this works:
egrep '(.)\1' < file
That will show all lines that have two of some character in a row. If you want just letters:
egrep `([A-Za-z])\1' < file
would work.

Like this in a perl flavour. \w matches a word character, and \2 matches second parentheses.
m/((\w)\2+)/g

Google it:'double characters regex'
Here's a re-fiddle I made with your regex: http://refiddle.com/2fa

This should work ................ (.)\1+

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Remove dashes surrounded by numbers on both sides - regex

Your link says "The PCRE engine (Perl Compatible Regular Expressions) is what BBEdit and TextWrangler use". So hopefully you can use lookaround with your regex. replace regex: (?<=\d)-(?=\d) replace with tab(\t).

If it's plain text, not sure you need TextWrangler. You can just use the "sed" command of unix: $ sed 's/\d-\d/\d\d/g' a.txt > b.txt

Related

Notepad++: add parentheses to timestamps

Need regular expression to replace before and after while keeping the numbers in the middle

Is there a smarter way to keep the indentation when replacing characters with a regex?

While replacing using regex, How to keep a part of matched string?

How to find occurences of same subsequent characters in a string with a regular expression?

Categories

Resources