Replacing digits immediately after a saved pattern - regex

Searched pattern looks like text9
I search for (text)9
I want to replace with \15 so that I would get text5 but instead it's just giving me text.
Any other character works except for digits.

As it turns out, the PCRE-style back-references do not work.
So, you have to use \015 to replace with the text captured with the first capturing group (\01) and 5.
Since there cannot be more than 99 capturing groups, and both the digits after \ are treated as back-reference group number, \01 is interpreted as the reference to the first group, and the rest are literal digits.

The replacement term \15 is being interpreted as "group 15" - you must escape the "5":
Try replacing with \1\\5, or if that doesn't work (I don't have textwrangler handy) use a look behind:
Search: (?<=text)9
Replace: 5
The look behind doesn't consume input, so only the "9" is matched.

Related

How can I remove something from the middle of a string with regex?

I have strings which look like this:
/xxxxx/xxxxx-xxxx-xxxx-338200.html
With my regex:
(?<=-)(\d+)(?=\.html)
It matches just the numbers before .html.
Is it possible to write a regex that matches everything that surrounds the numbers (matches the .html part and the part before the numbers)?
In your current pattern you already use a capturing group. In that case you might also match what comes before and after instead of using the lookarounds
-(\d+)\.html
To get what comes before and after the digits, you could use 2 capturing groups:
^(.*-)\d+(\.html)$
Regex demo
In the replacement use the 2 groups.
This should do the job:
.*-\d+\.html
Explanation: .* will match anything until -\d+ say it should match a - followed by a sequence of digits before a \.html (where \. represents the character .).
To capture groups, just do (.*-)(\d+)(\.html). This will put everything before the number in a group, the number in another group and everything after the number in another group.

RegEx for adding a zero between a dash and number [duplicate]

This question already has answers here:
Replacing digits immediately after a saved pattern
(2 answers)
Closed 3 years ago.
I want to find a way to add a leading zero "0" in front of numbers but BBEdit thinks it's substitute #10 Example:
Original string: Video 2-1: Title Goes Here
Desired result: Video 2-01: Title Goes Here
My find regex is: (-)(\d:)
My replace regex is: \10\2. The first substitute is NOT 10. I simply intend to replace first postion, then add a "0", then replace second position.
Kindly tell me how to tell BBEdit that I want to add a zero and that I don't mean 10th position.
If you simply need a number preceded by a dash, then I recommend using the regex lookbehind for this one.
Try this out:
(?<=-)(\d+:)
As seen here: regex101.com
It tells the regex that the match should be preceded by a dash -, and the - itself won't be matched!
You really don't need to capture hyphen in group1 (as it is a fixed string so no benefit capturing in group1 and replacing with \1) for replacement, instead just capture hyphen with digit using -(\d+:) and while replacing just use -0\1
Regex Demo
Also, there are other better ways to make the replacement where you don't need to deal with back references at all.
Another alternate solution is to use this look around based regex,
(?<=-)(?=\d+:)
and replace it with just 0 which will just insert a zero before the digit.
Regex Demo with lookaround
Another alternate solution when lookbehind is not supported (like in Javascript prior to EcmaScript2018), you can use a positive look ahead based solution. Basically match a hyphen - which is followed by digits and colon using this regex,
-(?=\d+:)
and replace it with -0
Regex Demo with only positive look ahead
Try \1\x30\2 as the replacement. \x30 is the hex escape for the 0 character, so the replacement is \1, then 0, then \2, and cannot be interpreted as \10 then 2. I don't know if BBEdit supports hex escapes in the replacement string though.
This expression might help you to do so, if Video 2- is a fixed input:
(Video 2-)(.+)
If you have other instances, you can add left boundary to this expression, maybe something similar to this:
([A-Za-z]+\s[0-9]+-)(.+)
Then, you can simply replace it with a leading zero after capturing group $1:
Graph
This graph shows how the expression would work:
If you wish, you can add additional boundaries to the expression.
Replacement
For replacing, you can simply use \U0030 or \x30 instead of zero, whichever your program might support, in between $1 and $2.

Regex to match and replace a character in a pattern

I would like to replace a character "?" with "fi" in a string.
I could write a generic str replace for this. But I want to replace the "?" only if it appears in between two A-Za-z character and avoid the rest
Eg., "Okay?" should be "Okay?" and not "Okayfi"
but
Modi?es should be Modifies since it has ? in middle
What have I tried?
sentence = re.sub(r"(\?)\b", "fi", sentence)
Please see here.
https://regexr.com/3nvk3
Seems to work fine in regexr. but doesnt work well in code. Am I doing something wrong?
The best approach here is to find the original text with the fi ligature and read it in with proper encoding.
Otherwise, you will have to use some workarounds.
You may use (?<=[a-zA-Z]) / (?=[A-Za-z]) lookarounds:
sentence = re.sub(r"(?<=[a-zA-Z])\?(?=[a-zA-Z])", "fi", sentence)
See the regex demo. The (?<=[a-zA-Z]) positive lookbehind matches a position immediately after an ASCII letter, and (?!=[A-Za-z]) positive lookahead matches a position immediately before an ASCII letter.
Or, you may also use a capturing group with backreferences:
sentence = re.sub(r"([a-zA-Z])\?([a-zA-Z])", r"\1fi\2", sentence)
See another regex demo. Note that \1 references the value captured with the first ([a-zA-Z]) group and \2 references the value captured into Group 2 (([a-zA-Z])).

Notepad++ Replace regex match for same text plus appending character

I have a file with text and numbers with a length of five (i.e. 12000, 11153, etc.). I want to append all of these numbers with a 0. So 11153 becomes 111530. Is this possible in Notepad++?
I know I can find all numbers with the following regex: [0-9]{5}, but how can I replace these with the same number, plus an appending 0?
In the replacement box I tried the following things:
[0-9]{5}0 - Which it took literally, so 11153 was replaced with [0-9]{5}0
\10 - I read somewhere that \1 would take the match, but it doesn't seem to work. This will replace 11153 with 0
EDIT: \00 - Based on this SO answer I see I need to use \0 instead of \1. It still doesn't work though. This will replace 11153 with
So, I've got the feeling I'm close with the \1 or \0, but not close enough.
You are very near to the answer! What you missed is a capturing group.
Use this regex in "Find what" section:
([0-9]{5})
In "Replace with", use this:
\10
The ( and ) represent a capturing group. This essentially means that you capture your number, and then replace it with the same followed by a zero.
You are very close. You need to add a capturing group to your regex by surrounding it with brackets. ([0-9]{5})
Then use \10 as the replacement. This is replacing the match with the text from group 1 followed by a zero.
You can use \K to reset.
\b\d{5}\b\K
And replace with 0
\b matches a word boundary
\d is a short for digit [0-9]
See demo at regex101

regex - 4 digit number match and replace

I want to match and replace a number of four digit numbers in a csv file
1,1456,2,3,4,5
2,1455,2,3,4,5
so that all 1400 numbers in the second column are mapped to the range of two hundred
1456 -> 256
1455 -> 255
I have this regex to match the 1400 numbers
',[1][4][0-9][0-9],'
but how can i define the matched substring regex to retain the last two digits of the match?
EDIT
Ended up changing the match regex to
,[1][4]([0-9][0-9])
and the match defined as
,2\1
in Notepad++
Replace /14(\d{2})/ with 2\1, where \1 is a back reference to the first match. Adapt to your regex flavor of choice.
sed -e 's/,[1][4]\([0-9][0-9]\),/,2\1,/'
Notice how the \( \) syntax captures a part of the matched expression, and \1 is used to say "the first captured data".
You need to use a backreference - by surrounding one or more parts of a regex in parentheses, you can later reference them in the output. Here is my final version (works with sed -r).
's/,[1][4]([0-9][0-9])/,2\1/'
You should use a group, i.e. something like
',[1][4]([0-9][0-9]),'
Some regex dialects will let you name groups, e.g. in .NET
',[1][4](?<LastTwoDigits>[0-9][0-9]),'
If you specify which language you are using, it will be easier to help you.