RegEx for adding a zero between a dash and number [duplicate] - regex

This question already has answers here:
Replacing digits immediately after a saved pattern
(2 answers)
Closed 3 years ago.
I want to find a way to add a leading zero "0" in front of numbers but BBEdit thinks it's substitute #10 Example:
Original string: Video 2-1: Title Goes Here
Desired result: Video 2-01: Title Goes Here
My find regex is: (-)(\d:)
My replace regex is: \10\2. The first substitute is NOT 10. I simply intend to replace first postion, then add a "0", then replace second position.
Kindly tell me how to tell BBEdit that I want to add a zero and that I don't mean 10th position.

If you simply need a number preceded by a dash, then I recommend using the regex lookbehind for this one.
Try this out:
(?<=-)(\d+:)
As seen here: regex101.com
It tells the regex that the match should be preceded by a dash -, and the - itself won't be matched!

You really don't need to capture hyphen in group1 (as it is a fixed string so no benefit capturing in group1 and replacing with \1) for replacement, instead just capture hyphen with digit using -(\d+:) and while replacing just use -0\1
Regex Demo
Also, there are other better ways to make the replacement where you don't need to deal with back references at all.
Another alternate solution is to use this look around based regex,
(?<=-)(?=\d+:)
and replace it with just 0 which will just insert a zero before the digit.
Regex Demo with lookaround
Another alternate solution when lookbehind is not supported (like in Javascript prior to EcmaScript2018), you can use a positive look ahead based solution. Basically match a hyphen - which is followed by digits and colon using this regex,
-(?=\d+:)
and replace it with -0
Regex Demo with only positive look ahead

Try \1\x30\2 as the replacement. \x30 is the hex escape for the 0 character, so the replacement is \1, then 0, then \2, and cannot be interpreted as \10 then 2. I don't know if BBEdit supports hex escapes in the replacement string though.

This expression might help you to do so, if Video 2- is a fixed input:
(Video 2-)(.+)
If you have other instances, you can add left boundary to this expression, maybe something similar to this:
([A-Za-z]+\s[0-9]+-)(.+)
Then, you can simply replace it with a leading zero after capturing group $1:
Graph
This graph shows how the expression would work:
If you wish, you can add additional boundaries to the expression.
Replacement
For replacing, you can simply use \U0030 or \x30 instead of zero, whichever your program might support, in between $1 and $2.

Related

Notepad++: Can I use regex to find some values and remove only one character instead of the whole pattern?

I want to use regex in notepad to find this pattern: "[0-9]+[\.][0-9]+[,][0-9]+" e.g. 1.010,80260
However from these kind of numbers I just want to remove the '.' , so the new value should be 1010,80260 .
So far I can only replace the whole pattern. Is there a way to do it?
Thank you in advance!
You can make use of the \K meta escape since PCRE doesn't support variable width lookbehinds:
regex:
[0-9]+\K[\.](?=[0-9]+[,][0-9]+)
[0-9]+ - capture digits
\K - forget what we've captured
[\.] - capture a period; just \. can be used, no need for the char class brackets
(?=[0-9]+[,][0-9]+) - ahead of me should be digits followed by a comma and digits
replace:
Nothing
\K is bugged in Notepad++ so you could use this regex instead since you only care that at least one digit is behind the period:
(?<=\d)\.(?=[0-9]+[,][0-9]+)
You can use \K, which basically says throw away whatever was matched up until that point, then add a lookahead. Like so
[0-9]+\K\.(?=[0-9]+[,][0-9]+)
Change the regular expression to: ([0-9]+)[\.]([0-9]+[,][0-9]+)
The () pieces are groups which you can refer to in the replace with \1 for the first group, and \2 for the second group.
The docs also explain this here: https://npp-user-manual.org/docs/searching/#substitution-grouping (even better, and in more detail, than my usage in this answer...)
EDIT: I just wanted to share the animated gif showing that 'Replace' in Notepad++ 7.9.5. does not seem to work.

Regex: how do I match a character before other capture characters?

I'm trying to match on a list of strings where I want to make sure the first character is not the equals sign, don't capture that match. So, for a list (excerpted from pip freeze) like:
ply==3.10
powerline-status===2.6.dev9999-git.b-e52754d5c5c6a82238b43a5687a5c4c647c9ebc1-
psutil==4.0.0
ptyprocess==0.5.1
I want the captured output to look like this:
==3.10
==4.0.0
==0.5.1
I first thought using a negative lookahead (?![^=]) would work, but with a regular expression of (?![^=])==[0-9]+.* it ends up capturing the line I don't want:
==3.10
==2.6.dev9999-git.b-e52754d5c5c6a82238b43a5687a5c4c647c9ebc1-
==4.0.0
==0.5.1
I also tried using a non-capturing group (?:[^=]) with a regex of (?:[^=])==[0-9]+.* but that ends up capturing the first character which I also don't want:
y==3.10
l==4.0.0
s==0.5.1
So the question is this: How can one match but not capture a string before the rest of the regex?
Negative look behind would be the go:
(?<!=)==[0-9.]+
Also, here is the site I like to use:
http://www.rubular.com/
Of course it does some times help if you advise which engine/software you are using so we know what limitations there might be.
If you want to remove the version numbers from the text you could capture not an equals sign ([^=]) in the first capturing group followed by matching == and the version numbers\d+(?:\.\d+)+. Then in the replacement you would use your capturing group.
Regex
([^=])==\d+(?:\.\d+)+
Replacement
Group 1 $1
Note
You could also use ==[0-9]+.* or ==[0-9.]+ to match the double equals signs and version numbers but that would be a very broad match. The first would also match ====1test and the latter would also match ==..
There's another regex operator called a 'lookbehind assertion' (also called positive lookbehind) ?<= - and in my above example using it in the expression (?<=[^=])==[0-9]+.* results in the expected output:
==3.10
==4.0.0
==0.5.1
At the time of this writing, it took me a while to discover this - notably the lookbehind assertion currently isn't supported in the popular regex tool regexr.
If there's alternatives to using lookbehind to solve I'd love to hear it.

Capture number between two whitespaces (RegEx)

I have the following data:
SOMEDATA .test 01/45/12 2.50 THIS IS DATA
and I want to extract the number 2.50 out of this. I have managed to do this with the following RegEx:
(?<=\d{2}\/\d{2}\/\d{2} )\d+.\d+
However that doesn't work for input like this:
SOMEDATA .test 01/45/12 2500 THIS IS DATA
In this case, I want to extract the number 2500.
I can't seem to figure out a regex rule for that. Is there a way to extract something between two spaces ? So extract the text/number after the date until the next whitespace ? All I know is that the date will always have the same format and there will always be a space after the text and then a space after the number I want to extract.
Can someone help me out on this ?
Capture number between two whitespaces
A whitespace is matched with \s, and non-whitespace with \S.
So, what you can use is:
\d{2}\/\d{2}\/\d{2} +(\S+)
^^^
See the regex demo
The 1+ non-whitespace symbols are captured into Group 1.
If - for some reason - you need to only get the value as a whole match, use your lookbehind approach:
(?<=\d{2}\/\d{2}\/\d{2} )\S+
Or - if you are using PCRE - you may leverage the match reset operator \K:
\d{2}\/\d{2}\/\d{2} +\K\S+
^^
See another demo
NOTE: the \K and a capture group approaches allow 1 or more spaces after the date and are thus more flexible.
I see some people helped you already, but if you would want an alternative working one for some reason, here's what works too :)
.+ \d+\/\d+\/\d+ (\d+[\.\d]*)
So the .+ matches anything plus the first space
then the \d+/\d+/\d+ is the date parsing plus a space
the capturing group is the number, as you can see I made the last part optional, so both floating point values and normal values can be matched. Hope this helped!
Proof: https://regex101.com/r/fY3nJ2/1
Just make the fractal part optional:
(?<=\d{2}\/\d{2}\/\d{2} )\d+(?:\.\d+)?
Demo: https://regex101.com/r/jH3pU7/1
Update following clarifications in comments:
To match anything (but space) surrounded by spaces and prepended by date use:
(?<=\d{2}\/\d{2}\/\d{2} )\S+
Demo: https://regex101.com/r/jH3pU7/3
Rather than capture, you can make your entire match be the target text by using a look behind:
(?<=\d\d(\/\d\d){2} )\S+
This matches the first series of non-whitespace that follows a "date like" part.
Note also the reduction in the length of the "date like" pattern. You may consider using this part of the regex in whatever solution you use.

Notepad++ Replace regex match for same text plus appending character

I have a file with text and numbers with a length of five (i.e. 12000, 11153, etc.). I want to append all of these numbers with a 0. So 11153 becomes 111530. Is this possible in Notepad++?
I know I can find all numbers with the following regex: [0-9]{5}, but how can I replace these with the same number, plus an appending 0?
In the replacement box I tried the following things:
[0-9]{5}0 - Which it took literally, so 11153 was replaced with [0-9]{5}0
\10 - I read somewhere that \1 would take the match, but it doesn't seem to work. This will replace 11153 with 0
EDIT: \00 - Based on this SO answer I see I need to use \0 instead of \1. It still doesn't work though. This will replace 11153 with
So, I've got the feeling I'm close with the \1 or \0, but not close enough.
You are very near to the answer! What you missed is a capturing group.
Use this regex in "Find what" section:
([0-9]{5})
In "Replace with", use this:
\10
The ( and ) represent a capturing group. This essentially means that you capture your number, and then replace it with the same followed by a zero.
You are very close. You need to add a capturing group to your regex by surrounding it with brackets. ([0-9]{5})
Then use \10 as the replacement. This is replacing the match with the text from group 1 followed by a zero.
You can use \K to reset.
\b\d{5}\b\K
And replace with 0
\b matches a word boundary
\d is a short for digit [0-9]
See demo at regex101

How to replace only part of found text?

I have a file with a some comma separated names and some comma separated account numbers.
Names will always be something like Dow, John and numbers like 012394,19862.
Using Notepad++'s "Regex Find" feature, I'd like to replace commas between numbers with pipes |.
Basically :
turn: Dow,John into: Dow,John
12345,09876 12345|09876
13568,08642 13568|08642
I've been using [0-9], to find the commas, but I can't get it to properly leave the number's last digit and replace just the comma.
Any ideas?
Search for ([0-9]), and replace it with \1|. Does that work?
use this regex
(\d),(\d)
and replace it with
$1|$2
OR
\1|\2
(?<=\d), should work. Oddly enough, this only works if I use replace all, but not if I use replace single. As an alternative, you can use (\d), and replace with $1|
General thoughts about replacing only part of a match
In order to replace a part of a match, you need to either 1) use capturing groups in the regex pattern and backreferences to the kept group values in the replacement pattern, or 2) lookarounds, or 3) a \K operator to discard left-hand context.
So, if you have a string like a = 10, and you want to replace the number after a = with, say, 500, you can
find (a =)\d+ and replace with \1500 / ${1}500 (if you use $n backreference syntax and it is followed with a digit, you should wrap it with braces)
find (?<=a =)\d+ and replace with 500 (since (?<=...) is a non-consuming positive lookbehind pattern and the text it matches is not added to the match value, and hence is not replaced)
find a =\K\d+ and replace with 500 (where \K makes the regex engine "forget" the text is has matched up to the \K position, making it similar to the lookbehind solution, but allowing any quantifiers, e.g. a\h*=\K\d+ will match a = even if there are any zero or more horizontal whitespaces between a and =).
Current problem solution
In order to replace any comma in between two digits, you should use lookarounds:
Find What: (?<=\d),(?=\d)
Replace With: |
Details:
(?<=\d) - a positive lookbehind that requires a digit immediately to the left of the current location
, - a comma
(?=\d) - a positive lookahead that requires a digit immediately to the right of the current location.
See the demo screenshot with settings:
See the regex demo.
Variations:
Find What: (\d),(?=\d)
Replace With: \1|
Find What: \d\K,(?=\d)
Replace With: |
Note: if there are comma-separated single digits, e.g. 1,2,3,4 you can't use (\d),(\d) since this will only match odd occurrences (see what I mean).