I have addresses in two formats:
SomeHouse,
Holbrook,
Belper,
Derbyshire,
DE56 0RR
and
SomeHouse,
Holbrook,
Belper,
Derbyshire,
DE56 0RR(123123123123)
The number only ever appears right at the end, is always in brackets and always 12 digits.
I am trying to get a regex to match two groups ... the address and the number (if it is there).
It is a head banger (for my inregexperienced self) since i cant get my expression to work on both types of address.
I have
(?<address>.*)(?<bracketsandnum>\((?<num>[0-9]{12})\))$
which also uses a group to match the brackets - not so sure i need that bit :) certainly not as a named group anyway.
Please advise!
Cheers,
James.
Update
I have used the answer provided by Martinho, Qtax. Many thanks to them.
Now i understand a bit more, i see my question is similar to the following:
Ignoring an optional suffix with a greedy regex
Make the second group optional with ?, and use a non-greedy match in the first group (by modifying * with ?). Something like this:
^(?<address>.*?)(?:\((?<num>\d{12})\))?$
I have addresses in two formats:
SomeHouse,
Holbrook,
Belper,
Derbyshire,
DE56 0RR
and
SomeHouse,
Holbrook,
Belper,
Derbyshire,
DE56 0RR(123123123123)
The number only ever appears right at the end, is always in brackets and always 12 digits.
I am trying to get a regex to match two groups ... the address and the number (if it is there).
It is a head banger (for my inregexperienced self) since i cant get my expression to work on both types of address.
I have
(?<address>.*)(?<bracketsandnum>\((?<num>[0-9]{12})\))$
which also uses a group to match the brackets - not so sure i need that bit :) certainly not as a named group anyway.
Please advise!
Cheers,
James.
Update
I have used the answer provided by Martinho, Qtax. Many thanks to them.
Now i understand a bit more, i see my question is similar to the following:
Ignoring an optional suffix with a greedy regex
Make the second group optional with ?, and use a non-greedy match in the first group (by modifying * with ?). Something like this:
^(?<address>.*?)(?:\((?<num>\d{12})\))?$
So this is the regex I've made:
^(([01]?\d{1,2})|(2(([0-4]\d)|(5[0-5])))\.){3}(([01]?\d{1,2})|(2(([0-4]\d)|(5[0-5]))))$
I have used several sites to break it down and it seems that it should work, but it doesn't. The desired result is to match any IPv4 - four numbers between 0 and 255 delimited by dots.
As an example, 1.1.1.1 won't give you a match.
The purpose of this question is not to find out a regex for IPv4 address, but to find out why this one, which seems correct, is not.
The literal . is only part of the 200-255 section of the capture group: railroad diagram.
Here's (([01]?\d{1,2})|(2([0-4]\d)|(5[0-5]))\.) formatted differently to help you spot the reason:
(
([01]?\d{1,2})
|
(2([0-4]\d)|(5[0-5])) \.
)
You're matching 0-199 or 200-255 with a dot. The dot is conditional on matching 200-255.
Additionally, as #SebastianProske pointed out, 2([0-4]\d)|(5[0-5]) matches 200-249 or 50-55, not 200-255.
You can fix your regex by adding capturing groups, but ultimately I would recommend not reinventing the wheel and using A) a pre-existing regex solution or B) parse the IPv4 address by splitting on dots. The latter method being easier to read and understand.
to fix yours up, just account for the "decimal" after each of the first three groups:
((2[0-4]\d|25[0-5]|[01]?\d{1,2})\.){3}(2[0-4]\d|25[0-5]|[01]?\d{1,2})
(*note that I reversed the order of the 2xx vs 1xx tests as well - prefer SPECIAL|...|NORMAL, or more restrictive first, when using alternations like this)
see it in action
I'm a regex noob and for the life of me I can't figure out how to match the third IPv4 address on line that contains three IPv4 addresses.
The line in question:
ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN
The regex I have so far that matches one IP:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
If I put a {3} at the end of it, it breaks. I think it has something to do with the spaces between the addresses but I can't figure out how to handle that. I need to capture the third address.
https://regex101.com/r/mN3cR6/1
You just need to add a multiline modifier to the code.
Your new code should be like this
/([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})/g
See this demo https://regex101.com/r/mN3cR6/2
Try
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?)+
This should match one, two, or three, or even more "IPs".
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
for exactly 3.
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?){3}
for a shorter formula with some possible errors.
Note that the basic idea is problematic too, as it matches "999.999.999.999" when it is definitely not a valid IP address.
The following should match the third ip
(?:[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s){2}([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
It's possible to be more compact depending what language you're using - for instance in ruby
string.scan(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/)[2]
would give you what you want. You could also collapse the multiple [0-9]{1,3}. instances using non matching groups and counts
The problem is, that the regex needs to not only contain the IPs but also the spaces between the IPs.
So adding a space into the repeated group should do the trick:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ){3}
If you don't want tat space in the final match, you make it non-greedy, using ?? (or *?):
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ??){3}
Also note, that your regex matches more than just valid IPs. e.g. 999.999.999.999 would match nicely.
You are already matching all three IPs with that regex.
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
Match 1
214.25.48.547
Match 2
255.255.255.255
Match 3
16.48.75.46
You can test it here:
http://rubular.com/
The problem may be with how you are trying to access them.
In Ruby, your regex works perfectly:
regex = /([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/
"ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN".scan(regex)
=> [["214.25.48.547"], ["255.255.255.255"], ["16.48.75.46"]]
I am using a system which takes a PCRE compatible regular expression.
The system stores capture group 1 into a database.
I need to capture two halves of a string with a delimiter, excluding the delimiter, as a single capture group.
Given the string: "I want to capture this bit but not this bit and definitely this bit"
I get that I could create a regex like:
([A-Za-z\s]*) but not this bit([A-Za-z\s]*)
This would give me two capture groups:
Group 1: "I want to capture this bit"
Group 2: " and definitely this bit"
However, I miss out on half my result, as group 1 is all that is stored.
You may be thinking about the branch reset feature. But this is only an assumption.
(?|([a-zA-Z\s]+) but not this bit|([a-zA-Z\s]+))
As stated in the comments, you can can fix this using the correct syntax.
([A-Za-z\s]+) but not this bit([A-Za-z\s]+)
So it turned out I had to do this programmatically, rather than relying on a single regex. Turns out Casimir was correct that it wasn't possible to do this with a single capture group, even following hwnd's suggestion, as below:
branch-reset does not result in a combined capture group
Also, yes, I had the wrong slash :-P