How to make a regular expression for IPv6 with prefix? - regex

I created this very complex regular expression(RegEx101) for IPv4 and IPv6
((^\s*((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\s*$)|(^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$))|(^\*$)
Below are three examples of data that can be checked by this regular expression.
2001:db8:abcd:0012:0000:0000:0000:0000 (ipv6)
0000:0000:2001:DB8:ABCD:12:: (condensed notation)
255.255.255.0 (ipv4)
but this regular expression does not work for IPv6 addresses with prefix.
For example:
2001:db8:abcd:0012::0/112
does not work.
How can this problem be fixed?

And if anybody in the future wants optional subnet masks for ipv4 as well as ipv6.
/((^\s*((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\s*(\/(\d|1\d|2\d|3[0-2]))?$)|(^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3}(\/(\d{1,2}|1[0-1]\d|12[0-8]))?)|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$))|(^\*$)/g
https://regex101.com/r/lS4Cjo/1

My understanding is that you want to optionally match a 'prefix'(a number at the end of the address which is always preceded by a forward slash) p such that 1≤p≤128. Let's try breaking this up.
Optionally match the following block:
Match a forward slash /
Either match a two digit number
Or match a number between 100 and 119 (inclusive)
Or match a number a number between 120 and 128 (inclusive)
The above is equivalent to this regex: (\/(\d{1,2}|1[0-1]\d|12[0-8]))?.
https://regex101.com/r/5cBm5a/3

Related

regex optional string in the middle followed by negative lookahead

I have following entries of 3 allowed ip in a config file:
logging host 10.1.1.1
logging host ipv4 10.1.1.2
logging host 10.1.1.3
ipv4 is an optional string.I need to make sure that there are no entries with unallowed ip.For eg: if there is a line:
logging host 10.1.1.4
then the file is invalid because 10.1.1.4 is not one of the three allowed ip.I have come up with a Java regex to check for existense of any unallowed ip:
^logging host (ipv4\s)?(?!10.1.1.1|10.1.1.2|10.1.1.3)
It only works when the optional string ipv4 is not present and not when the optional string is present as in the case of second entry: "logging host ipv4 10.1.1.2".The regex engine in the first attempt greedily matches upto "logging host ipv4" and the remaining string 10.1.1.2 exists as one of the options in the negative lookahead condition.Then the regex engine makes a second attempt to non greedily match only upto "logging host" as ipv4 is optional and then remaining string becomes "ipv4 10.1.1.2" which does not exist in the negative lookahead condition and so returns this whole line as unallowed ip which is not true.
What am I missing??
You get a partial match because you are not matching anything after the lookahead.
For example, in logging host 10.1.1.1 the lookahead sees the value that is not allowed after matching host and there are no other options to explore so the match fails.
In logging host ipv4 10.1.1.2 the ipv4 part will be matched. Then the lookahead will see the match that is not allowed. This time it can backtrack as the ipv4 part is optional. So it can get a match from the position before ipv4, and the match is logging host
You could shorten the pattern for the specific ip numbers to 10\.1\.1\.[123]
For example
^logging host (ipv4\s)?(?!10\.1\.1\.[123])\d{1,3}(?:\.\d{1,3}){3}$
Regex demo
Thanks very much to 'The fourth bird' for leading me to the answer with his important hints.
To summarize I need to ensure that the config file should not contain any unallowed logging host entries.The following are allowed host entries in the config file:
logging host 10.1.1.1
logging host 10.1.1.2
logging hsot ipv6 EFD7:DEA8:AEE4::11:3
The tricky bit here is using an optional for ipv6 did not solve the problem due to backtracking at the optional:
^logging host (ipv6\s)?(?!10.1.1.1|10.1.1.2|ipv6 EFD7:DEA8:AEE4::11:3)
The first solution uses Atomic Grouping to stop backtracking and the second solution is much simpler.
^logging (?>host ipv6|host)?\s(?!10.1.1.1|10.1.1.2|EFD7:DEA8:AEE4::11:3)
^logging host\s(?!10.1.1.1|10.1.1.2|ipv6 EFD7:DEA8:AEE4::11:3)

Regex for finding domains in a sentence but not IP addresses

I am trying to write a regular expression that will match domains in a sentence.
I found this post which was very useful and helped me create the following to match domains, but it also unfortunately matches IP addresses too which I do not want:
((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
I want to update my expression so that the following can still be found: in a sentence, between brackets, etc.:
www.example.com
subdomain.example.com
subdomain.example.co.uk
But not:
192.168.0.0
127.0.0.1
Is there a way to do this?
We could use a simple lookahead that excludes combinations of numbers and dots only: (?![\d.]+)
(?![\d.]+)((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
Demo
Answer from #wp78de is correct, however it would not detect the domains starting with Numerical digits i.e. 123reg.com
So remove the first group in the regex like this
((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

RegEx for IP Address with no range limit

I'm looking for a regex that finds IP Addresses with no range limit (i.e. 0-999). This is "simpler" than a regular IP Address regex but I'm learning regex and am stumped on how to essentially end the regex and not match IP Addresses with more than 4 periods or characters before/after it.
This is what I have: "/\b(\d{1,3}\.){3}(\d{1,3})\b/"
So, with this regex it will find most IP Addresses but will fail when there is an IP Address like this:
1.2.3.4.5
Appreciate the help. And it doesn't matter what flavor or regex, just need to know how to not match the case above.
You may use lookarounds to restrict the context around your expected matches:
\b(?<!\d\.)(?:\d{1,3}\.){3}\d{1,3}\b(?!\.\d)
^^^^^^^^^ ^^^^^^^^
See the regex demo
Here,
(?<!\d\.) is a negative lookbehind that fails the match if, immediately to the left of the current location, there is a digit + .
(?!\.\d) is a negative lookahead that fails the match if, immediately to the right of the current location, there is a . + a digit.
To also make sure the octets of 1 to 3 digits are matched, you may add more restriction:
\b(?<!\d\.|\d)(?:\d{1,3}\.){3}\d{1,3}\b(?!\.?\d)
^^^^^^^^^^^^ ^^^^^^^^^
See another regex demo.
Here, (?<!\d\.|\d) also fails if there is a digit immediately in front of the current location, and the lookahead is also failing when there is a digit without a dot in front after the expected match.
You can use this one also.
^[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}$

XSD pattern restriction for IP, excluding 0.0.0.0

I'm writing an XSD with an IP type.
I have the regex for 0.0.0.0 - 255.255.255.255 but so far I've failed to succeed with excluding 0.0.0.0
I've tried ?!0.0.0.0, but XSD doesn't support ?!
As part of your current regex, you have a subexpression (repeated four times perhaps) accepting the range 0 to 255. I'll refer to this as &re0;. Make a similar regex which accepts 1 to 255; I'll refer to this one as &re1;.
Construct a regex as a choice among:
&re-1;\.&re-0;\.&re-0;\.&re-0; (if the first value is non-zero, then it's not 0.0.0.0)
0\.&re-1;\.&re-0;\.&re-0; (even if the first value is zero, the second value being non-zero saves the overall expression from being 0.0.0.0)
0\.0\.&re-1;\.&re-0; (ditto for two leading zeroes ...)
0\.0\.0\.&re-1; (if you have three leading zeroes, the final value must be non-zero)

Regex: How can I match third IPv4 address?

I'm a regex noob and for the life of me I can't figure out how to match the third IPv4 address on line that contains three IPv4 addresses.
The line in question:
ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN
The regex I have so far that matches one IP:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
If I put a {3} at the end of it, it breaks. I think it has something to do with the spaces between the addresses but I can't figure out how to handle that. I need to capture the third address.
https://regex101.com/r/mN3cR6/1
You just need to add a multiline modifier to the code.
Your new code should be like this
/([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})/g
See this demo https://regex101.com/r/mN3cR6/2
Try
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?)+
This should match one, two, or three, or even more "IPs".
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
for exactly 3.
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?){3}
for a shorter formula with some possible errors.
Note that the basic idea is problematic too, as it matches "999.999.999.999" when it is definitely not a valid IP address.
The following should match the third ip
(?:[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s){2}([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
It's possible to be more compact depending what language you're using - for instance in ruby
string.scan(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/)[2]
would give you what you want. You could also collapse the multiple [0-9]{1,3}. instances using non matching groups and counts
The problem is, that the regex needs to not only contain the IPs but also the spaces between the IPs.
So adding a space into the repeated group should do the trick:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ){3}
If you don't want tat space in the final match, you make it non-greedy, using ?? (or *?):
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ??){3}
Also note, that your regex matches more than just valid IPs. e.g. 999.999.999.999 would match nicely.
You are already matching all three IPs with that regex.
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
Match 1
214.25.48.547
Match 2
255.255.255.255
Match 3
16.48.75.46
You can test it here:
http://rubular.com/
The problem may be with how you are trying to access them.
In Ruby, your regex works perfectly:
regex = /([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/
"ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN".scan(regex)
=> [["214.25.48.547"], ["255.255.255.255"], ["16.48.75.46"]]