I'm writing an XSD with an IP type.
I have the regex for 0.0.0.0 - 255.255.255.255 but so far I've failed to succeed with excluding 0.0.0.0
I've tried ?!0.0.0.0, but XSD doesn't support ?!
As part of your current regex, you have a subexpression (repeated four times perhaps) accepting the range 0 to 255. I'll refer to this as &re0;. Make a similar regex which accepts 1 to 255; I'll refer to this one as &re1;.
Construct a regex as a choice among:
&re-1;\.&re-0;\.&re-0;\.&re-0; (if the first value is non-zero, then it's not 0.0.0.0)
0\.&re-1;\.&re-0;\.&re-0; (even if the first value is zero, the second value being non-zero saves the overall expression from being 0.0.0.0)
0\.0\.&re-1;\.&re-0; (ditto for two leading zeroes ...)
0\.0\.0\.&re-1; (if you have three leading zeroes, the final value must be non-zero)
Related
I have a bunch of 4XX status codes and I want to match all except 400 and 411.
What I currently have so far:
/4\d{2}(?!400|411)/g
Which according to regexer:
Matches a 4XX
Should specify a group that cannot match after the main expression (but it is here where my expression is failing).
The expression 4\d{2}(?!400|411) First matches a 4 and then 2 digits. After the matching, it asserts not 400 and 411 directly to the right.
Instead, you can match 4 and first assert not 00 or 11 directly after it.
4(?!00|11)\d{2}
Or without partial word matches using word boundaries \b:
\b4(?!00|11)\d{2}\b
See a regex demo
If this is for a RegEx embedded in a programming language, I'd recommend using string comparisons or converting the status code to a number if it isn't already to then do number comparisons.
If you're forced to use RegEx, another, arguably less readable option (that does however make use of fewer RegEx features and is thus more portable): 4(0[1-9]|[1-9][02-9])
If the second character is a zero, the third character must be a nonzero digit;
If the second character is a one, the third character must be any digit other than one
If you don't want/need a capturing group, change the RegEx to 4(?:...).
I created this very complex regular expression(RegEx101) for IPv4 and IPv6
((^\s*((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\s*$)|(^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$))|(^\*$)
Below are three examples of data that can be checked by this regular expression.
2001:db8:abcd:0012:0000:0000:0000:0000 (ipv6)
0000:0000:2001:DB8:ABCD:12:: (condensed notation)
255.255.255.0 (ipv4)
but this regular expression does not work for IPv6 addresses with prefix.
For example:
2001:db8:abcd:0012::0/112
does not work.
How can this problem be fixed?
And if anybody in the future wants optional subnet masks for ipv4 as well as ipv6.
/((^\s*((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\s*(\/(\d|1\d|2\d|3[0-2]))?$)|(^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3}(\/(\d{1,2}|1[0-1]\d|12[0-8]))?)|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$))|(^\*$)/g
https://regex101.com/r/lS4Cjo/1
My understanding is that you want to optionally match a 'prefix'(a number at the end of the address which is always preceded by a forward slash) p such that 1≤p≤128. Let's try breaking this up.
Optionally match the following block:
Match a forward slash /
Either match a two digit number
Or match a number between 100 and 119 (inclusive)
Or match a number a number between 120 and 128 (inclusive)
The above is equivalent to this regex: (\/(\d{1,2}|1[0-1]\d|12[0-8]))?.
https://regex101.com/r/5cBm5a/3
Hi I have the NSRegular expression below meant to pull out coordinates from a string such as "167628,79009\r' delivered via a serial port using ORSSerial. The expression, however, matches 8,79009 instead of delivering the full first coordinate. The regex is also used internally by ORSSerial to validate incoming data on the serial port and delivers the truncated string.
If I replace the regex with "(\d{5}),(\d+)\r" it works but this will only be useful when the coordinates delivered are a 5 digit number. If I use d{1,5} I get the same result as when using the line-start anchor.
The regex is ignoring the anchors. Any Suggestions ?
Code
coordinatePacketRegex = [[NSRegularExpression alloc] initWithPattern:#"^(\\d+),(\\d+)\r"
options:NSRegularExpressionAnchorsMatchLines
error:®exError];
Alright, having said that:
If I replace the regex with "(\d{5}),(\d+)\r" it works but this will
only be useful when the coordinates delivered are a 5 digit number.
Your actual problem is that you use NSRegularExpressionAnchorsMatchLines. It will fail on a string like " 167628,79009\r". Don't use this option, use the zero option instead.
I would propose using: (\d{5,}),(\d+)\r (please notice, I added an extra comma after 5. Will serve you for the case of "at least 5 digits in the integral part, capturing as many as possible before a comma". If you are not bound to minimum 5 characters before a comma, just use ((\d+),(\d+)\r). Having the enclosing braces lets you access the whole match as the 0th capture group.
I'm looking for a regular expression that would match anything that could be a valid RFC1123 hostname in a string that can contain anything. The idea is to extract everything that could possibly be a hostname (by checking that the substring follows all requirements to be one) - except for the maximum length of 255 characters, which is easy to check on the results afterwards.
I initially came up with:
/(^|[^a-z0-9-])([a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?(\.[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?)*)([^a-z0-9-]|$)/i
While this matches some hostnames in parenthesized expression 2 (as intended), it seems to skip others. Looking the problem up on stack overflow, I found this related question:
Regular expression to match DNS hostname or IP Address?
Judging by the positive feedback the answer should be correct (although it doesn't verify label size), so I thought I'd give it a try. I converted their expression to an extractable format similar to my previous one:
/(^|[^a-z0-9-])((([a-z0-9]|[a-z0-9][a-z0-9-]*[a-z0-9])\.)*([a-z0-9]|[a-z0-9][a-z0-9-]*[a-z0-9]))([^a-z0-9-]|$)/i
Again, it should return the desired results in parenthesized expression 2, but it appears to skip some valid substrings. I believe there may be a problem with the way I'm checking for delimiters that are not part of the hostname.
Any ideas?
Figured it out. When scanning a string for sequential matches, using delimiters both before and after the desired expression means two characters must be consummed between each pair of hostnames. So when hostnames are only one character apart, the second one is skipped!
To obtain correct results one must simply remove the leading delimiter:
/([a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?(\.[a-z0-9]([a-z0-9-]{0,61}[a-z0-9])?)*)([^a-z0-9-]|$)/i
It is only necessary for validation, not scanning.
so basically I want to detect if in these strings:
Hello 123 My 222 dear 112 troll 12 8889
192.1.1.254:10000
the numbers are in a format like this:
[0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 255][ANYTHING][0 to 65536]
Does anyone know how I can build such a regex?
It is for detecting if anyone posts an IP:Port in unusual format to bypass default ip:port filters.
Edit: As for the first comment: I do not know regex and what I have tried is:
if(regex_match("192.168 najlepszy serwer SAMP!!1 1 join1!! 8080","/^[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*).(*)[0-2](*)?[0-5](*)?[0-5](*)?$/"))
{
print("Cannot send message");
}
else
{
print("New message for everyone! :)");
}
and some other not working regexes.
If you don't want to complicate your life checking the exact ranges, the simple regex would be:
/^.*(\d)+.+(\d)+.+(\d)+.+(\d)+.+(\d)+.*$/
The first four (\d)+ parts can be replaced with more complicated check for 0-255 range:
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
the last (\d)+ replace with next for port range check:
(6553[0-5]|655[0-2]\d|65[0-4]\d\d|6[0-4]\d\d\d|[1-5]\d\d\d\d|[1-9]\d{0,3})
An exact, simple, and direct representation of your pattern as a regular expression is not possible in the general case. The reason are the number ranges. Something like "at this place any integral number with a value from a to b" is just to complex. A regular expression is executed by a finite state machine and these (theoretical) beasts are (basically) only able to look at strings character by character. Therefore you can match something like "ignore all characters until you find the first digit, then check whether the first digit is followed by at most two more digits".
As a workaround you may try to build a list of alternations of possible digit patterns that covers your desired range of values (in the extreme case list every single value like \b(?:1|2|3|4|...|154|155|...|255)\b). I have a pattern for the range 0-255, but I have none for the range of possible port numbers. So a first approximation may be (really, this is only an approximation and not thoroughly tested):
\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b.*\b(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\b[^0-9]*[0-9]{1,5}
In the above pattern (?: .... ) means a shy group (not remembered for back references) and \b means word boundary.
I'd suggest you read up on Regex syntax. For starters . is special and matches any character. Also doing something like [0-2][0-5][0-5] won't catch something like 192 as 9 is not within 0-5.
According to your requirements here's a Regex that should roughly do what you want
([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*([0-2]?\d{1,2}).*(\d{1,5})?
Each of the ([0-2]?\d{1,2}) portions will match 1 or 2 digits preceded optionally with a 0,1, or 2. Each () will capture a group which you can then examine using a Regex engine. You will need to examine this group as the Regex for each of those portions will match numbers above 255 (specifically 256-299).
The last group (\d{1,5})? is to catch the port number, again you will have to examine this as it will catch any 1 to 5 digit number (hence the {1,5}). The ? makes the group optional, remove it if you want it to have to match against a port number.
As far as doing Regex in C, I haven't had much experience but there should be a way to get all the grouped matches and inspect them. Unfortunately they will be strings so you will have to convert them to integers to examine them.
Are you sure you need regex for this? In my opinion, you do not need regex for this.
Just split numbers into groups which are seperated by non-numeric characters. Then analyze.
What language?
As for actually looking for valid range, take a look at this;
http://www.regular-expressions.info/numericranges.html
I would do this simple regex
((\d|\D)+)*