I have an identifier that contains letters or digits and dashes.
What I would like to do is to keep the first 3 letters before the first dash and delete the rest and then keep the 2 first letters after the first dash.
For instance, I have the following id :
9D3236A9-B496-4597-87E4-3A3FB69D07BF
The output ID should be : 9D3B445873A3.
I have tried:
^.{3}\-
but nothing happens. Can you please help with that?
You may use
^([A-Za-z0-9]{3})[A-Za-z0-9]*|-([A-Za-z0-9]{3})[A-Za-z0-9]*$|-([A-Za-z0-9]{2})[A-Za-z0-9]*
Replace with $1$2$3. See the regex demo.
Details
^ - start of string
([A-Za-z0-9]{3}) - Group 1 ($1 in the replacement): 3 alphanumeric chars
[A-Za-z0-9]* - 0+ alphanumerics
| - or
- - a hyphen
([A-Za-z0-9]{3}) - Group 2 ($2 in the replacement): 3 alphanumeric chars
[A-Za-z0-9]* - 0+ alphanumerics
$ - end of string
|
- - a hyphen
([A-Za-z0-9]{2}) - Group 3 ($3 in the replacement): 2 alphanumeric chars
[A-Za-z0-9]* - 0+ alphanumerics.
You can use the regex given in this demo
(^.{3})[a-z0-9A-Z]*((?>-).{2})[a-z0-9A-Z]*((?>-).{2})[a-z0-9A-Z]*((?>-).{2})[a-z0-9A-Z]*((?>-).{2})[a-zA-Z0-9]*
Related
I'm trying to match all fractions or 'evs' and strings (string1, string2) the following string with regex. The strings may contain any number of white spaces ('String 1', 'The String 1', 'The String Number 1').
10/3 string1 evs string2 8/5 mon 19:45 string1 v string2 1/1 string1 v string2 1/1
The following regex works in Javascript but not in PHP. No errors are returned, just 0 results.
(\d{1,3}\/\d{1,3}|evs).*?(.+).*?(\d{1,3}\/\d{1,3}|evs).*?(.+).*?(\d{1,3}\/\d{1,3}|evs).*?(.+) v (.+).*?(\d{1,3}\/\d{1,3}|evs).*?(.+) v (.+).*?(\d{1,3}\/\d{1,3}|evs)
Here's the expected result, other than group 6 and 7 (ran using Javascript):
If I add a ? to the first (.+) so that it becomes (.+?), I get the desired result but with the first string not captured:
As soon as I remove the ? to capture the whole string, there are no results returned. Can somebody work out what's going on here?
In PCRE/PHP, you may use
$regex = '(\d{1,3}\/\d{1,3}|evs)\s+(\S+)\s+((?1))\s+(\S+)\s+((?1))\s+(.+?)\s+v\s+(\S+)\s+((?1))\s+(\S+)\s+v\s+(\S+)\s+((?1))';
if (preg_match_all($regex, $text, $matches)) {
print_r($matches[0]);
}
See the regex demo
The point is that you can't over-use .*? / .+ in the middle of the pattern, that leads to catastrophic backtracking.
You need to use precise patterns to match whitespace, and non-whitespace fields, and only use .*? / .+? where the fields can contain any amount of whitespace and non-whitespace chars.
Details
(\d{1,3}\/\d{1,3}|evs) - Group 1 (its pattern can be later accessed using (?1) subroutine): one to three digits, / and then one to three digits, or evs
\s+(\S+)\s+ - 1+ whitespaces, Group 2 matching 1+ non-whitespace chars, 1+ whitespaces
((?1)) - Group 3 that matches the same way Group 1 pattern does
\s+(\S+)\s+((?1))\s+ - 1+ whitespaces, Group 4 matching 1+ non-whitespaces, 1+ whitespaces, Group 5 with the Group 1 pattern, 1+ whitespaces
(.+?) - Group 6: matching any 1 or more char chars other than line break chars as few as possible
\s+v\s+ - v enclosed with 1+ whitespaces
(\S+) - Group 7: 1+ non-whitespaces
\s+((?1))\s+ - 1+ whitespaces, Group 8 with Group 1 pattern, 1+ whitespaces
(\S+) - Group 9: 1+ non-whitespaces
\s+v\s+ - v enclosed with 1+ whitespaces
(\S+)\s+((?1)) - Group 10: 1+ non-whitespaces, then 1+ whitespaces and Group 11 with Group 1 pattern.
I have a regex that captures the following expression
XPT 123A
Now I need to add "something" to my regex to capture the remaining string as a group
XPT 123A I AM VERY HAPPY
So XPT would be group 1, 123A group 2, and I AM VERY HAPPY group 3.
Here is my regex (also here http://regexr.com/4mocf):
^([A-Z]{2,4}).((?=\d)[a-zA-Z\d]{0,4})
EDIT:
I dont want to name my groups (editing b/c some people thought it was a dup of another question)
Assuming Group 3 is optional, you may use
^([A-Z]{2,4}) (\d[a-zA-Z\d]{0,3})(?: (.*))?$
^([A-Z]{2,4})\s+(\d[a-zA-Z\d]{0,3})(?:\s+(.*))?$
The \s+ matches any 1+ whitespace chars.
See the regex demo.
Details
^ - start of string
([A-Z]{2,4}) - Group 1: two, three or four uppercase ASCII letters
\s+ - 1+ whitespaces
(\d[a-zA-Z\d]{0,3}) - Group 2: a digit followed with 0 or more alphanumeric chars
(?:\s+(.*))? - an optional non-capturing group matching 1 or 0 occurrences of:
\s+ - 1+ whitespaces
(.*) - Group 3: any 0+ chars other than line break chars as many as possible
$ - end of string
Just add the following suffix to your regex to capture the rest of the line:
(?<rest>.+)?$
I need a regex that matches at most 9 digits with any number of space and/or hyphen (leading, trailing or within the digits), what should it look like?
I tried:
^[0-9 \\-].*?$
and
^\\d{9}
but they only serve part of my purpose and need a way to merge them together.
Thanks!
Try this regex:
^(?:[ -]*\d[ -]*){1,9}$
Click for Demo
Explanation:
^ - asserts the start of the string
(?:[ -]*\d[ -]*){1,9}
[ -]* - matches 0+ occurrences of either a space or a -
\d - matches a digit
[ -]* - matches 0+ occurrences of either a space or a -
{1,9} - matches 1 to 9 occurrences of a digit preceded or succeeded by either 0+ spaces or 0+ -
$ - asserts the end of the string
I am trying to separate a String into different parts that match a specif syntax.
The String I am using as example is Username 5/5, Version: 1.0 This is a custom message Sep 25, 2018.
Currently I have this Regex (\w+) ([0-9]\/[0-9]), (\w+): ([0-9][.][0-9][.]?[0-9]?) which gives me The username, the 5/5, the word version and the version 1.0.
First, how can I ignore the (\w+)? Since it'll always be version and I only need the number after.
Second question, is it possible to get the big message after the version, then get the date after it?
Output needed:
Username
5/5
1.0
This is a custom message
Sep 25, 2018
You may use
/^(\w+)\s+(\d+\/\d+),\s+\w+:\s*(\d+(?:\.\d+){1,2})\s*(.*?)\s*([a-zA-Z]+\s*\d{1,2},\s*\d{4})$/
See the regex demo
Details
^ - start of string
(\w+) - Group 1 (username): one or more letters, digits or _
\s+ - 1+ whitespaces
(\d+\/\d+) - Group 2 (5/5)
,\s+ - a comma and 1+ whitespaces
\w+: - 1+ word chars followed with :
\s* - 0+ whitespaces
(\d+(?:\.\d+){1,2}) - Group 3 (version number):
\d+ - 1+ digits
(?:\.\d+){1,2} - 1 or 2 sequences of a . followed with 1+ digits
\s* - 0+ whitespaces
(.*?) - Group 4 (message): any 0+ chars, as few as possible
\s* - 0+ whitespaces
([a-zA-Z]+\s*\d{1,2},\s*\d{4}) - Group 4 (date):
[a-zA-Z]+ - 1+ ASCII letters
\s* - 0+ whitespaces
\d{1,2} - 1 to 2 digits
,\s* - a comma and 0+ whitespaces
\d{4} - 4 digits
$ - end of string.
Try (.*)\s(\d\/\d),\s*Version:\s*(\d+\.\d+)\s*(.+?)\s*(\w{3} \d{1,2}, \d{4})
Capture the groups 1,2,3,4,5 to get the output you needed.
Regex
10001.000.01.01-A-AB - I need to write regex in the following format. this is taking care of until numbers decimal need to add characters
/^\d{4,6}(\.\d{3})(\.\d{2}(\.\d{2})?(\.\d{2})?)?$/
0001.000-A
0001.000.01-A
0001.000.01.01-A
0001.000.01.01-A-AB
10001.000.01.01-A-AB
Any help greatly appreciated.
It seems you may use
^\d{4,6}\.\d{3}(?:\.\d{2}(?:\.\d{2})?(?:\.\d{2})?)?(?:-[A-Z]+(?:-[A-Z]+)?)?$
See the regex demo
Details
^ - start of string
\d{4,6} - 4 to 6 digits
\.\d{3} - a . and 3 digits
(?:\.\d{2}(?:\.\d{2})?(?:\.\d{2})?)? - an optional group matching
\.\d{2} - a dot and 2 digits
(?:\.\d{2})? - an optional sequence of . and 2 digits
(?:\.\d{2})? - ibid.
(?:-[A-Z]+(?:-[A-Z]+)?)? - an optional non-capturing group matching 1 or 0 occurrences of:
- - a hyphen
[A-Z]+ - 1 or more ASCII uppercase letters
(?:-[A-Z]+)? - an optional sequence of:
- - a hyphen
[A-Z]+ - 1 or more ASCII uppercase letters
$ - end of string