I am trying to match a sequence of four numbers that are separated by pipes in a string. The numbers may be negative, float, or double digits, for example:
13|5|-1|3 or 5|5|0|3 or 13|4|1.5|1
The string may also contain additional numbers and words; a full example looks like so:
SOME STRING CONTENT 13|5|-1|3 MORE 1.6 CONTENT HERE
How could I identify those numbers between and to the left/right of the pipes using regex?
I have tried [\d\-.\|] which matches all digits, decimals, pipes, and negative signs but also find it matches the additional number/decimal content in the string. Any help on just selecting that one section would be appreciated!
You can use
-?\b\d+(?:\.\d+)?(?:\|\-?\d+(?:\.\d+)?){3}\b
The pattern matches:
-? Match an optional -
\b A word boundary to prevent a partial match
\d+(?:\.\d+)? Match 1+ digits with an optional decimal part
(?:\|\-?\d+(?:\.\d+)?){3} Repeat 3 times the same as previous part preceded by a pipe
\b A word boundary
Regex demo
As well use
(?<!\S)-?\d*\.?\d+(?:\|-?\d*\.?\d+){3}(?!\S)
See proof.
EXPLANATION
--------------------------------------------------------------------------------
(?<! look behind to see if there is not:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
-? '-' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
\d* digits (0-9) (0 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\.? '.' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (3 times):
--------------------------------------------------------------------------------
\| '|'
--------------------------------------------------------------------------------
-? '-' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
\d* digits (0-9) (0 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\.? '.' (optional (matching the most amount
possible))
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
){3} end of grouping
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
\S non-whitespace (all but \n, \r, \t, \f,
and " ")
--------------------------------------------------------------------------------
) end of look-ahead
Related
I would like to match a specific pattern with regex but I am running into catastrophic backtracking. I wonder if there's a way it would be possible to match what I would like and not get an error.
I start with a simple assumption; I want my string to contain only one specific number e.g. 7 and only that specific number:
^\D*7\D*$
Only if I find this pattern do I want to look for another word in the same text such as "Coffee"; I put my condition into a group (^\D*7\D*$) and reference the group in my conditional and the then part will contain "Coffee":
(?(1)Coffee|)
Is there another phrasing that would avoid the the catastrophic backtracking?
You can use a negative lookahead to assert that the word Coffee is at the right.
^(?=.*\bCoffee\b)\D*7\D*$
The pattern matches:
^ Start of string
(?= Positive lookahead, assert that on the right is
.*\bCoffee\b Match Coffee between word boundaries \b to prevent a partial match
) Close lookahead
\D*7\D* Match number 7 between optional non digit characters.
$ End of string
Regex demo
Note that \D also matches a newline. If you don't want to cross newline boundaries, you can use [^\r\n\d] instead.
Left to right checking is more traditional:
^(?=.*Coffee)[^\d7]*7\D*$
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
Coffee 'Coffee'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[^\d7]* any character except: digits (0-9), '7' (0
or more times (matching the most amount possible))
--------------------------------------------------------------------------------
7 '7'
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the string
Right to left checking is only possible with engines like latest JavaScript, .NET or PyPi regex in Python:
^[^\d7]*7\D*$(?<=Coffee.*)
See proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
[^\d7]* any character except: digits (0-9), '7' (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
7 '7'
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
Coffee 'Coffee'
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of look-behind
I have a string that I've matched with regex but that same string is commented out with a # sign in front and regex keep matching it which I do not want.
My Regex
BLTY:\w{8}:\w{8}:\w{5}\.\w{7}\.\w{1}\.\w{3}\/\w{3}\/.*\(\w{4}\)
String
BLTY:ENCQ0000:SERVER:TEMP.PPMQ8FE.Y.323/TCP/gtg23.dev.pmt.com(3213)-> only match this
#BLTY:ENCQ0000:SERVER:TEMP.PPMQ8FE.Y.323/TCP/gtg23.dev.pmt.com(3213) -> I dont want to match this
Tried
^[BLTY:\w{8}:\w{8}:\w{5}\.\w{7}\.\w{1}\.\w{3}\/\w{3}\/.*\(\w{4}\)]
^BLTY:\w{8}:\w{8}:\w{5}\.\w{7}\.\w{1}\.\w{3}\/\w{3}\/.*\(\w{4}\)
(?!#)BLTY:\w{8}:\w{8}:\w{5}\.\w{7}\.\w{1}\.\w{3}\/\w{3}\/.*\(\w{4}\)
Also if there's a less verbose/optimized way of writing this regex Im open to hear
No need using lookaheads:
^BLTY(?::\w+){3}(?:\.\w+){3}/.*\(\d+\)$
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
BLTY 'BLTY'
--------------------------------------------------------------------------------
(?: group, but do not capture (3 times):
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
){3} end of grouping
--------------------------------------------------------------------------------
(?: group, but do not capture (3 times):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
){3} end of grouping
--------------------------------------------------------------------------------
/ '/'
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
\( '('
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\) ')'
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
I am using this regex for handling all sort of names:
String Regex_Name="^([A-Za-z]*|\\p{L})+([ ]*|[A-Za-z]*|[']*|\\p{L}*)+([\\s]?[A-Za-z]*)+[A-Za-z]$";
While running the code I am getting this error:
Unknown character property name {​​​​​​​L} near index 44
^[A-Za-z][[A-Za-z]*\p{​​​​​​​L}​​​​​​​*[,]?[ ]?[-]?[A-Za-z]+]+([ ]?[.]?[,]?[(]?[A-Za-z]+[)]?[-]?\p{​​​​​​​L}​​​​​​​*)+([,]?|[.]?)$
How can I solve the issue?
Use
String Regex_Name="^\\p{L}+(?:[’'-]\\p{L}+)*(?:\\s+\\p{L}+(?:[’'-]\\p{L}+)*)*$";
See proof.
The expression does not support shortened, abbreviated names, like John G. Smith.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[’'-] any character of: '’', ''', '-'
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
[’'-] any character of: '’', ''', '-'
--------------------------------------------------------------------------------
\p{L}+ any character of: letters (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
I am scanning a QR code and need a script to replace the commas with a ( \t)
My results are:
820-20171-002, ,Nov 24, 2020,,,13,283.40,,Mike Shmow
My problem is - I don't want a comma after the date. Right now I have the following - which does work to replace commas with a tab.
decodeResults[0].content.replace(/,/g, "\t");
I am trying to replace the /,/g with an expression to replace all commas except for the 3rd occurrence.
Use
.replace(/(?<!\b[a-zA-Z]{3}\s+\d{1,2}(?=,\s*\d{4})),/g, '\t')
See proof
Explanation
--------------------------------------------------------------------------------
(?<! Negative lookbehind start, fail if pattern matches
--------------------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
--------------------------------------------------------------------------------
[a-zA-Z]{3} any character of: 'a' to 'z', 'A' to 'Z'
(3 times)
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\d{1,2} digits (0-9) (between 1 and 2 times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
, ','
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
\d{4} digits (0-9) (4 times)
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
) end of negative lookbehind
--------------------------------------------------------------------------------
, ','
What is the best regex to get groups of keys, operator and values from a clause like the image below?
What I have done so far is not accurate and is only able to get the first group: (^.*?(=|!=)+([^.]*))
Use
(\w+(?:\.\w+)*)\s*(!=|=)\s*(\w+)
See proof
Explanation
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1
or more times (matching the most
amount possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
!= '!='
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
= '='
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \3