regex matching conditional strings - regex

For example, if I have the following strings:
99%89 (should match)
99%? (should match)
?%99 (should match)
?%? (should not match)
?%99%99 (should match)
99%99%99%? (should match)
essentially the first or second element can be a ? or a number, but both elements cannot be ?. I tried thinking of something like:
[0-9]*|[?](?!\?)[%][0-9]*|[?]
But this does not yield the correct answer, any help would be appreciated

With your shown samples, could you please try following.
^(?:(?:\?(?:(?:%\d+){1,})?)|(?:(?:(?:\d+%){1,})?\?(?:(?:%\d+){1,})?)|(?:\d+%\d+))$
Online demo for above regex
Explanation: Adding detailed explanation for above.
^(?: ##Matching from starting of the value, starting a non-capturing group from here.
(?:\? ##Starting non-capturing group(one for understanding purposes) matching literal ? here.
(?:(?:%\d+){1,})? ##In a non capturing group looking for % with 1 or more occurrences of digits and matching this group match keeping it optional.
)| ##Closing one non-capturing group here, with OR condition here.
(?: ##Starting non-capturing group(two) here.
(?:(?:\d+%){1,})?\? ##Looking for digits with % one or more occurrences in a non-capturing group keeping it optional followed by ?
(?:(?:%\d+){1,})? ##Checking for % digits one or more occurrences in a non-capturing group keeping it optional followed by ?
)| ##Closing two non-capturing group here, with OR condition here.
(?:\d+%\d+) ##In a non-capturing group looking for 1 or more digits % one or more digits
)$ ##Closing 1st non-capturing group at the end of value.

Not sure if I am reading the question right, but as you tried using a negative lookahead you could assert that the string does not only contains % and/or ?
^(?![%?]+$)[\d?%]+$
Regex demo
Or without a lookahead:
^[%?]*\d[%?\d]*$
Regex demo

Related

Regexp_Extract - Data Studio extract value after second underscore

I have a simple string separated by underscores from which I need to pull all the values after a specific underscore using a regular expression with the REGEXP_EXTRACT formula in Google Data Studio
The strings look like this:
ABC123_DEF456_GHI789-JKL274
Basically the values after the second underscore can be alphanumeric or symbols as well.
I need to pull the values after the second underscore. In the case of the example I gave, it would be:
GHI789-JKL274
Any ideas would be greatly appreciated.
With your shown samples please try following regex.
^(?:.*?_){2}([^_]*)
OR
REGEXP_EXTRACT(yourField, "^(?:.*?_){2}([^_]*)")
Here is the Online Demo for used regex.
Explanation: Adding a detailed explanation for used regex here.
^ ##Matching from starting of the value here.
(?: ##Opening 1 non-capturing group here.
.*?_ ##Using Lazy match to match till next occurrence of _ here.
){2} ##Closing non-capturing group here and matching its 2 occurrences.
( ##Creating 1 and only capturing group here.
[^_]* ##Matching everything before _ here.
) ##Closing capturing group here.
You need to use
REGEXP_EXTRACT(some_field, "^(?:[^_]*_){2}([^_]*)")
See the regex demo.
Details:
^ - start of string
(?:[^_]*_){2} - two occurrences of any zero or more chars other than _ and then a _
([^_]*) - Capturing group #1: zero or more chars other than _.

Regex pattern for matching float followed by some fixed strings

I want to a regex pattern that could match the following cases:
0, 1, 0.1, .1, 1g, 0.1g, .1g, 1(g/100ml), .1(g/ml)
If the regex matches the pattern, I want to capture only the numerical part(0,1,0.1..)
I tried using following regex but it matches many cases:
((?=\.\d|\d)(?:\d+)?(?:\.?\d*))|((?=\.\d|\d)(?:\d+)?(?:\.?\d*))[a-zA-Z]+?|\([^)]*\)
How to achieve above with single regex pattern?
Edit:
To make the question solution more generic
What would be a single regex that would match below
Any numerical ( 0, 1, 0.1, ...)
Any numerical followed by g, mg any characters (0.1g, .1mg, 100kg)
Any numerical followed by anything in parentheses - .1(g/100ml), 100(mg/1kg)
And just capture the numerical part
You could make the pattern a bit more specific and use a capture group for the digits and optionally match what follows or (Updated with the comment of # anubhava) add a word boundary to prevent another partial match.
(\d*\.?\d+)(?:\(g\/\d*ml\)|g?\b)
(\d*\.?\d+) Capture group 1, match optional digits, optional . and 1+ digits
(?: Non capture group for the alternation
\(g\/\d*ml\) Match (g/ optional digits and ml)
| Or
g?\b Match an optional g followed by a word boundary
) Close non capture group
Regex demo
If the values should match in the comma separated string, you can assert either a , or the end of the string to the right.
(\d*\.?\d+)(?:\(g\/\d*ml\)|g)?(?=,|$)
Regex demo
Edit
A broad pattern to match anything between parenthesis or optional chars a-zA-Z after the digits:
(\d*\.?\d+)(?:\([^()]*\)|[a-zA-Z]*\b)
(\d*\.?\d+) Capture group 1, match optional digits, optional . and 1+ digits
(?: Non capture group
\([^()]*\) Match from opening till closing parenthesis
| Or
[a-zA-Z]*\b Optionally match chars in the ranges a-zA-Z followed by a word boundary
) Close non capture group
Regex demo
EDIT2: With OP's edited samples(to match 0, 1, 0.1 OR (0.1g, .1mg, 100kg) OR .1(g/100ml), 100(mg/1kg)), adding following solution here. Explanation is same as very first solution, only thing is in spite of matching specific strings, I have changed regex to match any alphabets here.
(\d*\.?\d+)(?:[a-zA-Z]+|\([a-zA-Z]+(?:\/\d*(?:[a-zA-Z]+))?\)|(?:,\s+|$))
Online Demo for above regex
EDIT1: As per OP's comments to match .01c and 100(g/1000L) kind of examples adding following regex, which is small edit to 1st solution here.
(\d*\.?\d+)(?:g|cc|\(g(?:\/\d*(?:ml|L))?\)|(?:,\s+|$))
Online demo for above regex
With your shown samples, please try following regex here.
(\d*\.?\d+)(?:g|\(g(?:\/\d*ml)?\)|(?:,\s+|$))
Online demo for above regex
Explanation: Adding detailed explanation for above.
(\d*\.?\d+) ##Matching digits 0 or more occurrences followed by .(optional, followed by 1 or more digits occurrences here.
(?: ##Starting a non-capturing group here.
g| ##matching only g here OR.
\(g(?:\/\d*ml)?\)| ##Matching (g) OR (g/digits ml) here OR.
(?:,\s+|$) ##Matching comma followed by 1 or more spaces occurrences OR end of value here.
) ##Closing non-capturing group here.
try this:
[\d]?\.?\d+(?:g|(?<p>\()(?(p)g\/(?:\d+)?ml\)))?
Demo

Regex to split up string of CPU usage without percentage

Is it possible to get the result just first two digits without % in the first group. Iam using Telegraf with Grafana.
Example:
5 Secs ( 22.3463%) 60 Secs ( 25.677%) 300 Secs ( 21.3522%)
Result:
22
I found out this regex in the similar topic, but it's return bad format for me :
^\s*\d+\s+Secs\s*\(\s*(\d+(?:\.\d+)?%)\)\s+\d+\s+Secs\s+\(\s+(\d+(?:\.\d+)?%)\)\s+\d+\s+Secs\s+\(\s+(\d+(?:\.\d+)?%)\)$
You can update your pattern to use a single capturing group by relocating the parenthesis around the digits only for the first occurrence.
You can omit the second and third capture groups as you don't need them.
^\s*\d+\s+Secs\s*\(\s*(\d+)(?:\.\d+)?%\)\s+\d+\s+Secs\s+\(\s+\d+(?:\.\d+)?%\)\s+\d+\s+Secs\s+\(\s+\d+(?:\.\d+)?%\)$
^ ^
Regex demo
Or you might use a named capture group, for example digits
^\s*\d+\s+Secs\s*\(\s*(?P<digits>\d+)(?:\.\d+)?%\)\s+\d+\s+Secs\s+\(\s+\d+(?:\.\d+)?%\)\s+\d+\s+Secs\s+\(\s+\d+(?:\.\d+)?%\)$
With your shown samples, please try following regex.
^\d+\s+Secs\s+\(\s+(\d+)(?:\.\d+%)?\)(?:\s+\d+\s+Secs\s+\(\s+\d+(?:\.\d+)?%\))*
Online demo for above regex
Explanation: Adding detailed explanation for above.
^\d+\s+Secs\s+\(\s+ ##From starting of value matching digits(1 or more occurrences) followed by space(s) Secs spaces ( spaces.
(\d+) ##Creating 1st and only capturing group where we have digits in it.
(?:\.\d+%)?\) ##In a non-capturing group matching dot digits % ) keeping it optional followed by )
(?: ##Creating a non-capturing group here.
\s+\d+\s+Secs\s+\(\s+\d+ ##matching spaces digits spaces Secs spaces ( spaces digits
(?:\.\d+)? ##In a non-capturing group matching dot digits keeping it optional.
%\) ##matching % followed by ) here.
)* ##Closing very first non-capturing group, and matching its 0 or more occurrences.
If it's just the 1st occurrence you're after, wouldn't the following work?
/secs\s*\(\s*(\d+)/i

Google Forms Regular Expressions (REGEX) comma delaminated (CSV)

I have a Google form field, that contains 1 or more Id's
Patterns:
The IDs are always 6 numbers.
If only one ID is entered, a comma and a space is NOT required.
If more than one ID is entered, a comma and a space is required.
If more than one ID is entered, the last ID, should not have a comma or a space at the end.
Allowed Examples:
a single ID: 123456
multiple ID: 123456, 456789, 987654
Here is my current REGEX (does not work correctly)
[0-9]{6}[,\s]?([0-9]{6}[,\s])*[0-9]?
What am I doing wrong?
With your shown samples, could you please try following.
^((?:\d{6})(?:(?:,\s+\d{6}){1,})?)$
Online demo for above regex
Explanation: Adding detailed explanation of above regex.
^( ##Checking from starting of value, creating single capturing group.
(?:\d{6}) ##Checking if there are 6 digits in a non-capturing group here.
(?: ##Creating 1st non-capturing group here
(?:,\s+\d{6}) ##In a non-capturing group checking it has comma space(1 or more occurrences) followed by 6 digits here.
){1,})? ##Closing 1st non-capturing group here, it could have 1 or more occurrences of it.
)$ ##Closing 1st capturing group here with $ to make sure its end of value.
You can use
^\d{6}(?:,\s\d{6})*$
^ Start of string
\d{6} Match 6 digits
(?: Non capture group to repeat as a whole
,\s\d{6} Match a , a whitespace char and 6 digits
)* Close group and optionally repeat
$ End of string
Regex demo

Regex to match specific domain and it's subfolder

I want to match a particular domain and its subdomain, no matter how it's entered. In the following example, I want to match all ´test.comif nothing comes after it (only a slash or query strings) OR if a specific folder follows it in this case it's named as:subfolder`. Again the subfolder could have / or query strings after.
Domain
Match
test.com
match
https://test.com
match
https://test.com?foo=bar
match
https://test.com/
match
https://test.com/?foo=bar
match
https://www.test.com
match
https://www.test.com/subfolder
match
https://www.test.com/subfolder/
match
https://www.test.com/subfolder/?foo=bar
match
test.com/subfolder
match
https://www.test.com/foo
no match
test.com/foo
no match
https://www.test.com/jason
no match
https://www.test.com/jason?foo=bar
no match
Right now I have the following regex:
^(?:\S+://)?[^/]+/?$
The problem though is that it matches ANY domains, which is not what I need. I want to match a specific domain and a specific subfolder.
How is this possible?
You may use this regex:
^(?:https?://)?(?:www\.)?test\.com(?:/subfolder)?/?(?:\?\S*)?$
RegEx Demo
RegEx Demo:
^: Start
(?:https?://)?: *optionally* match http://orhttps://`
(?:www\.)?: optionally match www.
test\.com: match test.com
(?:/subfolder)?: optionally match /subfolder
/?: optionally match a trailing /
(?:\?\S*)?: optionally match query string starting with ?
$: End
With your shown samples, could you please try following.
^(?:(?:https?:\/\/)(?:www\.)?)?test\.com(?:(?:(?:\/)?(?:\/subfolder\/?)?(?:\/\?\S+\/?)?)?(?:\?\S+)?)?(?:\/)?$
Online demo for above regex
Explanation: Adding detailed explanation for above.
^ ##Starting of match here by caret sign.
(?: ##Starting non-capturing group here.
(?:https?:\/\/) ##In this non-capturing group which has http/https// in it to match.
(?:www\.)? ##In this non-capturing group keeping www. as an optional here.
)? ##Closing very first non-capturing group here.
test\.com ##Matching string test.com here. (1, calling it 1 for explanation purposes)
(?: ##Starting a non-capturing group here.
(?: ##Starting one more non-capturing group here.(2, calling it for explanation purposes only)
(?:\/)? ##Matching / optional in a non-capturing group here.
(?:\/subfolder\/?)? ##Matching /subfolder /(as optional) and whole non-capturing group as optional.
(?:\/\?\S+\/?)? ##Matching /? and all non-space characters followed by /(optional) in non-capturing group, keep this optional.
)? ##Closing (2) non-capturing group here.
(?: ##Starting non-capturing group here.
\?\S+ ##Matching ? non-spaces values here.
)? ##Closing non-capturing group here.
)? ##Closing (1) non-capturing group here.
(?: ##Starting non-capturing group here.
\/ ##Matching single / here.
)? ##Closing non-capturing group here, keeping it optional.
$ ##Mentioning $ to tell the end of value(match).