REGGEX Matching just once inside multiple delimiters - regex

In a pattern X-Y-Z where the delimiters are "-" i want to check if Y has the size 8 without repetions.
Y could be a subset like Y = (A-B-C) but Y just has a value 1 if there's no
1 - num-12345678-num -> In this case I want that Y has a value.
2 - num-12345678-234-213-num -> Since Y is a subset (12345678-234-213) Y should have a different value.
The reggex i'm using is '-([0-9]*)-' and works for the 1st case however gets the same value for the second. Could anyone help me?
Thanks in advance

You may add a hyphen to the character class:
-([0-9-]*)-
^
See the regex demo
If you put it at the end of the char class, you do not need to escape it.
Details:
- - a hyphen
([0-9-]*) - Group 1 capturing zero or more (due to the * quantifier) digits or/and hyphens
- - a literal hyphen again.

Related

Regex quantifier more than one group

I need a regex to get a sequence of number 1 followed by number 0 and the total numbers should be equal to a max length. Is there a way to do something like (([1]+)([0]+)){maxLength} ?
Ex.:
maxLength = 7
10 -> should not pass (total length < maxLength)
1111100 -> should match
1000000 -> should match
11110000000 -> should match 1111000.
111111111111 -> should match 1111111.
Plus: The sequence could be 0 followed by 1, and the greater the amount of 1 the better (I don't know if it's possible in only one regex).
000000001111 -> should get 0001111.
I'm focusing on 1 followed by 0.
I started with [1]+[0]+,
after I quantified the 0s ([1]+)([0]{1,7}),
but it still giving more 0s than I want.
Then I was thinking in ([1]{7,}|[1]{6}[0]{1}|[1]{5}[0]{2}|[1]{4}[0]{3}|[1]{3}[0]{4}|[1]{2}[0]{5}|[1]{1}[0]{6}),
and ok, it works. BUT if maxLength = 100 the above solution is not viable.
Is there some way to count the length of the first matched group and then the second group to be the difference from the first one?
Or something like (([1]+)([0]+)){7} ?
My attempt using branch reset group:
0*(?|(1[10]{6})|([10]{6}1))
See an online demo. You can use the result from 1st capture group.
0* - 0+ literal zeros (greedy) upto;
(?| - Open branch reset group:
(1[10]{6}) - 1st Capture group holding a literal 1 and 6 ones or zeros.
| - Or:
([10]{6}1) - 1st Capture group holding 6 ones or zeros upto and a literal one.
) - Close branch reset group.
It seems you just want:
^(?:(?=1+0*$)|(?=0+1*$))[01]{7}
Here the {7} can be replaced with whatever the max length is minus one.
I think the regex can be as simple as:
/0*([01]{7})/
example:
const result = `
10
1111100
1000000
11110000000
111111111111
000000001111
`.split("\n").reduce((acc, str) => {
const m = str.match(/0*([01]{7})/);
m && acc.push(m[1]);
return acc
}, []);
console.log(result)

Extracting Data from the Cell Through Formula But does not pull some of the data

I have been Extracting a data from the cell where i need more result but my formula is extracting some data but not the whole as i need.
I have attached a sheet below will appreciate if i could get a help.
My formulas.
=ArrayFormula(TRIM(REGEXREPLACE(A3:A,"\.\.\.(.*)|\*\*\*","")))
=ArrayFormula(IFERROR(TRIM(REGEXEXTRACT(A3:A, "DONE=>\s*.+\b"))))
https://docs.google.com/spreadsheets/d/1MKC1OWIj64v_mmuNM6mLFY9wMgLwl2mUxm6KnsM5arE/edit#gid=0
The regexps you can use are
=ArrayFormula(TRIM(REGEXREPLACE(A3:A,"(\*{3}.*?)(?:\s*\.{3}DONE=>.*)?(\*{3})$","$1 $2")))
=ArrayFormula(IFERROR(TRIM(REGEXEXTRACT(REGEXREPLACE(A3:A, "^([^-]*-)[^-]+-", "$1"), ".*DONE=>.*"))))
See the first regex demo and the second regex demo. The third one - .*DONE=>.* - simply returns all the strings that contain DONE=> in them.
Details:
(\*{3}.*?) - Group 1 ($1): three * chars and then any zero or more chars other than line break chars, as few as possible
(?:\s*\.{3}DONE=>.*)? - an optional string of zero or more whitespaces, ***DONE=> and then the rest of the string
(\*{3}) - Group 2 ($2): *** string
$ - end of string.
The ^([^-]*-)[^-]+- matches
^ - start of string
([^-]*-) - Group 1 ($1): any zero or more chars other than - and then a -
[^-]+- - one or more chars other than - and then a - char.
You say "Thank you but it includes ? value in last"
Completely new formula for your needs
We put front part and last part together with &
=ArrayFormula(IF(REGEXMATCH(A2:A,"MUKHML"),TRIM((REGEXEXTRACT(A2:A,"^[^-]*")&REGEXREPLACE(A2:A,".*\?|.* COMPLEXIES",""))),""))
Use this new formula like from Wiktor
=ArrayFormula(IF(REGEXMATCH(A2:a,"MUKHML"),REGEXREPLACE(A2:a,"^([^-]*-)[^-]+-","$1"),""))

Extract values delimited by characters without characters with regex (LabView)

I want to extract the number sandwiched between two specific letters.
e.g. string: x23y4z90
I specify x and y , I get 23
I specify y and z , I get 4
I specify z and x , I get 90 (the string pattern loops)
x\dy yields x23y, but I don't want the letters included.
*note: This is to read sensor values serially in LabVIEW.
One possibility is to use groups:
x(\d+)y
Now, the second group will contain only the number. The first group will be the whole match.
Another possibility is to use positive lookahead and positive lookbehind:
(?<=x)\d+(?=y)
Please note the + I added. This is necessary to match numbers with multiple digits.
Check it here for x and y and here for y and z.
You need to use lookarounds or groups
(?<=x)\d+(?=y)
----- ----
| |->only checks if y is after a digit(lookahead)
|->only checks if x is before a digit(lookbehind)

Need to capture single character, but ignore digit

I'm parsing out flight info.
Here's the sample data:
E0.777 7 3:09
E0.319 N 1:43
E0.735 8 1:45
E0.735 N 1:48
E0.M80 9 3:21
E0.733 1:48
I need to populate fields like this:
Equipment: 735
On Time: N
Duration: 1:48
Problem I'm having is capturing the Y or N character but ignoring the single digit, then capturing the duration.
This is the expression I have tried:
#"^.{3}(.{3})\s?([N|Y]?)?(?:[0-9]\s+)?(\w{4})"
Edit: I updated the sample data to clarify my question. Equipment is not always three digits, it could be a character and two digits. The data between the equipment and the duration could be a boolean N or Y, a single digit, or white space. Only the boolean should be captured.
Firstly, you mix up the concepts of alternation and character classes [Y|N] would match 3 different characters: Y or | or N. Either use (...) or leave out the pipe.
Secondly your double ? after the character class does not really do anything. Thirdly, at the end you only match consecutive spaces if a digit was found. But if there is no digit, the last ? will ignore the subpattern, thus not allowing spaces either.
Lastly, \w does not match :.
Try this:
#"^.{3}(\d{3})\s?(?:([NY])|\d)\s+(\d:\d\d)"
You should also think about restricting the repeated . at the beginning to a more precise character class (i.e \w{2}\., but I don't know the possibilities there).
#"^..\.(\d{3})\s(?:([YN])|\d)\s*(\S{4})"
Changed .{3} to ..\. which is a bit more specific about there being a literal . for character 3.
(?:([YN])|\d) matches either Y/N or a digit, but only captures a Y or N. Notice that it's [YN] not [Y|N].
Changed \w{4} to \S{4} since \w doesn't match colons :.
This will do it...
^\w\d\.(\d{3})\s(?:([YN])|\d)\s*(\d:\d{2})$
I made some other changes to your regex because it was easier for me to just rewrite it based off your data then to try to modify what you had.
This will capture the Y or N or it won't capture anything in that group. I also tried to be more specific with your duration regex.
Update: This works with your new requirements...
^\w\d\.(\w{3})\s(?:([YN])|\d|\s)\s*(\d:\d{2})$
You can see it working on your data here... http://regexr.com?32j1b
(hover over each line to see the matched groups)
This captures all lines with Y or N and ignores everything else:
^...(\d{3})\s*([YN])\s*(\d+:\d+)

Regex for: 6 digits or 0-6 signs (digits or stars) with at least one star

How to write regex to validate this pattern?
123456 - correct
*1 - correct
1* - correct
124** - correct
*1*2 - correct
* - correct
123456* - incorrect (size 7)
12345 - incorrect (size 5 without stars)
tried:
^[0-9]{6}$|^(([0-9]){1,6}([*]){1,5}){1,6}+$
But it allows to have more than 6 numbers and don't allow for star to be before number.
There is no minimum/maximum count of "*" sign (but max count for all signs is 6).
Here you go:
^(?:\d{6}|(?=.*\*)[\d*]{1,6}|)$
Here is what it does:
^ <-- Start of the string (we don't want to capture more than that)
(?: <-- Start a non captured group (it will be used to do the "or" part)
\d{6} <-- 6 digits, nothing more
| <-- OR
(?=.*\*) <-- Look ahead for a '*' (you could replace the first * with {0,5})
[\d*] <-- digits or '*'
{1,6} <-- repeated one to six times (we know from the look ahead that there will be at least one '*'
| <-- OR (nothing)
) <-- End the non capturing group
$ <-- End of the string
I'm not quite sure if you want the empty case (but you said 0 to 6), if you actually want 1 to 6 just remove the last |
/ ([0-9] {6} ) | ( ( [0-9]{0-5} & [*]{1-5} ) {0-6})/
something like this?
[1-6]{6}|([1-6]|\*){1,6}[^123456]
this works for the inputs you gave...
If you want something else then update me...
You can't do this with just a regex. You also need a length check. However, here is a regex that will help.
([\d*]*\*[\d*]*)|(\d{6})
To validate the input, try something like this:
validate(input)
{
regex = "([\d*]*\*[\d*]*)|(\d{6})";
digitregex = ".*\d.*"; // this makes sure they aren't all stars
return (input.length < 7 and regex.matches(input) and digitregex.matches(input))
}
I am afraid that you will have to try for each position that the * might have, like this:
/([0-9]{6}|\*[0-9][0-9\*]{0,4}|[0-9]\*[0-9\*]{0,4}|[0-9]{2}\*[0-9\*]{0,3}|[0-9]{3}\*[0-9\*]{0,2}|[0-9]{4}\*[0-9\*]?|[0-9]{5}\*)/
Edit:
The above solution will however not allow **2
And I was wrong. You can do it with a look forward like Colin did. That is the way to go.
Try this : (updated)
([0-6]{6})|([0-6\*]{1,6})
It should work...
if any digits 0..9 are allowed try this regexp [0-9*]{2,6}
if only digits 1..6 as in your example [1-6*]{2,6}
it's a bit tricky cause also 12345 will be validated as correct
example here
You'll actually need a solution with look-around as already suggested by #Colin