Using REGEXEXTRACT in an array, searching multiple columns - regex

Can someone please tell me what I am doing wrong in this formula?
=ARRAYFORMULA(REGEXEXTRACT((A2:A&"")+(B2:B&"")+(C2:C&"")), "02(\d{14})37")
I'm trying to extract a 14 digit number that sits between 02 and 37 that may be in columnA, columnB or columnC.
I've tried this also, with the expected result showing on the first row only:
=ARRAYFORMULA(REGEXEXTRACT(textjoin(" ",true,A2:C),"02(\d{6,14})37"))
I'm really confuzzled.

it needs to be like this:
=ARRAYFORMULA(IFERROR(IFERROR(IFERROR(IFERROR(
REGEXEXTRACT(A2:A&"", "02(\d{14})37"),
REGEXEXTRACT(B2:B&"", "02(\d{14})37")),
REGEXEXTRACT(C2:C&"", "02(\d{14})37")))))

Related

how to count how many Tuesdays and Wednesdays between two dates?

The input is an array [Tuesday, Wednesday] but it is should be stored in one cell only.
Using this input I want to know how many days between two dates.
I found a reference but I don't know how to make it as dynamic because it only accept integer weekday.
https://www.extendoffice.com/excel/formulas/excel-count-day-of-week-between-two-dates.html
SUMPRODUCT(--(WEEKDAY(ROW(INDIRECT(start_date&":"&end_date)))=week_day))
Someone knows how to achieve this?
EDITED: input is okay in any format as long as it should inside in one cell only
within sheets you can try:
For count:
=INDEX(LAMBDA(aix,COUNTA(IFNA(FILTER(aix,REGEXMATCH(TO_TEXT(WEEKDAY(aix)),JOIN("|",MATCH(SPLIT(REGEXREPLACE(A5,"\[|\]",""),", "),TEXT(SEQUENCE(7),"DDDD"),0)))))))(SEQUENCE(DATEDIF(A2,B2,"d")+1,1,A2,1)))
For list:
=INDEX(LAMBDA(aix,IFNA(FILTER(aix,REGEXMATCH(TO_TEXT(WEEKDAY(aix)),JOIN("|",MATCH(SPLIT(REGEXREPLACE(A5,"\[|\]",""),", "),TEXT(SEQUENCE(7),"DDDD"),0))))))(SEQUENCE(DATEDIF(A2,B2,"d")+1,1,A2,1)))
use:
=SUMPRODUCT(REGEXMATCH(TEXT(SEQUENCE(DAYS(B2, B1)+1, 1, B1),
"dddd"), REGEXREPLACE(A4, ", ?", "|")))

Google sheets IF stops working correctly when wrapped in ARRAYFORMULA

I want this formula to calculate a date based on input from two other dates. I first wrote it for a single cell and it gives the expected results but when I try to use ARRAYFORMULA it returns the wrong results.
I first use two if statements specifycing what should happen if either one of the inputs is missing. Then the final if statement calculates the date if both are present based on two conditions. This seems to work perfectly if I write the formula for one cell and drag it down.
=IF( (LEN(G19)=0);(U19+456);(IF((LEN(U19)=0) ;(G19);(IF((AND((G19<(U19+456));(G19>(U19+273)) ));(G19);(U19+456))))))
However, when I want to use arrayformula to apply it to the entire column, it always returns the value_if_false if neither cell is empty, regardless of whether the conditions in the if statement are actually met or not. I am specifically talking about the last part of the formula that calculates the date if both input values are present, it always returns the result of U19:U+456 even when the result should be G19:G. Here is how I tried to write the ARRAYFORMULA:
={"Date deadline";ARRAYFORMULA(IF((LEN(G19:G400)=0);(U19:U400+456);(IF((LEN(U19:U400)=0);
(G19:G400);(IF((AND((G19:G400<(U19:U400+456));(G19:G400>(U19:U400+273)) ));(G19:G400);(U19:U400+456)))))))}
I am a complete beginner who only learned to write formulas two weeks ago, so any help or tips would be greatly appreciated!
AND and OR are not compatible with ARRAYFORMULA
Replace them by * or +
Try
={"Date deadline";ARRAYFORMULA(
IF((LEN(G19:G400)=0),(U19:U400+456),
(IF((LEN(U19:U400)=0), (G19:G400),
(IF((((G19:G400<(U19:U400+456))*(G19:G400>(U19:U400+273)) )),(G19:G400),
(U19:U400+456)))
))
)
)}
Keep in mind you cannot use AND, OR operators in an arrayformula, so you must find an alternative method such as multiplying the values together and checking them for 0 or 1 (true*true=1)
I am gathering based on your formula's and work that you want to have the following:
If G19 is blank show U19 + 456
If U19 is blank show G19
If G19 is less than U19 + 456 but greater than U19 + 273 show G19
Otherwise show U19 + 456
I'm not too sure what you want to happen when both columns G and U are empty. Based on your current formula you are returning an empty cell + 456... but with this formula it returns an empty cell rather than Column U + 456
Formula
={"Date deadline";ARRAYFORMULA(TO_DATE(ARRAYFORMULA(IFS((($G19:$G400="")*($U19:$U400=""))>0,"",$G19:$G400="",$U19:$U400+456,$U19:$U400="",$G19:$G400,(($G19:$G400<$U19:$U400+456)*($G19:$G400>$U19:$U400+273))>0,$G19:$G400,TRUE,$U19:$U400+456))))}

COGNOS 11 Concatenate with cast for char length

Probably simple but my head is fried right now with figures. I'm using COGNOS 11 and trying to make a data item display character length of '4' i.e 0014 rather than just 14. I can do this in the edit within the report properties but I'm trying to do a concatenate string and it keeps reverting to 14.
I've been trying CAST([Demand No], varchar(4)) as the expression definition (comes up as 'No error') but it still keeps dropping the leading 00 on the report.
My full concatenated string so far [Unit ID]||to_char(cast([Demand Date],date), 'ddmmyyyy')||cast([Demand No], varchar(4)). This produces XXXXXXDDMMYYYY0000 but only when the last four characters are 0000 but it looks like this XXXXXXDDMMYYYY00 if the leading 0's are dropped.
You could try lpad(cast([Demand No], varchar(4)),2,'0')
Not elegant, but this uses generic Cognos functions:
substring('0000', 1, 4 - char_length(cast([Demand No], varchar(4)))) || cast([Demand No], varchar(4))

Select files between specified range with regex

I have a folder with 100 folders, named like:
parent_folder/05/01/
parent_folder/05/02/
parent_folder/05/03/
parent_folder/05/04/
...
parent_folder/05/29/
parent_folder/05/30/
How can I specify a path, with regex, that would select only the contents of folders 01 to 10, then 11 to 20 and, finally, 21 to 30 ?
I am trying
"parent_folder/05/[1-10]*/*"
but it also selects 11, 12, ... all the way to 19.
EDIT: I want to read a large dataset in pyspark by 10-day intervals, and all suggested answers, so far, seem to fail.
If you want the "10" to be grouped with your 01...09 set, you are going to use something like this:
parent_folder\/05\/(0[1-9]|10)\/
then, for your 10...20 set,
parent_folder\/05\/(1[1-9]|20)\/
and so on.
You can try these regexps with the following link : https://regex101.com/r/cXAYbS/2
In python, you are going to need:
regex = r"parent_folder\/05\/(1[1-9]|20)\/"
The link above has a "python" generator, where you can borrow some code:
https://regex101.com/r/cXAYbS/2/codegen?language=python
How about this:
parent_folder/05/(?:0[1-9]|10)/
The '?:' is used for non-captering groups.

Data validation using regular expressions in Google Sheets

I am using the below date/time format in gSheets:
01 Apr at 11:00
I wonder whether it is possible to use Data Validation (or any other function) to report error (add the small red triangle to the corner of the cell) when the format differs in any way.
Possible values in the given format:
01 -> any number between 01-31 (but not "1", there must be the leading zero)
space
Apr -> 3 letters for month (Jan, Feb, Mar... Dec)
space
at
space
11 -> hours in 24h format (00, 01...23)
:
00 -> minutes (00, 01,...59)
Is there any way to validate that the cell contains "text/data" exactly in the above mentioned format?
The right way to do this is using Regular Expression and "regexmatch()" function in Google Sheets. For the given example, I made the below regular expression:
[0-3][0-9] (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) at [0-2][0-9]\:[0-5][0-9]
Process:
Select range of cells to be validated
Go to Data > Data Validation
Under Criteria select "Own pattern is" (not sure the exact translation used in EN)
Paste: =regexmatch(to_text(K4); "[0-3][0-9] (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) at [0-2][0-9]\:[0-5][0-9]")
Make sure that instead of K4 in "to_text(K4)" there is a upper-left cell from the selected range
Save
Hope it helps someone :)
You may try the formula for data validation:
=not(iserror(SUBSTITUTE(A1," at","")*1))*(len(A1)=15)*(right(A1,2)*1<61)
not(iserror(SUBSTITUTE(A1," at","")*1)) checks all statemant is legal date
(len(A1)=15) checks dates are entered with 2 digits
(right(A1,2)*1<61) cheks too much minutes, for some reason 01 Apr at 11:99 is a legal date..
Select the range of fields, where you need the data validation to occur to.
Press on -> Data -> Data validation
For "Criteria" select "Custom formula is"
Enter the following in the textfield next to "Custom formula is":
=regexmatch(Tablename!B2; "^[a-z_]*$")
Where as "Tablename" should be replaced by the table name and "B2" should be replaced by the first cell of the range.
Inside the "" you enter then your regex-expression. Here this would allow only small letters and underscores.
Using the to_text() function additionally didn't work for me. So you should maybe avoid it in order to make sure, that it works.
Press save