Using Match with Regex and Array formula in Sheets - regex

I have a list of names for which I want to know if there's a cross match in family name.
So if all in Family column contain family name (as the one in col B) - there'd be a Match, otherwise not.
I started by cleaning/splitting the names
=TRANSPOSE(ARRAYFORMULA(TRIM( SPLIT(SUBSTITUTE($A2," and",","),","))))
then doing a T/F match of only the family name for each case
=ISNUMBER(MATCH(REGEXEXTRACT($B$2,"\w+$"),REGEXEXTRACT(D2,"\w+$"),0))
I wanted to do this MATCH as an array, but it's not working. And then I'd have to do a count of the TRUE value if all are TRUE return a MATCH, else NO MATCH.
I obviously want to do this in a single cell, but got stuck because I can't make the MATCH an array. I hope that makes sense, or am I going about this the wrong way.
Here's the sample sheet

try:
=ARRAYFORMULA(IF(A2:A="",,IF(1+LEN(
REGEXREPLACE(SUBSTITUTE(A2:A, "and", ","), "[^,]", ))=
MMULT(N(IFERROR(IF(SPLIT(SUBSTITUTE(A2:A, "and", ","), ",")="",,
REGEXMATCH(TRIM(SPLIT(SUBSTITUTE(A2:A, "and", ","), ",")),
REGEXEXTRACT(B2:B, "\w+$"))))),
SEQUENCE(COLUMNS(SPLIT(SUBSTITUTE(A2:A, "and", ","), ",")), 1, 1, 0)),
"match", "no match")))

use this
C2=trim(index(split(B2," "),1,COUNTA(split(B2," "))))
D2=SUBSTITUTE(A2,"and",",")
E2=if(COUNTA(split(D2,C2,false))=counta(split(D2,",",false)),"matched","not matched")
1- C2 gets the last word from sentence as last name
2- D2 Replaces "and" by ","
3- E2 splits D2 by "," and splits D2 by C2 then counts and compares if same means all matched
Result

another one for you:
=ARRAYFORMULA(
IFS(
A2:A = "",,
ISNA(MATCH(
ROW(A2:A),
QUERY(
QUERY(
SPLIT(
FLATTEN(
FILTER(
ROW(A2:A) & "♥"
& --NOT(REGEXMATCH(
SPLIT(
REGEXREPLACE(A2:A, ",\s*|\s+and\s+", "♥"),
"♥"
),
"^$|" & REGEXEXTRACT(B2:B, "\s(\w+)$")
)),
A2:A <> ""
)
),
"♥"
),
"SELECT Col1, SUM(Col2)
GROUP BY Col1",
),
"SELECT Col1
WHERE Col2 = 0",
),
)),
"NO MATCH",
True,
"MATCH"
)
)

Related

Having a problem with IF function argument in Spreadsheet

I'm trying to make this description generator, and I can't seem to make the first part work for one of the IF arguments as it does with the rest. It only checks the logical expression but doesn't bring the rest of the text body in the cell joined with & as it does in the case of the other IF arguments I have there linked one after the other. This example should make more sense.
try:
=INDEX(REGEXREPLACE(SUBSTITUTE(SUBSTITUTE(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(IFERROR(
SPLIT(B1:B, CHAR(10)))="",,REGEXREPLACE({"", SEQUENCE(1, 100)}&". "&IFNA(VLOOKUP(
TRIM(SPLIT(B1:B, CHAR(10))),
{"ck", "click >";
"box", "select box";
"scd", "scroll down >"}, 2, 0),
SPLIT(B1:B, CHAR(10))), " ", CHAR(13)))),,9^9))),
" ", CHAR(10)), CHAR(13), " "), "^\. !?", ))
demo sheet
The first If statement encloses all the rest of the formula so if the regex matches "ck" the condition is satisfied, you get "click >" but nothing else happens. I think you can just move the final bracket so it is just after "select " like this:
=IF(REGEXMATCH(B4, "ck"),"click >",IF(REGEXMATCH(B4, "scd"),"scroll down > ",IF(REGEXMATCH(B4, "!"),"","select "))) & ARRAYFORMULA(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(TEXTJOIN(CHAR(10), 1, IF(REGEXMATCH(""&
SPLIT(B4, CHAR(10)), "^>.*"),
SPLIT(B4, CHAR(10)), TRANSPOSE(MMULT(TRANSPOSE(TRANSPOSE((SEQUENCE(1, COLUMNS(
SPLIT(B4, CHAR(10))))<=SEQUENCE(COLUMNS(
SPLIT(B4, CHAR(10))), 1, 0))*NOT(REGEXMATCH(
SPLIT(B4, CHAR(10)), "^>.+")))), TRANSPOSE(SIGN(NOT(REGEXMATCH(
SPLIT(B4, CHAR(10)), "^>.+"))))))&". "&
SPLIT(B4, CHAR(10)))), "^0. ", ),"scd",""),"ck",""),"!",""))

How to create result from multiple IF results neatly in Google Sheets

I want to combine multiple results of if statement into a form of a sentence.
Code:
=CONCAT("Fail column", IF($T3="No", " T", "")& IF($U3="No", ", U", "") & IF($W3<7, ", W", "") & IF($X3>3, ", X", "") & IF($AE3="No", ", AE", "") & IF($AF3="No", ", AF", ""))
Sample data :
If the first statement returns blank, the next statement would not show the comma at the beginning. And let say all pass, they would be shown as "Yes".
My expected output can be:
Fail column T, U, W, X, AE, AF
Fail column U, W, X, AE, AF
Fail column T
Fail column W, X
Yes
I'm thinking you could try:
Formula in R3:
=IF(OR(T3="No",U3="No",W3<7,X3>3,AE3="No",AF3="No"),"Fail column: "&TEXTJOIN(", ",TRUE,IF(T3="No","T",""),IF(U3="No","U",""),IF(W3<7,"W",""),IF(X3>3,"X",""),IF(AE3="No","AE",""),IF(AF3="No","AF","")),"Yes")
The key here is TEXTJOIN instead of CONCAT to exclude any empty values from the concatenated string.
Note: Excel and Google Spreadsheets are two different apps and the functions are not always exchangeable. Your question's title suggests that you are actually using Excel, however your tags include GS.
correct formula would be:
=ARRAYFORMULA(REGEXREPLACE(IF(
(T3:T="yes")*(U3:U="yes")*((W3:W<7)*(W3:W<>""))*(X3:X>3)*(AE3:AE="yes")*(AF3:AF="yes"),
"yes", "Fail column: "&
IF(T3:T="no", "T, ", )&
IF(U3:U="no", "U, ", )&
iF(W3:W>=7, "W, ", )&
IF((X3:X<=3)*(X3:X<>""), "X, ", )&
IF(AE3:AE="no", "AE, ", )&
IF(AF3:AF="no", "AF, ", )), ", $|Fail column: $", ))

Reference text from cell and return remaining text to other cell

So my question is... I have 2 columns, A and B. In B I want to read the text from A (which has a list of texts: eg. Normal Car, Lorry, Sports Car, Bike) and I would type some text specified from the list in 'A' (eg. Bike) and in B would be the text that is left from that list from 'A' (So meaning "Normal Car", "Lorry" and "Sports Car" would be shown in 'B')
How do I do that?
This is the spreadsheet example:
https://docs.google.com/spreadsheets/d/16o75-R-U3zY0vajg1pO0N7-t9uGCtt_B2QiZLW_sXUM/edit#gid=0
Here are all the words that will be used: Lyrics, Visualizer, Bass Boost, 8D, Nightcore, Image Only
What I want to achieve is if I fill one or more of the words in Column A, then all the rest of the words will be filled up in B automatically. I left them filled up already so you can see the example. Thanks.
for one word:
=ARRAYFORMULA(REGEXREPLACE(TRIM(IF(LEN(A1:A),
REGEXREPLACE("Lyrics, Visualizer, Bass Boost, 8D, Nightcore, Image Only,",
A1:A&",", ), )), ",$", ))
for multiple words:
=ARRAYFORMULA(SUBSTITUTE(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(LEN(A1:A), SUBSTITUTE(
REGEXREPLACE({"Lyrics", "Visualizer", "Bass Boost", "8D", "Nightcore", "Image Only"},
REGEXREPLACE(A1:A, ", ", "|"), ), " ", "♦"), )),,999^99))), " ", ", "), "♦", " "))
I would type some text specified from the list in 'A' (eg. Bike) and in B would be the text that is left from that list from
try like this:
=ARRAYFORMULA(REGEXREPLACE(TRIM(REGEXEXTRACT(A1:A, "(.*)"&B1:B)), ",$", ))

How to change a text or symbol into line break in Google Sheets?

I'm trying to change all ; into a line break \n in Google Sheets.
Is there a way to automate this or I need to do it one by one?
use SUBSTITUTE or REGEXREPLACE formulas wrapped in the ARRAYFORMULA like:
=ARRAYFORMULA(SUBSTITUTE(your_formula_or_range_here, ";", CHAR(10))
=ARRAYFORMULA(REGEXREPLACE(your_formula_or_range_here, ";", CHAR(10))
example:
=ARRAYFORMULA(SUBSTITUTE(QUERY({INDEX(QUERY(A1:B,
"select A,count(A) where A is not null group by A pivot B", 0), , 1),
REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IF(ISNUMBER(QUERY(A1:B,
"select count(A) where A is not null group by A pivot B", 0)), INDEX(QUERY({A1:A,B1:B&";"},
"select count(Col1) where Col1 is not null group by Col1 pivot Col2 offset 1", 0), 1,), ))
, , 999^99))), ";$", )}, "offset 1", 0), "; ", CHAR(10)))

regexp_substr skips over empty positions

With this code to return the nth value in a pipe delimited string...
regexp_substr(int_record.interfaceline, '[^|]+', 1, i)
it works fine when all values are present
Mike|Male|Yes|20000|Yes so the 3rd value is Yes (correct)
but if the string is
Mike|Male||20000|Yes, the 3rd value is 20000 (not what I want)
How can I tell the expression to not skip over the empty values?
TIA
Mike
The regexp_substr works this way:
If occurrence is greater than 1, then the database searches for the
second occurrence beginning with the first character following the
first occurrence of pattern, and so forth. This behavior is different
from the SUBSTR function, which begins its search for the second
occurrence at the second character of the first occurrence.
So the pattern [^|] will look for NON pipes, meaning it will skip consecutive pipes ("||") looking for a non-pipe char.
You might try:
select trim(regexp_substr(replace('A|test||string', '|', '| '), '[^|]+', 1, 4)) from dual;
This will replace a "|" with a "| " and allow you to match based on the pattern [^|]
I had a similar problem with a CSV file thus my separator was the semicolon (;)
So I started with an expression like the following one:
select regexp_substr(';2;;4;', '[^;]+', 1, i) from dual
letting i iterate from 1 to 5.
And of course it didn't work either.
To get the empty parts I just say they could be at the beginning (^;), or in the middle (;;) or at the end (;$). And or-ing all of this together gives:
select regexp_substr(';2;;4;', '[^;]+|^;|;;|;$', 1, i) from dual
And believe me or not: testing for i from 1 to 5 it works!
But let's not forgot the last details: with this approach you get ; for fields that are empty originally.
The next lines are showing how to get rid of them easily replacing them by empty strings(nulls):
with stage1 as (
select regexp_substr(';2;;4;', '[^;]+|^;|;;|;$', 1, 2) as F from dual
)
select case when F like '%;' then '' else F end from stage1
OK. This should be the best solution for you.
SELECT
REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'^([^|]*\|){2}([^|]*).*$',
'\2' )
TEXT
FROM
DUAL;
So for your problem
SELECT
REGEXP_REPLACE ( INCOMINGSTREAMOFSTRINGS,
'^([^|]*\|){N-1}([^|]*).*$',
'\2' )
TEXT
FROM
DUAL;
--INCOMINGSTREAMOFSTRINGS is your complete string with delimiter
--You should pass n-1 to obtain nth position
ALTERNATE 2:
WITH T AS (SELECT 'Mike|Male||20000|Yes' X FROM DUAL)
SELECT
X,
REGEXP_REPLACE ( X,
'^([^|]*).*$',
'\1' )
Y1,
REGEXP_REPLACE ( X,
'^[^|]*\|([^|]*).*$',
'\1' )
Y2,
REGEXP_REPLACE ( X,
'^([^|]*\|){2}([^|]*).*$',
'\2' )
Y3,
REGEXP_REPLACE ( X,
'^([^|]*\|){3}([^|]*).*$',
'\2' )
Y4,
REGEXP_REPLACE ( X,
'^([^|]*\|){4}([^|]*).*$',
'\2' )
Y5
FROM
T;
ALTERNATE 3:
SELECT
REGEXP_SUBSTR ( REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'\|',
';' ),
'(^|;)([^;]*)',
1,
1,
NULL,
2 )
AS FIRST,
REGEXP_SUBSTR ( REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'\|',
';' ),
'(^|;)([^;]*)',
1,
2,
NULL,
2 )
AS SECOND,
REGEXP_SUBSTR ( REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'\|',
';' ),
'(^|;)([^;]*)',
1,
3,
NULL,
2 )
AS THIRD,
REGEXP_SUBSTR ( REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'\|',
';' ),
'(^|;)([^;]*)',
1,
4,
NULL,
2 )
AS FOURTH,
REGEXP_SUBSTR ( REGEXP_REPLACE ( 'Mike|Male||20000|Yes',
'\|',
';' ),
'(^|;)([^;]*)',
1,
5,
NULL,
2 )
AS FIFTH
FROM
DUAL;
You can use the following :
with l as (select 'Mike|Male||20000|Yes' str from dual)
select regexp_substr(str,'(".*"|[^|]*)(\||$)',1,level,null,1)
from dual,l
where level=3/*use any position*/ connect by level <= regexp_count(str,'([^|]*)(\||$)')
As an complement to #tbone response...
Oddly, my oracle didn't recognize the blank space character in this list: [^|]
In this cases can be confusing and hard to realize what is going wrong.
Try with this regex ([^|]| )+. Also, to detect a posible first blank item, it is better to replace the separator with the space blank before, and not after it:
' |'
trim(regexp_substr(replace('A|test||string', '|', ' |'), '([^|]| )+', 1, 4))