Regexmatch for multiple words in Sheets - regex

I'm trying to write a REGEXMATCH formula for Sheets that will analyze all of the text in a cell and then write a given keyword into another cell.
I've figured out how to do this for a single keyword: for example,
=IF(REGEXMATCH(F3, "czech"),"CZ",IF(REGEXMATCH(F3, "african"),"AF",IF(REGEXMATCH(F3, "mykonos"),"MK")))
What I'm having trouble with though is writing one of these values only if two or more terms are matched in the reference cell.
If I were trying to match one of two words, I realize I could use | as in:
=IF(REGEXMATCH(F3, "czech|coin"),"CZC"
etc
But in this instance I only want to produce CZC if the previous cell contains BOTH czech AND coin.
Can someone help me with this?

try like this:
=IF((REGEXMATCH(F3, "czech"))*(REGEXMATCH(F3, "coin")), "CZC", )
multiplication stands for AND

Related

Google Sheets multiple search and replace from a list

I already found this solution, but unfortunately I can't comment or ask a question in this thread.
https://stackoverflow.com/a/47685929/19554304
Is there a way to change the script from the solution so that it is possible to check multiple words for a replacement. For example: Check if the text contains the words A or B and replace them with C.
Thx
Check if the text contains the words A or B and replace them with C.
You can solve it vía formula like this:
=reduce(C2,SEQUENCE(COUNTA(A:A)),LAMBDA(a,r,REGEXREPLACE(a,"\b("&INDEX(A:A,r)&")\b",INDEX(B:B,r,))))
You put the values separated by "|" in Column A and its replacements in column B. For example: "car|Moto"
Let me know if it is useful! You can use it as ARRAYFORMULA too by wrapping it in ARRAYFORMULA and changing C2 with the whole range of C column

How to Keep rows of multi-line cells containing a keyword in google sheets

I'm trying to keep lines that contain the word "NOA" in a column A which has many multi-line cells as can be viewed in this Google Spreadsheet.
If "NOA" is present then, I would like to keep the line. The input and output should look like the image which I have "working" with too-many helper cells. Can this be combined into a single formula?
Theoretical Approaches:
I have been thinking about three approaches to solve this:
ARRAYFORMULA(REGEXREPLACE - couldn't get it to work
JOIN(FILTER(REGEXMATCH(TRANSPOSE - showing promise as it works in multiple steps
Using the QUERY Function - unfamiliar w/ function but wondering if this function has a fast solution
Practical attempts:
FIRST APPROACH: first I attempted using REGEXEXTRACT to extract out everything that did not have NOA in it, the Regex worked in demo but didn't work properly in sheets. I thought this might be a concise way to get the value, perhaps if my REGEX skill was better?
ARRAYFORMULA(REGEXREPLACE(A1:A7, "^(?:[^N\n]|N(?:[^O\n]|O(?:[^A\n]|$)|$)|$)+",""))
I think the Regex because overly complex, didn't work in Google or perhaps the formula could be improved, but because Google RE2 has limitations it makes it harder to do certain things.
SECOND APPROACH:
Then I came up with an alternate approach which seems to work 2 stages (with multiple helper cells) but I would like to do this with one equation.
=TRANSPOSE(split(A2,CHAR(10)))
=TEXTJOIN(CHAR(10),1,FILTER(C2:C7,REGEXMATCH(C2:C7,"NOA")))
Questions:
Can these formulas be combined and applied to the entire Column using an Index or Array?
Or perhaps, the REGEX in my first approach can be modified?
Is there a faster solution using Query?
The shared Google spreadhseet is here.
Thank you in advance for your help.
Here's one way you can do that:
=index(substitute(substitute(transpose(trim(
query(substitute(transpose(if(regexmatch(split(
filter(A2:A,A2:A<>""),char(10)),"NOA"),split(
filter(A2:A,A2:A<>""),char(10)),))," ","❄️")
,,9^9)))," ",char(10)),"❄️"," "))
First, we split the data by the newline (char 10), then we filter out the lines that don't contain NOA and finally we use a "query smush" to join everything back together.

How can I extract specific patterns from a string?

I currently have a dataset filled with the following pattern:
My goal is to get each value into a different cell.
I have tried with the following formula, but it's not yielded the results I am looking for.
=SPLIT(D8,"[Stock]",FALSE,FALSE)
I would appreciate any guidance on how I can get to the ideal output, using Google Sheets.
Thank you in advance!
I will assume here from your post that your original data runs D8:D.
If you want to retain [Stock] in each entry, try the following in the Row-8 cell of a column that is otherwise empty from Row 8 downward:
=ArrayFormula(IF(D8:D="",,TRIM(SPLIT(REGEXREPLACE(D8:D&"~","(\[Stock\]).","$1~"),"~",1,1))))
If you don't want to retain [Stock] in each entry, use this version:
=ArrayFormula(IF(D8:D="",,TRIM(SPLIT(REGEXREPLACE(D8:D&"~","\[Stock\].","~"),"~",1,1))))
These formulas don't function based on using any punctuation at all as markers. They also assure that you don't wind up with blank (and therefore unusable) cells interspersed for ending SPLITs.
, only used in the separator
=ARRAYFORMULA(SPLIT(D8:D,", ",FALSE))
, used also in each string ([stock] will be replaced)
=ARRAYFORMULA(SPLIT(D8:D," [Stock], ",FALSE))
, used also in each string ([stock] will not be replaced)
=ArrayFormula(SPLIT(REGEXREPLACE(M9:M11,"(\[Stock\]), ","$1♦"),"♦"))
use:
=INDEX(TRIM(IFNA(SPLIT(D8:D; ","))))

Google Sheets Using RegEX To Reformat & Concatenate

Link To Spreadsheet
Sheet!1Name - Names are in Single Column
Sheet!2Names - Names are in First Name, Last Name columns.
What I'm trying to do is basically remove any suffixes, special characters, and spaces, capitalize that information, and combine it with information from another field.
I was able to figure out how to piece together some regex that seems to effectively get rid of suffixes and removes special characters. It's below. That's where my skill set stops.
={"PlayerKey";ARRAYFORMULA(UPPER(IF(ISBLANK(C2:C8),,PROPER(TRIM(REGEXREPLACE(C2:C8," Jr\.$| J$| Sr\.$| S$|IV$|III$|II$|\.|-|'",""))))))}
I'm having trouble nesting formulas - i believe what i need to do is nest both concat and substitute but not sure if that's the method to get the "Desired Output example" that is in the sheet. I'm also having trouble understanding what order to do things, which is why i'm having trouble with 2Name i think.
How's this in A1 of the new tab called MK.Help?
=ARRAYFORMULA({"Player Key";UPPER(TRIM(REGEXREPLACE(IF(MID(C2:C8,2,1)=".",INDEX(SPLIT(C2:C8," "),,1),LEFT(C2:C8))&D2:D8," Jr\.$| J$| Sr\.$| S$|IV$|III$|II$|\.|-|'",""))&E2:E8)})

Google Sheets Pattern Matching/RegEx for COUNTIF

The documentation for pattern matching for Google Sheets has not been helpful. I've been reading and searching for a while now and can't find this particular issue. Maybe I'm having a hard time finding the correct terms to search for but here is the problem:
I have several numbers (part numbers) that follow this format: ##-####
Categories can be defined by the part numbers, i.e. 50-03## would be one product category, and the remaining 2 digits are specific for a model.
I've been trying to run this:
=countif(E9:E13,"50-03[123][012]*")
(E9:E13 contains the part number formatted as text. If I format it any other way, the values show up screwed up because Google Sheets thinks I'm writing a date or trying to do arithmetic.)
This returns 0 every time, unless I were to change to:
=countif(E9:E13,"50-03*")
So it seems like wildcards work, but pattern matching does not?
As you identified and Wiktor mentioned COUNTIF only supports wildcards.
There are many ways to do what you want though, to name but 2
=ArrayFormula(SUM(--REGEXMATCH(E9:E13, "50-03[123][012]*")))
=COUNTA(FILTER(E9:E13, REGEXMATCH(E9:E13, "50-03[123][012]*")))
This is a really big hammer for a problem like yours, but you can use QUERY to do something like this:
=QUERY(E9:E13, "select count(E) where E matches '50-03[123][012]' label count(E) ''")
The label bit is to prevent QUERY from adding an automatic header to the count() column.
The nice thing about this approach is that you can pull in other columns, too. Say that over in column H, you have a number of orders for each part. Then, you can take two cells and show both the count of parts and the sum of orders:
=QUERY(E9:H13, "select count(E), sum(H) where E matches '50-03[123][012]' label count(E) '', sum(H) ''")
I routinely find this question on $searchEngine and fail to notice that I linked another question with a similar problem and other relevant answers.