Avoid duplicate code in Excel IF formula code - regex

I want to avoid duplicate code within excel formulas. Is there a method to repeat a certain code segment?
=IF(A1=1,(A1-B2-C3),(A1-B2-C3)+1)
This would be especially useful when it comes to more complex or longer sections. But: everything must be in ONE formula in ONE cell. Thanks! :-)
EDIT: This is my current code.
=IF(ISNUMBER(SEARCH(".amp",A2)),IFERROR(MID(A2,FIND("#",SUBSTITUTE(A2,"-","#",LEN(A2)-LEN(SUBSTITUTE(A2,"-",""))))+1,SEARCH(".html",A2)-FIND("#",SUBSTITUTE(A2,"-","#",LEN(A2)-LEN(SUBSTITUTE(A2,"-",""))))-5),""),IFERROR(MID(A2,FIND("#",SUBSTITUTE(A2,"-","#",LEN(A2)-LEN(SUBSTITUTE(A2,"-",""))))+1,SEARCH(".html",A2)-FIND("#",SUBSTITUTE(A2,"-","#",LEN(A2)-LEN(SUBSTITUTE(A2,"-",""))))-1),""))
It strips the long ID number out of any URL of a specific CMS. So
FIND("#",SUBSTITUTE(A2,"-","#",LEN(A2)-LEN(SUBSTITUTE(A2,"-","")))
is probably the part which occurs more than once and should be replaced for a code which does not be that duplicate-prone.
EXAMPLE: www.domain.com/path1/path2/this-is-an-article-123-dd-123456789.html --> 1234567890
EXAMPLE: www.domain.com/path1/path2/this-is-an-article-123-dd-1234567890.amp.html ->
1234567890
EXAMPLE: www.domain.com/path1/this-is-an-article-1234567890.html ->
1234567890

In google sheets, you could use REGEXEXTRACT to get what you want:
Formula in B1:
=REGEXEXTRACT(A1,"\d{8,}")

Place the complex common sub-expression in its own cell and refer to that cell.
EDIT#1:
As an alternative, you can use a Named Formula for the sub-expression:
Named Formula

So here is another way of finding the code in Excel:
Here is the formula in Cell B1 which needs to be confirmed by pressing Ctrl+Shift+Enter, then drag it down to apply across board:
{=FILTERXML("<data><a>"&SUBSTITUTE(MID(A1,LARGE(IF(MID(A1,ROW($A$1:INDEX($A:$A,LEN(A1))),1)="-",ROW($A$1:INDEX($A:$A,LEN(A1)))),1)+1,LEN(A1)),".","</a><a>")&"</a></data>","/data/a[1]")}
For the logic behind this formula you may give a read to this article: Extract Words with FILTERXML.
Cheers :)
Ps. it seems that GoogleSheet has out performed Excel in some area already.

Related

How to Keep rows of multi-line cells containing a keyword in google sheets

I'm trying to keep lines that contain the word "NOA" in a column A which has many multi-line cells as can be viewed in this Google Spreadsheet.
If "NOA" is present then, I would like to keep the line. The input and output should look like the image which I have "working" with too-many helper cells. Can this be combined into a single formula?
Theoretical Approaches:
I have been thinking about three approaches to solve this:
ARRAYFORMULA(REGEXREPLACE - couldn't get it to work
JOIN(FILTER(REGEXMATCH(TRANSPOSE - showing promise as it works in multiple steps
Using the QUERY Function - unfamiliar w/ function but wondering if this function has a fast solution
Practical attempts:
FIRST APPROACH: first I attempted using REGEXEXTRACT to extract out everything that did not have NOA in it, the Regex worked in demo but didn't work properly in sheets. I thought this might be a concise way to get the value, perhaps if my REGEX skill was better?
ARRAYFORMULA(REGEXREPLACE(A1:A7, "^(?:[^N\n]|N(?:[^O\n]|O(?:[^A\n]|$)|$)|$)+",""))
I think the Regex because overly complex, didn't work in Google or perhaps the formula could be improved, but because Google RE2 has limitations it makes it harder to do certain things.
SECOND APPROACH:
Then I came up with an alternate approach which seems to work 2 stages (with multiple helper cells) but I would like to do this with one equation.
=TRANSPOSE(split(A2,CHAR(10)))
=TEXTJOIN(CHAR(10),1,FILTER(C2:C7,REGEXMATCH(C2:C7,"NOA")))
Questions:
Can these formulas be combined and applied to the entire Column using an Index or Array?
Or perhaps, the REGEX in my first approach can be modified?
Is there a faster solution using Query?
The shared Google spreadhseet is here.
Thank you in advance for your help.
Here's one way you can do that:
=index(substitute(substitute(transpose(trim(
query(substitute(transpose(if(regexmatch(split(
filter(A2:A,A2:A<>""),char(10)),"NOA"),split(
filter(A2:A,A2:A<>""),char(10)),))," ","❄️")
,,9^9)))," ",char(10)),"❄️"," "))
First, we split the data by the newline (char 10), then we filter out the lines that don't contain NOA and finally we use a "query smush" to join everything back together.

Regexmatch for multiple words in Sheets

I'm trying to write a REGEXMATCH formula for Sheets that will analyze all of the text in a cell and then write a given keyword into another cell.
I've figured out how to do this for a single keyword: for example,
=IF(REGEXMATCH(F3, "czech"),"CZ",IF(REGEXMATCH(F3, "african"),"AF",IF(REGEXMATCH(F3, "mykonos"),"MK")))
What I'm having trouble with though is writing one of these values only if two or more terms are matched in the reference cell.
If I were trying to match one of two words, I realize I could use | as in:
=IF(REGEXMATCH(F3, "czech|coin"),"CZC"
etc
But in this instance I only want to produce CZC if the previous cell contains BOTH czech AND coin.
Can someone help me with this?
try like this:
=IF((REGEXMATCH(F3, "czech"))*(REGEXMATCH(F3, "coin")), "CZC", )
multiplication stands for AND

Arrayformula to check if column contains text and pull the number next to it. Google Sheets

In desperate need of some assistance with this!
Wasn't sure how to title this question...
SAMPLE SHEET - CLICK ME! :)
In SupportingSheet!H1 I have the following formula:
=ArrayFormula(if(G1:G<>"", IF(DASHBOARD!N2<>"", G1:G/DASHBOARD!$P$2-filter(DASHBOARD!O1:O100,REGEXMATCH(DASHBOARD!N1:N100,E1:E100)),G1:G/(DASHBOARD!$M$3)),))
The part I struggle with is:
G1:G/DASHBOARD!$P$2-filter(DASHBOARD!O1:O100,REGEXMATCH(DASHBOARD!N1:N100,E1:E100))
It needs to divide two numbers and then subtract another number. I can't seem to get this formula to pull the correct number.
It needs to check if the text in E1:E100 exist in DASHBOARD!N1:N100, if yes, pull the number from DASHBOARD!O1:O100.
For example, text in SupportingSheet!E1 can be found in DASHBOARD!N2, hence it needs to pull the number from DASHBOARD!O2.
Column SupportingSheet!J has the actual end result that a formula needs to produce.
It doesn't look like Regexmatch works as an Arrayformula and I am not sure how to go about it.
Please note, that text in SupportingSheet!E1:E is not always identical. Often it will have a random number of "space" at the end (long story...). That is why Regexmatch was a perfect option until I realised it didn't work.
Please let me know if further clarification is needed.
Below is an image of the random spaces (non-printable characters) at the end.
use:
=ARRAYFORMULA(IF(G1:G="",,IF(DASHBOARD!N2<>"",
IFNA(G1:G/DASHBOARD!$P$2-VLOOKUP(E1:E1000, DASHBOARD!N1:O100, 2, 0),
G1:G/DASHBOARD!$M$3))))

Excel- Extract Number from Cell

I have multiple cells that I am attempting to extract a number from, and need help finding a regex alternative.
The cells range in the following formats:
asdfs. Seat#29 asfddsa
asdfsa. Seat#5d
asdfasN/A . Seat#22 as789fsd
Seat#111 words33
The closest that I came to a solution is:
=IFERROR(TRIM(MID([#DisplayName],FIND("#",[#DisplayName])+1,3)),"")
As you can see this will extract most of the numbers but for some it leaves a character at the end.
The only commonality is the # preceding the seat number. I am trying to extract only the seat number, no other numbers.
I cannot use VBA, this must be done using formulas. I have figured this out once before but stupidly pasted over the formulas with a values only paste.
This can be done utilizing a flash fill, but I was hoping for a more stable formula.
If you want just the numbers then use:
=--MID(A1,FIND("#",A1)+1,AGGREGATE(15,6,ROW(1:5)/(ISERROR(--MID(REPLACE(A1,1,FIND("#",A1),""),ROW(1:5),1))),1)-1)
If you want the letter also then:
=MID(A1,FIND("#",A1)+1,FIND(" ",REPLACE(A1,1,FIND("#",A1),""))-1)
If you do not need the letter following the seat number, you can use
.*#(\d+)
Edit for clarity: Excel does not have regex functions built in. You will either have to use a UDF (I can help with that if you'd like) or use a non-regex solution.
Here is a solution without VBA to extract all numbers inside the strings.
https://drive.google.com/open?id=1Fk6VFznD3i8s6scADy_vXCEj-1zQpBPW
Sheet #3

Google Sheets Pattern Matching/RegEx for COUNTIF

The documentation for pattern matching for Google Sheets has not been helpful. I've been reading and searching for a while now and can't find this particular issue. Maybe I'm having a hard time finding the correct terms to search for but here is the problem:
I have several numbers (part numbers) that follow this format: ##-####
Categories can be defined by the part numbers, i.e. 50-03## would be one product category, and the remaining 2 digits are specific for a model.
I've been trying to run this:
=countif(E9:E13,"50-03[123][012]*")
(E9:E13 contains the part number formatted as text. If I format it any other way, the values show up screwed up because Google Sheets thinks I'm writing a date or trying to do arithmetic.)
This returns 0 every time, unless I were to change to:
=countif(E9:E13,"50-03*")
So it seems like wildcards work, but pattern matching does not?
As you identified and Wiktor mentioned COUNTIF only supports wildcards.
There are many ways to do what you want though, to name but 2
=ArrayFormula(SUM(--REGEXMATCH(E9:E13, "50-03[123][012]*")))
=COUNTA(FILTER(E9:E13, REGEXMATCH(E9:E13, "50-03[123][012]*")))
This is a really big hammer for a problem like yours, but you can use QUERY to do something like this:
=QUERY(E9:E13, "select count(E) where E matches '50-03[123][012]' label count(E) ''")
The label bit is to prevent QUERY from adding an automatic header to the count() column.
The nice thing about this approach is that you can pull in other columns, too. Say that over in column H, you have a number of orders for each part. Then, you can take two cells and show both the count of parts and the sum of orders:
=QUERY(E9:H13, "select count(E), sum(H) where E matches '50-03[123][012]' label count(E) '', sum(H) ''")
I routinely find this question on $searchEngine and fail to notice that I linked another question with a similar problem and other relevant answers.