Excel- Extract Number from Cell - regex

I have multiple cells that I am attempting to extract a number from, and need help finding a regex alternative.
The cells range in the following formats:
asdfs. Seat#29 asfddsa
asdfsa. Seat#5d
asdfasN/A . Seat#22 as789fsd
Seat#111 words33
The closest that I came to a solution is:
=IFERROR(TRIM(MID([#DisplayName],FIND("#",[#DisplayName])+1,3)),"")
As you can see this will extract most of the numbers but for some it leaves a character at the end.
The only commonality is the # preceding the seat number. I am trying to extract only the seat number, no other numbers.
I cannot use VBA, this must be done using formulas. I have figured this out once before but stupidly pasted over the formulas with a values only paste.
This can be done utilizing a flash fill, but I was hoping for a more stable formula.

If you want just the numbers then use:
=--MID(A1,FIND("#",A1)+1,AGGREGATE(15,6,ROW(1:5)/(ISERROR(--MID(REPLACE(A1,1,FIND("#",A1),""),ROW(1:5),1))),1)-1)
If you want the letter also then:
=MID(A1,FIND("#",A1)+1,FIND(" ",REPLACE(A1,1,FIND("#",A1),""))-1)

If you do not need the letter following the seat number, you can use
.*#(\d+)
Edit for clarity: Excel does not have regex functions built in. You will either have to use a UDF (I can help with that if you'd like) or use a non-regex solution.

Here is a solution without VBA to extract all numbers inside the strings.
https://drive.google.com/open?id=1Fk6VFznD3i8s6scADy_vXCEj-1zQpBPW
Sheet #3

Related

Regular Expression - Extract Words and number

So I'm using Regexextract in GoogleSheet to find the value for a big amount of data. I have 2 problems I don't know to extract or what I did wrong. Feel free to point out my mistakes and help me with a solution.
Require: I need to extract the part number which format is ABCD#### or ABCD-#### which is is Upper character and numbers follow after, with or w/o "-" , for example KTA1763 or SPD-4124
# I use this formula: =Regexextract(A1,"([A-Z]+-?[0-9]*)") .FYI, the values I'm extracting, it could appear either at the beginning, middle or last.
1.First problem, I have the value as below:
REACH TECH 223/224 list document for KTD2026BEWE-TR
=> Extract result : REACH
[What I need: KTD2026]
2.I have the value as:
information for Part number KTA1550EDS-TR
=> Extract result: P
[What I need: KTA1550]
Please let me know which part in the formula should I fix to have the final expected result. Or how should I alter my formula for that matter, big thanks
go for:
=INDEX(IFNA(REGEXEXTRACT(A1:A, "[A-Z]+\d+")))
Try this in one cell.
=ArrayFormula(IF(A2:A="",,REGEXEXTRACT(REGEXEXTRACT(A2:A, ".+"&REGEXEXTRACT(A2:A, "[0-9]+")), ".+\s(.+)")))

How to filter by Regex in LibreOffice?

I've got this string:
{"success":true,"lowest_price":"1,49€","volume":"1,132","median_price":"1,49€"}
Now I want the value for median_price being displayed in a cell. HHow can I achive this with Regex?
With regex101.com I've came to this solution:
(?<=median_price":")\d{0,4},\d{2}€
But this one does not seem to be working in LibreOffice calc.
I'd advise to discard the Euro-symbol at first since you'd probably want to retrieve a value to calculate with, a numeric value. Therefor try:
Formula in B1:
=--REGEX(A1;".*median_price"":""(\d+(?:,\d+)?)€.*";"$1")
The double unary will transform the result from the 1st capture group into a number. I then went ahead and formatted the cell to display currency (Ctrl+Shift+4).
Note: I went with a slightly different regular pattern. But go with whatever works for your data I supppose.

Arrayformula to check if column contains text and pull the number next to it. Google Sheets

In desperate need of some assistance with this!
Wasn't sure how to title this question...
SAMPLE SHEET - CLICK ME! :)
In SupportingSheet!H1 I have the following formula:
=ArrayFormula(if(G1:G<>"", IF(DASHBOARD!N2<>"", G1:G/DASHBOARD!$P$2-filter(DASHBOARD!O1:O100,REGEXMATCH(DASHBOARD!N1:N100,E1:E100)),G1:G/(DASHBOARD!$M$3)),))
The part I struggle with is:
G1:G/DASHBOARD!$P$2-filter(DASHBOARD!O1:O100,REGEXMATCH(DASHBOARD!N1:N100,E1:E100))
It needs to divide two numbers and then subtract another number. I can't seem to get this formula to pull the correct number.
It needs to check if the text in E1:E100 exist in DASHBOARD!N1:N100, if yes, pull the number from DASHBOARD!O1:O100.
For example, text in SupportingSheet!E1 can be found in DASHBOARD!N2, hence it needs to pull the number from DASHBOARD!O2.
Column SupportingSheet!J has the actual end result that a formula needs to produce.
It doesn't look like Regexmatch works as an Arrayformula and I am not sure how to go about it.
Please note, that text in SupportingSheet!E1:E is not always identical. Often it will have a random number of "space" at the end (long story...). That is why Regexmatch was a perfect option until I realised it didn't work.
Please let me know if further clarification is needed.
Below is an image of the random spaces (non-printable characters) at the end.
use:
=ARRAYFORMULA(IF(G1:G="",,IF(DASHBOARD!N2<>"",
IFNA(G1:G/DASHBOARD!$P$2-VLOOKUP(E1:E1000, DASHBOARD!N1:O100, 2, 0),
G1:G/DASHBOARD!$M$3))))

Possible combination (variations) of words in a string variable in stata

I have a string variable containing school names and I need to find all the possible combination of each word in this string variable in stata:
For example variation of a word "Academy" would be:
Academy,
Academy,
acdamey,
aacdemy,
dmcaamy,
aacedmy,
and so on.
I need this to standardize the raw data of school names, which has many typos of each word due to data entry issues, like the ones given above for "academy".
Depending whether your data is already in the Excel sheets or a file, you can either use regex trying to match all possible combinations (and probably fix them when found) or parse the strings first before bringing them into Excel. In either case you could make a file (or Excel list/table/area/etc.) that includes all the common typos and pick each typo as regex match to use when comparing to your actual input.
Making regexp that would actually find all possible cases is next to impossible, especially if there are cases where very similar (but correct) names for schools exist. In any case direct regexps would be very messy and complex, so I would advice you to parse the data by finding first the correct form, excluding it and then using (greedy) search/regex to find the typoed versions. You can then save the typos to use them as a filter/match/pattern.
To get some sort of starting ideas, check this links:
Regex: Search for verb roots
Read text file and extract string into Excel sheet using regex
P.s You should keep the count of all strings/school names and finally get a list of all names that did not match correct form or any of your regexp filters, so you can manually insert/correct them.

Check if cell contains numbers in Google Spreadsheet using RegExMatch

I want to check if specific cell contain only numbers.
I know I should use RegExMatch but I get an error.
This is what I wrote : =if(RegExMatch(H2,[0-9]),"a","b")
I want it to say : write 'a' if H2 contains only numbers, 'b' otherwise.
Thank you
Try this:
=IF(ISNUMBER(H2,"A","B"))
or
=if(isna(REGEXEXTRACT(text(H2,"#"),"\d+")),"b","a")
One reason your match isn't working also - is that it in interpreting your numbers as text. the is number function is a bit more consistent, but if you really need to use regex, then you can see in the second formula where im making sure the that source text is matching against a string.
Your formula is right, simple you forget the double quotes at regexmatch function's regular_expression .
This is the right formula: =if(RegExMatch(B20,"[0-9]"),"a","b")
=REGEXREPLACE(“text”,”regex”,”replacement”)
It spits out the entire content but with the regular expression matched content replaced. =REGEXREPLACE(A2,[0-9],"a")
=REGEXREPLACE(A2,![0-9],"b")//not sure about not sign.
will fill a cell with the same text as A2, but with the 0-9 becoming an a!