VLOOKUP missing entries? - vlookup

I have a decently sized spreadsheet of about 6 thousand entries and I am trying to match two columns to make sure they match up using vlookup and that is working fine but I have about 400 or so entries that don't have a match and looking through a few and just using control-f to see if they were somewhere with a different naming convention I found the a matching cell for several of the entries that supposedly don't have a match, any clue why this might be? I checked the length of the values in the cell to see if there was white spacing I was missing but they matched in length and the values in the cell are the exact same. This is the vlookup setup I am using
=VLOOKUP(A1,$B$1:$B$6074,1,0). The CSV is here if anyone wanted to see it. I am not saying they are all wrong but I looked at about 10 or so and they all had matches so I am a bit lost on this.

Related

Regexmatch for multiple words in Sheets

I'm trying to write a REGEXMATCH formula for Sheets that will analyze all of the text in a cell and then write a given keyword into another cell.
I've figured out how to do this for a single keyword: for example,
=IF(REGEXMATCH(F3, "czech"),"CZ",IF(REGEXMATCH(F3, "african"),"AF",IF(REGEXMATCH(F3, "mykonos"),"MK")))
What I'm having trouble with though is writing one of these values only if two or more terms are matched in the reference cell.
If I were trying to match one of two words, I realize I could use | as in:
=IF(REGEXMATCH(F3, "czech|coin"),"CZC"
etc
But in this instance I only want to produce CZC if the previous cell contains BOTH czech AND coin.
Can someone help me with this?
try like this:
=IF((REGEXMATCH(F3, "czech"))*(REGEXMATCH(F3, "coin")), "CZC", )
multiplication stands for AND

REGEXMATCH and MATCH don't work when a cell contains a number

I am trying to use formulas to find a row in my google spreadsheet document, however I have got a weird problem.
I am not able to find values when a cell contains a number (without any other characters).
Consider the following case
I have got two values
A1 - 32323232323
A2 - 323-23232-323
When I use the following formula
=FILTER(A:E,REGEXMATCH(B:B,"323-23232-323"))
It works fine, it successfully finds A2 value, however when I try to use the following formula
=FILTER(A:E,REGEXMATCH(B:B,"32323232323"))
It doesn't match any row, and I also tried the following formula
ADDRESS(MATCH("32323232323",B:B,0),1)
It doesn't work either, it only works when I remove quotes like that
ADDRESS(MATCH(32323232323,B:B,0),1)
But this doesn't work with REGEXMATCH.
Is there any way I can match numbers using a regex expression (exact number, without wildcards) ?
Thanks
=FILTER(A:A,REGEXMATCH(REGEXREPLACE(TO_TEXT(A:A),"-",""), "32323232323"))
to get both 323-23232-323 and 32323232323.
=FILTER(A:E,REGEXMATCH(TO_TEXT(B:B),"32323232323"))
to get number 32323232323.
Notes:
Converting to_text is a key here.
Change columns to yours.

Google Sheets Pattern Matching/RegEx for COUNTIF

The documentation for pattern matching for Google Sheets has not been helpful. I've been reading and searching for a while now and can't find this particular issue. Maybe I'm having a hard time finding the correct terms to search for but here is the problem:
I have several numbers (part numbers) that follow this format: ##-####
Categories can be defined by the part numbers, i.e. 50-03## would be one product category, and the remaining 2 digits are specific for a model.
I've been trying to run this:
=countif(E9:E13,"50-03[123][012]*")
(E9:E13 contains the part number formatted as text. If I format it any other way, the values show up screwed up because Google Sheets thinks I'm writing a date or trying to do arithmetic.)
This returns 0 every time, unless I were to change to:
=countif(E9:E13,"50-03*")
So it seems like wildcards work, but pattern matching does not?
As you identified and Wiktor mentioned COUNTIF only supports wildcards.
There are many ways to do what you want though, to name but 2
=ArrayFormula(SUM(--REGEXMATCH(E9:E13, "50-03[123][012]*")))
=COUNTA(FILTER(E9:E13, REGEXMATCH(E9:E13, "50-03[123][012]*")))
This is a really big hammer for a problem like yours, but you can use QUERY to do something like this:
=QUERY(E9:E13, "select count(E) where E matches '50-03[123][012]' label count(E) ''")
The label bit is to prevent QUERY from adding an automatic header to the count() column.
The nice thing about this approach is that you can pull in other columns, too. Say that over in column H, you have a number of orders for each part. Then, you can take two cells and show both the count of parts and the sum of orders:
=QUERY(E9:H13, "select count(E), sum(H) where E matches '50-03[123][012]' label count(E) '', sum(H) ''")
I routinely find this question on $searchEngine and fail to notice that I linked another question with a similar problem and other relevant answers.

Conditional Vlook up without using VBA

I want to convert an input to desired output. Kindly help.
In the output - the columns value should start from most recent (year)
Please click this to see data
Unfortunately VLOOKUP is not able to fulfill that ask. However the INDEX-function can.
Here is a good read on how to use it:
http://fiveminutelessons.com/learn-microsoft-excel/use-index-lookup-multiple-values-list
This will work for you spreedsheet, if your input table starts at A1 without a header and your output table starts at H3 with the first ID.
You get this by copy&pasting the first column of your input table to column H and then remove duplicates.
{=IF(ISERROR(INDEX($A$1:$C$7,SMALL(IF($A$1:$A$7=$H$3,ROW($A$1:$A$7)),ROW(1:1)),3)),"",
INDEX($A$1:$C$7;SMALL(IF($A$1:$A$7=$H$3,ROW($A$1:$A$7)),ROW(1:1)),3))}
Let's look at the formula step by step:
The curly brackets tell excel that this is an array formula, the interesting part for you is: when you've inserted the formula (without curly brackets) press shift+ctrl+enter, excel will then know that this is an array formula.
'error at formula?, then blank, else formula
=IF(ISERROR(....),"",...)
When you autofill this formula you probably dont know how many instances of your lookup variable are. So when you put this formula in 4 cells, but there are only 3 entries, this bit will keep the cell blank instead of giving an error.
INDEX($A$1:$C$7,SMALL(IF($A$1:$A$7=$H$3,ROW($A$1:$A$7)),ROW(1:1)),3))
$A$1:$C$7 is your data matrix. Your IDs (in your case 125 and 501) are to be found in $A$1:$A$7. ROW(1:1) is the absolute(!) rowID, 3 the absolute(!) column id. So when you move your input table those values have to be changed.
What exactly SMALL and INDEX do are well described in the link above. (Or at least better than I could.)
Hope that clarified some parts,
Tom

Delimiting columns very specifically

I've got a column (with many thousands of rows) which I'd like to delimit into multiple rows. I have some experience using regular expressions in Excel, and I have some experience using delimiters in excel, but this one is just a tad too hard..
Let me give you three example-lines:
- 23-12-05: For sale for 2000. 2010-09-09: Not found
- 25-11-09: For sale for 3400. Last date found: 2010-07-08
- 18-06-08: For sale for 5500. 21-07-09: Changed from 5500 to 4900. 16-09-09: Jumped from 4900 to 4700. 2010-02-04: Not found
Most other lines follow these structures. How can I create a new column based on just the first symbols before [COLON]; A second column based on the symbols between the first [COLON] and the first [DOT]. How can I continue to the last IF the text LAST DATE is not found? Finally: How can I use regex (or another way) to use the text 'NOT FOUND' to paste the last date into a new column?
Trust me, I have been at this for quite some time now (sigh). Any help is much appreciated!
you can actually use formulas for this.
Assuming the text is in A1,
B1: =LEFT(A1,FIND(":",A1)-1)
C1: =MID(A1,FIND(":",A1)+1,FIND(".",A1,FIND(":",A1))-FIND(":",A1))
D1: =MID(A1,FIND(".",A1,FIND(":",A1))+1,LEN(A1))
E1: =MID(A1,FIND("Not found",A1)-12,10)
(I'm assuming the date format does not change for the E1)
By the way, this also works for me, to get the last date in a cell:
=LOOKUP(9999999999999999,FIND("**-**-**",A1,ROW($1:$1024)))
Only problem here is: I haven't the slightest clue what exactly I am doing here.
For example, I'd like to use the same code to find the FIRST occurence of a date.
Can anyone explain this code to me? Why am I searching for a very high number? What is it in this code that makes that I find the last occurence? What does it mean that the 'starting number' is "row(1:1024)"?
Anybody knows?