VLOOKUP Dilemma multiple matches - vlookup

ok here is my dilemma; I have a list of colleagues who are getting bonuses that I have to match with their addresses on another tab; the second tab is a list of all employees and their addresses. I did a VLOOKUP, but I just realized that there are a few employees with the same last name!!! how can I match the addresses (last name is in one column, first in another) to the CORRECT address in the range I named on the second tab?

You could create a new column in both tabs. Concatenate firstname + lastname into the new columns. Use that to do the vlookup.

Related

Is there an error-proof way in google sheets to extract House numbers from address cell (street + house number) into another cell

In Sheet1!AK2:AK I have addresses in the following formats:
rotenkamper weg, 323, Kirchstieg 2345, Im Schleedörn 20b
I need the street names to export into Sheet2!C3:C, i.e:
rotenkamper weg, Kirchenstieg, Im Schleedörn
The House numbers have to go into Sheet2!D3:D.
I have researched and tried for hours but couldn't find a solution that could fetch the house numbers including the letter i.e. 20b or if the number is a range 24-27.
Also, I have huge trouble to get it to work when the street consist of two or more words.
Does anyone know an elegant solution for this?
Any help would be much appreciated. This will safe me weeks of data entry work.
Try this in Sheet2!C3:
=ARRAYFORMULA(
{
REGEXREPLACE(REGEXREPLACE(Sheet1!AK2:AK, "\s+\S*\d\S*\b", ""), ",+", ","),
IFNA(REGEXEXTRACT(Sheet1!AK2:AK, "\S+$"))
}
)
Explanation:
REGEXREPLACE(Sheet1!AK2:AK, "\s+\S*\d\S*\b", "") this one removes any "word" which has a digit in it. Al of these 323, 2345, 20b will be gone.
REGEXREPLACE(..., ",+", ",") cleans up any multiple consequent commas which may appear after removing in the first step. This will be a value for the first column.
IFNA(REGEXEXTRACT(Sheet1!AK2:AK, "\S+$")) this one just gets whatever is at the end of the address string from the last space to the end. This will be a value for the second column.
{value_for_the_first_column, value_for_the_second_column} placed in the C3 cell will populate C3 with value_for_the_first_column and D3 with value_for_the_first_column.
ARRAYFORMULA will do all of the above for every row.
Regex pattern could be refined if you provide more than one example of the address.

Google Sheets ArrayFormula to get INITIALS of arbitrary length name

Sample sheet.
As the title says, given a column of arbitrary number of words of arbitrary length, Want a single ArrayFormula to get the first letters of all words in the said column.
I have tried two methods, seen in sample sheet.
1) Using SPLIT and ARRAYFORMULA, can get it one cell but cannot extend down column.
2) Using 2 REGEXEXTRACT, can get for first 2 initials and extend down
But is it possible to get for arbitrary number of words for whole column using ArrayFormula.
Is it possible to use REGEXEXTRACT to return the first letters of many words?
This replaces every word with the captured first letter
=ARRAYFORMULA(UPPER(REGEXREPLACE(A1:A6,"(\w)\S*\s?","$1")))

PostgreSQL - finding string using regular expression

What I am looking to do is to, within Postgres, search a column for a string (an account number). I have a log table, which has a parameters column that takes in parameters from the application. It is a paragraph of text and one of the parameters stored in the column is the account number.
The position of the account number is not consistent in the text and some rows in this table have nothing in the column (since no parameters are passed on certain screens). The account number has the following format: L1234567899. So for the account number, the first character is a letter and then it is followed by ten digits.
I am looking for a way to extract the account number alone from this column so I can use it in a view for a report.
So far what I have tried is getting it into an array, but since the position changes, I cannot count on it being in the same place.
select foo from regexp_split_to_array(
(select param from log_table where id = 9088), E'\\s+') as foo
You can use regexp_match() to achieve that result.
(regexp_match(foo,'[A-Z][0-9]{10}'))[1]
DBFiddle
Use substring to pull out the match group.
select substring ('column text' from '[A-Z]\d{10}')
Reference: PostgreSQL regular expression capture group in select

Remove duplicates and Keep related data Calc (Excel)

I have a list of products in calc (excel), each with an associated IP address. Many of the names have multiple IP addresses, however they are organized one column at a time. I am trying to remove all of the multiples and pull all of the IP addresses under a single name. I have tried nslookup and index match, they do not deal well with multiple outputs though. Right now it looks like this
a| 1
a| 2
a| 3
b| 1
b| 2
b| 3
etc...
I would like it to look like this
a 1,2,3
b 1,2,3
Is there any way to do this without wasting a ton of time, I have a few ways that work but they will take me forever to setup.
I recommend setting up your formulas in multiple "helper" cells before getting to the final "result cell". This breaks down the problem into smaller steps that are more easily formulated and, if needed in the future, updated. Once the setup is complete you can hide the helper columns by right-clicking on the column letter and choosing "Hide".
The first column to set up is the list of distinct product names. For the formula below to work, the product/IP list will need to be sorted in ascending order. If the list is not already sorted, to sort it first highlight the entire list, including headers. Then choose Data→Sort; select sort by "Product", make sure the radio button "Ascending" is selected, and press OK.
For purposes of this example, I'll assume product names are in column A, starting on row 2 and IPs are in column B starting on row 2 (with row 1 being the header labels). In the column where you want to list the distinct product names (I used column D), enter in the top cell =A2. In the cell below enter
=INDEX($A$2:$A$13;MATCH(D2;$A$2:$A$13;1)+1)
The match formula has a 1 as the third variable, meaning the range is sorted ascending and MATCH will return the position of the last matching cell. We add 1 to the position of the last matching cell, and this will be the position of the first cell with a new product name. That position is fed into the INDEX function to show the next product name.
Copy and paste that cell down as far as you need to show all the product names.
Now we'll set up a series of cells to display each IP address. I used columns F to I to show up to 4 addresses:
=IF(MATCH(D2;$A$2:$A$13;0)<=MATCH($D2;$A$2:$A$13;1);INDEX($B$2:$B$13;MATCH($D2;$A$2:$A$13;0));"")
=IF(MATCH(D2;$A$2:$A$13;0)+1<=MATCH(D2;$A$2:$A$13;1);INDEX($B$2:$B$13;MATCH(D2;$A$2:$A$13;0)+1);"")
=IF(MATCH(D2;$A$2:$A$13;0)+2<=MATCH(D2;$A$2:$A$13;1);INDEX($B$2:$B$13;MATCH(D2;$A$2:$A$13;0)+2);"")
=IF(MATCH(D2;$A$2:$A$13;0)+3<=MATCH(D2;$A$2:$A$13;1);INDEX($B$2:$B$13;MATCH(D2;$A$2:$A$13;0)+3);"")
MATCH with the third variable of 1 returns the position of the last matching cell; MATCH with the third variable of 0 returns the position of the first matching cell.
The IF statement checks if the position of the first matching cell (in the first lookup column) or the cell below that (in the second lookup column) or the cell two below the first match (in the third lookup column), etc. is less than or equal to the position of the last matching cell. If yes, then it looks up the relevant IP address. If no, it displays a blank.
In the formulas above you would need to manually enter the formula in the top row of each column. If you have some products with a large number of IP addresses, you may want to set up the formula so you can copy and paste between columns as well as down the rows. This would work if you were starting in column F:
=IF(MATCH($D2;$A$2:$A$13;0)+COLUMN()-6<=MATCH($D2;$A$2:$A$13;1);INDEX($B$2:$B$13;MATCH($D2;$A$2:$A$13;0)+COLUMN()-6);"")
Once you have your top row set up as you want, copy and paste down however many rows you need.
If you want to combine all the IPs into a single cell separated by commas, you can use a formula like this:
=CONCATENATE(F2;IF(G2<>"";","&G2;"");IF(H2<>"";","&H2;"");IF(I2<>"";","&I2;""))
Each IF statement will add a comma separator followed by the cell contents if the checked cell is not empty, otherwise it returns a blank string. You will need to manually adjust to add additional IF statements for however many maximum columns you want to concatenate. Again, once you have the top row set up, copy and paste down however far you need.
Assuming you have two columns (A and B), that these are labelled and sorted as shown, then enter in C2:
=IF(A1<>A2;B2;C1&","&B2)
and in D1:
=A1<>A2
Copy both down to suit, select ColumnC and Copy, Paste Special... with each Selection ticked other than Paste all and Formulas, click OK.
Select ColumnsA:D, Data > Filter > AutoFilter, click Yes and select 1 for ColumnD and all visible range.
Copy and paste into a new sheet, move B1 to C1 and delete Columns B and D.

SQLite: How to split a column

I have a column containing two names, which I'd like to extract into two separate columns surname1 and surname2 (I don't need the name nor the initial letter (e.g. N.)).
The exemplary content of that column is:
AwyeEaef2012 MS101 N.Lopez-O.Lorenzi.txt
-Lopez and Lorenzi are these two which we are looking for in this row.
What is good about my situation is that the first name comes always after the first dot (.) and ends just before the dash (-) and the second name comes just after second dot and ends just before the third dot and txt (.txt).
I know how to write a regex and using LIKE check if that column contains some specific surname but not the opposite way- how to read surnames and write them into two new columns.
Several rows from that column look like below:
WyeEaef MN2014 MS401 N.Lopez-O.Lorenzi.txt
AwyufEQ WCH2014 OS401 N.Lorenzi-O.Lopez.txt
THAFa5u WCH2014 LS107 N.Larry-O.Lolly.txt
So the pattern is as I mentioned *.Name1-[A-Z].Name2.txt
Where * is max 30 characters of capital and small letters and numbers
It could be approached in this manner: other words we need to divide this into substrings divided by dots first substring is a waste, the second without two last characters(a dash and acapital letter, e.g. -O) is the first name, the third substring is the second name and the fourth is another waste(a former file format).
I'd like to have an output of three columns:
initialColumn, firstName, secondName
The workaround that I wrote as a formula in Excel which I personally don't love, but might be useful for someone in the future.
=MID(A1;FIND(".";A1;1)+1;FIND(".";A1;FIND(".";A1;1)+1)-FIND(".";A1;1)-3)
I was surprised that Excel can manage processing ~0.5mln of records in the blink of an eye.