Remove duplicates from columns in Google Sheets - regex

https://docs.google.com/spreadsheets/d/1KGpWneb5yHWbsr7V20n9oB1aibF6BCB2ii7sMURPLPc/edit?usp=sharing
In this link I am trying to demonstrate what I want google sheets to do; remove duplicates from column cells - row by row.
I've tried removing duplicates from cells, and by row, but haven't had any luck with removing them by column.
This is a pretty big database (2k+ entries), so really hoping to find a clean solution by putting this question out to the community.

paste in G2:
={A3:A, ARRAYFORMULA(REGEXREPLACE(B3:E, A3:A, ))}

=ARRAYFORMULA(A3:A) in G3 (or =UNIQUE(A3:A) if what is in the red frame is not ok).
And this in H3:
=ARRAYFORMULA(IF(IFERROR(MATCH(B3:E, A3:A, 0), 0) <> 0, "", B3:E))

Related

Google Sheets ARRAYFORMULA to skip blank rows

How to make my ARRAYFORMULA(A1 + something else) to stop producing results after there are no more values in A1 column, eg. to skip blank values. By default it gives endlessly "something else".
Here is my demo sheet:
https://docs.google.com/spreadsheets/d/1AikL5xRMB94BKwG34Z_tEEiI07aUAmlbNzxGZF2VeYs/edit?usp=sharing
Actual data in column A1 is regularly changing, rows are being added.
I tried the others and they didn't work. This does though:
=ARRAYFORMULA(filter(A1:B;A1:A<>"";B1:B<>""))
use:
=ARRAYFORMULA(IF(A1:A="";;A1:A+1000))
You can try this formula =ARRAYFORMULA(IF(ISBLANK(A1:A),"",(A1:A + B1:B))) if this works out for you.
Reference:
https://support.google.com/docs/answer/3093290?hl=en

How to collect data and headers for non blank cells in a row in Sheets

I cannot find a solution to my problem:
I have a sheet with ~290 rows and ~80 columns. The first row and column are fixed/header.
I would like to collect non-blank values and their header into column B.
I've tried to search for solutions, but I'm not as good at excel, so I cannot wrap my head around most of the advice that I've found.
In Google Sheets you could use an Array formula. I got this:
The formula I've used:
=ArrayFormula(CONCATENATE(IF(--(C2:G2<>"")*COLUMN($C$1:$G$1)<>0;$C$1:$G$1&" "&C2:G2;"")))
This is how it works:
(--(C2:G2<>"") will return an array of 0 and 1 if the cell is blank or not
COLUMN($C$1:$G$1) will return an array of column numbers of each cell
(C2:G2<>"")*COLUMN($C$1:$G$1) we multiply both arrays, so we will get an array of column numbers of non blank cells and 0 of blank cells
<>0;$C$1:$G$1&" "&C2:G2;"") We check if each number in the array obtained in step 3 is 0 or not. If it's 0, it returns a null value, if not, it returns the value of cell
CONCATENATE will concatenate all values from previous array (step 4) so we concatenate null values with real values of non blank cells.
Not sure if this will make the sheet load slower if you have too many records.
Hope this helps
Excel is not the same Google Sheets
=ARRAYFORMULA(TRIM(REGEXREPLACE(
TRANSPOSE(
QUERY(TRANSPOSE(IF(C2:F13<>"",C1:F1 & ", ","")),,99^99)
),
"((\s+)|(,\s*$))",
" "
)))
My sample
use:
=ARRAYFORMULA(REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G<>"", C1:G1&" "&C2:G&",", )),,99^99))), ",$", ))

Combining two formulas ArrayFormula w/ if and TextJoin not workingI

I am trying to combine two formulas
=TEXTJOIN("|", 1, AQ2, AR2)
If I drag this down each row independently gets joined
And
=ARRAYFORMULA({"AAA";IF(INDIRECT("Elements!D2:D")="Person","Yes", "No")})
I want to combine then
=ARRAYFORMULA({"AAA";IF(INDIRECT("Elements!D2:D")="Person",TEXTJOIN("|", 1, AQ2, AR2), "No")})
But this only expands the first join at A2 and copies it down
How do you combine the formulas, so each row independently gets joined like the manually dragged down version
I have tried adding INDIRECT(AQ2:AQ) and INDIRECT(AR2:AR) in the TextJoin formula but this does not work
Google sheet
https://docs.google.com/spreadsheets/d/1uOpOi41kjVWIRO__0y7jg0JKrJNy04Kv1O9jxQWmKjo/edit?usp=sharing
try:
=ARRAYFORMULA({"AAA"; IF(B2:B="Person", C2:C&IF(D2:D="",,"|"&D2:D), "No")})
to remove those No's for blank rows use:
=ARRAYFORMULA({"AAA"; IF(B2:B="",,IF(B2:B="Person", C2:C&IF(D2:D="",,"|"&D2:D), "No"))})

How can I apply an Arrayformula for multiple cells on Google Sheets which includes fixed and variable cells?

I want to apply an arrayformula to my worksheet on Google Sheets, which works good when I drop-down the formula to the cells below. But because I have quite much data, I need to use the arrayformula for this formula and I can't find a solution how to apply the variable cell (in this example B11 and C11) to all cells below it.
The screenshot should explain my problem very well.
=COUNTIF((ARRAYFORMULA(IF((ARRAYFORMULA(IF(B11>$B$4:$B$7,IF(C11>$C$4:$C$7,1,0),0)))=1,IF((ARRAYFORMULA(IF($K$4:$K$7>$J$4:$J$7,1,0)))=1,1,0),0))),"1")
Here a link to my file:
https://docs.google.com/spreadsheets/d/1c17IQCujy3cQwDOcbJUpm3iCgJHCbD8QRbK0aQfVtQA/edit?usp=sharing
The output is in the green field
it would be like this:
=ARRAYFORMULA(MMULT(
IF(IF(INDIRECT("B11:B"&COUNTA(B11:B)+10)>TRANSPOSE(B4:B7),
IF(INDIRECT("C11:C"&COUNTA(C11:C)+10)>TRANSPOSE(C4:C7), 1, 0), 0)=1,
IF(IF(TRANSPOSE(K4:K7)>TRANSPOSE(J4:J7), 1, 0)=1, 1, 0), 0), {1; 1; 1; 1}))

How to count the number of blank cells in one column based on the first blank row in another column

I have a spreadsheet set up with tv program titles in column B, the next 20 or so columns are tracking different information about that title. I need to count the number of blank cells in column R relating to the range in column B that contains titles (ie, up to the first blank row in column B.)
I can easily set up a formula to count the number of empty cells in a given range in column R, the problem is as I add more titles to the sheet I would have to keep updating the range in the formula [a simple =COUNTIF(R3:R1108, "")]. I've done a little googling of the problem but haven't quite found anything that fits the situation. I thought I would be able to get the following to work but I didn't fully understand what was going on with them and they weren't giving the expected results.
I've tried these formulas:
=ArrayFormula(sum(MIN("B3:B"&MIN(IF((R3:R)>"",ROW(B3:B)-1)))))
=ArrayFormula(sum(INDIRECT("B3:B"&MIN(IF((R3:R)>"",ROW(B3:B)-1)))))
And
=if(SUM(B3:B)="","",SUM(R3:R))
All of the above formulas give "0" as the result. Based on the COUNTIF formula I have set up it should be 840, which is a number I would expect. Currently, there are 1106 rows containing data and 840 is a reasonable number to expect in this situation.
Is this what you're looking for?
=COUNTBLANK(INDIRECT(CONCATENATE("R",3,":R",(3+COUNTA(B3:B)))))
This counts the number of non-blank rows in the B column (starting at B3), and uses that to determine the rows to perform COUNTBLANK in, in column R (starting at R3). CONCATENATE is a way to give it a range by adding strings together, and the INDIRECT allows for the range reference to be a string.
a proper way would be:
=ARRAYFORMULA(COUNTBLANK(INDIRECT(ADDRESS(3, 18, 4)&":"&
ADDRESS(MAX(IF(B3:B<>"", ROW(B3:B), )), 18, 4)))
or shorter:
=ARRAYFORMULA(COUNTBLANK(INDIRECT("R3:"&
ADDRESS(MAX(IF(B3:B<>"", ROW(B3:B), )), 18, 4))))
or shorter:
=ARRAYFORMULA(COUNTBLANK(INDIRECT("R3:R"&MAX(IF(B3:B<>"", ROW(B3:B), ))))