Virtually blank column in array? - if-statement

I use this formula in Google Sheets:
=SORT({SORT({'Discord-D'!B2:B, 'Discord-D'!A2:A,
ARRAYFORMULA(IF('Discord-D'!G2:G = "", "", IF('Discord-D'!C2:C <> "", "Removed", "Processing"))), 'Discord-D'!H2:H,
ARRAYFORMULA(IF('Discord-D'!C2:C <> "", 'Discord-D'!G2:G, IFERROR(REPLACE('Discord-D'!G2:G, LEN('Discord-D'!G2:G)-3, 4, "****")))),
'Discord-D'!I2:I}, ROW('Discord-D'!A2:A), FALSE); SORT({'Facebook-D'!B2:B, 'Facebook-D'!A2:A,
ARRAYFORMULA(IF('Facebook-D'!E2:E = "", "", IF('Facebook-D'!C2:C <> "", "Removed", "Processing"))), 'Facebook-D'!F2:F,
ARRAYFORMULA(IF('Facebook-D'!C2:C <> "", 'Facebook-D'!E2:E, IFERROR(REPLACE('Facebook-D'!E2:E, LEN('Facebook-D'!E2:E)-3, 4, "****")))),
'Facebook-D'!G2:G}, ROW('Facebook-D'!A2:A), FALSE); SORT({'Instagram-D'!B2:B, 'Instagram-D'!A2:A,
ARRAYFORMULA(IF('Instagram-D'!E2:E = "", "", IF('Instagram-D'!C2:C <> "", "Removed", "Processing"))), 'Instagram-D'!F2:F,
ARRAYFORMULA(IF('Instagram-D'!C2:C <> "", 'Instagram-D'!E2:E, IFERROR(IF(LEN(REGEXEXTRACT('Instagram-D'!E2:E, "com/(.+)")) > 4,
REPLACE('Instagram-D'!E2:E, LEN('Instagram-D'!E2:E)-3,4, "****"),
REPLACE('Instagram-D'!E2:E, LEN('Instagram-D'!E2:E)-1,2, "**"))))), 'Instagram-D'!G2:G}, ROW('Instagram-D'!A2:A), FALSE); SORT({'TikTok-D'!B2:B, 'TikTok-D'!A2:A,
ARRAYFORMULA(IF('TikTok-D'!E2:E = "", "", IF('TikTok-D'!C2:C <> "", "Removed", "Processing"))), 'TikTok-D'!F2:F,
ARRAYFORMULA(IF('TikTok-D'!C2:C <> "", IFERROR(IF(LEN(REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/")) = 4,
REPLACE('TikTok-D'!E2:E, LEN("https://www.tiktok.com/#" & REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/"))-1, 2, "**"),
REPLACE('TikTok-D'!E2:E, LEN("https://www.tiktok.com/#" & REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/"))-3, 4, "****"))),
IFERROR(IF(LEN(REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/")) = 4,
REPLACE(REPLACE('TikTok-D'!E2:E, LEN("https://www.tiktok.com/#" &
REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/"))-1, 2, "**"), LEN('TikTok-D'!E2:E)-3, 4, "****"),
REPLACE(REPLACE('TikTok-D'!E2:E, LEN("https://www.tiktok.com/#" & REGEXEXTRACT('TikTok-D'!E2:E, "https://www.tiktok.com/#(.*?)/"))-3, 4, "****"), LEN('TikTok-D'!E2:E)-3, 4, "****"))))), 'TikTok-D'!H2:H}, ROW('TikTok-D'!A2:A), FALSE);
SORT({'YouTube-D'!B2:B, 'YouTube-D'!A2:A, ARRAYFORMULA(IF('YouTube-D'!E2:E = "", "", IF('YouTube-D'!C2:C <> "", "Removed", "Processing"))), 'YouTube-D'!G2:G,
ARRAYFORMULA(IF('YouTube-D'!C2:C <> "", 'YouTube-D'!E2:E, IFERROR(REPLACE('YouTube-D'!E2:E, LEN('YouTube-D'!E2:E)-3, 4, "****")))),
ARRAYFORMULA(IFERROR(REPLACE('YouTube-D'!F2:F, LEN('YouTube-D'!F2:F)-3, 4, "****")))}, ROW('YouTube-D'!A2:A), FALSE)}, 1, FALSE)
The YouTube array contains an extra column compared to the rest. Initially, it would fail because not all the arrays had the same number of columns in them. I solved it by inserting a blank column in each of the sheets and referencing that one (the last column reference in each array except YouTube, so 'Discord-D'!I2:I, 'Facebook-D'!G2:G, 'Instagram-D'!G2:G and 'TikTok-D'!H2:H). Is there a better way I can achieve this (for example by creating a virtually blank column instead of actually needing to have one in reality)?

there are various ways how to create a virtual column (or row). for example, to avoid array errors we can create a virtual array manually like:
={A1:A5, {"";"";"";"";""}}
to do it dynamically we can use divide by zero error and turn it into blanks:
=IFERROR(ROW(A2:A)/0, )
another way is within query where we can insert virtual column as:
=QUERY(A:C, "select A,B,C,' ' label ' '''")
but you can do it even with a single if:
=IF(A2:A,,)
and a lot of people use a sequence with substitute:
=SUBSTITUTE(SEQUENCE(ROWS(A2:A), 1, 1, 0), 1, )
of course, 2nd, 4th and 5th fx need ARRAYFORMULA wrapping
another popular way is to reference an existing column that is for sure empty like X:
={A:C, X:X}
and in some cases, you can even reference a non-existent column. for example, if your sheet has A-Z columns you can use:
={A:C, XX:XX}
and lambda lovers will appreciate:
=INDEX(LAMBDA(x, IFERROR(x/0))(A2:A))
or:
=INDEX(LAMBDA(x, x)(IF(A2:A,,)))

Related

Google Sheets Formula help: IF reference cell is BLANK, then cell is BLANK

OK the solution should be simple... like =IF(AG3 = "", "")
but I am unable to add the clause to my current formula as seen below: Any suggestions?
=IF(
IF(AF3 <> "y",
SUM(IFNA(VLOOKUP($AG3, RICS_TimeClocks!Q$3:U, 4, 0), 0),
IFNA(VLOOKUP($AG3, RICS_TimeClocks!V$3:Z, 4, 0), 0))
,"0")
= "0", "", SUM(IFNA(VLOOKUP($AG3, RICS_TimeClocks!Q$3:U, 4, 0), 0),
IFNA(VLOOKUP($AG3, RICS_TimeClocks!V$3:Z, 4, 0), 0)))
Let's say your current formula is "FORMULA", you would have to do the following:
=IF(AG3="",,FORMULA)
Now replace FORMULA with your actual formula and you get
=IF(AG3="",,IF(
IF(AF3 <> "y",
SUM(IFNA(VLOOKUP($AG3, RICS_TimeClocks!Q$3:U, 4, 0), 0),
IFNA(VLOOKUP($AG3, RICS_TimeClocks!V$3:Z, 4, 0), 0))
,"0")
= "0", "", SUM(IFNA(VLOOKUP($AG3, RICS_TimeClocks!Q$3:U, 4, 0), 0),
IFNA(VLOOKUP($AG3, RICS_TimeClocks!V$3:Z, 4, 0), 0))))

Return top value ordered by another column

Suppose I have a table as follows:
TableA =
DATATABLE (
"Year", INTEGER,
"Group", STRING,
"Value", DOUBLE,
{
{ 2015, "A", 2 },
{ 2015, "B", 8 },
{ 2016, "A", 9 },
{ 2016, "B", 3 },
{ 2016, "C", 7 },
{ 2017, "B", 5 },
{ 2018, "B", 6 },
{ 2018, "D", 7 }
}
)
I want a measure that returns the top Group based on its Value that work inside or outside a Year filter context. That is, it can be used in a matrix visual like this (including the Total row):
It's not hard to find the maximal value using DAX:
MaxValue = MAX(TableA[Value])
or
MaxValue = MAXX(TableA, TableA[Value])
But what is the best way to look up the Group that corresponds to that value?
I've tried this:
Top Group = LOOKUPVALUE(TableA[Group],
TableA[Year], MAX(TableA[Year]),
TableA[Value], MAX(TableA[Value]))
However, this doesn't work for the Total row and I'd rather not have to use the Year in the measure if possible (there are likely other columns to worry about in a real scenario).
Note: I am providing a couple solutions in the answers below, but I'd love to see any other approaches as well.
Ideally, it would be nice if there were an extra argument in the MAXX function that would specify which column to return after finding the maximum, much like the MAXIFS Excel function has.
Another way to do this is through the use of the TOPN function.
The TOPN function returns entire row(s) instead of a single value. For example, the code
TOPN(1, TableA, TableA[Value])
returns the top 1 row of TableA ordered by TableA[Value]. The Group value associated with that top Value is in the row, but we need to be able to access it. There are a couple of possibilities.
Use MAXX:
Top Group = MAXX(TOPN(1, TableA, TableA[Value]), TableA[Group])
This finds the maximum Group from the TOPN table in the first argument. (There is only one Group value, but this allows us to covert a table into a single value.)
Use SELECTCOLUMNS:
Top Group = SELECTCOLUMNS(TOPN(1, TableA, TableA[Value]), "Group", TableA[Group])
This function usually returns a table (with the columns that are specified), but in this case, it is a table with a single row and a single column, which means the DAX interprets it as just a regular value.
One way to do this is to store the maximum value and use that as a filter condition.
For example,
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN MAXX(FILTER(TableA, TableA[Value] = MaxValue), TableA[Group])
or similarly,
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN CALCULATE(MAX(TableA[Group]), TableA[Value] = MaxValue)
If there are multiple groups with the same maximum value the measures above will pick the first one alphabetically. If there are multiple and you want to show all of them, you could use a concatenate iterator function:
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN CONCATENATEX(
CALCULATETABLE(
VALUES(TableA[Group]),
TableA[Value] = MaxValue
),
TableA[Group],
", "
)
If you changed the 9 in TableA to an 8, this last measure would return A, B rather than A.

How to Return Text with IF Function in an Array

In Google Sheets, I'm trying to query a column and look for a state abbreviation, and if that abbreviation is a match, then "East" if not then "West"
Wanting to return text values in my column based on state abbreviation. We have territory manager split into two domains--East and West. So, trying to easily sort my data by East/West.
Here's what I have:
=IF(M:M={"AL", "CA", "DE","FL","GA","IA","KY","ME","MD","MA","MN","MS","NH","NJ","NY","ND","RI","SD","TN","VT","VA","WV","WI"},"East","West")
But, when I fill down, it just fills down East, and does not seem to actually query M:M
Thoughts?
Not the cleanest code, but this should work:
=ARRAYFORMULA(IF(LEN(A:A), IF((A:A = "foo")+(A:A = "bar") = 1, "WEST", "EAST"), ))
To use IF with an OR in an ARRAYFORMULA, you evaluate the column with 1s and 0s. The A:A = "foo" will evaluate to 1 if foo is in the cell. So if one of your OR criteria is in the cell, the total value in the IF will be 1.
You have a lot of criteria so writing each of them in will take a while ...
E.g. IF( (A:A = "AL") + (A:A = "CA") ... (A:A = "WI") = 1, "East", "West")
Use ISERROR/MATCH():
=IF(ISERROR(MATCH(M:M,{"AL", "CA", "DE","FL","GA","IA","KY","ME","MD","MA","MN","MS","NH","NJ","NY","ND","RI","SD","TN","VT","VA","WV","WI"},0)),"West","East")

Beginner rbind function

I cannot for the life of me understand the rbind function. I've tried using the examples on here, but I can't figure out what I am doing incorrectly. All I would like to do is add the data from my second data frame under the first.
Does rbind require the columns be the same name or...?
ParticipantA=c("A","B","C","D")
Score1A=c("21","20","21","21")
Score2A=c("32","40","32","31")
Score3A=c("47","50","43","46")
BlockA=data.frame(ParticipantA,Score1A,Score2A,Score3A)
BlockA$Major=c("Computer_Science","Computer_Science","Computer_Science","Computer_Science")
BlockA$Gender=c("Female","Female","Male","Male")
ParticipantB=c("E","F","G","H")
Score1B=c("28","28","21","22")
Score2B=c("30","36","37","32")
Score3B=c("41","49","49","46")
BlockB=data.frame(ParticipantB,Score1B,Score2B,Score3B)
BlockB$Major=c("Medical","Medical","Medical","Medical")
BlockB$Gender=c("Female","Female","Male","Male")
rbind requires that all columns be of the same name and class.
The problem is in the column titles. rbind uses column titles to orient how it will bind the rows. The columns can be in different orders, R will just use the first element to determine column order.
Alternatively, adding another column to your data frames, with the value "A" or "B" in it could preserve your information without putting "A"s and "B"s in your column names <-- the reason you can't use rbind. The additional column would also allow you to do more analyses in R, e.g. regression and other linear models.
Here is one way to handle your data:
Create a uniform set of column names that can be used for the data frames "BlockA" and "BlockB"
final_colnames <- c("Block", "Participant", "Score1", "Score2", "Score3")
Create a new list to identify which block the participants belong to.
BlockA = c("A", "A", "A", "A")
Your previous data
ParticipantA = c("A", "B", "C", "D")
Score1A = c("21", "20", "21", "21")
Score2A = c("32", "40", "32", "31")
Score3A = c("47", "50", "43", "46")
The label "BlockA" is recycled here to name the new data frame, but not before adding the "BlockA" column list of "A" "A" "A" "A".
BlockA = data.frame(BlockA, ParticipantA, Score1A, Score2A, Score3A)
The new column names have to be added at this point, so that the number of names and the number of columns are equal.
colnames(BlockA) <- final_colnames
Now you can add the remaining columns
BlockA$Major = c("Computer_Science", "Computer_Science", "Computer_Science", "Computer_Science")
BlockA$Gender = c("Female", "Female", "Male", "Male")
BlockB is the same process
BlockB = c("B", "B", "B", "B") # the extra column
ParticipantB = c("E", "F", "G", "H")
Score1B = c("28", "28", "21", "22")
Score2B = c("30", "36", "37", "32")
Score3B = c("41", "49", "49", "46")
BlockB = data.frame(BlockB, ParticipantB, Score1B, Score2B, Score3B)
colnames(BlockB) <- final_colnames # renaming the columns
BlockB$Major = c("Medical", "Medical", "Medical", "Medical")
BlockB$Gender = c("Female", "Female", "Male", "Male")
Uniform column names mean that rbind will now work.
rbind(BlockA,BlockB)

Mathematica - StringMatch Elements Within a List?

I have a functions that returns cases from a table that match specific strings.
Once I get all the cases that match those strings, I need to search each case (which is its own list) for specific strings and do a Which command. But all I know how to do is turn the whole big list of lists into one string, and then I only get one result (when I need a result for each case).
UC#EncodeTable;
EncodeTable[id_?PersonnelQ, f___] :=
Cases[#,
x_List /;
MemberQ[x,
s_String /;
StringMatchQ[
s, ("*ah*" | "*bh*" | "*gh*" | "*kf*" |
"*mn*"), IgnoreCase -> True]], {1}] &#
Cases[MemoizeTable["PersonnelTable.txt"], {_, id, __}]
That function is returning cases from the table
Which[(StringMatchQ[
ToString#
EncodeTable[11282], ("*bh*" | "*ah*" |
"*gh*" ), IgnoreCase -> True]) == True, 1,
(StringMatchQ[
ToString#
EncodeTable[11282], ("*bh*" | "*ah*" |
"*gh*" ), IgnoreCase -> True]) == False, 0]
That function is SUPPOSED to return a 1 or 0 for each case returned by the first function, but I don't know how to search within lists without making them all one string and return a result for each list.
Well, you probaby want Map, but it's hard to say without seeing what the structure of the data to be operated upon is. Perhaps you can provide an example.
EDIT: In the comment, an example result was given as
dat = {{204424, 11111, SQLDateTime[{1989, 4, 4, 0, 0, 0.}], Null,
"Parthom, Mary, MP", Null, 4147,
"T-00010 AH BH UI", {"T-00010 AH BH UI", "M-14007 LL GG",
"F-Y3710 AH LL UI GG"}, "REMOVED."}, {2040, 11111,
SQLDateTime[{1989, 4, 13, 0, 1, 0.}], Null, "KEVIN, Stevens, STK",
Null, 81238,
"T-00010 ah gh mn", {"T-00010 mn", "M-00100 dd", "P-02320 sd",
"M-14003 ed", "T-Y8800 kf", "kj"}}};
(actually the example had a syntax error so I fixed it in what I hope is the right way).
Now, if I define a function
func = Which[(StringMatchQ[#[[8]], ("*bh*" | "*ah*" | "*gh*"),
IgnoreCase -> True]) == True, 1, True, 0] &;
(note the second condition to be matched may be written as True, see the documentation of Which) which does this
func[dat[[1]]]
(*
-> 1
*)
(note that I've slightly changed func from what you have, in order for it to do what I assume you wanted it to actually do). This can then be applied to dat, of which the elements have the form you gave, as follows:
Map[func, dat]
(*
-> {1, 1}
*)
I'm not sure if this is what you want, I did my best guessing.
EDIT2: In response to the comment about the position of the element to be matched being variable, here is one way:
ClearAll[funcel]
funcel[p_String] :=
Which[StringMatchQ[p, ("*bh*" | "*ah*" | "*gh*"),
IgnoreCase -> True], 1, True, 0];
funcel[___] := 0;
ClearAll[func];
func[lst_List] := Which[MemberQ[Map[funcel, lst], 1], 1, True, 0]
so that
Map[func, dat]
gives {1,1}