How to combine the code of two array queries in Google Sheets - regex

I need to combine two queries that are both inside arrayformulas so that I just have one query:
I've tried using Union
First Code:
= ARRAYFORMULA(QUERY({MID(Sheet1!B1:B, 8, 5), Sheet1!A1:AS},
"select count(Col13)
where Col13>=0
group by Col1
label count(Col13)'Winners #'"))
Second Code:
= ARRAYFORMULA(QUERY({MID(Sheet1!B1:B, 8, 5), Sheet1!A1:AS},
"select count(Col13)
where Col13<=0
group by Col1
label count(Col13)'Losers #'"))

=ARRAYFORMULA(QUERY(REGEXREPLACE(TO_TEXT(QUERY({
QUERY({MONTH(MID('grouping project'!A2:A, 8, 3)&1)&"♦"&
MID('grouping project'!A2:A, 8, 5), 'grouping project'!A2:AO},
"select Col1,count(Col3),'Winners #'
where Col1 is not null
and Col3 >= 0
group by Col1
label count(Col3)'','Winners #'''", 0);
QUERY({MONTH(MID('grouping project'!A2:A, 8, 3)&1)&"♦"&
MID('grouping project'!A2:A, 8, 5), 'grouping project'!A2:AO},
"select Col1,count(Col3),'Loosers #'
where Col3 <= 0
and Col1 is not null
group by Col1
label count(Col3)'','Loosers #'''", 0)},
"select Col1,sum(Col2)
group by Col1
pivot Col3
label Col1'Week ending'", 0)), "^.+♦", ),
"where Col1 is not null", 0))

Related

Replace nulls with the previous non-null value

I am using Amazon Athena engine version 1, which is based on Presto 0.172.
Consider the example data set:
id
date_column
col1
1
01/03/2021
NULL
1
02/03/2021
1
1
15/03/2021
2
1
16/03/2021
NULL
1
17/03/2021
NULL
1
30/03/2021
NULL
1
30/03/2021
1
1
31/03/2021
NULL
I would like to replace all NULLs in the table with the last non-NULL value i.e. I want to get:
id
date_column
col1
1
01/03/2021
NULL
1
02/03/2021
1
1
15/03/2021
2
1
16/03/2021
2
1
17/03/2021
2
1
30/03/2021
1
1
30/03/2021
1
1
31/03/2021
1
I was thinking of using a lag function with IGNORE NULLS option but unfortunately, IGNORE NULLS is not supported by Athena engine version 1 (it is also not supported by Athena engine version 2, which is based on Presto 0.217).
How to achieve the desired format without using the IGNORE NULLS option?
Here is some template for generating the example table:
WITH source1 AS (
SELECT
*
FROM (
VALUES
(1, date('2021-03-01'), NULL),
(1, date('2021-03-02'), 1),
(1, date('2021-03-15'), 2),
(1, date('2021-03-16'), NULL),
(1, date('2021-03-17'), NULL),
(1, date('2021-03-30'), NULL),
(1, date('2021-03-30'), 1),
(1, date('2021-03-31'), NULL)
) AS t (id, date_col, col1)
)
SELECT
id
, date_col
, col1
-- This doesn't work as IGNORE NULLS is not supported.
-- CASE
-- WHEN col1 IS NOT NULL THEN col1
-- ELSE lag(col1) OVER IGNORE NULLS (PARTITION BY id ORDER BY date_col)
-- END AS col1_lag_nulls_ignored
FROM
source1
ORDER BY
date_co
After reviewing similar questions on SO (here and here), the below solution will work for all column types (including Strings and dates):
WITH source1 AS (
SELECT
*
FROM (
VALUES
(1, date('2021-03-01'), NULL),
(1, date('2021-03-02'), 1),
(1, date('2021-03-15'), 2),
(1, date('2021-03-16'), NULL),
(1, date('2021-03-17'), NULL),
(1, date('2021-03-30'), 1),
(1, date('2021-03-31'), NULL)
) AS t (id, date_col, col1)
)
, grouped AS (
SELECT
id
, date_col
, col1
-- If the row has a value in a column, then this row and all subsequent rows
-- with a NULL (before the next non-NULL value) will be in the same group.
, sum(CASE WHEN col1 IS NULL THEN 0 ELSE 1 END) OVER (
PARTITION BY id ORDER BY date_col) AS grp
FROM
source1
)
SELECT
id
, date_col
, col1
-- max is used instead of first_value, since in cases where there will
-- be multiple records with NULL on the same date, the first_value may
-- still return a NULL.
, max(col1) OVER (PARTITION BY id, grp ORDER BY date_col) AS col1_filled
, grp
FROM
grouped
ORDER BY
date_col

Aggregating values in "virtual" columns created with table constructor within a measure

within a measure I am creating a "virtual" table with several rows and columns. Within the same measure I need to aggregate the values from one of the resulting columns.
My problem is that I can't figure out how to access/refer to a column of the resulting virtual table with aggregator functions like e.g. MAX() or SUM().
Here is the code for creating the table within a measure (= it is not a calculated table in the datamodel):
VAR virtualtable =
{
( "o1", 1, 2, 3 ),
( "o2", 4, 5, 6 ),
( "o4", 7, 8, 9 ),
( "o5", 10, 11, 12 )
}
Resulting table:
Value1
Value2
Value3
Value4
o1
1
2
3
o2
4
5
6
o4
7
8
9
o5
10
11
12
Trying to sum the values of column "Value2" using SUM( virtualtable[Value2] ) does not work. Any ideas?
The correct way to SUM the column of a table variable is using SUMX, for instance this works
Sum Val 2 =
VAR virtualtable =
{
( "o1", 1, 2, 3 ),
( "o2", 4, 5, 6 ),
( "o4", 7, 8, 9 ),
( "o5", 10, 11, 12 )
}
RETURN SUMX( virtualtable, [Value2] )

Data Cols in Google Sheets Query Formula

What I'm trying to do is to make a query formula returning year sales by week number filtered by store.
Example sheet: Link
It's easy to do with formula like this one: =Query(query(A:D,"Select A,Sum(D) where A is not null group by A Pivot B",1),"Select * offset 1",0)
But I also need to filter results based on specific store (Col C)
It's also not hard:
=Query(query(A:D,"Select A,Sum(D) where C = 'First' AND A is not null group by A Pivot B",1),"Select * offset 1",0)
But in this case any week with 0 sales and store equals to 'Second' will be missed.
I would like to show all weeks (Col A) presented in the data. Is it possible?
try:
=ARRAYFORMULA(QUERY(QUERY({A:B, IF(C:C<>"First", {"First", 0}, C:D)},
"select Col1,sum(Col4)
where Col3 = 'First'
and Col1 is not null
group by Col1
pivot Col2"), "offset 1", 0))

Need to combine 3 formulas into 1

Formula 1: List all items from Estimate tab
=QUERY(Estimate!A2:D50,"SELECT * where C is not null",0)
Formula 2: Locate task group matches within taskItemAssociations
=ARRAYFORMULA(IFERROR(VLOOKUP(A9:A&B9:B&C9:C&D9:D,
TRIM(IFERROR(SPLIT(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
{INDEX(QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN('task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D&"♦"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select Col1,count(Col1) where Col1 is not null group by Col1 pivot Col2", 0),,1), IF(
ISNUMBER(QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN('task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D&"♦"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select count(Col1) where Col1 is not null group by Col1 pivot Col2", 0)),
QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN('task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&'task-itemAssociations'!D2:D&"♦♥"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select count(Col1) where Col1 is not null group by Col1 pivot Col2 limit 0", 0), )})
,,999^99))), "♥"))), {2}, 0)))
Formula 3: List all matches from taskData tab
This result is really all I need. I'm just not sure how else to arrive here without all of the above.
=QUERY(taskData!C2:O,"SELECT * where C = '"&E9&"'",0)
Ideally, this would be a single ARRAYFORMULA in Tasks!A2 (currently occupied by notes)
Here is my sheet
paste in A2 cell:
=FILTER(taskData!C2:O, REGEXMATCH(taskData!C2:C, TEXTJOIN("|", 1,
ARRAYFORMULA(IFERROR(VLOOKUP(
INDEX(QUERY(Estimate!A2:D50,"where C is not null",0),,1)&
INDEX(QUERY(Estimate!A2:D50,"where C is not null",0),,2)&
INDEX(QUERY(Estimate!A2:D50,"where C is not null",0),,3)&
INDEX(QUERY(Estimate!A2:D50,"where C is not null",0),,4),
TRIM(IFERROR(SPLIT(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
{INDEX(QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN(
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D&"♦"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select Col1,count(Col1) where Col1 is not null group by Col1 pivot Col2", 0),,1), IF(
ISNUMBER(QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN(
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D&"♦"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select count(Col1) where Col1 is not null group by Col1 pivot Col2", 0)),
QUERY(IFERROR(SPLIT(SORT(UNIQUE(IF((LEN(
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D))*(LEN('task-itemAssociations'!E2:E)),
'task-itemAssociations'!A2:A&'task-itemAssociations'!B2:B&'task-itemAssociations'!C2:C&
'task-itemAssociations'!D2:D&"♦♥"&'task-itemAssociations'!E2:E, )), 1, 1), "♦")),
"select count(Col1) where Col1 is not null group by Col1 pivot Col2 limit 0", 0), )})
,,999^99))), "♥"))), {2}, 0))))))

How to break down date range by each month in DAX Power BI?

I have two dates:
'2018-01-05' and '2019-01-05'
How to create calculated table to break down those dates by month.
Should look simething like that:
There are probably many ways to do this, but here's one way that combines a few different concepts:
Table =
VAR Starting = DATE(2018, 1, 5)
VAR Ending = DATE(2019, 1, 5)
VAR MonthTable =
SUMMARIZE(
ADDCOLUMNS(
CALENDAR(Starting, Ending),
"StartDate", EOMONTH([Date], 0) + 1),
[StartDate],
"EndDate", EOMONTH([StartDate], 0) + 1)
RETURN UNION(
ROW("StartDate", Starting, "EndDate", EOMONTH(Starting, 0) + 1),
FILTER(MonthTable, [EndDate] < Ending && [StartDate] > Starting),
ROW("StartDate", EOMONTH(Ending, -1) + 1, "EndDate", Ending)
)
Basically, you start with the CALENDAR function to get all the days, tag each date with its corresponding month, and then summarize that table to just return one row for each month.
Since the first and last rows are a bit irregular, I prepended and appending those to a filtered version of the summarized month table to get your desired table.
Create new table as
Table = CALENDAR( DATE(2018, 5, 1), DATE(2019, 1, 5) - 1)
Rename auto-generated column "Date" into "Start Date". Add new column as
End Date = Start Date + 1