Partition BigQuery materialized view on aggregated column - google-cloud-platform

I have a table logs with these columns:
timestamp sessionID elementID value
The logs table is partitioned on the timestamp column.
I create a materialized view out of it:
create materialized view X
partition by date(new_timestamp)
as
select min(timestamp) as new_timestamp, sessionID, elementID, sum(value) as sumvalue
from logs
group by sessionID, elementID
I am getting an error "Partitioning column of the materialized view must either match partitioning column or pseudo-column of the base table, or be a TIMESTAMP_TRUNC over it."
BigQuery doc says the only way to partition the materialized view is to use exactly the same partition column of the main table, even the min() operator on the column is not accepted. Do you know how I could achieve the result I want despite this limitation?

A fresh look on your question gave me a simple answer: Use timestamp directly!
Indeed, you will store in the materialized view only the min(timestamp). So, the partitioning/sharding of your data will be done by timestamp, min or max, it's not the purpose!

Related

RLS filter with multiple values assigned?

I need to filter my Power BI report by the App IDs associated with the current user (using the USERPRINCIPALNAME function). So I have three tables in my model, DimApp, DimUser, and FactRegisters, where a User_Id may be related to 1 or more App_Ids in my Fact table.
DimApp table
DimUser table
FactRegister table
As you can see in FactRegisters table there are two App_Ids (3 and 1) for User_Id 201. The following is the DAX rule defined in App_Id column from DimApp table to filter the data:
VAR userId =
LOOKUPVALUE (
DimUser[User_Id],
DimUser[Email], USERPRINCIPALNAME()
)
VAR app =
LOOKUPVALUE (
FactRegisters[Application_Id],
FactRegisters[User_Id], userId
)
RETURN DimApplication[Application_Id] IN {app}
Verifying the DAX expression doesn't return an error, however, when I choose to "View as" that role I'm not able to see the data in the visuals. The error states: "Couldn't load the data for this visual. An error was encountered during the evaluation of the row-level security expression defined in table DimApp. A table of multiple values was supplied where a single value was expected."
Cannot display the visual viewing as role
However, when a single App_Id is associated with the User_Id, I'm able to visualize the data on the report visuals using the same DAX rule. Here is how FactRegisters table looks like when User_Id 201 has a single App_Id (3) associated:
FactRegisters table when User_Id with single App_Id
User_Id with a single App_Id visual
Now I'm I able to visualize data in the report. This is not a suitable case scenario as a User_Id can have many App_Ids.
I also tried the following static DAX rule in my App_Id column from DimApp just to test and pass multiple values to that column, and I succeed in visualizing data for multiple App_Ids:
DimApplication[Application_Id] IN {1,3}
Static RLS with multiple values by App_Id column
But this is not the goal (it's not dynamic). The goal is to visualize the data from all the Apps associated with the current user. Is it possible? Can't I pass more than one value to a column while filtering in RLS?
[Application_Id] IN
CALCULATETABLE(
VALUES('FactRegister'[Application_Id]),
FILTER ('FactRegister',
'FactRegister'[User_Id] = LOOKUPVALUE (DimUser[User_Id], DimUser[Email],USERPRINCIPALNAME())
)
)

How to resolve `single value for column cannot be determined` error?

DimUser and DimCustomer filter the FactSales table.
I have created a RLS role with the following DAX on the DimCustomer table:
[DW_CustomerID] IN
SELECTCOLUMNS(FILTER(Dimuser,DimUser[User_Email]=USERPRINCIPALNAME())
, "DW_CustomerID", FactSales[DW_CustomerID])
My intention is to filter the DimUser based on the current user's email, then retrieve the filtered Customer ID's from the FactSales table. Effectively the logged in user can only those customer for the user has made sales.
The DAX is giving following error:
A single value for column DW_CustomerID in table FactSales cannot be
determined.
How to resolve this error?
You are looking for a column that does not exist in the table you are looking for it in. The first parameter of the SELECTCOLUMNS() function is a table, in your case you have provided a derived table built by the FILTER() function being used on the DimUser table. Therefore, your derived table is one row with all the columns from the DimUser table. The FactSales[DW_CustomerID] column is not in this table.
I would try rewriting it to be something closer to the following:
[DW_CustomerID] IN
CALCULATETABLE(
VALUES(FactSales[DW_CustomerID]),
DimUser[User_Email] = USERPRINCIPALNAME()
)
Without seeing your model it is tough to know for sure though.

How to add a new column with custom values, based on a WHERE clause from another table in PowerBi?

I am stuck while dynamically forming a new column based certain WHERE clause from another Table in PowerBi. To give more details, let's say I have a table with item numbers associated with a Customer Name. In another table, I have to add a new column, which will dynamically add the item numbers associated with a particular customer and append as a query parameter to a base url.
So, my first table looks like this:
The second table that I want is this:
The query parameter value in the URL, has to be dynamically based on a SELECT query with a WHERE clause and pick up the ItemNumbers using the Customer field which is common between both. So, how can this be done in PowerBi? Any help would be really appreciated :)
I have one table in my model "TableRol" if I want to summarize my Date as the string I can use CONCATENATEX;
URL = CONCATENATE(CONCATENATE("http:\\mysite.com\parametersHere\getitem?='",CONCATENATEX(VALUES('TableRol'[Date]), 'TableRol'[Date],";")),"'")

How to query DynamoDB by string between + other keys

I'm trying to design a DynamoDB query that meets the following criteria:
get items by type, category, and date between(date_1, date_2)
I have these attributes already stored in a Global Secondary Index:
type (string)
category (string)
date (string)
I know I could use the between operator to query by a given date string:
gsi_1_pk = 'products' and gsi_1_sk between '2019-01-01T00:00:00.000Z' and '2019-01-01T00:00:00.000Z'
But there are situations where I want to query by the 3 attributes, not only the date.
So, I want a solution that allows me to query by all the possible filtering combinations: type, category, date between, type + category, type + date between, category + date between type + category + date between.
How can I combine this between operation with the other attributes from the GSI?
I ended up creating a new Global Secondary Index, where I store the date alone at the Sorting Key, which allows me to use the between Dynamo operation with no problem.
The downside is that I had to create a new GSI for such a simple query. But as many said here, DynamoDB seems not to be the "right/best" tool for this job.

Dynamodb2 Table Schema Creation

I'm using the following: dynamodb2, boto, python. I have the following code for creating a table:
table = Table.create('mySecondTable',
schema=[HashKey('ID')],
RangeKey('advertiser'),
throughput={'read':5,'write':2},
global_indexes=[GlobalAllIndex('otherDataIndex',parts=[
HashKey('date',data_type=NUMBER),
RangeKey('publisher', date_type=str),
],throughput={'read':5,'write':3})],
connection=conn)
I would like to be able to have the following data that I can query by:
ID, advertiser, date, publisher, size, and color
That means I need a different schema. When I add additional points it does not query unless the column name is listed in the schema.
The problem however is that right now I am only able to query by Id, advertiser, date, and publisher in this case. How can I add additional columns that I can query by?
I read this which appears to say that it is possible:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
However there is no example here:
http://boto.readthedocs.org/en/latest/dynamodb2_tut.html
I tried adding an additional range key however it doesn't work (cannot have duplicates)
I'd like it to be like:
table = Table.create('mySecondTable',
schema=[
RangeKey('advertiser'),
otherKey('date')
fourthKey('publisher') ... etc
throughput={'read':5,'write':2},
connection=conn)
Thanks!
If you want to add additional range keys you need to use Local secondary index.
You can query the LSI in the same way that you query the base table. You need to provide an exact value for the hashkey and a comparison-predicate for range key.