Amazon Redshift multi-aggregate rollup - amazon-web-services

I have a REDSHIFT dataset which includes customer purchases as a SKU level.
I'm struggling to produce a view that includes multiple aggregates at a customer level. For example my table includes columns like: customer_id, order_id, product_id, product_category, product_division, sales, units
From this base I'd like a result which looks like
customer_id
total_sales (i.e. sum of all SKU sales)
total_units
total_orders
categories_purchased (i.e. a distinct count of categories the customer purchased)
divisions_purchased
primary_category_sales (i.e. category with the highest sales)
primary_division_sales
primary_category_mix (i.e. primary category sales / total sales)
primary_division_mix
While I can aggregate results for the whole dataset, I can't seem to crack how to incorporate sub-aggregates like finding the maximum category and its relative contribution to total sales. Any help you can offer is most appreciated!
I have tried nesting queries + using window functions but keep running into errors like aggregate function calls may not have nested aggregate or window functions.

Related

Merging queries on product number AND on date

For a report on bakery product performance I am trying to match the purchase price with the selling price in order to get to a product margin. The purchase price sits in a separate table and so does the selling price. Both tables contain product numbers so I am able to merge the two queries in order to get both selling price and purchase price on one line in order to compute a product margin.
The difficulty however is the fact that the selling prices and purchase prices are very subjective to change. Each line in both tables has a date field that is changing often. Therefore I would like to merge the queries in such a way that the selling price of a given product is corresponding to the purchase price of the same product, but also make sure that the date of the selling price and purchase price align.
How would I go about and do this?
Tables would look like this:
DATE - SHOP - PRODUCTNR - QUANTITY SOLD - TOTAL SALES - SELLING PRICE PER PRODUCT
DATE - PRODUCTNR - PURCHASE PRICE PER PRODUCT
You need to add a new column consisting of the [PRODUCTNR] and [DATE] fields. You can use a simple concatenate operation to achieve this:
For Example:
Let's say you have such a table:
Then you need to create a new column called [CompositeKey] in the table:
CompositeKey =
[DATE] & "-" & [PRODUCTNR]
Resulting Table:
Note: I used (-) character to separate columns. You can use any others, like pipe(|), underscore(_) etc...
Now you need to do the same for your 2nd table, and create a relationship between your [CompositeKey] columns on both tables. Now It will adapt to changing dates, and you will know which is which... I hope It solves your problem.

DAX Power BI circular dependence problem across multiple tables

After years of benefitting from answers here, and long hours of research to resolve a DAX circular dependency, I'm posting my first question. I am self trained, and hopefully, I'll explain properly.
For clarity, I'm giving full detail below, but my basic question is how to avoid circular dependencies when calculating columns in one table that reference multiple other tables.
I'm looking to build a report that shows total sales by salesperson from two tables, CRM_Opportunities and BMC_Usage. Identifying the salesperson for each row of BMC_Usage involves referencing data from four other tables, which works fine within BMC_Usage, but results in a circular dependency when I try to link to the CRM_Users.
relationships
I have five tables from three systems:
Three tables from our CRM: CRM_Users, CRM_Accounts and CRM_Opportunities.
CRM_Users contains the fullname and crm user_id of each
salesperson.
CRM Opportunities contains rows of opportunities, each
with createdon date, client_id and salesperson_key.
CRM_Accounts contains client_id and salesperson_key.
BMC_Usage contains sales data for our new product, but no sales person information. Each row equates to one unique billable event. It does contain the date when the service began, start_date and the client id field bmc_client_id.
One table from our billing system, UBS_Accounts, includes the name of the sales person, ubs_salesperson along with the client id ubs_client_id.
In order to produce a consolidated report by sales person, I need to first identify who the correct sales person is for each row of usage in BMC_Usage using the following series of calculated columns in BMC Usage:
First, find out who opened the last sales opportunity in CRM_Opportunity prior to start_date in BMC_Usage:
Last Opportunity Date = CALCULATE(MAX(CRM_Opportunities[createdon]),CRM_Opportunities[createdon]<=EARLIER(BMC_Usage[start_date]),CRM_Opportunities[client_id]=EARLIER(BMC_Usage[bmc_client_id]))
Next, in case there's been no opportunities created yet, check when the account was last created in CRM_Account and also prior to start_date in BMC_Usage:
Last Acct Date = DATEVALUE(CALCULATE(MAX(CRM_Accounts[account_createdon]),CRM_Accounts[account_createdon]<=EARLIER(BMC_Usage[start_date]),CRM_Accounts[crm_client_id]=EARLIER(BMC_Usage[bmc_client_id])))
Third, grab the salesperson_key field based on the Last Opportunity Date, or if blank, from the Last Acct Date:
salesperson id = IF(ISBLANK(BMC_Usage[Last Opportunity Date]),
IF(ISBLANK(BMC_Usage[Last Acct Date]),
BLANK(),
LOOKUPVALUE(CRM_Accounts[salesperson_key],CRM_Accounts[crm_client_id],BMC_Usage[bmc_client_id],CRM_Accounts[account_createdon],BMC_Usage[Last Acct Date])),
LOOKUPVALUE(CRM_Opportunities[salesperson_key], CRM_Opportunities[client_id],BMC_Usage[bmc_client_id],CRM_Opportunities[createdon],BMC_Usage[Last Opportunity Date]))
Now that I have the salesperson id, I can get the fullname from CRM_Users:
CRM salesperson = LOOKUPVALUE('CRM_Users'[fullname],'CRM_Users'[crm_user_id],BMC_Usage[salesperson id])
Since it's possible our CRM has no record for this client, I can also pull the ubs_saleperson name from our billing system, UBS_Accounts, although it is a less reliable source.
UBS salesperson = RELATED(UBS_Accounts[ubs_salesperson])
Finally, I arrive at who the salesperson is for each row of BMC_Usage by preferring the CRM name over the UBS_Accounts name:
calc salesperson = IF(ISBLANK(BMC_Usage[CRM salesperson]),BMC_Usage[UBS salesperson],BMC_Usage[CRM salesperson])
And it all works fine!
calc salesperson
The problem is, I can't build a relationship between BMC_Usage and CRM_Users so that I can make a consolidated report.
circular dependancy
From reading online, I believe the problem is caused by the multiple calculated columns and possibly the presence of blank rows. I've tried filtering options using ALLEXCEPT, DISTINCT and others but I can't seem to get it right. I'm not sure if I should be filtering the local table or the tables I'm pulling information from, and I'm just lost.
If you've read this far....THANK YOU! I know it's a long question. Perhaps you have an idea?

How to switch data from two tables based on filter in Power Bi

I have two tables which have counts and sales based on dates and one of them also have customer ID. The counts are not same when we see by customer and summary. I also have customer filter on my dashboard. What I want to achieve is if no customer is selected the count should come from summary table otherwise it should come from customer if multiple or one is selected in the filter.
Customer Table
Summary Table
Any hints, I have tried lookupvalue function but I cant put date as search value from date table.
It's much easier to use Measures, instead of creating calculated tables to obtain those metrics. Also, summarized tables would not have the same filter context your are looking for.
Measure 1
Total Customers =
DISTINCTCOUNT('Customer Table'[CustomerID])
Measure 2
Total Sales =
SUM ( 'Customer Table'[Sales])

DAX TopN Behavior

Just wanted to confirm my understanding (or lack thereof) around these two formulas - in an orders table where each row is an order:
TOPN(10,ALL(Orders),[Total Sales]) - looks at the individual Sales amount for each row and returns the whole table with just the top 10 records sorted by the Sales field; using the measure Total Sales(defined as Sum of Sales) in this context doesn't really have an effect as the aggregation is at a single row level which just keeps it the same.
TOPN(10,ALL(Orders[Customer Name]),[Total Sales]) - this actually groups by the customer name, calculates the total sales, and returns the top 10 customer names based on that metric; it's more or less equivalent to this SQL:
select customer_name, sum(sales) as Total_Sales from orders
group by customer_name
order by Total_Sales desc
limit 10

Create Custom Column with IF and lookup functionality to own table

I have merged two tabels ( sales and forecast ). For all the rows coming from sales query the cost price column has a value. The forecast rows does not have that.
In order to calculate future metrics/KPI I need to make a Power Query transformation that populates cost price on all forecast rows. I would like to do some kind of refence to the ProductName (exits both on the sales and forecast rows) and pull the cost price from the sales rows. The ProductName can have multiple entries in the table, but will be the same for alle the rows. So maybe a find first/max or something would be fine.
However, I am not sure have to make this calcuated column with some sort of lookup to ProductName?
Well you can definitely do so
Here is an excellent Article form Microsoft on LookupValue
In addition check this Thread as well. It will give you more Idea.
I would do something like
=LOOKUPVALUE(Product[SafetyStockLevel], [ProductName], " Mountain-400-W Silver, 46")