Replace Aggregate Data with original details for analysis

Replace Aggregate Data with original details for analysis - replace

I have a data set that is the combination of several data sources. One data source (data ABC) that is being added to the final data set (data XYZ) is aggregating based on a few columns and dropping some details. Now, I need to analyze the whole data set and a key column is left blank. What I would like to do is some how replace the lines in the data set with the original detailed disaggregated data. Below I have included example data. Data XYZ shows the rows in the data set that I would like to replace with the original data from Data ABC. As you can see in Data XYZ the "Vendor" column is being dropped and being aggregated together with everything else within the department, account, and date.
Any thoughts on how to replace the aggregate data with the original details. I am using MS excel, but I could also accept a VBA or SQL solution to solve so that I better analyze the data.
Data XYZ
Date
Account
Vendor
Department
Amount
6/30/2022
66320
501
4,000.00
7/31/2022
66320
501
4,000.00
8/31/2022
66320
501
4,000.00
6/30/2022
66320
502
0.00
7/31/2022
66320
502
1,000.00
8/31/2022
66320
502
1,000.00
Data ABC
Date
Account
Vendor
Department
Amount
6/30/2022
66320
Vendor A
501
552.00
7/31/2022
66320
Vendor A
501
552.00
8/31/2022
66320
Vendor A
501
552.00
6/30/2022
66320
Vendor B
501
2,575.00
7/31/2022
66320
Vendor B
501
2,575.00
8/31/2022
66320
Vendor B
501
2,575.00
6/30/2022
66320
Vendor C
501
873.00
7/31/2022
66320
Vendor C
501
873.00
8/31/2022
66320
Vendor C
501
873.00
7/31/2022
66320
Vendor B
502
256.00
8/31/2022
66320
Vendor B
502
256.00
7/31/2022
66320
Vendor C
502
744.00
8/31/2022
66320
Vendor C
502
744.00

Related

Power BI: Conditional Formating Matrix Visual with data bars

I would like to create a matrix visual like below and add data bars as conditional formating to the "Sales Percentage" Column with different user defined max and min values based on the countries.
I have the following dummy data
Salesperson
Country
Product
Sales Percentage
Total Sales
Gina
Canada
City Bike
0.02
232
Gina
Canada
Mountain Bike
0.56
2800
Gina
Italy
City Bike
0.32
213
Gina
Italy
Mountain Bike
0.21
1050
Gina
USA
City Bike
0.11
122
Gina
USA
Mountain Bike
0.43
2150
John
Canada
City Bike
0.32
333
John
Canada
Mountain Bike
0.34
442
John
Italy
City Bike
0.12
2132
John
Italy
Mountain Bike
0.67
1233
John
USA
City Bike
0.22
3300
John
USA
Mountain Bike
0.45
7300
Mary
Canada
City Bike
0.21
121
Mary
Canada
Mountain Bike
0.53
2650
Mary
Italy
City Bike
0.32
213
Mary
Italy
Mountain Bike
0.12
600
Mary
USA
City Bike
0.11
123
Mary
USA
Mountain Bike
0.12
600
The matrix looks like this after showing columns as rows and putting "Sales Percentage" and "Total Sales" as values, Country as columns and Product + Salesperson as rows:
I can add databars when I right click the Sales Percentage under values but I can only enter one user defined min and max value for the whole "Sales Percentage" column. Is it possible to have different maximum value for data bars based on the Country? For example to create a target value of 35% for Canada, 40% for USA and 50% for Italy. So in other words the data bar would be full when the Sales Percentage for Canada reaches 35% and full when Sales Percentage for USA reaches 40% and so on.

This isn't possible with you current setup. The best you could do to approximate this is as follows.
Create a measure as follows:
% Canada = CALCULATE(SUM('Table'[Total Sales]), 'Table'[Country ] = "Canada")
Do the same for USA and Italy and then add them as values to your matrix.
You can now select individual targets for each country.

Power BI: Conditional Formating Data bars for Matrix Visual

I need to create a matrix in the following format The total sales and percentage sales below each other:
This is why I have created a table with data like this:
Salesperson
Country
Sales
Product
Format
John
USA
0.45
Mountain Bike
Percentage
John
Canada
0.34
Mountain Bike
Percentage
John
Italy
0.67
Mountain Bike
Percentage
Gina
USA
0.43
Mountain Bike
Percentage
Gina
Canada
0.56
Mountain Bike
Percentage
Gina
Italy
0.21
Mountain Bike
Percentage
Mary
USA
0.12
Mountain Bike
Percentage
Mary
Canada
0.53
Mountain Bike
Percentage
Mary
Italy
0.12
Mountain Bike
Percentage
John
USA
0.22
City Bike
Percentage
John
Canada
0.32
City Bike
Percentage
John
Italy
0.12
City Bike
Percentage
Gina
USA
0.11
City Bike
Percentage
Gina
Canada
0.02
City Bike
Percentage
Gina
Italy
0.32
City Bike
Percentage
Mary
USA
0.11
City Bike
Percentage
Mary
Canada
0.21
City Bike
Percentage
Mary
Italy
0.32
City Bike
Percentage
John
USA
2250
Mountain Bike
Total
John
USA
1700
Mountain Bike
Total
John
USA
3350
Mountain Bike
Total
Gina
USA
2150
Mountain Bike
Total
Gina
Canada
2800
Mountain Bike
Total
Gina
Italy
1050
Mountain Bike
Total
Mary
USA
600
Mountain Bike
Total
Mary
Canada
2650
Mountain Bike
Total
Mary
Italy
600
Mountain Bike
Total
John
USA
1100
City Bike
Total
John
USA
1600
City Bike
Total
John
USA
600
City Bike
Total
...
...
...
...
...
Under Sales column is the total amount and percentage amount of sale and the matrix will filter after the Format column. But since I need to change the format of the percentage to percent, because it's in decimal format, I have created a measure for sales like this:
Sales_all =
VAR variable = SUM ( 'Table'[Sales])
RETURN
SWITCH (
SELECTEDVALUE ( 'Table'[Format]),
"Total", FORMAT ( variable, "General Number" ),
"Percentage", FORMAT ( variable, "Percent" ))
I have two questions. I would like to create a data bar conditional formatting for Percentage:
Is it possible to use different values for max and min of the data bar for each country. Currently when I choose data bars, I can only enter values for the whole column of Sales, disregarding the Countries (Canada, Italy, USA). For example I would like to enter a max value for Canada as 60% and max value for Italy as 25%. If I use the Sales column directly, not as measure, I can only choose one max value for the whole Sales column. The bar for the percentage should be full at 60% for Canada and full at 25% for Italy.
Since I have used a measure to change the format of the values in Sales column based on the Format column, I can't choose data bar under conditional formatting anymore? Why is this the case and how can I change it?

Please keep each post to a single question. Please don't paste data as images and keep the sample data as copiable text.
I don't understand question 1 so you will need to elaborate (ideally in a brand new question with copiable sample data). The reason for question 2 is that FORMAT() returns text and so is no longer a number and can't produce a data bar. Either keep the measure as a number or change the display formatting using calculation groups.
EDIT
You need to reshape your data. In PQ, pivot Format column with value of Sales as follows:
You end up with this (missing data because your sample wasn't complete)
Create a matrix as follows:
Highlight the column or measure for percentage and in the ribbon select percent for the format. This keeps the underlying value as a number but changes the display only.
On the matrix, ensure you have the following option.
You should now have the following:
You can now add data bars to percentage column.

Power BI - Showing Top 5 records in Metrix Table but total should show for all records

I have table with thousands of record. i want to create a table visual that will show top 5 records for each category. i created a measure to achieve this and i am getting the result exactly the same i am looking for but facing one issue there.
See below image where i am showing top 5 records for each category, but after each category i have total.
I don't want that total for top 5 records i am showing in the table instead i want the total of all the records which is there under each category.
How can i achieve that?
Measure I created is - Top 5 = RankX(AllSelected(table(Category), Table(account), table(name)),amount_measure,,,Dense)
for Top 5 measure i am putting the filter for top 5.
Category
Account
Name
P%
amount
country
owner
Food
A101
AA11
10%
105
India
A
Food
A102
AA12
20%
120
India
A
Food
A103
AA13
80%
100
India
A
Food
A104
AA14
30%
150
India
A
Food
A105
AA15
60%
90
India
A
Stat
B101
AA11
10%
205
India
A
Stat
B102
AA12
20%
220
India
A
Stat
B103
AA13
80%
200
India
A
Stat
B104
AA14
30%
250
India
A
Stat
B105
AA15
60%
190
India
A
Admn
D101
AD11
10%
305
India
A
Admn
D102
AD12
20%
320
India
A
Admn
D103
AD13
80%
300
India
A
Admn
D104
AD14
30%
350
India
A
Admn
D105
AD15
60%
290
India
A
Thanks,
SK

You can try this
Let's suppose you have the following measures
_sumAMT:= SUM('Table 1'[amount])
and this is your ranking measure
_sumAMTRank:= RANKX(ALLEXCEPT('Table 1','Table 1'[Category]),[_sumAMT],,DESC,Dense)
You can revise the subtotal by doing this
_sumAMT by CAT:= CALCULATE(SUM('Table 1'[amount]),ALLEXCEPT('Table 1','Table 1'[Category]))
_revisedTotal:= IF(HASONEVALUE('Table 1'[Name])=true(),[_sumAMT],[_sumAMT by CAT])

Using different columns in Slicer and PARALLELPERIOD breaks my measure

I have a table that contains data for 13 prior month ends as well as the most recent business day. The majority of my reports only look at a single period at a time so I have a slicer on each report to allow the user to choose the period they want to look at, which is generally the most recent business day. Whenever the data gets refreshed I have to manually go to each slicer that had the most recent business day selected and choose the new most recent business day (unfortunately my organization hasn't updated to the version that allows the slicers to be synced). As we move towards production and scheduled refreshes, this will be a nuisance so I added a second column called REPORTING_DATE, which is equal to the original DATA_DATE field except the most recent date is replaced with 'Most Recent' so any slicers with that selected can maintain their selection after a refresh.
This is a simplified example of my data:
DATA_DATE REPORTING_DATE ACCOUNT_NO
7/10/2018 Most Recent 1001
7/10/2018 Most Recent 1002
7/10/2018 Most Recent 1003
7/10/2018 Most Recent 1004
7/10/2018 Most Recent 1005
7/10/2018 Most Recent 1006
7/10/2018 Most Recent 1007
6/30/2018 6/30/2018 1001
6/30/2018 6/30/2018 1002
6/30/2018 6/30/2018 1003
6/30/2018 6/30/2018 1004
6/30/2018 6/30/2018 1005
6/30/2018 6/30/2018 1006
5/31/2018 5/31/2018 1001
5/31/2018 5/31/2018 1002
5/31/2018 5/31/2018 1003
5/31/2018 5/31/2018 1004
My issue is that when I change my slicer to use DATA_DATE instead of REPORTING_DATE it breaks my measure that I use to calculate the change in counts for each period.
Change in Count (Month) = DISTINCTCOUNT(MyData[ACCOUNT_NO])-CALCULATE(DISTINCTCOUNT(MyData[ACOUNT_NO]),PARALLELPERIOD(MyData[DATA_DATE],-1,MONTH))
When my slicer has DATA_DATE = 7/10/2018 the measure correctly returns 1 (count of 7 for July 10 minus a count of 6 for June 30). When I use a slicer with REPORTING_DATE = Most Recent I get 7 because DISTINCTCOUNT(MyData[ACCOUNT_NO]) returns 7, which is correct, but CALCULATE(DISTINCTCOUNT(MyData[ACOUNT_NO]),PARALLELPERIOD(MyData[DATA_DATE],-1,MONTH)) returns (Blank). It looks like PARALLELPERIOD(MyData[DATA_DATE],-1,MONTH) returns the same value 6/30/2018 regardless of the slicer being used so I'm stumped as to the issue.

The reason this does not work is that the PARALLELPERIOD filter in your CALCULATE is only replacing the context filter for the [DATA_DATE] column but still has the slicer filtering in effect since that is on a different column. If you select Most Recent on the [REPORTING_DATE] slicer, your measure will try to find the distinct count where [REPORTING_DATE] is Most Recent and also [DATA_DATE] is in the previous month. Since no such rows exist, it returns a blank.
To fix this, you can tell the measure to ignore the filtering directly from the [REPORTING_DATE] slicer and only use the filtering on the [DATA_DATE] column (which gets indirectly filters from the slicer).
Change in Count (Month) =
DISTINCTCOUNT(MyData[ACCOUNT_NO]) -
CALCULATE(
DISTINCTCOUNT(MyData[ACOUNT_NO]),
ALL(MyData[REPORTING_DATE]),
PARALLELPERIOD(MyData[DATA_DATE], -1, MONTH))
The reason that this works with the slicer on [DATA_DATE] is that the PARALLELPERIOD filter replaces the slicer filtering for that column. When you slice on [REPORTING_DATE], the slicer filtering does not get replaced since you aren't referring to that column inside a CALCULATE filter argument.
If this still doesn't make sense, I recommend some reading on how the CALCULATE function works. There's an entire chapter devoted to it in The Definitive Guide to DAX and there's a handful of websites/blogs that have some decent reading too.

sas Proc expand procedure

I am trying to calculate 3 months moving average of the following data by Product by country( I only have two country variables here). Is there a way to do so?
Here is the sales table I have:
Date Product Country Sales
201101 Sofa US 100
201102 Sofa US 200
201103 Sofa US 250
201104 Sofa US 300
201101 Sofa CA 250
201102 Sofa CA 300
201103 Sofa CA 250
201104 Sofa CA 300
201101 Chair US 300
201102 Chair US 300
201103 Chair US 300
201104 Chair US 300
201101 Chair CA 300
201102 Chair CA 300
201103 Chair CA 300
201104 Chair CA 300
I tried something like the following, but moving average is only calculated by country. Is there a way I can have it calculated by country, by product? Any ideas will be appreciated. thanks:)
PROC SORT DATA=Sales;
BY Country Product Date;
RUN;
PROC EXPAND DATA=Sales out =ma;
By Country Product;
CONVERT Value=Value_ma/transformin=(setmiss 0) transformout=(movave 3);
run;

after my comment i tested a bit, i guess concating product and country gives the result you are looking for (i hope i still did not understood something wrong):
data have;
input Date $ Product $ Country $ Sales ;
datalines;
201101 Sofa US 100
201102 Sofa US 200
201103 Sofa US 250
201104 Sofa US 300
201101 Sofa CA 250
201102 Sofa CA 300
201103 Sofa CA 250
201104 Sofa CA 300
201101 Chair US 300
201102 Chair US 300
201103 Chair US 300
201104 Chair US 300
201101 Chair CA 300
201102 Chair CA 300
201103 Chair CA 300
201104 Chair CA 300
;
run;
data have ;
set have;
copr=catx("_",Product,country);
run;
PROC SORT DATA=have;
BY copr Date;
RUN;
PROC EXPAND DATA=have out =ma ;
By copr;
CONVERT sales=average / transformin=(setmiss 0) transformout=(movave 3);
run;
proc print data=ma;
var date product country average;
where time > 1;
run;
result:

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Replace Aggregate Data with original details for analysis - replace

Related

Power BI: Conditional Formating Matrix Visual with data bars

Power BI: Conditional Formating Data bars for Matrix Visual

Power BI - Showing Top 5 records in Metrix Table but total should show for all records

Using different columns in Slicer and PARALLELPERIOD breaks my measure

sas Proc expand procedure

Categories

Resources