I have a ‘Project’ dimension and it has a discount value (in percentage). My problem is I cannot use a column from a Dimension directly in a measure…
I have to do Discount x Revenue. (Usually my dimension columns have only text values, but in this case its numeric).
Is it ok to do:
Measure1: = AVERAGE(Project[Discount])
Measure 2:= Measure1*Revenue
Or how is the correct way; one can achieve this. (My revenue is in my FactRevenue).
First, the fact that the discount depends on the project may be relevant for creating the revenue records, but perhaps shouldn't matter for reporting on them. Does it really matter why the discount was applied to the revenue fact? Could the source of the discount or the rules for discounting change in the future?
So you may want to use the project's discount data during ETL to populate a column on the fact table recording the applicable discount.
Second, there's no problem using dimension data in a measure. Something like
DiscountedRevenue = sumx(Revenue,Revenue[Revenue]*related(Project[Discount]))
Related
I am trying to create a measure which will calculate the Total Project Revenue
while I have 2 different projects. For each project there is a different calculation:
for Hourly project the calculation should be: Income * BillHours
for Retainer project the calculation should be: Income*TotalWorkinghours
I wrote the below DAX:
enter code here : Total project revenue = IF(max(DimProjects[ProjectType])="Hours",
max(FactWorkingHours[Income])[BillHours],max(FactWorkingHours[Income])*
[Total Working Hours])
the rows are calculated correctly but the total in the table is wrong
what should I fix in DAX so the total of all raw will correct as well.
The total Revenue should be 126,403.33
Thank you in advance
you can find here the table with the results
It's hard to say exactly what your measure is doing because, as written here, that is not a valid measure. I pasted it into the DAX Formatter, which I would recommend for formatting and pasting here into code blocks, and the measure was invalid. It would also be helpful to post the other measures this measure references, eg. [Bill Hours] and [Income Hours].
That being said, I think I can kind of tell what's going on. Your total is probably wrong because the filter context at the total level is being based of the condition where:
MAX ( DimProjects[ProjectType] ) = "Retainer" (or some other value not in your shared snippet)
That is because when you consider the MAX of a string, the higher alphabetical order is considered. Therefore, "Retainer" > "Hours". So at the total level, your table is outputting—more than likely, I can't be certain without more information—the false condition of your measure:
MAX ( FactWorkingHours[Income] ) * [Total Working Hours])
There is a better way to handle your intended outcome. IF statements, in the way you are attempting to use it, are rarely used in a calculated measure. You may be better off trying a calculated column without using the MAX functions. Again, I can't give an exact code recommendation without more context. Hope that sends you in the right direction!
I have the following Power BI table example for an operating expense report that uses a slicer to filter the first column named "Actual". This is to see the operating expenses for one month compared to the budget figures for the year. It also compares the year-to-date and annual figures. How can I create dynamic columns that change based on the slicer selection? These additional columns are not shown in the pic below but included in the last pic. The Budget column below was just created as an example to show what it should look like.
I set up a star schema with several tables shown below. There's only one expense fact table used and the slicer only works for the first column as previously stated but I need all the other columns to use different parameters and adjust based off what's selected in the slicer. The last image is an overview of the info and the parameters for each column. I tried creating new columns with measures for the budget to see if I can get that going but can't figure out how to make it adjust with the slicer selection.
I'm not sure if I should be using separate queries for each column or can this be done using the one expense table. Hope this isn't too confusing. Please let me know if more info is needed.
If I understood what you wanted correctly I think I solved your problem.
I was able to create the following:
I did not use all values since I did not want to type everything, if you provide some test data it is easier to replicate you dashboard.
This matrix (so not table) allows you to filter for Date (if you so desire, you can always show all date's in the matrix) Book and AccountTree.
The way this is done is by putting the address column in the ROWS for the matrix, Putting the Date column in the COLUMNS of the matrix and putting your values (actual, budget, variance) in the values of the matrix.
For the date is used days, since it was easier to type. You can always use weeks, months, quarters or years.
For this to work you have to create the following relationships:
Hope this helps.
If not, please provide test data so it is easier to try and solve your problem.
I'm trying to understand the data required for Amazon Forecast to create a demand forecast using my historical sales data. I've read through the documentation many times. I am still confused as to how the "in_stock" field in a related time series is supposed to function. Here is a link to the description of the "in_stock" field I am referring to:
https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html#related-time-series-type-retail-domain
It says:
The following fields are optional and might be useful in improving
forecast results:
in_stock (integer; 1=true, 0=false) – A flag that specifies whether the item is in stock.
What exactly is this field meant to indicate? Is it only meant to be set to 0 when the number of sales is 0? In other words, if the number of sales for a given day is 0, and in_stock is set to 0, then the system knows the sales were 0 because the product was not available, not because there was no demand.
What if a product goes out of stock halfway through the day. Would that be a case where you might have in_stock = 0 but also have sales on that day?
I am also confused how this in_stock field comes into play when in another piece of their documentation:
https://d1.awsstatic.com/whitepapers/time-series-forecasting-principles-amazon-forecast.pdf?did=wp_card&trk=wp_card
On page 10, they say:
In the retail case study, the information that a retailer sold zero
units of an available item differs from the information that zero
units of an unavailable item are sold either in the periods outside
its existence, e.g. before its launch or after its deprecation, or in
periods within its existence, e.g. partially out of stock, or when
there was no sales data recorded for this time range. The default zero
filling is applicable in this former case. In the latter, even though
the corresponding target value is typically zero, there is additional
information conveyed in the value being marked as missing. You must
preserve the information that there was missing data and not discard
this information (see the following example for an illustration why
keeping the information is important). To encode a value that does not
represent zero sales of an available product as truly missing, Amazon
Forecast allows the user to specify the filling type for middle fill
and back fill in the FeaturizationMethodParameters key of the
FeaturizationConfig parameter of the create_predictor API. To mark a
value as truly missing, the fill type for these parameters should be
set to NaN. Unlike for zero filling, the values encoded with NaN are
treated as truly missing, and not used in the metrics evaluation
component.
This seems to indicate that when a product is out of stock and there are no sales, those rows should be marked as NaN, which effectively removes those rows from the dataset.
I suppose my questions boils down to:
What is the difference between a day with 0 sales and in_stock = 0 vs a day with sales = NaN, which effectively removes that day from the dataset?
What do you do when a product goes out of stock partway through the day? Can in_stock = 0 and still have sales data for a given day?
The difference is that with in_stock your model includes more information allowing you to do projections based on in_stock into the future. It's better to use in_stock as opposed to sales = NaN as long as you have the historical data.
If you require in_stock to be represented as a partial day, your time series will need to be more granular. If you can't achieve more granularity you should be able to specify sales > 0 and in_stock = 0, all this will do is train the model, for that Id, that when an item is out of stock a certain amount of sales are made hence it is better to have a more granular time series.
For this use case the recommendation would be to specify sales = 'NaN' in the TARGET_TIME_SERIES when a particular product is unavailable / out-of-stock. This is as per the approach outlined in: https://d1.awsstatic.com/whitepapers/time-series-forecasting-principles-amazon-forecast.pdf. The example in Figure 6 (on page 11) elaborates on this further. For cases where a product goes out of stock part way through the day, the sales value should be specified.
It is also worth noting that at the time of this writing, RELATED_TIME_SERIES in Amazon Forecast requires the last timestamp for every item in the related time series dataset must be on or after the last timestamp in the target time series plus the user-designated forecast window (called the forecast horizon). See: https://docs.aws.amazon.com/forecast/latest/dg/related-time-series-datasets.html for additional details.
Hope this helps.
Let's say you had a table.
OrderNumber, OrderDate, City, & Sales
The Sales field is given to you. No need to calculate it.
When you bring in this data into Power BI, say you want to analyze Sales by City (in a table format).
You can just straight away drag the two fields into the table.
No need to create a measure.
So now, suppose you created a measure, though.
Total Sales = Sum(Sales).
Is there any advantage to it, in this scenario?
Is it more efficient to use: City, Total Sales
than it is to use: City, Sales
Both display the same information.
When you drag the field into the table, what Power BI does is create an implicit measure automatically based on its best guess of what aggregation (e.g. sum, max, count) it thinks you want.
So in this case, using an explicitly defined measure or an implicitly generated measure should perform the same since it is doing the same thing in the background, i.e., SUM(TableName[Sales]).
It's generally considered best practice to use explicit measures.
You may be interested in this video discussing the differences.
I was told that it is good to always create explicit measures, and that measures are more efficient. Weather right or wrong, I don't know, but from perspective of policy, it is a good idea, since measures do protect you from column name changes. In general, I think I can just make a rule of thumb to always define any measures that you want to report on explicitly.... BUT the answer above could also be correct... stack exchange doesn't let you choose multiple answers....
I am working on a report that has data by month. I have created a measure that will calculate a cost per unit which divides the sum of dollars by the sum of production volume for the selected month(s):
Wtd Avg = SUM('GLData - Excel'[Amount])/SUM('GLData - Excel'[Production])
This works well and gives me the weighted average that I need per report category regardless of if I have one or multiple months selected. This actual and budget data is displayed below:
If you take time to total the actual costs you get $3.180. Where I am running into trouble is a measure to sum up to that total for a visual (This visual does not total sadly). Basically I need to sum the aggregated values that we see above. If I use the Wtd Avg measure I get the average for the total data set, or .53. I have attempted another measure, but am not coming up with the correct answer:
Total Per Unit Cost = sumX('GLData - Excel','GLData - Excel'[Wtd Avg])/DISTINCTCOUNT('GLData - Excel'[Date])
We see here I return $3.186. It is close, but it is not aggregating the right way to get exactly the $3.180:
My Total Per Unit Cost formula is off. Really I am simply interested in a measure to sum the post aggregated Wtd Avg measure we see in the first graph and total to $3.180 in this example.
Here is my data table:
As you probably know already, this is happening because measures are dynamic - if you are not grouping by a dimension, they will compute based on the overall table. What you want to do is to force a grouping on your categories, and then compute the sum of the measure for each category.
There are 2 ways to do this. One way is to create a new table in Power BI (Modeling tab -> New Table), and then use a SUMMARIZE() calculation similar to this one to define that table:
SUMMARIZE('GLData - Excel',[Category],[Month],[Actual/Budget],"Wtd Avg",[Wtd Avg])
Unfortunately I do not know your exact column names, so you will need to adjust this calculation to your context. Once your new table is created, you can use the values from that table to create your aggregate visual - in order to get the slicers to work, you may need to join this new table to your original table through the "Manage Relationships" option.
The second way to do this is via the same calculation, but without having to create a new table. This may be less of a hassle. Create a measure like this:
SUMX(SUMMARIZE('GLData - Excel',[Category],[Month],[Actual/Budget],"Wtd Avg",[Wtd Avg]),[Wtd Avg])
If this does not solve your issue, go ahead and show me a screenshot of your table and I may be able to help further.