Django ORM: Sum value and then create extra field showing rolling average - django

Is it possible, using Django's ORM to create a Sum based on a Date column and then add an extra field with a rolling average.
Let's say I have a table like this, called "Sales":
|---------------------|------------------|--------------|
| Date | Category | Value |
|---------------------|------------------|--------------|
| 2020-04-01 | 1 | 55.0 |
|---------------------|------------------|--------------|
| 2020-04-01 | 2 | 30.0 |
|---------------------|------------------|--------------|
| 2020-04-02 | 1 | 25.0 |
|---------------------|------------------|--------------|
| 2020-04-02 | 2 | 85.0 |
|---------------------|------------------|--------------|
| 2020-04-03 | 1 | 60.0 |
|---------------------|------------------|--------------|
| 2020-04-03 | 2 | 30.0 |
|---------------------|------------------|--------------|
I would like to group it by Date (column "Category" is unimportant) and add the Sum of the values for the date. Then I would like to add a rolling Average for the last 7 days.
I tried this:
days = (
Sales.objects.values('date').annotate(sum_for_date=Sum('value'))
).annotate(
rolling_avg=Window(
expression=Avg('sum_for_date'),
frame=RowRange(start=-7,end=0),
order_by=F('date').asc(),
)
)
.order_by('date')
This throws this error:
django.core.exceptions.FieldError: Cannot compute Avg('sum_for_date'): 'sum_for_date' is an aggregate
Any ideas?

Related

How do I repeat row labels in a matrix?

I have data showing me the dates grouped like this:
For security reasons, I had to remove the Customer Description detail, due to confidentiality.
How do I repeat the date column the same way you repeat the Row Labels in an Excel Pivot?
I've looked, but couldn't find a solution to this - this option should be available.
EDIT
When you have the following source data in Excel:
Date | Customer | Item Description | Qty Out | Unit Price | Sales
--------------------------------------------------------------------------------------------------------------------------------------------
14/08/2020 | Customer 1 | Item 11 | 4.00 | 65.00 | 260.00
14/08/2020 | Customer 2 | Item 12 | 56.00 | 12.00 | 672.00
14/08/2020 | Customer 3 | Item 13 | 64.00 | 35.00 | 2,240.00
14/08/2020 | Customer 4 | Item 14 | 29.00 | 65.00 | 1,885.00
15/08/2020 | Customer 2 | Item 15 | 746.00 | 12.00 | 8,952.00
15/08/2020 | Customer 3 | Item 16 | 14.00 | 75.00 | 1,050.00
15/08/2020 | Customer 4 | Item 17 | 45.00 | 741.00 | 33,345.00
15/08/2020 | Customer 5 | Item 18 | 456.00 | 125.00 | 57,000.00
15/08/2020 | Customer 6 | Item 19 | 925.00 | 17.00 | 15,725.00
16/08/2020 | Customer 4 | Item 20 | 6.00 | 532.00 | 3,192.00
16/08/2020 | Customer 5 | Item 21 | 56.00 | 94.00 | 5,264.00
16/08/2020 | Customer 6 | Item 22 | 546.00 | 37.00 | 20,202.00
You then pivot this data using Microsoft Excel, where you get the following:
You then choose the option to Repeat Item Labels as can be seen below:
After selecting this, you get my expected results I require in Power BI:
Is there not a function available like this in Power BI?
Just adding this for your reference as a work around. Check this below image with a custom column created in the Power Query Editor-
date_customer = Date.ToText([Date]) &" : "& [Customer]
Then added both Date and date_customer in the Matrix row level. The output is as below- (using your sample data)
ANOTHER OPTION Another option is to add Date and Customer in the Matrix row and the output is will be as below- (using your sample data)
This is also a meaningful output as date are showing as a group header. But in case of requirement of having redundant date to show, you can consider the first option.

Sum where version is highest by another variable (no max version in the whole data)

I'm struggling having this measure to work.
I would like to have a measure that will sum the Value only for the max version of each house.
So following this example table:
|---------------------|------------------|------------------|
| House_Id | Version_Id | Value |
|---------------------|------------------|------------------|
| 1 | 1 | 1000 |
|---------------------|------------------|------------------|
| 1 | 2 | 2000 |
|---------------------|------------------|------------------|
| 2 | 1 | 3000 |
|---------------------|------------------|------------------|
| 3 | 1 | 5000 |
|---------------------|------------------|------------------|
The result of this measure should be: 10.000 because the house_id 1 version 1 is ignored as there's another version higher.
By House_id the result should be:
|---------------------|------------------|
| House_Id | Value |
|---------------------|------------------|
| 1 | 2000 |
|---------------------|------------------|
| 1 | 3000 |
|---------------------|------------------|
| 2 | 5000 |
|---------------------|------------------|
Can anyone help me?
EDIT:
Given the correct answer #RADO gave, now I want to further enhance this measure:
Now, my main Data table in reality has more columns.
What if I want to add this measure to a table visual that splits the measure by another column from (or related to) the Data table.
For example (simplified data table):
|---------------------|------------------|------------------|------------------|
| House_Id | Version_Id | Color_Id | Value |
|---------------------|------------------|------------------|------------------|
| 1 | 1 | 1 (Green) | 1000 |
|---------------------|------------------|------------------|------------------|
| 1 | 2 | 2 (Red) | 2000 |
|---------------------|------------------|------------------|------------------|
| 2 | 1 | 1 (Green) | 3000 |
|---------------------|------------------|------------------|------------------|
| 3 | 1 | 1 (Green) | 5000 |
|---------------------|------------------|------------------|------------------|
There's a Color_Id in the main table that is connected to a Color table.
Then I add a visual table with ColorName (from the ColorTable) and the measure (ColorId 1 is Green, 2 is Red).
With the given answer the result is wrong when filtered by ColorName. Although the Total row is indeed correct:
|---------------------|------------------|
| ColorName | Value |
|---------------------|------------------|
| Green | 9000 |
|---------------------|------------------|
| Red | 2000 |
|---------------------|------------------|
| Total | 10000 |
|---------------------|------------------|
This result is wrong per ColorName as 9000 + 2000 is 11000 and not 10000.
The measure should ignore the rows with an old version. In the example before this is the row for House_Id 1 and Color_Id Green because the version is old (there's a newer version for that House_Id).
So:
How can I address this situation?
What If I want to filter by another column from (or related to) the Data table such as Location_Id? It is posible to define the measure in such a way that could work for any given number splits for columns in the main Data table?
I use "Data" as a name of your table.
Sum of Latest Values =
VAR Latest_Versions =
SUMMARIZE ( Data, Data[House_id], "Latest_Version", MAX ( Data[Version_Id] ) )
VAR Latest_Values =
TREATAS ( Latest_Versions, Data[House_id], Data[Version_Id] )
VAR Result =
CALCULATE ( SUM ( Data[Value] ), Latest_Values )
RETURN Result
Measure output:
How it works:
We calculate a virtual table of house_ids and their max versions, and store it in a variable "Latest_Versions"
We use the table from the first step to filter data for the latest versions only, and establish proper data lineage
(https://www.sqlbi.com/articles/understanding-data-lineage-in-dax/)
We calculate the sum of latest values by filtering data for the latest values only.
You can learn more about this pattern here:
https://www.sqlbi.com/articles/propagate-filters-using-treatas-in-dax/

PowerBI Sort Columns in Matrix Visual

I have a Matrix visual in Microsoft PowerBI with Australian 'States' as rows and 'Months Ago' as columns.
By default the Matrix shows my columns from 0 months ago to 12. I would like it to show from 12 months ago on the left to 0 months ago on the right.
+-------------------+-----------------------------+-------+
| | Months Ago | |
+-------------------+-----------------------------+-------+
| State | 0 | 1 | 2 | 3 | 4 | 5 | Total |
+-------------------+----+----+----+----+----+----+-------+
| Queensland | 10 | 10 | 10 | 10 | 10 | 10 | 60 |
+-------------------+----+----+----+----+----+----+-------+
| New South Wales | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Victoria | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| South Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Western Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
Currently I am only given the option to sort by the value type fields (ie revenue etc).
Is there any option to sort/order the Column Headers?
I don't think there is an option for you to sort column headers directly.
However, you can change the default sort order for the Months Ago column so that it will be reflected in general.
You can add a custom column MonthSrt = 12 - [Months Ago] in query editor:
(It won't work in DAX because of a known issue)
Then you can select the Months Ago column and sort it by MonthSrt:
The custom sort will be applied when you use the Months Ago column in visuals:
You can also make groups (1 to 1 items) al give them al logical number:
The order will change automaticly in the matrix
The following solution worked for me to display the dates in descending order in a matrix:
how to sort column dates in descending order of matrix in power bi

Total/Sum working incorrectly in Power Bi

I have created a Report in which I have created some measures like -
X =
CALCULATE (
DISTINCTCOUNT ( ActivityNew[Name] ),
FILTER (
ActivityNew,
ActivityNew[Resource Owner Name] = MAX ( 'Resource Owners'[Manager Name] )
&& ActivityNew[LocationId] = 2
)
)
When I use this measure in table then the column values dont add up. For eg. if the value of this measure is 2,2,2,2,2 then Total in table should be 10. but it is showing 2.
I have noticed that wherever I have used this MAX(), the measure values are not adding up.
Why this is happening and Is their any solution for this?
You are using DISTINCTCOUNT which is in general not aggregatable.
Say you have the following table Sales:
+----------+------+-------+
| Customer | Item | Count |
+----------+------+-------+
| Albert | Coke | 3 |
| Bertram | Beer | 5 |
| Bertram | Coke | 2 |
| Charlie | Beer | 1 |
+----------+------+-------+
If you wanted to count the number of distinct items each customer has bought, you would create a new measure with the formula:
[Distinct Items] := DISTINCTCOUNT(Sales[Item])
If you include the [Customer] column and your [Distinct Items] measure in a report, it would output the following:
+----------+----------------+
| Customer | Distinct Items |
+----------+----------------+
| Albert | 1 |
| Bertram | 2 |
| Charlie | 1 |
+----------+----------------+
| Total | 2 |
+----------+----------------+
As you can see, this does not sum up, as the context of the total row, is the entire table, not filtered by any particular customer. To change this behaviour, you have to explicitly tell your measure that it should sum the values derived at the customer level. To do this, use the SUMX function. In my example, the measure formula should be changed like this:
[Distinct Items] := SUMX(VALUES(Sales[Customer]), DISTINCTCOUNT(Sales[Item]))
As I only want to sum over unique customers I use VALUES(Sales[Customer]). If you want to sum over every row in the table simply do: SUMX(<table name>, <expression>).
With this change, the output in the above example would be:
+----------+----------------+
| Customer | Distinct Items |
+----------+----------------+
| Albert | 1 |
| Bertram | 2 |
| Charlie | 1 |
+----------+----------------+
| Total | 4 |
+----------+----------------+

How to run raw query with a model with dynamic fields in Django 1.9?

I have a complex result that requires writing raw sql queries.
See https://stackoverflow.com/a/38548462/80353
The expected result is a table showing several columns.
The first column header is simply Product and the other column headers are store names.
The values are simply the product names and the aggregated sales values of the product in these stores.
Which stores will be shown is entirely dynamic. Maximum should be 9 stores.
The same in text format:
Store table
------------------------------
| id | code | address |
|-----|------|---------------|
| 1 | S1 | Kings Row |
| 2 | S2 | Queens Street |
| 3 | S3 | Jacks Place |
| 4 | S4 | Diamonds Alley|
| 5 | S5 | Hearts Road |
------------------------------
Product table
------------------------------
| id | code | name |
|-----|------|---------------|
| 1 | P1 | Saucer 12 |
| 2 | P2 | Plate 15 |
| 3 | P3 | Saucer 13 |
| 4 | P4 | Saucer 14 |
| 5 | P5 | Plate 16 |
| and many more .... |
|1000 |P1000 | Bowl 25 |
|----------------------------|
Sales table
----------------------------------------
| id | product_id | store_id | amount |
|-----|------------|----------|--------|
| 1 | 1 | 1 |7.05 |
| 2 | 1 | 2 |9.00 |
| 3 | 2 | 3 |1.00 |
| 4 | 2 | 3 |1.00 |
| 5 | 2 | 5 |1.00 |
| and many more .... |
| 1000| 20 | 4 |1.00 |
|--------------------------------------|
The relationships are:
Sales belongs to Store
Sales belongs to Product
Store has many Sales
Product has many Sales
What I want to achieve
I want to display by pagination in the following manner:
Given the stores S1-S3:
-------------------------
| product | S1 | S2 | S3 |
|---------|----|----|----|
|Saucer 12|7.05|9 | 0 |
|Plate 15 |0 |0 | 2 |
| and many more .... |
|------------------------|
For more details of the schema, check the question in How to get back aggregate values across 2 dimensions using Python Cubes?
My question
The schema is not super important to my question which is:
Since I am going to write a complex raw query, is there a way to map the query result to a model where the fields are dynamic?
I found documentation about how to execute raw queries in Django and how to execute raw queries to existing models with fixed fields and matching table.
My question is is it possible to do that for a model that has no matching table and dynamic fields?
If so, how?
Or if I choose to use materialised view in postgresql, how do I match it with a model class?