CALCULATE QUARTILES WITH FILTERS - POWER BI - powerbi

I am trying to put together a DAX statement in Power BI to calculate the quartiles of a table according to some filters. I did a generic sentence however, I have problems to be able to assign the filters to it.
The table structure is:
Campaing
management
group
id_oper
name_oper
nom_sup
tickets
convergente
bo tecnico
convergente
0000000
operador1
supervisor1
500
convergente
bo tecnico
convergente
11111111
operador2
supervisor1
200
convergente
bo tecnico
convergente
22222222
operador3
supervisor1
80
convergente
bo tecnico
convergente
33333333
operador4
supervisor1
300
despacho
bo despacho
averias
44444444
operador5
supervisor2
1500
despacho
bo despacho
averias
55555555
operador6
supervisor2
500
despacho
bo despacho
averias
66666666
operador7
supervisor2
30
despacho
bo despacho
averias
77777777
operador8
supervisor2
1000
multiskill
bo provision
multiskill
88888888
operador9
supervisor3
20
multiskill
bo provision
multiskill
99999999
operador10
supervisor3
5
multiskill
bo provision
multiskill
12345678
operador11
supervisor3
80
multiskill
bo provision
multiskill
87654321
operador12
supervisor3
3
And the Power BI query that i'm using is this:
Quantile =
IF(TBL_CUARTIL[tickets]<= PERCENTILE.EXC(TBL_CUARTIL[tickets],0.25),"Q4",
IF(AND(TBL_CUARTIL[tickets]> PERCENTILE.EXC(TBL_CUARTIL[tickets],0.25),TBL_CUARTIL[tickets]<= PERCENTILE.EXC(TBL_CUARTIL[tickets],0.5)),"Q3",
IF(AND(TBL_CUARTIL[tickets]> PERCENTILE.EXC(TBL_CUARTIL[tickets],0.5),TBL_CUARTIL[tickets]<= PERCENTILE.EXC(TBL_CUARTIL[tickets],0.75)),"Q2",
IF(TBL_CUARTIL[tickets]> PERCENTILE.EXC(TBL_CUARTIL[tickets],0.75),"Q1"))))
The query calculates the quartile by operator and it works without problems, but it calculates them on the total of records, and I would like to know how I can calculate the same but with data filters, for example filters of the campaign column.
I would appreciate your help with this problem, and I hope I have explained ... thank you very much

Use New measure:
Quantile=
var cur_val = max(TBL_CUARTIL[tickets])
return IF(cur_val< PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.25),"Q4",
IF(AND(cur_val> PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.25),cur_val<= PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.5)),"Q3",
IF(AND(cur_val> PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.5),cur_val<= PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.75)),"Q2",
IF(cur_val> PERCENTILEX.EXC(ALLSELECTED(TBL_CUARTIL), TBL_CUARTIL[tickets], 0.75),"Q1"))))

Related

RegEx for matching Germany or Austria or CH Postcodes

It is about my site, it is a ad portal and 3 geodata are installed in the system: Germany, Switzerland and Austria.
When I look for an advertisement in Germany, everything works correctly, I'm looking for zip code 68259 and a radius of 30 km. The results are correct, it shows all ads from 68259 Mannheim and the radius of 30 km.
Problem: The problem exists when I search in Switzerland or Austria: I search for the postal code 6000 Lucerne 1 PF and a radius of 30 km ... the results are wrong, I also find ads from Munich or Frankfurt which correspond to 300-500 km radius! I think the mistake is somewhere in the regex postal verification! Any advice what could be wrong???
// Germany Postcode
preg_match('/\b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9]\d{3}))\b/is', $this->search_code, $output);
if(!empty($output[0])){
$this->search_code = $output[0];
}else{
// Switzerland, Austria Postcode
preg_match('/\d{4}/', $this->search_code, $at_ch);
if(!empty($at_ch[0])){
$this->search_code = $at_ch[0];
}
}
The following regex will match codes for DE, CH & AU:
'/\b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9]\d{3})|(?:\d{4}))\b/is'
Examples
68259 Mannheim -> 68259
6000 Lucerne 1 PF -> 6000
1234 Musterstadt -> 1234

Is there any query string that I can use with the QUERY function to get a group-wise maximum?

I know how to use the GROUP BY clause in the QUERY function with either a single or multiple fields. This can return the single row per grouping with the maximum value for one of the fields.
This page explains it nicely using these queries and image:
=query({A2:B10},"Select Col1,min(Col2) group by Col1",1)
=query({A14:C22},"Select Col1,Col2,min(Col3) group by Col1,Col2",1)
However, what if I only want a query that returns the corresponding values for the most recent row, grouped by multiple fields? Is there a query that can do this?
Example
Source Table
created_at
first_name
last_name
email
address
city
st
zip
amount
4/12/2022 19:15:00
Ava
Anderson
ava#domain.com
123 Main St
Anytown
IL
12345
1.00
8/30/2022 21:38:00
Brooklyn
Brown
bb#domain.com
234 Lake Rd
Baytown
CA
54321
2.00
2/12/2022 16:58:00
Ava
Anderson
ava#new.com
123 Main St
Anytown
IL
12345
3.00
4/28/2022 01:41:00
Brooklyn
Brown
brook#acme.com
456 Ace Ave
Bigtown
NY
23456
4.00
5/03/2022 17:10:00
Brooklyn
Brown
bb#domain.com
234 Lake Rd
Baytown
CA
54321
5.00
Desired Query Result
Group by first_name, last_name, address, city, st, and zip, but return the created_at, email, and amount for the maximum (most recent) value of created_at:
created_at
first_name
last_name
email
address
city
st
zip
amount
4/12/2022 19:15:00
Ava
Anderson
ava#domain.com
123 Main St
Anytown
IL
12345
1.00
8/30/2022 21:38:00
Brooklyn
Brown
bb#domain.com
234 Lake Rd
Baytown
CA
54321
2.00
4/28/2022 01:41:00
Brooklyn
Brown
brook#acme.com
456 Ace Ave
Bigtown
NY
23456
4.00
Is such a query possible in Google Sheets?
Use this formula
=QUERY({QUERY(A1:I, " Select max(A),min(B),min(C),min(D),min(E),min(F),min(G),min(H),min(I) Group by B,C,E,F,G,H ", 1)},
" Select * Where Col1 is Not null ")
I believe that this is the formula you need:
=ARRAY_CONSTRAIN(SORTN(SORT(
QUERY({A1:I9,INDEX(IFERROR(REGEXEXTRACT(D1:D9,"(\D+)#")))},
"where Col2 is not null"),
10,1,1,0),9^9,2,10,1),9^9,9)
(Do adjust the formula according to your ranges and locale)
For the formula to work we create the helper column
INDEX(IFERROR(REGEXEXTRACT(D1:D9,"(\D+)#"))).
We also use 9^9 which equals to 387420489 rows, making sure that all rows are included in our sorting calculations.
Finally in our ARRAY_CONSTRAIN function we return the first 9 columns discarding the 10th helper column.
Functions used:
REGEXEXTRACT
IFERROR
INDEX
QUERY
SORT
SORTN
ARRAY_CONSTRAIN

How do I create a pivot table with weighted averages from a table in PowerBI?

I have data in the following format:
Building
Tenant
Type
Floor
Sq Ft
Rent
Term Length
1 Example Way
Jeff
Renewal
5
100
100
6
47 Fake Street
Tom
New
3
500
200
12
I need to create a visualisation in PowerBI that displays a pivot table of attribute by tenant, with a weighted averages (by square foot) column, like this:
Jeff
Tom
Weighted Average (by Sq Ft)
Building
1 Example Way
47 Fake Street
-
Type
Renewal
New
-
Floor
5
3
-
Sq Ft
100
500
433.3333333
Rent
100
200
183.3333333
Term Length (months)
6
12
11
I have unpivoted the original data, like this:
Tenant
Attribute
Value
Jeff
Building
1 Example Way
Jeff
Type
Renewal
Jeff
Floor
5
Jeff
Sq Ft
100
Jeff
Rent
100
Jeff
Term Length (months)
6
Tom
Building
47 Fake Street
Tom
Type
New
Tom
Floor
3
Tom
Sq Ft
500
Tom
Rent
200
Tom
Term Length (months)
12
I can almost create what I need from the unpivoted data using a matrix (as below), but I can't calculate the weighted averages column from that matrix.
Jeff
Tom
Building
1 Example Way
47 Fake Street
Type
Renewal
New
Floor
5
3
Sq Ft
100
500
Rent
100
200
Term Length (months)
6
12
I can also create a table with my attributes as headers (instead of in a column). This displays the right values and lets me calculate weighted averages (as below).
Building
Type
Floor
Sq Ft
Rent
Term Length (months)
Jeff
1 Example Way
Renewal
5
100
100
6
Tom
47 Fake Street
New
3
500
200
12
Weighted Average (by Sq Ft)
-
-
-
433.3333333
183.3333333
11
However, it's important that these values are displayed vertically instead of horizontally. This is pretty straightforward in Excel, but I can't figure out how to do it in PowerBI. I hope this is clear. Can anyone help?
Thanks!

Power BI - Showing Top 5 records in Metrix Table but total should show for all records

I have table with thousands of record. i want to create a table visual that will show top 5 records for each category. i created a measure to achieve this and i am getting the result exactly the same i am looking for but facing one issue there.
See below image where i am showing top 5 records for each category, but after each category i have total.
I don't want that total for top 5 records i am showing in the table instead i want the total of all the records which is there under each category.
How can i achieve that?
Measure I created is - Top 5 = RankX(AllSelected(table(Category), Table(account), table(name)),amount_measure,,,Dense)
for Top 5 measure i am putting the filter for top 5.
Category
Account
Name
P%
amount
country
owner
Food
A101
AA11
10%
105
India
A
Food
A102
AA12
20%
120
India
A
Food
A103
AA13
80%
100
India
A
Food
A104
AA14
30%
150
India
A
Food
A105
AA15
60%
90
India
A
Stat
B101
AA11
10%
205
India
A
Stat
B102
AA12
20%
220
India
A
Stat
B103
AA13
80%
200
India
A
Stat
B104
AA14
30%
250
India
A
Stat
B105
AA15
60%
190
India
A
Admn
D101
AD11
10%
305
India
A
Admn
D102
AD12
20%
320
India
A
Admn
D103
AD13
80%
300
India
A
Admn
D104
AD14
30%
350
India
A
Admn
D105
AD15
60%
290
India
A
Thanks,
SK
You can try this
Let's suppose you have the following measures
_sumAMT:= SUM('Table 1'[amount])
and this is your ranking measure
_sumAMTRank:= RANKX(ALLEXCEPT('Table 1','Table 1'[Category]),[_sumAMT],,DESC,Dense)
You can revise the subtotal by doing this
_sumAMT by CAT:= CALCULATE(SUM('Table 1'[amount]),ALLEXCEPT('Table 1','Table 1'[Category]))
_revisedTotal:= IF(HASONEVALUE('Table 1'[Name])=true(),[_sumAMT],[_sumAMT by CAT])

Self Join in Pandas: Merge all rows with the equivalent multi-index

I have one dataframe in the following form:
df = pd.read_csv('data/original.csv', sep = ',', names=["Date", "Gran", "Country", "Region", "Commodity", "Type", "Price"], header=0)
I'm trying to do a self join on the index Date, Gran, Country, Region producing rows in the form of
Date, Gran, Country, Region, CommodityX, TypeX, Price X, Commodity Y, Type Y, Prixe Y, Commodity Z, Type Z, Price Z
Every row should have all the different commodities and prices of a specific region.
Is there a simple way of doing this?
Any help is much appreciated!
Note: I simplified the example by ignoring a few attributes
Input Example:
Date Country Region Commodity Price
1 03/01/2014 India Vishakhapatnam Rice 25
2 03/01/2014 India Vishakhapatnam Tomato 30
3 03/01/2014 India Vishakhapatnam Oil 50
4 03/01/2014 India Delhi Wheat 10
5 03/01/2014 India Delhi Jowar 60
6 03/01/2014 India Delhi Bajra 10
Output Example:
Date Country Region Commodit1 Price1 Commodity2 Price2 Commodity3 Price3
1 03/01/2014 India Vishakhapatnam Rice 25 Tomato 30 Oil 50
2 03/01/2014 India Delhi Wheat 10 Jowar 60 Bajra 10
What you want to do is called a reshape (specifically, from long to wide). See this answer for more information.
Unfortunately as far as I can tell pandas doesn't have a simple way to do that. I adapted the answer in the other thread to your problem:
df['idx'] = df.groupby(['Date','Country','Region']).cumcount()
df.pivot(index= ['Date','Country','Region'], columns='idx')[['Commodity','Price']]
Does that solve your problem?