Standard deviation for categories - powerbi

Hi I have desperately been trying to work this out and have referred to several posts but am still not getting the correct answer!
I have a bunch of providers of different provider type. I calculate an average cost change for each provider (from more granular payment data). I then want to find the standard deviation of these provider level changes for the difference provider type.
This is where I've got up to with the dax - this gives the same standard deviation across all provider types rather than the required output.
group_test =
var tab1 = SUMMARIZECOLUMNS(ProvData[Provider Type],ProvData[Provider Code], "prov_avg",AVERAGEX(core_data, sum(PayData[Payment1])-sum(PayData[Payment2]))/SUM(PayData[Payment1]))
var sd_type = SELECTCOLUMNS(SUMMARIZE(tab1,[Provider Type],[Provider Code], "test", STDEVX.S(tab1,[prov_avg])), "sd_type", [test])
var tab2 = ADDCOLUMNS(tab1, "sd_type", sd_type)
return tab2
I want my final table to look like this
Provider Code
Provider type
Prov_avg
sd_type
1
a
x
sd for a
2
a
y
sd for a
3
b
z
sd for b
Thanks in advance for any help

Add a column to your table:
stdColumn =
var prov_Code = ProvData[Provider Code]
var prov_type = ProvData[Provider Type]
var stdValue = CALCULATE (STDEV.S([prov_avg]), FILTER(prov_Code = ProvData[Provider Code] && prov_type = ProvData[Provider Type]))
return stdValue
So what we do is to calculate the stdev based on the filter given on Code & Type

Related

Power BI calculate increase based on next & previous values

I'm new to Power BI and have an assignment to hand in for school. Awaiting for 365 admin to grant access to power BI community I'm hoping someone can help me explain the following:
In this basic Table called Population, there is data over 10 years for each country and their population.
I would like to add a column and with DAX formula calculate the increase relative to last year.
So I would think the basic idea is something like
Increase = CALCULATE(SUMX(population, Filter(Population, Population[Year] = Population[year] + 1) - Population[Population))
But I'm not finding the right formula.
Any help would be much appreciated.
calculated_column:
NewColumn_calculate = population[Population]-CALCULATE(SUM(population[Population]), FILTER(population, EARLIER(population[Country]) = population[Country] && EARLIER(population[Year]) = population[Year] + 1))
NewColumn_sumx = population[Population]-SUMX(FILTER(population, EARLIER(population[Country]) = population[Country] && EARLIER(population[Year]) = population[Year] + 1), population[Population])
measure:
Measure = SUM(population[Population]) - SUMX(FILTER(all(population), population[Country] = MAX(population[Country]) && population[Year] = MAX(population[Year]) - 1), population[Population])

Calculate the difference between 2 rows in PowerBI using DAX

I'm trying to complete something which should be quite simple but for the life of me, I can't work it out.
I'm trying to calculate the difference between 2 rows that share the same 'Scan type'.
I have attached a photo showing sample data from production. We run a scan and depending on the results of the scan, it's assigned a color.
I want to find the difference in Scan IDs between each Red scan.
Using the attached Photo of Sample data, I would expect a difference of 0 for id 3. A difference of 1 for id 4 and a difference of 10 for id 14.
I have (poorly) written something that works based on the maximum value from the scan id.
I have also tried following a few posts to see if I can get it to work..
var _curid= MAX(table1[scanid])
var _curclueid = MAX(table1[scanid])
var _calc =CALCULATE(SUM(TABLE1[scanid],FILTER(ALLSELECTED(table1[scanid]),table1[scanid]))
return if(_curid-_calc=curid,0,_curid-_calc)
Edit;
Forgot to mention I have checked threads;
57699052
61464745
56703516
57710425
Try the following DAX and if it helps then accept it as the answer.
Create a calculated column that returns the ID where the colour is Red as follows:
Column = IF('Table'[Colour] = "Red", 'Table'[ID])
Create another column as following:
Column 2 =
VAR Colr = 'Table'[Colour]
VAR SCAN = 'Table'[Scan ID]
VAR Prev_ID =
CALCULATE(MAX('Table'[Column 2]),
FILTER('Table', 'Table'[Colour] = Colr && 'Table'[Scan ID] < SCAN))
RETURN
'Table'[Column] - Prev_ID
Output:
EDIT:-
If you want your first value(ID3) to be 0 then relace the RETURN line with the following line:
IF(ISBLANK(Prev_ID) && 'Table'[Colour] = "Red", 0, 'Table'[Column] - Prev_ID )
This will give you the following result:

Filter and compare specific data (Power BI/DAX)

I am trying to do an If formula in Power bi, with filtering and comparing data. I want to check for every Client,who have with unique Transaction ID, if the Legal firm is the same. If its the same to return Yes, if not - NO.
**Client | Transaction ID | Legal firm**
American Express |2295876 |Orrick Herrington
American Express |2295877 |Orrick Herrington
American Express |2295878 |Orrick Herrington
Swedbank AB |2287074 |Linklaters
Swedbank AB |2287074 |Clifford Chance
Swedbank AB |2287075 |Clifford Chance
I tried Calculate with distinct count, but it wasn't possible to include if statement.
You should be able to do it with COUNT, and removing the filter context on the Legal Firm using ALLEXCEPT, for example
Measure =
VAR rowCheck = CALCULATE(COUNT(Table1[Legal firm]), ALLEXCEPT(Table1, Table1[Transaction ID]))
VAR textValue = IF(rowCheck = 1, "Yes", "No")
RETURN
textValue
[
Hope that helps

How to filter rows based on column value while running a loop in pandas dataframe

I have a dataset with 13 features and 1 label column with only two outcomes Income =< 50k or > 50k.
I am trying to see the distribution of values for each feature for the entire dataset vs the same feature but only with >50k cases to see how the distribution changes or not for that given subset.
if i do:
filtertable = table[table[column] == criteria]
that works well to get the subset
However when used inside a function:
def comparacion(tabla, columna, criterio):
completa = {}
criteria = {}
datos = tabla[tabla[columna] == criterio] #<- here is the problem
datos = tabla.drop(columna, axis=1)
titulos = datos.columns
for tit in titulos:
completa[tit] =
(tabla[tit].value_counts().astype(float))/len(tabla[tit])
criteria[tit] =
(datos[tit].value_counts().astype(float))/len(datos[tit])
return completa, criteria
For some reason the filtering does not work, any ideas what could it be the problem?

CAD to Feature Class

import arcpy
fc = r'H:\H-ONUS UTILITY DATA GIS\As_Builts\2014\RandolphPoint_Phase2\789-AS-BUILT 8-7-13.dwg\Polyline'
out_gdb = r'H:\H-ONUS UTILITY DATA GIS\As_Builts\2014\RandolphPoint_Phase2\RandolphPoint.gdb.gdb'
field = 'Layer'
values = [row[0] for row in arcpy.da.SearchCursor(fc, (field))]
uniqueValues = set(Values)
for value in uniqueValues:
sql = """Layer" = '{0}'""".format(Value)
name = arcpy.ValidateTableName(value,out_gdb)
arcpy.FeatureClassToFeatureClass_conversion(fc, out_gdb, name, sql)
I am trying to convert CAD(dwg) to ArcGIS 10.2.2 Feature Classes using a file geodatase as the workspace. I was just taught this code at an ESRI conference and of course it worked beautifully for the insturtor.
My error I am getting is "NameError:name'Values' is not defined" however I did define it as values = [row[0] for row in arcpy.da.SearchCursor(fc, (field))] I have been working hours on this, it would help out my job considerably.
Python variables are case-sensitive.
You've declared values with a lower-case v, but you're referring to it on the next line with an upper-case V.
(Same with value/Value further down.
import arcpy
fc = r'H:\H-ONUS UTILITY DATA GIS\As_Builts\2014\RandolphPoint_Phase2\789ASBUILT.dwg\Polyline'
out_gdb = r'H:\H-ONUS UTILITY DATA GIS\As_Builts\2014\RandolphPoint_Phase2\RandolphPoint.gdb'
field = 'Layer'
value = [row[0] for row in arcpy.da.SearchCursor(fc, (field))]
uniquevalues = set(value)
for value in uniquevalues:
sql = """"Layer" = '{0}'""".format(value)
name = arcpy.ValidateTableName(value,out_gdb)
arcpy.FeatureClassToFeatureClass_conversion(fc, out_gdb, name, sql)
Here is the solution, I had an extra .gdb in the geodatabase path
my word value was values so had to take the s off
and also in my sql statement I was missing a " before the word Layer
If anyone is reading this just change the individual parameters and it works beautifully!
thanks Juffy for responding and trying to help me out
Cartogal