I have a csv dataset like this where we asked favorite colors:
id q1 q2 q3
1 red blue green
2 blue green .
3 green . .
4 blue . .
5 . . .
Is PowerBI able to handle this type of reporting, I've seen recommendations to Unpivot the data which I could do BUT i would like to keep the results % based on respondents NOT on mentions, meaning % should be calculated by diving by 4 (people that answered a favorite color) son for example for RED result should be:
Green = 3/4 = 75% (based on 4 respondents)
Instead of
Green = 3/7 = 43% (based on 7 colors mentioned)
Thanks!
After unpivoting your sample data table looks like this:
ID
Attribute
Value
1
q1
red
1
q2
blue
1
q3
green
2
q1
blue
2
q2
green
3
q1
green
4
q1
blue
Now you can use this calculated table
% Colors =
VAR numIDs =
DISTINCTCOUNT('Table'[ID])
RETURN
SUMMARIZE(
'Table',
'Table'[Value],
"Pct", DIVIDE(COUNT('Table'[Value]), numIDs)
)
to get this result:
Related
I need to create a measure for a card that will count the total number of Question Groups that exist for each person using the tables below.
I've tried the following but it's returning the result 10, instead of the expected result which should be 6. (George = 2, Susan = 1, tom = 1, bill=1, sally =1, mark =0, jason=0)
Measure = COUNTROWS(NATURALLEFTOUTERJOIN(NATURALLEFTOUTERJOIN(People,Questions),'Question Groups'))
What am I doing wrong?
Table: People
PeopleID
Name
1
George
2
Susan
3
Tom
4
Bill
5
Sally
6
Mark
7
Jason
Table: relPeopleQuestions
PeopleID
QuestionID
1
1
1
2
1
3
2
4
2
5
3
6
4
7
5
8
Table: Questions
Question ID
Question name
Questiong Group ID
1
How are you?
1
2
Favorite Color?
2
3
Favorite Movie?
2
4
Sister's Name
3
5
Brother's Name
3
6
What is your birthdate?
1
7
What City do you live in?
1
8
Favorite game?
2
Table: Question Groups
Question Group ID
Question Group Name
1
Assorted
2
Favorites
3
Relatives
A working example file can be obtained here.
A distinct count on the Question Group ID from the Questions table would seem to be sufficient, e.g.
MyMeasure =
VAR MyTable =
SUMMARIZE (
People,
People[Name],
"Count", DISTINCTCOUNT ( Questions[Question Group ID] )
)
RETURN
SUMX ( MyTable, [Count] )
data
I am trying to plot a bar graph for both sept and oct waves. As in the image you can see the id are the individuals who are surveyed across time. So on the one graph I need to plot sept in-house, oct in-house, sept out-house, oct out-house and just have to show the proportion of people who said yes in sept in-house, oct in-house, sept out-house, oct out-house. Not all the categories have to be taken into account.
Also I have to show whiskers for 95% confidence intervals for each of the respective categories.
* Example generated by -dataex-. For more info, type help dataex
clear
input float(id sept_outhouse sept_inhouse oct_outhouse oct_inhouse)
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 3 3 3
5 4 4 3 3
6 4 4 3 3
7 4 4 4 1
8 1 1 1 1
9 1 1 1 1
10 1 1 1 1
end
label values sept_outhouse codes
label values sept_inhouse codes
label values oct_outhouse codes
label values oct_inhouse codes
label def codes 1 "yes", modify
label def codes 2 "no", modify
label def codes 3 "don't know", modify
label def codes 4 "refused", modify
save tokenexample, replace
rename (*house) (house*)
reshape long house, i(id) j(which) string
replace which = subinstr(proper(which), "_", " ", .)
gen yes = house == 1
label def WHICH 1 "Sept Out" 2 "Sept In" 3 "Oct Out" 4 "Oct In"
encode which, gen(WHICH) label(WHICH)
statsby, by(WHICH) clear: ci proportion yes, jeffreys
set scheme s1color
twoway scatter mean WHICH ///
|| rspike ub lb WHICH, xla(1/4, noticks valuelabel) xsc(r(0.9 4.1)) ///
xtitle("") legend(off) subtitle(Proportion Yes with 95% confidence interval)
This has to be solved backwards.
The means and confidence intervals have to be plotted using twoway as graph bar is a dead-end here, because it does not allow whiskers too.
The confidence limits have to be put in variables before the graphics. Some graph commands, notably graph bar, will calculate means for you, but as said that is a dead end. So, we need to calculate the means too.
To do that you need an indicator variable for Yes.
The best way I know to get the results then is to reshape to a different structure and then apply ci proportion under statsby.
As a detail, the option jeffreys is explicit as a signal that there are different methods for the confidence interval calculation. You should choose one knowingly.
I have the following data
Date Band Colour Amount
02/01/2020 0-50 Red 20
01/01/2020 51-100 Blue 18
03/01/2020 51-100 Red 14
01/01/2020 51-100 Red 18
02/01/2020 51-100 Red 16
02/01/2020 51-100 Blue 14
01/01/2020 0-50 Red 12
03/01/2020 51-100 Blue 20
01/01/2020 51-100 Red 12
02/01/2020 0-50 Blue 11
02/01/2020 0-50 Red 13
01/01/2020 0-50 Red 10
02/01/2020 51-100 Blue 17
01/01/2020 51-100 Blue 17
I want to produce two table and filter by date.
the first table create colour by band and sum the total. same as the second table. the challenge is to find the difference between table one and table two base on filtered date
the table below show all report without filtering
when I filter table 1 to 01/01/2020and table 2 to 02/01/2020
my expected output will be
Please check the attached .pbix file for your reference. It is difficult to explain the whole process and as a result I have attached the report file here. Check these following things from the report-
Two separate Date table for two different slicer
Created three measure
Interaction between Slicer and Metrix visuals.
Get the Report File Here
I have the following variable indicating whether an observation is working or unemployed, where 0 indicates working and 1 refers to unemployed.
dataex unemp
input float unemp
0
0
0
0
1
.
1
When I tabulate the variable:
Unemploymen |
t | Freq.
------------+--------------
Employed | 80
Unemployed | 20
Total LF 100
I essentially want to divide 20/100, to obtain a total unemployment variable of 20%. I have done this manually now, but think it is better to automate this as I also want to compute unemployment by different education groups and geographic regions.
gen unemployment_broad = .
replace unemployment_broad = (20/100)*100
The education variable is as follows, where 1 "Less than basic",
2 "Basic",
3 "Secondary",
4 "Higher education",
Is there a way to compute unemployment rate by each education group?
input float educ
2
4
4
4
2
4
1
3
3
3
Using Cybernike's solution, I tried to create a variable showing unemployment by education as follows, but I got an error:
gen unemp_educ = .
replace unemp_educ = bysort educ: summarize unemp
I essentially want to visualize unemployment by education. With something like this:
graph hbar (mean) Unemployment, over(education)
This is because I also intend to replicate the same equation by demographic group, gender, etc.
Your unemployment variable is coded as 0/1. Therefore, you can obtain the proportion unemployed by taking the mean value. You could do this using the summarize command, or using the collapse command. Both of these can be performed by education group.
clear
input unemp educ
0 2
0 4
0 4
0 4
1 2
0 3
1 3
1 1
1 3
end
bysort educ: summarize unemp
collapse (mean) unemp, by(educ)
list
+-----------------+
| educ unemp |
|-----------------|
1. | 1 1 |
2. | 2 .5 |
3. | 3 .6666667 |
4. | 4 0 |
+-----------------+
In response to your edit, you can also save the mean values to the original dataset using:
bysort educ: egen unemp_mean = mean(unemp)
Your code for plotting the data seems to work fine.
Here I like to compute dynamic measure called "count_percent" (last field) based on slicer selection.
Formula for 'count_percent' is ...Admin/count
Slicers in PowerBI = Diag and Practice
Bar chart has Date on X column (1/1/2018 to 1/3/2018) and 'count_percent' on Y column
In this example,
if I select Diag slicer = Head and Practice = open or select all, I
like to see 'count_percent' for 1/1/2018, itshould be 6 (24/4)
if I select Diag slicer = Head and Practice slicer = Practice 1, i
like to see 'count_percent for 1/1/2018, it should be 5 (10/2)
if I select Diag = open and practice = 1, i like to see 'count_percent' for 1/1/2018 is 4.16 (25/6)
Please hlep. My data sample is below. Thank you so much
Date Diag Practice Admin count count_percent
01/01/2018 Head Practice1 10 2
01/02/2018 Head Practice1 22 3
01/03/2018 Head Practice1 13 3
01/01/2018 Head Practice2 14 2
01/02/2018 Head Practice2 13 2
01/01/2018 Neck Practice1 15 4
01/02/2018 Neck Practice1 17 2
01/03/2018 Neck Practice1 12 2
01/01/2018 Neck Practice2 18 3
01/02/2018 Neck Practice2 20 4
It should be as simple as this: count_percent = DIVIDE(SUM(admin), SUM(count))