I am facing an issue in PowerBI matrix visualization.I have a school table with column values Student_ID,Location and AttendanceDate.
I need to find the sum of the number of times each student who attended classes >=1 days per location per month.
I have created a custom measure named Attendance as stated below to calculate students who the attended classes >=1
Attendance = IF(DISTINCTCOUNT(school[Attendance_Date])>=1,1,0)
In my visualization, I am able to get all the flags which is set to '1' for all the students who meet the condition of attending classes>=1.But as per my requirement I want to get the sum of these 1 flags to get the number of times all students attended classes >=1 per location per month.My final visualization should not contain the student ID, it should only have the location and months and the sum of the flags set to 1 indicating the number of times students attended the classes >=1 .
Expected Output:-
Location January February March
Chennai 1 1 1
Delhi 2 2 2
Goa 0 2 0
I tried to implement the fixed LOD concept as we do in tableau to handle this scenario in PowerBI but no luck.
I created a calculated measure 'CalculateAttendance as below but it is not working :-
CalculateAttendance = CALCULATE((school[Attendance]),ALLEXCEPT(school[Student_ID],school[Location],school[Attendance]))
Could you please provide any changes to my above calculations to resolve this issue.Please suggest how can I handle it or modify my calculations.
Regards
Sameer
My current matrix visualization in PowerBI
Input data source [text/excel[any]] for powerBi
Attendance Student_ID location
01.01.2017 100 Delhi
02.01.2017 100 Delhi
03.01.2017 100 Delhi
04.01.2017 100 Delhi
05.01.2017 100 Delhi
06.01.2017 100 Delhi
01.01.2017 101 Delhi
02.01.2017 101 Delhi
03.01.2017 101 Delhi
04.01.2017 101 Delhi
05.01.2017 101 Delhi
06.01.2017 101 Delhi
08.01.2017 101 Delhi
09.01.2017 102 Chennai
01.01.2017 102 Chennai
02.01.2017 102 Chennai
03.01.2017 102 Chennai
04.01.2017 102 Chennai
05.01.2017 102 Chennai
06.01.2017 102 Chennai
08.01.2017 102 Chennai
11.01.2017 102 Chennai
01.02.2017 101 Delhi
02.02.2017 101 Delhi
03.02.2017 101 Delhi
04.02.2017 101 Delhi
05.02.2017 101 Delhi
06.02.2017 101 Delhi
01.02.2017 100 Delhi
02.02.2017 100 Delhi
03.02.2017 100 Delhi
04.02.2017 100 Delhi
05.02.2017 100 Delhi
06.02.2017 100 Delhi
01.02.2017 102 Chennai
02.02.2017 102 Chennai
03.02.2017 102 Chennai
04.02.2017 102 Chennai
05.02.2017 102 Chennai
06.02.2017 102 Chennai
01.02.2017 103 Goa
02.02.2017 103 Goa
03.02.2017 103 Goa
04.02.2017 103 Goa
05.02.2017 103 Goa
06.02.2017 103 Goa
01.02.2017 104 Goa
02.02.2017 104 Goa
03.02.2017 104 Goa
04.02.2017 104 Goa
01.03.2017 100 Delhi
02.03.2017 100 Delhi
03.03.2017 100 Delhi
04.03.2017 100 Delhi
05.03.2017 100 Delhi
06.03.2017 100 Delhi
01.03.2017 101 Delhi
02.03.2017 101 Delhi
03.03.2017 101 Delhi
04.03.2017 101 Delhi
05.03.2017 101 Delhi
06.03.2017 101 Delhi
08.03.2017 101 Delhi
09.03.2017 102 Chennai
01.03.2017 102 Chennai
02.03.2017 102 Chennai
03.03.2017 102 Chennai
04.03.2017 102 Chennai
05.03.2017 102 Chennai
It looks like you just need the number of distinct students per month/location
This measure produces this matrix
# Students = DISTINCTCOUNT( School[Student_ID] )
To verify it at the students level, here it is the same matrix with students detail
Related
1.Same Subject,Category,subcategory,location and same severity for subject 20 and 90 then i need to flag as "N" 2.if same any of the above mentioned key variables are not same then Y, if only one visit is there then also Y....
data want;
input Subject category $ subcategory $ loc $ visitnum severity $ wanted_flag $;
datalines;
100 UAE Dubai Sharja 90 MILD N
100 UAE Dubai Sharja 20 MILD N
101 UAE Dubai Abudabi 90 MILD Y
101 UAE Dubai Abudabi 20 MODERATE Y
102 UAE Dubai AlAin 20 MODERATE Y
102 UAE Dubai Kuwait 20 MODERATE Y
102 IND MUMBAI Delhi 20 MILD Y
103 IND Chennai Kolkata 90 MODERATE N
103 IND Chennai Kolkata 20 MODERATE N
104 US NY Huston 20 MILD Y
;
run;
The below is the sample information i have and the needed flag is also mentioned as Wanted_flag.
Subject category subcategory location visitnum severity wanted_flag
100 UAE Dubai Sharja 90 MILD N
100 UAE Dubai Sharja 20 MILD N
101 UAE Dubai Abudabi 90 MILD Y
101 UAE Dubai Abudabi 20 MODERATE Y
102 UAE Dubai Al Ain 20 MODERATE Y
102 UAE Dubai Kuwait 20 MODERATE Y
102 IND MUMBAI Delhi 20 MILD Y
103 IND Chennai Kolkata 90 MODERATE N
103 IND Chennai Kolkata 20 MODERATE N
Based on what you've posted this creates the desired results.
data have;
input Subject category $ subcategory $ loc $ visitnum severity $ wanted_flag $;
datalines;
100 UAE Dubai Sharja 90 MILD N
100 UAE Dubai Sharja 20 MILD N
101 UAE Dubai Abudabi 90 MILD Y
101 UAE Dubai Abudabi 20 MODERATE Y
102 UAE Dubai AlAin 20 MODERATE Y
102 UAE Dubai Kuwait 20 MODERATE Y
102 IND MUMBAI Delhi 20 MILD Y
103 IND Chennai Kolkata 90 MODERATE N
103 IND Chennai Kolkata 20 MODERATE N
104 US NY Huston 20 MILD Y
;
run;
data want;
set have;
by subject category subcategory loc severity notsorted;
if not (first.severity and last.severity) then flag='N';
else flag='Y';
run;
I have two sheets that look something like this:
Sheet1
id | phone | age
0 123 23
1 456 42
2 789 36
Sheet2
id | city | country
0 madrid spain
1 nyc usa
2 dubai uae
3 london england
4 lisbon portugal
My goal is to have a sheet that looks like this:
Sheet3
id | phone | age | city | country
0 123 23 madrid spain
1 456 42 nyc usa
2 789 36 dubai uae
3 london england
4 lisbon portugal
I've been using this formula:
=ARRAYFORMULA({'Sheet1'!A$1:C$4, VLOOKUP('Sheet1'!A$1:A$4,{'Sheet2'!A$1:A$6, 'Sheet2'!B$1:C$6}, {2,3}, false)})
This is what I get:
Sheet3
id | phone | age | #N/A | #N/A
0 123 23 madrid spain
1 456 42 nyc usa
2 789 36 dubai uae
So as you can see, it is leaving out the column headers from Sheet2 in the combined table and it leaves out any rows where the id doesn't match. How do I tell it to leave those rows in and leave the cells blank and include the column headers from Sheet2?
try:
=ARRAYFORMULA(QUERY({A:C, IFNA(VLOOKUP(IF(A:A<>"", A:A, "×"), E:G, {2, 3}, 0));
FILTER({E2:E, IFERROR(E2:F/0), F2:G}, NOT(COUNTIF(E2:E, A2:A)))},
"where Col1 is not null order by Col1", 1))
I've got dataframe like this :
Name Nationality Tall Age
John USA 190 24
Thomas French 194 25
Anton Malaysia 180 23
Chris Argentina 190 26
so let say i got incoming data structure like this. each element representing the data of each row. :
data = [{
'food':{'lunch':'Apple',
'breakfast':'Milk',
'dinner':'Meatball'},
'drink':{'favourite':'coke',
'dislike':'juice'}
},
..//and 3 other records
].
'data' is some variable that save predicted food and drink from my machine learning. There is more record(about 400k rows) but i process them by batch size (right now i process 2k data each iteration) through iteration. Expected result like:
Name Nationality Tall Age Lunch Breakfast Dinner Favourite Dislike
John USA 190 24 Apple Milk Meatball Coke Juice
Thomas French 194 25 ....
Anton Malaysia 180 23 ....
Chris Argentina 190 26 ....
Is there's an effective way to achive that dataframe? so far i've already tried to iterate the data variables and get the value of each predicted label. which its feels like that process took much time.
You need flatenning dictionaries first, create DataFrame and join to original:
data = [{
'a':{'lunch':'Apple',
'breakfast':'Milk',
'dinner':'Meatball'},
'b':{'favourite':'coke',
'dislike':'juice'}
},
{
'a':{'lunch':'Apple1',
'breakfast':'Milk1',
'dinner':'Meatball2'},
'b':{'favourite':'coke2',
'dislike':'juice3'}
},
{
'a':{'lunch':'Apple4',
'breakfast':'Milk5',
'dinner':'Meatball4'},
'b':{'favourite':'coke2',
'dislike':'juice4'}
},
{
'a':{'lunch':'Apple3',
'breakfast':'Milk8',
'dinner':'Meatball7'},
'b':{'favourite':'coke4',
'dislike':'juice1'}
}
]
#or use another solutions, both are nice
L = [{k: v for x in d.values() for k, v in x.items()} for d in data]
df1 = pd.DataFrame(L)
print (df1)
breakfast dinner dislike favourite lunch
0 Milk Meatball juice coke Apple
1 Milk1 Meatball2 juice3 coke2 Apple1
2 Milk5 Meatball4 juice4 coke2 Apple4
3 Milk8 Meatball7 juice1 coke4 Apple3
df2 = df.join(df1)
print (df2)
Name Nationality Tall Age breakfast dinner dislike favourite \
0 John USA 190 24 Milk Meatball juice coke
1 Thomas French 194 25 Milk1 Meatball2 juice3 coke2
2 Anton Malaysia 180 23 Milk5 Meatball4 juice4 coke2
3 Chris Argentina 190 26 Milk8 Meatball7 juice1 coke4
lunch
0 Apple
1 Apple1
2 Apple4
3 Apple3
I am a complete newb to SAS and I only know is basic sql. Currently taking Regression class and having trouble with SAS code.
I am trying to input two columns of data where x variable is State; y variable is # of accidents for a simple regression.
I keep getting this:
ERROR: No valid observations are found.
Number of Observations Read 51
Number of Observations Used 0
Number of Observations with Missing Values 51
Is it because datalines only read numbers and not charcters?
Here is the code as well as the datalines:
Data Firearm_Accidents_1999_to_2014;
ods graphics on;
Input State Sum_OF_Deaths;
Datalines;
Alabama 526
Alaska 0
Arizona 150
Arkansas 246
California 834
Colorado 33
Connecticut 0
Delaware 0
District_of_Columbia 0
Florida 350
Georgia 413
Hawaii 0
Idaho 0
Illinois 287
Indiana 288
Iowa 0
Kansas 44
Kentucky 384
Louisiana 562
Maine 0
Maryland 21
Massachusetts 27
Michigan 168
Minnesota 0
Mississippi 332
Missouri 320
Montana 0
Nebraska 0
Nevada 0
New_Hampshire 0
New_Jersey 85
New_Mexico 49
New_York 218
North_Carolina 437
North_Dakota 0
Ohio 306
Oklahoma 227
Oregon 41
Pennsylvania 465
Rhode_Island 0
South_Carolina 324
South_Dakota 0
Tennessee 603
Texas 876
Utah 0
Vermont 0
Virginia 203
Washington 45
West_Virginia 136
Wisconsin 64
Wyoming 0
;
run; proc print;
proc reg data = Firearm_Accidents_1999_to_2014;
model State = Sum_OF_Deaths;
ods graphics off;
run; quit;
OK, some different levels of issues here.
ODS GRAPHICS go before and after procs, not inside them.
When reading a character variable you need to tell SAS using an informat.
This allows you to read in the data. However your regression has several issues. For one, State is a character variable and you can do regression with a character variable. I think that issue is beyond this forum. Review your regression basics and check what you're trying to do.
Data Firearm_Accidents_1999_to_2014;
informat state $32.;
Input State Sum_OF_Deaths;
Datalines;
Alabama 526
Alaska 0
Arizona 150
Arkansas 246
California 834
Colorado 33
....
;
run;
I have a table is SAS which looks like this.
year Country Host Code Value
2010 India Pak 220 111
2010 India Aus 220 123
2010 India NZ 220 23
2010 India SA 240 43
2010 India WI 250 124
2010 India SRI 250 325
2010 India ZIM 280 235
i want to transform this table to following form
Country Code Pak_2010 Aus_2010 NZ_2010 SA_2010 WI_2010 SRI_2010 IM_2010
India 220 111 123 23 0 0 0 0
India 240 0 0 0 43 0 0 0
India 250 0 0 0 0 124 325 0
India 280 0 0 0 0 0 0 235
for one country and code, there will be one value.
Can anyone please suggest me code for doing this transformation?
This is a classical proc transpose, separating your ID variables with a delimiter:
PROC TRANSPOSE
DATA=yourInput
OUT=yourOutput(drop=_name_)
DELIMITER=_;
BY Country Code;
ID Host Year;
VAR Value;