Number of unique students that completed all 10 courses - powerbi

I have a little formula problem that I would really appreciate some help with.
The list has columns with Student names that repeat, Course names that repeat, and course status that can be passed, not passed, or not started.
I would like to count the number of unique students that passed all 10 courses that are available.
I tried different variations of Calculate and COUNTROWS.
This is the formula I have at the moment that doesn't work
PassedAll =CALCULATE(DISTINCTCOUNT(Progress[Student]),Progress[Mark]="Passed",Progress[Course]="Course1"&&Progress[Course]="Course2")
I understand that && doesn't work in this scenario because in a single row it cannot be both courses. And I don't want to replace it with an OR, || operator because I want to count students that have Passed marks on each of these courses.
Can someone please recommend how to somehow replace the course section of the filter with something that will include all 10 courses?

If you want only number to show in "Card Visualization" then:
StudentPassed = countrows(filter(GENERATE(VALUES(Sheet1[Student]), ROW("CoursCompleted", CALCULATE( DISTINCTCOUNT(Sheet1[Course]), Sheet1[Mark] ="Passed"))), [CoursCompleted]= 10))
in my sample data 1 Student Passed all, 1 Student Passed 9courses, 1 Student Pass 8 (and no record for 2 of course).

Related

How to input sales values multiple times using a For loop and sentinel?

To provide some context, I am very new to programming and to C++ so I understand that my code is likely not the most efficient and my error is probably fairly simple. This is just something I am doing for fun and I have a particular interest in music, so that is what has inspired me to create this type of program.
I am trying to practice what I have learned in class in order to improve my coding. I am working on a program that will: 1) get the name of an album from the user 2) get the number of singles released off the album from the user 3) get the sales of the album from the user 4) get the sales of each single from the user and 5) calculate the total album and single sales off the album.
So far, my code is working well but I am having some difficulty with the single sales and I am unsure how to code it so that it works properly. I want to use a For loop that asks for the sales of each single. Inside of the loop, I have a nested while loop to stop collecting the sales for a single when "1" is entered by the user. My code works well for the first single, however after I enter "1" to indicate that there are no more sales to be entered for the first single, the For loop just prints out the prompt for the sales of the next singles but does not allow for any entry. I know that this is because of the nested while loop I have included, but I don't know how to make the loop accept user sales input for each single while still being able to mark the end of input for each single using "1" as a sentinel.
This is what the program looks like when running the For loop and using my own input:
"What are the sales of single #1?
-5000000
-250000
-320000
-1
What are the sales of single #2?
What are the sales of single #3?
What are the sales of single #4?
What are the sales of single #5?
The single sales are 5570000."
I indicated there were 5 singles released off the album but I am only able to enter the sales for the first single. As I mentioned, I know this is because when I entered 1, the nested while loop I am using to collect the sales is terminated and then the For loop just prints out the cout statement I have written. However, I am looking for a way to code this so that I can enter the sales for all 5 singles.
Does anyone know how I might be able to fix this? I appreciate any help offered and I am happy to answer any additional questions that anyone might have. I am also including the code I have written for the function that handles the single sales for reference but I can include my entire code as well if it would be helpful. Thank you!
int singleSales(int numOfSingles)
{
//Holds single sales and accumulated sales
int singleSales = 0;
int totalSales = 0;
//Collects the single sales for each single as determined by user
for (int singleNum = 1; singleNum <= numOfSingles; singleNum++)
{
cout << "\nWhat are the sales of single #" << singleNum << "?\n";
while (singleSales != 1) //Collects single sales while input is not 1
{
cin >> singleSales;
totalSales += singleSales; //Accumulates single sales
}
}
return totalSales - 1;
}
The value of your variable singleSales is 1 at the end of the loop. So, for any singleNum after the first one, your while (singleSales != 1) will already be false and you never reach the cin part.
You need to set int singleSales = 0; inside the for loop, not outside.

How to write loop across Hierarchical Data (household-individual) in stata?

I'm now working on a household survey data set and I'd like to give certain members extra IDs according to their relationship to the household head. More specifically, I need to identify the adult children of household head and his/her spouse, if married, and assign them "sub-household IDs".
The variables are: hhid - household ID; pid -individual ID; relhead - relationship with head.
Regarding relhead, a 1 represents the head, a 6 represents a child, and a 7 represents a child-in-law. Below some example data, including in the last column the desired outcome. I assume that whenever a 6 is followed by a 7, they constitute a couple and belong to the same sub-household.
hhid pid relhead sub_hhid(desired)
50 1 1 1
50 2 3 1
50 3 6 2
50 4 6 3
50 5 7 3
-----------------------------------------------
67 1 1 1
67 3 6 2
67 4 7 2
Here are some thoughts:
There may be married and unmarried adult children within one household, the family structure is a little bit complicated, so I want to write some loop across the members in a household.
The basic idea is in the outer loop we identify the children staying-at-home and then check if there's a spouse presented, if there is, then we give the couple an indicator, if not, we continue and give the single stay_chil other indicator. After walking through all the possible members within a household, we get a series of within-household IDs. To facilitate further analysis , I need some kind of external ID variable to separate the sub-families.
* Define N as the total number of household, n as number of individual household size
* sty_chil is indicator for adult child who living with parents(head)
* sty_chil_sp is adult child's spouse
* "hid" and "ind_id" are local macros
forvalue hid=1/N {
forvalue ind_id= 1/n {
if sty_chil[`ind_id']==1 {
check if sty_chil_sp[`ind_id+1']==1 {
if yes then assign sub_hhid to this couples *a 6-7 pairs,identifid as couple
}
else { * single 6 identifid as single child
assign sub_hhid to this child
}
else { *Other relationships rather than 6, move forward
++ind_id the members within a household
}
++hid *move forward across households
}
The built-in stata by,sort: is pretty powerful but here I want to treat part of family members who fall into certain criterion and leave other untouched, so a if-else type loop is more natural for me (even by: may achieve my goal,it's always too tactful when situation become not so simpleļ¼Œand we cannot exhaust all the possible pattern of household pattern).
An immediate problem is that I don't know how to write loop across house IDs and individual IDs, because I used to acquire the household size (increment of outer loop) using by command (I'm not sure in this case it's 1 or the numerber of family members), and I'm not sure if mix up the by and if loops is a good programming practice, I favor write a "full loop" in this case. Please give me some clues how to achieve my goal and provide (illustrate)pseudo code for me.
An extra question is I cannot find the ado file which contains the content of by command, does it exist?
I will abstract from the issue of whether the assumption used to create matches is a sensible one or not. Rather, let this be an example of reaching the desired results without using explicit loops. Some logic and the use of subscripting (see help subscripting) can get you far.
clear
set more off
*----- example data -----
input ///
hhid pid relhead sub_hhid
50 1 1 1
50 3 6 2
50 4 6 3
50 5 7 3
67 1 1 1
67 3 6 2
67 4 7 2
67 5 6 3
end
list, sepby(hhid)
*----- what you want -----
bysort hhid (pid): gen hhid2 = sum( !(relhead == 7 & relhead[_n-1] == 6) )
list, sepby(hhid)
As you can see, one line of code gets you there. The reasoning is the following:
sum() gives the running sum. The arguments to sum(), being conditions, can either be True or False. The ! denotes the logical not (see help operators).
If it is not the case that the relationship is daughter/son-in-law AND the previous relationship is daughter/son, the condition evaluates to True and takes on the value of 1, increasing the running sum by 1. If it evaluates to False, meaning that the relationship is daughter/son-in-law AND the previous relationship is daughter/son, then it takes on the value of 0 and the running sum will not increase. This gives the result you seek.
You do this using the by: prefix, since you want to check each original household independently, so to speak.
For the the first observation of each original household, the condition always evaluates to True. This is because there exist no "previous" observation (relationship), and Stata considers relhead to be missing (., a very large number) and therefore, not equal to 6. This takes the running sum from 0 to 1 for the first observation of each sub-group, and so on.
Bottom line: learn how to use by: and take advantage of the features offered by Stata. Do not swim against the current; not here.
Edit
Please note that instead of progressively changing your example data set, you should provide a representative example from the beginning. Not doing so can render answers that are initially OK, completely inadequate.
For your modified example, add:
replace hhid2 = 1 if !inlist(relhead,6,7)
That will simply assign anyone not 6 or 7 to the same household as the head. The head is assumed to always have hhid2 == 1. If the head can have hhid2 != 1, then
bysort hhid (relhead): replace hhid2 = hhid2[1] if !inlist(relhead,6,7)
should work.
You can follow with:
bysort hhid (pid): replace hhid2 = hhid2[_n-1] + 1 if hhid2 != hhid2[_n-1] & _n > 1
but because they are IDs, it's not really necessary.
Finally, use:
gen hhid3 = string(hhid) + "_" + string(hhid2)
to create IDs with the form 50_1, 50_2, 50_3, etc.
Like I said before, if your data presents more complications, you should present a relevant example.

Generating rolling z-scores of panel data in Stata

I have an unbalanced panel data set (countries and years). For simplicity let's say I have one variable, x, that I am measuring. The panel data sorted first by country (a 3-digit numeric country-code) and then by year. I would like to write a .do file that generates a new variable, z_x, containing the standardized values of the variable x. The variables should be standardized by subtracting the mean from the preceding (exclusive) m time periods, and then dividing by the standard deviation from those same time periods. If this is not possible, return a missing value.
Currently, the code I am using to accomplish this is the following (edited now for clarity)
xtset weocountrycode year
sort weocountrycode year
local win_len = 5 // Defining rolling window length.
quietly: rolling sd_x=r(sd) mean_x=r(mean), window(`win_len') saving(stats_x, replace): sum x
use stats_x, clear
rename end year
save, replace
use all_data_PROCESSED_FINAL.dta, clear
quietly: merge 1:1 (weocountrycode year) using stats_x
replace sd_x = . if `x'[_n-`win_len'+1] == . | weocountrycode[_n-`win_len'+1] != weocountrycode[_n] // This and next line are for deleting values that rolling calculates when I actually want missing values.
replace mean_`x' = . if `x'[_n-`win_len'+1] == . | weocountrycode[_n-`win_len'+1] != weocountrycode[_n]
gen z_`x' = (`x' - mean_`x'[_n-1])/sd_`x'[_n-1] // calculate z-score
UPDATE:
My struggle with rolling is that when rolling is set up to use a window length 5 rolling mean, it automatically does window length 1,2,3,4 means for the first, second, third and fourth entries (when there are not 5 preceding entries available to average out). In fact, it does this in general - if the first non-missing value is on entry 5, it will do a length 1 rolling average on entry 5, length 2 rolling average on entry 6, ..... and then finally start doing length 5 moving averages on entry 9. My issue is that I do not want this, so I would like to avoid performing these calculations. Until now, I have only been able to figure out how to delete them after they are done, which is both inefficient and bothersome.
I tried adding an if clause to the -rolling- statement:
quietly: rolling sd_x=r(sd) mean_x=r(mean) if x[_n-`win_len'+1] != . & weocountrycode[_n-`win_len'+1] != weocountrycode[_n], window(`win_len') saving(stats_x, replace): sum x
But it did not fix the problem and the output is "weird" in the sense that
1) If `win_len' is equal to, say, 10, there are 15 missing values in the resulting z_x variable, instead of 9.
2) Even though there are "extra" missing values in z_x, the observations still start out as window length 1 means, then window length 2 means, etc. which makes no sense to me.
Which leads me to believe I fundamentally don't understand 1) what -rolling- is doing and 2) how an if clause works in the context of -rolling-.
Does this help?
Thanks!
I'm not sure I understand completely but I'll try to answer based on what I think your problem is, and based on a comment by #NickCox.
You say:
... when rolling is set up to use a window length 5 rolling mean...
if the first non-missing value is
on entry 5, it will do a length 1 rolling average on entry 5, length 2
rolling average on entry 6, ...
This is expected. help rolling states:
The window size refers to calendar periods, not the number of
observations. If there
are missing data (for example, because of weekends), the actual number of observations used by command may be less than
window(#).
It's not actually doing a "length 1 rolling average", but I get to that later.
Below some examples to see what rolling does:
clear all
set more off
*-------------------------- example data -----------------------------
set obs 92
gen dat = _n - 1
format dat %tq
egen seq = fill(1 1 1 1 2 2 2 2)
tsset dat
tempfile main
save "`main'"
list in 1/12, separator(4)
*------------------- Example 1. None missing ------------------------
rolling mean=r(mean), window(4) stepsize(4) clear: summarize seq, detail
list in 1/12, separator(0)
*------- Example 2. All but one value, missing in first window ------
use "`main'", clear
replace seq = . in 1/3
list in 1/8
rolling mean=r(mean), window(4) stepsize(4) clear: summarize seq, detail
list in 1/12, separator(0)
*------------- Example 3. All missing in first window --------------
use "`main'", clear
replace seq = . in 1/4
list in 1/8
rolling mean=r(mean), window(4) stepsize(4) clear: summarize seq, detail
list in 1/12, separator(0)
Note I use the stepsize option to make things much easier to follow. Because the date variable is in quarters, I set windowsize(4) and stepsize(4) so rolling is just computing averages by year. I hope that's easy to see.
Example 1 does as expected. No problem here.
Example 2 on the other hand, should be more interesting for you. We've said that what matters are calendar periods, so the mean is computed for the whole year (four quarters), even though it contains missings. There are three missings and one non-missing. summarize is computing the mean over the whole year, but summarize ignores missings, so it just outputs the mean of non-missings, which in this case is just one value.
Example 3 has missings for all four quarters of the year. Therefore, summarize outputs . (missing).
Your problem, as I understand it, is that when you face a situation like Example 2, you'd like the output to be missing. This is where I think Nick Cox's advice comes in. You could try something like:
rolling mean=r(mean) N=r(N), window(4) stepsize(4) clear: summarize seq, detail
replace mean = . if N != 4
list in 1/12, separator(0)
This says: if the number of non-missings for the window (r(N), also computed by summarize), is not the same as the window size, then replace it with missing.

Python number averages using lists and keys

I'm working on a Python assignment and I'm totally stuck. Any assistance would be greatly appreciated. I know it's probably not as convoluted as it seems in my head... The details are below. Thanks very much.
Implement the following three functions (you should use an appropriate looping construct to compute the averages):
allNumAvg(numList) : takes a list of numbers and returns the average of all the numbers in the list.
posNumAvg(numList) : takes a list of numbers and returns the average of all the numbers in the list that are greater than zero.
nonPosAvg(numList) : takes a list of numbers and returns the average of all the numbers in the list that are less than or equal to zero.
Write a program that asks the user to enter some numbers (positives, negatives and zeros). Your program should NOT ask the user to enter a fixed number of numbers. Also it should NOT ask for the number of numbers the user wants to enter. But rather it should ask the user to enter a few numbers and end with -9999 (a sentinel value). The user can enter the numbers in any order. Your program should NOT ask the user to enter the positive and the negative numbers separately.
Your program then should create a list with the numbers entered (make sure NOT to include the sentinel value (-9999) in this list) and output the list and a dictionary with the following Key-Value pairs (using the input list and the above functions):
Key = 'AvgPositive' : Value = the average of all the positive numbers
Key = 'AvgNonPos' : Value = the average of all the non-positive numbers
Key = 'AvgAllNum' : Value = the average of all the numbers
Sample run:
Enter a number (-9999 to end): 4
Enter a number (-9999 to end): -3
Enter a number (-9999 to end): -15
Enter a number (-9999 to end): 0
Enter a number (-9999 to end): 10
Enter a number (-9999 to end): 22
Enter a number (-9999 to end): -9999
The list of all numbers entered is:
[4, -3, -15, 0, 10, 22]
The dictionary with averages is:
{'AvgPositive': 12.0, 'AvgNonPos': -6.0, 'AvgAllNum': 3.0}
EDIT: This is what I have so far, which I did pretty quick just to have a something to work with but I can't figure out how to implement the keys/dictionary like the assignment asks. Thanks again for any help.
print("This program takes user-given numbers and calculates the average")
counter = 0
sum_of_numbers = 0
first_question = int(input('Please enter a number. (Enter -9999 to end):'))
while first_question != -9999 :
ent_num = int(input('Please enter a number. (Enter -9999 to end):'))
sum_of_numbers = sum_of_numbers + ent_num
counter = counter + 1
first_question = int(input('Please enter a number (Enter -9999 to end):'))
print("Your average is " + str(sum_of_numbers/counter))
Welcome to Python programming, and programming in general!
From your code, I assume you are not entirely familiar with Python lists, dictionaries, and functions and how to use them. I'd suggest you look up tutorials for these; knowing how to use them will make your assignment much easier.
Here are some tutorials I found with some quick searches that might help:
Dictionary Tutorial,
List Tutorial,
Function Tutorial
When your assignment says to make three functions, you should probably make actual functions rather than trying to fit the functionality into your loop. For example, here is a simple function that takes in a number and adds 5 to it, then returns it:
def addFive(number):
return number + 5
To use it in your code, you would have something like this:
num = 6 # num is now 6
num = addFive(num) # num is now 11
So what you should do is create a list object containing all the numbers the user entered, and then pass that object into three separate functions - posNumAvg, nonPosAvg, allNumAvg.
Creating a dictionary of key-value pairs is pretty easy - first create the dictionary, then fill it with the appropriate values. For example, here is how I would create a dictionary like {'Hello': 'World'}
values = {}
values['Hello'] = 'World'
print(values) # Will print out {'Hello': 'World'}
So all you need to do is for each of the three values you need, assign the result of the function call to the appropriate key.
If this doesn't feel like quite enough for you to figure out this assignment, read the tutorials again and play with lists, dictionarys, and functions to try and get a feel for them. Good luck!
P.S. The append method of lists will be helpful to you. Try to figure out how to use it!

XSB prolog: Problems with lists

I'm new to XSB prolog and I'm trying to solve this problem.
I've got prices of products and some orders. It looks like this:
price(cola,3).
price(juice,1).
price(sprite,4).
// product for ex. is cola with a price of 3 (something, it doesn't matter which currency)
order(1, [cola,cola,sprite]).
order(2, [sprite,sprite,juice]).
order(3, [juice,cola]). // the number here is the number of the order
// and the list represents the products that
// belong to that order
Now, my task is to write a new function called bill/2. This function should take the number of the order and then sum up all the prices for the products in the same order(list).
Something like:
|?-bill(1,R).
R= 10 ==> ((cola)3 + (cola)3 + (sprite)4 = 10)
|?-bill(2,R).
R= 9 ==> ((sprite)4 + (sprite4 + (juice)1 = 9)
and so on... I know how to get to the number of the order but I don't know how to get each product from the list inside that order to get to it's price, so I can sum it up.
Thanks in advance.
In plain Prolog, first get all numbers in a list, then sum the list:
bill(Ord, Tot) :-
order(Ord, Items),
findall(Price, (member(I, Items), price(I, Price)), Prices),
sum_list(Prices, Tot).
but since XSB has tabling available, there could be a better way, using some aggregation function.