T 103 - Negative Marking - ccc

Raju is giving his JEE Main exam. The exam has Q questions and Raju needs S marks to pass. Giving the correct answer to a question awards the student with 4 marks whereas giving the incorrect answer to a question awards the student with negative 3 (-3) marks. If a student chooses to not answer a question at all, he is awarded 0 marks.
Write a program to calculate the minimum accuracy that Raju will need in order to pass the exam.
Input
Input consists of multiple test cases.
Each test case consists of two integers Q and S
Output
Print the minimum accuracy upto 2 decimal places
Print -1 if it is impossible to pass the exam
Sample Input 0
2
10 40
10 33
Sample Output 0
100.00
90.00

Think of this as a simultaneous equation problem.
4x - 3y = S
x + y = Q
For the second scenario, your equations will be :
4x - 3y = 33
x + y = 10
After solving 'x' will be equal to the minimum number of questions he has to solve correctly. Calculate what percentage of 'Q' is 'x'.
That's the concept, think how you would approach it programatically :)

Related

Determine Maximum Profit Algorithm C++

Consider the following problem:
The Searcy Wood Shop has a backlog of orders for its world famous rocking chair (1 chair per order). The total time required to make a chair is 1 week. However, since the chairs are sold in different regions and various markets, the amount of profit for each order may differ. In addition, there is a deadline associated with each order. The company will only earn a profit if they meet the deadline; otherwise, the profit is 0.
Write a program that will determine an optimal schedule for the orders that will maximize profit.
The first line in a test case will contain an integer, n (0 ≤ n ≤ 1000), that represents the number of orders that are pending. A value of 0 for n indicates the end of the input file.
The next n lines contain 3 positive integers each. The first integer, i, is an order number. All order numbers for a given test case are unique. The second integer represents the number of weeks from now until the deadline for order number i. The third integer represents the amount of profit that the company will earn if the deadline is met for order number i.
Example input:
7
1 3 40
2 1 35
3 1 30
4 3 25
5 1 20
6 3 15
7 2 10
4
3054 2 30
4099 1 35
3059 2 25
2098 1 40
0
Ouput:
100
70
The output will be the optimal sum of the input of the test case.
The problem that I am having is that I am struggling to come up with an algorithm that consistently finds this optimal sum.
My first idea was that I could simply go through each input week by week and choose the chair with the highest profit for said week. This didn't work though in the case that a week has two chairs that both have a higher profit than the week prior.
My next idea was that I could order the list in order from highest to lowest profit. Then I would go through the list from the highest profit and compare the current entry to the next entry and choose the entry with the lower week.
None of these are consistently working. Can anyone help me?
I would first sort the list by second column (number of weeks before the deadline) in increasing order and then sort the third column (profit) in decreasing order.
For example, in your file:
2098 1 40
2 1 35
4099 1 35
3 1 30
5 1 20
3054 2 30
3059 2 25
7 2 10
1 3 40
4 3 25
6 3 15
Among the same number of week orders, I will peak the highest profit to execute. If deadline is 1 week - top highest order; 2 weeks - 2 top highest orders, 3 weeks - 3 top highest orders and so on.
Firstly you'll have to think which orders are eligible to be completed on the 'ith' day, that would be all the orders with deadline greater than or equal to i. So just iterate all the orders in decreasing order of their deadline.
Lets say the last deadline week is 'x' then push all the profit values of week 'x' in a priority queue. The max value from the pushed values would be your optimal profit for week 'x'. Now remove the selected profit from the priority queue and add it to your answer. The remaining values are still eligible to be used in the previous weeks and now add the profit values with deadline 'x-1' to the priority queue and take the max out of them and repeat until deadline week becomes 0.

Best way to sort two results from string

I've got the results back from a function stored as a string:
TF00, 24 percent complete
TF01, 100 percent complete
TF02, 0 percent complete
TF03, 5 percent complete
but I need to sort it (reverse numerically) by the second item, so it looks like this:
TF01, 100 percent complete
TF00, 24 percent complete
TF03, 5 percent complete
TF02, 0 percent complete
What's the most Pythonic way of doing this?
Assume s is the str, then:
print '\n'.join(sorted(s.split('\n'), key=lambda x: int(x.split()[1])))

Reshaping Pandas data frame (a complex case!)

I want to reshape the following data frame:
index id numbers
1111 5 58.99
2222 5 75.65
1000 4 66.54
11 4 60.33
143 4 62.31
145 51 30.2
1 7 61.28
The reshaped data frame should be like the following:
id 1 2 3
5 58.99 75.65 nan
4 66.54 60.33 62.31
51 30.2 nan nan
7 61.28 nan nan
I use the following code to do this.
import pandas as pd
dtFrame = pd.read_csv("data.csv")
ids = dtFrame['id'].unique()
temp = dtFrame.groupby(['id'])
temp2 = {}
for i in ids:
temp2[i]= temp.get_group(i).reset_index()['numbers']
dtFrame = pd.DataFrame.from_dict(temp2)
dtFrame = dtFrame.T
Although the above code solve my problem but is there a more simple way to achieve this. I tried Pivot table but it does not solve the problem perhaps it requires to have same number of element in each group. Or may be there is another way which I am not aware of, please share your thoughts about it.
In [69]: df.groupby(df['id'])['numbers'].apply(lambda x: pd.Series(x.values)).unstack()
Out[69]:
0 1 2
id
4 66.54 60.33 62.31
5 58.99 75.65 NaN
7 61.28 NaN NaN
51 30.20 NaN NaN
This is really quite similar to what you are doing except that the loop is replaced by apply. The pd.Series(x.values) has an index which by default ranges over integers starting at 0. The index values become the column names (above). It doesn't matter that the various groups may have different lengths. The apply method aligns the various indices for you (and fills missing values with NaN). What a convenience!
I learned this trick here.

Stata: break ties in rank of a variable using a second variable

I would like to rank observations in Stata by score1, while breaking ties using score2, as below:
score1 score2 desired_rank
____________________________
99 5 1
99 4 2
89 8 3
80 9 4
80 9 4
78 6 6
I've tried using egen rank, but can't find an option for specifying another variable for tiebreaking.
I've also read this post, but I haven't been able to adapt its solution to my problem very elegantly.
Any recommendations on how to create desired_rank?
One way could be:
clear
set more off
*----- example data -----
input ///
score1 score2 desired_rank
99 4 2
99 5 1
89 8 3
80 9 4
78 6 6
80 9 4
end
list, sep(0)
*----- what you want -----
egen scoreg = group(score1 score2)
egen myrank = rank(scoreg), field
// check
assert desired_rank == myrank
sort myrank
list, sep(0)
The key here is that egen, group() will assign group numbers according to the sort order of the varlist: score1 score2. Then use egen, rank() but with the field option which will rank the highest value as 1 and will not correct ties.
Let's flag here that the question asks for a twist on Stata's default ranking conventions. By default, Stata ranks the lowest value as 1, as is the more common practice in statistics, but here the question asks for the opposite convention, which Stata calls field ranks. That term is intended to evoke field events in athletics such as throwing and jumping in which the highest or longest score is ranked 1.
#Roberto Ferrer's solution is good, but let's work from first principles as an alternative. If we get the observations into the desired sort order, the rank desired is just the observation number, except that if the values in one observation are the same as those in the preceding observation, that rank is used, an exception we apply in cascade.
Here is some code:
clear
input score1 score2 desired_rank
99 5 1
99 4 2
89 8 3
80 9 4
80 9 4
78 6 6
end
gsort -score1 -score2
gen Desired_Rank = _n
replace Desired_Rank = Desired_Rank[_n-1] if score1 == score1[_n-1] & score2 == score2[_n-1]
assert desired_rank == Desired_Rank
Had we wanted lowest values to rank 1, the sorting command would have been
sort score1 score2
This solution gets messier if we want to rank only some observations using if or in; or if there are missing values; or if there are more scores to be used. In all those cases a solution based on egen is cleaner.
This is a good point to emphasise a trick obvious when it's explained:
egen rank1 = rank(mpg)
egen rank2 = rank(-mpg)
Negating a variable flips the ranking order round. The ranks of 2.71828, 3.14159 and 42 are 1, 2, 3; the ranks of -2.71828, -3.14159, -42 are 3, 2, 1. People often miss that the rank() function of egen can be fed an expression, which can easily be more complicated than a single variable name.
Personal note: When writing some ranking code for Stata in 1999, I was surprised to find no hint in the statistical or computing literature of names for different kinds of ranks, so I introduced the terms field and track to the Stata literature. Some years on, the only other term I have noticed is "schoolmaster's rank" for field rank, but that does not seem a better term, for several quite different reasons.

WEKA: What does the number after the '/' represent in these leaves?

"0(607.0/60.0)"
"1(149.0/14.0)"
I know that 607 and 149 represent the total number of examples covered by each leaf.
I want to know what the numbers "60" and "14" after the '/' represent?
The second number is the number (weight) of those instances that are misclassified.
The first number is the total number of instances (weight of instances) reaching the leaf. The second number is the number (weight) of those instances that are misclassified.
https://weka.wikispaces.com/What+do+those+numbers+mean+in+a+J48+tree%3F
For sample dataset
Decision tree result:
physician-fee-freeze = n: democrat (253.41/3.75).
First number indicated the number of correct things that reach that node. ( in this democrats) and the second number after “/” shows number of incorrect things that reach that node ( in this case republicans)
Total number of instances:
435 Total number of no (also integral number of correct things): 253
Probability of having no:
253/435 = 0.58
Total number of missing data:
11 Total number of times where it is coming with “no”: 8 Probability:
8/11 = 0.72
Total probability that missing data could be no:
0.58 X 0.72 = 0.42
Total number of correct things:
253+0.42 = 253.42 ~ 253.41
The number after the “/”shows number of incorrect things that reach that node. Now if you see this data it has five incorrect instances where “republican” is the result while “physician fee freeze” is “n” (or “?”)
Those five can be split as following: Total number incorrect instances with “n” : 2 Total number incorrect instances with “?”: 3
Similar formula:
2+(253/435)*3=3.75