Apologies all, I'm new to R and am stumped on something that I am sure is simple!
I have this working code: lr5yr$"LR_3Y_>10%_over_5Y" <- ifelse(lr5yr$lr >= 0.65 & lr5yr$lr - lr5yr$lr5 < lr5yr$lr ,"Y","N")
Effectively all I need is at the end to specify how much less the final calculation is.
In my formula the end is show a Y if the value of lr-lr5 is less than lr. But what I need is for it to show a Y only if the amount less is greater than or equal to 0.1
So, if lr-lr5 is 0.1 or greater less than lr show Y.
I'm sure that what I want could be achieved with a slight modification to my formula above, however I needed a quick fix so managed to come up with a workaround.
I created a new column using the following lr5yr$"LR_Diff" <- lr5yr$lr - lr5yr$lr5 and then derived my require Y/N flag with the following code: lr5yr$"LR_3Y_>10%_over_5Y" <- ifelse(lr5yr$lr >= 0.65 & lr5yr$LR_Diff > .1 ,"Y","N")
Related
I have an array of numbers and for every number I want to check if it's greater than a value in another cell and if it is greater I want to add the difference to the total sum.
I have succeeded to do this "manually" for an amount of cells but there must be a better way.
For simplicity I just compared the value to 10 but it will be another cell.
=sum(if(A1>=10,A1-10,0),if(A2>=10,A2-10,0),if(A3>=10,A3-10,0))
The formula abohe yields the expected result for A1:A3.
What unfortunately doesn't work is:
=SUM(if(A1:A3>=10,A1:A3-10,0))
At the end I changed my approach to arrive at the solution:
=SUMIF(A1:A3,">10") - COUNTIF(A1:A3,">10") * 10
So instead of summing the differences directly, we sum the appropriate values and then subtract the reference as often as we summed up.
Try with this:
=SUM(ARRAYFORMULA(IF(A:A="","",IF(A:A<=10,10-A:A,0))))
try:
=BYROW(A1:A20, LAMBDA(a, IF(a>=10, a-10, 0)))
or:
=SUMPRODUCT(BYROW(A1:A20, LAMBDA(a, IF(a>=10, a-10, 0))))
Sorry for maybe a lame posting, but I often encounter a need to return a pre-defined value x if cell exceeds y, but more efficient than below (due to often having long formulas)
Example below. I want cell to return 10 if
SUMIFS($AI$42:$AI$51;$AH$42:$AH$51;"<"&AH51)+1
is more than 10.
Is there a more efficient/elegant way than repeating this twice? i.e.
If((SUMIFS($AI$42:$AI$51;$AH$42:$AH$51;"<"&AH51)+1)>10;10;SUMIFS($AI$42:$AI$51;$AH$42:$AH$51;"<"&AH51)+1)
Thanks in advance!
Just put sumifs part into another cell, than use it in formula.
B1: =SUMIFS($AI$42:$AI$51;$AH$42:$AH$51;"<"&AH51)+1
RESULT: =If($B$1>10;10;$B$1)
If the result can't be higher than 10 then I'd go with MIN formula:
MIN(10, SUMIFS($AI$42:$AI$51;$AH$42:$AH$51;"<"&AH51)+1)
No extra cells, shorter and cleaner.
I have the following data and I would like to apply the log() function:
v1
2
3
4
-1
5
Expected output:
v1
2 0.30 ~ log(2)
3 0.48 ~ log(3)
4 0.60 ~ log(4)
-1 .
5 0.70 ~ log(5)
This is just a simplified version of the problem. There are 35000 observations in my dataset and I could not find any simple rules like drop if v1 <= 0 to solve this problem.
Without screening my data first, one method in my mind is to use for loop and run the log() function over the observations. However, I couldn't find any websites telling me how to do that.
Stata will return missing if asked to take the logarithm of zero or negative values. But
generate log_x = log(x)
and
generate log_x = log(x) if x > 0
will have precisely the same result, missings in the observations with problematic values.
The bigger question here is statistical. Why do you want to take logarithms of such a variable any way? If your idea is to transform a variable, then other transformations are available. If the variable is a response or outcome variable, then a generalized linear model with logarithmic link will work even if there are some zero or negative values; the idea is just that the mean function should remain positive.
There have been many, many threads raising these issues on Cross Validated and Statalist.
I can't imagine why you think a loop is either needed or helpful here. With generate statements of the kind above, Stata automatically loops over observations.
=COUNTIFS(Orders!$T:$T,$B4)
is a code that gives 0 or a +ve result
I use this across 1500 cells which makes the sheet gets filled with 0s
I'd like to remove the Zeros by using the following formula
if(COUNTIFS(Orders!$T:$T,$B3,Orders!$F:$F,""&P$1&"*")=0,
"",
COUNTIFS(Orders!$T:$T,$B3,Orders!$F:$F,""&P$1&"*"))
This calculates every formula twice and increases the calculation time.
How can we do this in 1 formula where if the value is 0 - keep empty - otherwise display the answer
I suggest this cell-function:
=IFERROR(1/(1/COUNTIFS(Orders!$T:$T,$B4)))
EDIT:
I'm not sure what to add as explanation. Basically to replace the result of a complex calculation with blank cells if it results in 0, you can wrap the complex function in
IFERROR(1/(1/ ComplexFunction() ))
It works by twice taking the inverse (1/X) of the result, thus returning the original result in all cases except 0 where a DIV0 error is generated. This error is then caught by IFERROR to result in a blank cell.
The advantage of this method is that it doesn't need to calculate the complex function twice, so can give a significant speed/readability increase, and doesn't fool the output like a custom number format which can be important if this cell is used in further functions.
You only need to set the number format for your range of cells.
Go to the menu Format-->Number-->More Formats-->Custom Number Format...
In the entry area at the top, enter the following: #;-#;""
The "format" of the format string is
(positive value format) ; (negative value format) ; (zero value format)
You can apply colors or commas or anything else. See this link for details
instead of your =COUNTIFS(Orders!$T:$T,$B4) use:
=REGEXREPLACE(""&COUNTIFS(Orders!$T:$T,$B4), "^0$", )
also, to speed up things you should avoid "per row formulae" and use ArrayFormulas
The problem itself can be found here. The gist of it is that Bessie is riding a roller coaster, but she gets dizzy. What is the maximum amount of fun she can have without going over her "dizzy limit."
The input consists of:
"N K L
where N (1 ≤ N ≤ 1,000) is the number of sections in this particular the roller coaster; K (1 ≤ K ≤ 500) is the amount that Bessies dizziness level will go down if she keeps her eyes closed on any section of the ride; and L (1 ≤ L ≤ 300,000) is the limit of dizziness that Bessie can tolerate — if her dizziness ever becomes larger than L, Bessie will get sick, and thats not fun!
Each of the next N lines will have two integers:
F D
where F (1 ≤ F ≤ 20) is the increase to Bessies total fun that shell get if she keeps her eyes open on that section, and D (1 ≤ D ≤ 500) is the increase to her dizziness level if she keeps her eyes open on that section. The sections will be listed in order."
My algorithm to solve this looks like this:
cin >> N; // sections
cin >> K; // amount dizziness can go down
cin >> L; // dizzy ceiling
belowL = L; // sets the amount of dizzy left
for (int i = 0; i < N; i++) {
cout << "\n" << i;
cin >> F >> D; // fun increase and dizzy increase
if (D < belowL) {
if (F >= D) {
funTotal += F;
}
}
else {
belowL -= K;
}
However, this does not always yield the correct result. What is the problem? It should pick the fun option, unless it would put Bessie over the sickness threshold. Is there a better way to do that?
So rather than give you code here's a sort of explanation of the problem solution.
The basic approach is to solve using dynamic programming as this reduces to what's called a Knapsack problem. Think of it this way, her dizziness is how much she can carry in the sack at once and the fun is what she would like to maximize (compared to say the value of the "items" she carries in a sack). Now what we would like to do is get the most enjoyment out of the roller coaster (most value in the sack) without making her too dizzy (going over the "weight" limit on the sack).
So now you want to pick which parts she has her eyes open/closed (whether an item is in the sack or not). So an easy way to think of this is choose the maximum of both options. If she can keep her eyes open without going over the threshold or whether to just keep her eyes closed. But now the problem changes, you see if she keeps her eyes open her diziness threshold will decrease (easier to solve sub problems).
Using this information it becomes easy to adapt the knapsack solution to this problem without having to use backtracking or recursion.
The idea is to save all the previously solved subproblems in a matrix so that you can reuse the results instead of recalculating them each time. Note one trick you can use is that you only need the current row of the matrix and the previously solved one because the recurrence relation for knapsack only requires those entries :)
P.S I was in the regional where this problem was given and solved it, nice to see this problem again
It should pick the fun option, unless it would put Bessie over the
sickness threshold. Is there a better way to do that?
The problem is that you're not maximizing the fun here, you're just preventing Bessie from getting sick. When you get to a section that would put her over the dizzy limit, you may be able to add more fun by backtracking and taking the not-fun option on some previous section. Say you've got input like this, in F D order:
1 400
5 450
10 25
18 75
20 400
Further, let's say that she's already near the dizzy limit when she hits the first section above. You could take the fun option on the first two sections and get a F increase of 6 and a D increase of 850. Maybe that puts her right at the limit. Now you have no choice but to take the not-fun option for subsequent sections. On the other hand, if you take the not-fun option for the first two sections, you can take the fun option for the next three and get an increase in F of 48 for a D increase of only 500.
Your current algorithm takes the fun option if the F:D ratio is greater than 1 and you have enough dizziness capacity left. That's reasonable, but not optimal. It's likely that no fixed ratio will give an optimal solution. Instead, you need to consider the benefit and cost of each option for each section.