Is he being greedy? - c++

I am solving a question from LeetCode.com:
Given an array of non-negative integers, you are initially positioned at the first index of the array.
Each element in the array represents your maximum jump length at that position.
Determine if you are able to reach the last index.
For example:
A = [2,3,1,1,4], return true.
A = [3,2,1,0,4], return false.
One of the most voted solutions (here) says that the following is using a greedy approach:
bool canJump(int A[], int n) {
int last=n-1,i,j;
for(i=n-2;i>=0;i--){
if(i+A[i]>=last)last=i;
}
return last<=0;
}
I have two questions:
What is the intuition behind using a greedy algorithm for this?
How is the above solution a greedy algorithm?
I thought this to be solvable by Dynamic Programming. I understand that questions solvable by DP can be solved by greedy method, but what was the intuition behind this particular one that made it more sense to solve by the greedy approach?
This SO question highlights this difference to some extent. I understand this might be a bit more, but if possible, could some one please answer this question in this context? I would highly appreciate that.
Thank you.
Edit: I think one of the reasons of my confusion is over the input [3,3,1,0,4]. As per the greedy paradigm, when i=0 wouldn't we take a jump of size 3 (A[0]) in order to greedily reach the output? But doing this would in fact be incorrect.

According to Wikipedia:
A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum.
Here, I want to draw your attention to the key phrase, locally optimal choice at each stage which makes the algorithm paradigm greedy.
Q1. What is the intuition behind using a greedy algorithm for this?
Since in this question, we only care about whether it is possible to reach the last index of the array, we can use a greedy algorithm. A greedy algorithm will select the optimal choice (take the maximum jump) at every step and check at the end whether the maximum index can reach the end.
Say, if we need to find out the jump size at each index to reach the end or need to optimize the number of jumps to reach the end, then the direct use of greedy algorithm won't serve our purpose.
Q2. How is the above solution a greedy algorithm?
The if condition in the above code - if(i+A[i]>=last)last=i; makes the algorithm greedy because we take the maximum jump if it is possible (i+A[i]>=last).
The analysis provided here may help you.
Edit
Let's talk about the input you mentioned - [3,3,1,0,4].
When i=0, algorithm checks what is the maximum index that we can reach from i=0.
Then we will move to the next index and check what is the max index we can reach from i=1. Since we moved to i=1, it is guranteed that we can come to index 1 from index 0 (doesn't matter what is the jump size).
Please note, in this problem, we don't care whether we should take a jump of size 3 at i=0 though we know this will not help us to reach the end. What we care about is whether we can reach the end or beyond that end index by taking jumps.

Related

Interview Complexity Question: If statement and comparison complexity

I was studying some interview questions and came across this problem right here. Question Image
I understood everything for the most part except the part that I have boxed in red. If every string within the array has been sorted. Then shouldn't sorting the array take only O(a log a)? Why did they multiply that by O(s)? It explains that the string comparison when sorting would take O(s), with s being the largest string size. That makes sense... However, I assumed the comparison would look something like this...
if ( array[x].equals(array[y]) ) {...}
Don't if statements take a complexity of O(1)? So shouldn't that be disregarded? I might be wrong but i think i had read that executing if statements and any other imbedded statement (Not Loop) would take a complexity of O(1). Please correct me if I am wrong and enlighten me on how to calculate complexity properly.
It's sorting, so the comparison would be 'a' > 'b'.
Now, imaging you have two strings like these:
'aaaaaaaa'
'aaaaaaab'
It would take S steps to get to the very last character to determine which string is larger. This operation is performed for each string comparison, hence we multiply it by S.

How can I remove too close points in a list

I have a list of points with x,y coordinates:
List_coord=[(462, 435), (491, 953), (617, 285),(657, 378)]
This list lenght (4 element here) can be very large from few hundred up to 35000 elements.
I want to remove too close points by threshold in this list.
note:Points are never at the exact same position.
My current code for that:
while iteration<5:
for pt in List_coord:
for PT in List_coord:
if (abs(pt[0]-PT[0])+abs(pt[1]-PT[1]))!=0 and abs(pt[0]-PT[0])<threshold and abs(pt[1]-PT[1])<threshold:
List_coord.remove(PT)
iteration=iteration+1
Explication of my terrible code :) :
I check if the very distance is 0 then it means that i am comparing
the same point
then i check the distance in x and in y..
Iteration:
I need few iterations to avoid missing one remove because the list change inside the loop itself...
This code is working but it is a very low process!
I am sure there is another method much easier but i wasn't able to find even if some allready answered questions are close to mine..
note:I would like to avoid using extra library for that code if it is possible
Python will be a bit slow at this ;-)
The solution you will probably want is called quad-trees, but I'll mention a simpler approach first, in case it's preferable.
The usual approach is to group the points so that you can easily reject points that are clearly far away from each other.
One approach might be to sort the list twice, once by x once by y. You can prove that if two points are too-close, they must be close in one dimension or the other. Thus your inner loop can break out early. If it sees a point that is too far away from the outer point in the sorted direction, it can know for a fact that all future points in that list are also too far away. Thus it doesn't have to look any further. Do this in X and Y and you're set!
This approach is going to tend to be dominated by the O(n log n) sort times. However, if all of your points share a single x value, you'll end up doing the same slow O(n^2) iteration that you're doing right now because you never terminate the inner loop early.
The more robust solution is to use quadtrees. Quadtrees are designed to solve the kind of problem you are looking at. The idea is to build a tree such that you can rapidly exclude large numbers of points. I'd recommend this.
If your number of points gets too large, I'd recommend getting a clustering library. Efficient clustering is a very difficult task, and often done in C++ or another fast language.

Time Complexity of fibonacci series in Bottom Up approach(DP)

Algorithm in Bottom up approach
a[0]=0,a[1]=1
integer fibo(n)
if a[n]== null
a[n] = fibo(n-1) + fibo(n-2)
return a[n]
How this algorithm has the time limit of O(N)
For 5 it calls 8 times.
pass of fibnacci series in Bottom Up approach
fibo(5) calling 8 times to go top to down and also calling 8 times to return top from bottom. so total call is 8+8=16 of my view. So how the time complexity is O(N) it's unclear to me.
I found many similar questions answered here but all of those isn't related with my
interest.
Some of these are:
Time Complexity of Fibonacci Series
Time Complexity of Fibonacci Algorithm
Anyone help would be appreciated.Thanks
There are a couple of quick things to mention before answering your question about the time complexity. The reason for this is that the time complexity at least partially depends on these answers.
First, there seems to be a bug in your program as you have an array 'a' for the base conditions (Fibbinacci numbers 0 and 1) and some array 'm' which is set in the fibo function, but never used again. More importantly, when you reach n=1 or n=0, you return the value of m[n] which is entirely unknown. So, I'm going to assume the algorithm is rewritten as follows:
a[0]=0,a[1]=1
integer fibo(n)
if a[n]== null
a[n] = fibo(n-1) + fibo(n-2)
return a[n]
Okay, second problem. Let's assume that that a is always defined as at least n+1 integers. There needs to be enough room for the incoming data. This is important because c++ will let you overwrite values at the n+1th index. It's out of bounds and wrong, but c++ doesn't give those sorts of protections. It is up to you as the programmer to verify boundary conditions like that. (I'm assuming c++ because this is tagged with c++. The code looks more like python, which has its wrap-around indices which are problematic on their own.)
Third, let's assume that you don't start with a new array 'a' for each run of the algorithm. This is important because if a stores already-calculated values then you will save time on calculation by not having to re-evaluate those values. That time savings is a great thing even if it won't affect how I calculate time complexity.
Great. Let's get started with your question. Let's use the image below to answer it. When you start the algorithm at n you are going to make two recursive calls for fibo(n-1) and fibo(n-2) BUT they do not happen simultaneously. Instead the first call for fibo(n-1) takes place and must be 100% complete before the second call for fibo(n-2) begins. That call is represented by the green line from n-1 on the nth line to the n-1th line.
Now, those green lines apply to each recursion down the line until you reach the fibo(1) call. That call terminates early because a[n] is NOT null. Finally the second call for fibo(0) is executed and it also terminates early because a[n] is not null. Okay, so much for the first set of recursive calls.
As each recursive call returns, the second call (represented by the orange broken line) is made, but a[n] is no longer null, so that call terminates early and the call returns up to the next layer.
So, let's count the number of calls. From n to 1 is n-1 recursive calls. At the end there is one additional call to fibo(0) so that is n recursive calls. Then on the way up there are n-2 additional calls which terminate early. So, altogether we have 2n-2 calls which is O(n).
Of course, if you call fibo(k) and then fibo(k+x) you will only need to do the first 2x calls because everything from fibo(k) down is already known. It is a considerable savings after the initial investment. Any questions?
Regarding O(2n)=O(n), that is a good follow up. Big-O complexity rules say that we are interested in the order-of-magnitude when you compare efficiency. So, suppose that you were looking at a n=1000. O(n)=1000, O(2n)=2000, but O(n2)=1,000,000. O(n) is more or less the same as O(2n), but if you compare them with O(n2), that is a huge difference. Similarly, if you have O(n+1)=1001 that isn't much different from O(n). So, in general we say that the leading term, the most important value in the equation is what is important. We aren't really interested in extra terms. We aren't really interested in specific coefficients because they don't really affect the outcome.
If you still have questions, see this site for some additional information.
https://justin.abrah.ms/computer-science/big-o-notation-explained.html

Comparison of two common comparison algorithms and their Big O help please

Today my professor gave us 2 take home questions as practice for upcoming array unit in C and I am wondering what exactly the sorting algorithm these 2 problems resemble and what their Big O is. Now, I am not coming here just expecting answers and I have ALREADY solved them, but I am not confident in my answers so I will post them udner each question and if I am wrong, please correct me and explain my error in thinking.
Question 1:
If we decide to go through an array's(box) element(folders) one at a time. Starting at the first element and comparing it with the next. Then if they are the same the comparison ends, however if both are not equal then it moves on to comparing the next two ELEMENTS [2] and [3]. This process is repeated and will stop once last two elements are compared and note that the array IS already sorted by last name and we are looking for same first name! Example: [ Harper Steven, Hawking John, Ingleton Steven]
My believed answer:
I beleive it is O(n) because it's just going over the elements of an array comparing array[0] to array[1] and then array[2] to array[3] ect ect. This process is linear and continues until the last two are compared. Definitely not logn because we aren't multiplying or diving by 2.
Final Question:
Suppose we have a box of folders each containing info on one person. If we were to want to look for people with same first name, we could first start by placing a sticker on the first folder in the box and then going through the folders after it in an orderly fashion until we find person with same name. If we find a folder with same name, we move that folder next to the folder with a sticker. Once we find ONE case where two people have same name, we stop and go to sleep because we're lazy. If the first search fails however, we simply remove sticker and place it on next folder and then continue as we did earlier. We repeat this process until sticker is on last folder in a scenario where we have no two people with same name.
This array is NOT sorted and compares the first folder with sticker folder[0] with the next i folder[i] elements.
My answer:
I feel like this can't be O(n), but maybe O(n^2) where it kinda feels like we have an array and then we keep repeating the process where n is proportional to the square of the input(folders). I could be wrong here through >.>
You're right on both questions… but it would help to explain things a bit more rigorously. I don't know what the standards of your class are; you probably don't need an actual proof, but showing more detailed reasoning than "we aren't multiplying or dividing by two" never hurts. So…
In the first question, there's clearly nothing happening here but comparisons, so that's what we have to count.
And the worst case is obviously that you have to go through the whole array.
So, in that case, you have to compare a[0] == a[1], then a[1] == a[2], …, a[N-1] == a[N]. For each of N-1 elements, there's 1 comparison. That's N-1 steps, which is obviously O(N).
The fact that the array is sorted turns out to be irrelevant here. (Of course since they're not sorted by your search key—that is, they're sorted by last name, but you're comparing by first name—that was already pretty obvious.)
In the second question, there are two things happening here: comparisons, and then moves.
For the comparisons, the worst case is that you have to do all N searches because there are no matches. As you say, we start with a[0] vs. a[1], …, a[N]; then a[1] vs. a[2], …, a[N], etc. So, N-1 comparisons, then N-2, and so on down to 0. So the total number of comparisons is sum(0…N-1), which is N*(N-1)/2, or N^2/2 - N/2, which is O(N^2).
For the moves, the worst case is that you find a match between a[0] and a[N]. In that case, you have to swap a[N] with a[N-1], then a[N-1] with a[N-2], and so on until you've swapped a[2] with a[1]. So, that's N-1 swaps, which is O(N), which you can ignore because you've already got an O(N^2) term.
As a side note, I'm not sure from your description whether you're talking about an array from a[0…N], or an array of length N, so a[0…N-1], so there could be an off-by-one error in both of the above. But it should be pretty easy to prove to yourself that it doesn't make a difference.
Scenario 2, a method of finding two matching items of arbitrary value, is indeed “quadratic”. Each pass looking for a match of one candidate against all the rest of the elements is O(n). But you repeat that n times. The value of n drops as you go so a detailed number of comparisons would be closer to n+(n-1)+(n-2)+ … 1 which is (n+1)×(n/2) or ½(n²+n) but all we care about is the overall shape of the curve so don't worry about the lower order terms or the coefficients. It's O(n²).

Finding all permutations that match a set of rules

I am given N numbers and for them apply M rules about their order. The rules are represented in a pairs of indexes and every pair (A, B) is telling that the number with index A (A-th number) must be AFTER the B-th number - it doesn't have to be next to him.
Ex: N = 4
1 2 3 4
M = 2
3 2
3 1
Output: 1234, 4213, 4123, 2134, 2143, 2413, 1423 ...Maybe there are even more:)
The algorithm should give me all the permutations available that don't break the rules, like in the example - 3 must always be after 2 and after 1.
I tried bruteforcing but it didn't work (although bruteforce should work in here, N is in the range (1,8). )
Any ideas ?
Just as a hint.
You can treat your set of rules as a graph. Each index is a vertex, each rule is a directed edge.
Any proper ordering of the numbers (i.e. a permutation that satisfies the rules) corresponds to so called topological ordering of the above graph. In order to generate all valid orderings of your numbers you need to generate all possible topological orderings of that graph.
P.S. The first algorithm for topological ordering given at the linked Wikipedia page already allows for a fairly straightforward solution that would enumerate all valid permutations. It will take some effort and some care to implement, but it is not rocket science.
Brute forcing would be going through every permutation, which is O(N!), and for each permutation simply looping through every rule to confirm that they aplpy, which is O(M). This ends up O(N!M) which is kind of ridiculous, but it shouldn't "not work" for such a small set.
Honestly, your best bet is to go back and get the brute force solution working. Once that is done (and if you still have time, etc) you can look for a better algorithm.
EDIT to the down voter. The student is (should be) trying to get his homework done on time. By the sounds of it, his homework is a programming exercise where a brute-force solution would be adequate. Helping him to figure out an efficient algorithm is not addressing his REAL problem.
In this case he has tried the simple brute-force approach (which everyone agrees ought to work for small N values) and given up on it prematurely to try something that is probably more difficult. Any experienced developer will tell you that this is a bad idea. The student needs and deserves to be told so, and if he is sensible he will pay attention. But obviously, it his choice ...