Go through the array from left to right and collect as many numbers as possible - c++

CSES problem (https://cses.fi/problemset/task/2216/).
You are given an array that contains each number between 1…n exactly once. Your task is to collect the numbers from 1 to n in increasing order.
On each round, you go through the array from left to right and collect as many numbers as possible. What will be the total number of rounds?
Constraints: 1≤n≤2⋅10^5
This is my code on c++:
int n, res=0;
cin>>n;
int arr[n];
set <int, greater <int>> lastEl;
for(int i=0; i<n; i++) {
cin>>arr[i];
auto it=lastEl.lower_bound(arr[i]);
if(it==lastEl.end()) res++;
else lastEl.erase(*it);
lastEl.insert(arr[i]);
}
cout<<res;
I go through the array once. If the element arr[i] is smaller than all the previous ones, then I "open" a new sequence, and save the element as the last element in this sequence. I store the last elements of already opened sequences in set. If arr[i] is smaller than some of the previous elements, then I take already existing sequence with the largest last element (but less than arr[i]), and replace the last element of this sequence with arr[i].
Alas, it works only on two tests of three given, and for the third one the output is much less than it shoud be. What am I doing wrong?

Let me explain my thought process in detail so that it will be easier for you next time when you face the same type of problem.
First of all, a mistake I often made when faced with this kind of problem is the urge to simulate the process. What do I mean by "simulating the process" mentioned in the problem statement? The problem mentions that a round takes place to maximize the collection of increasing numbers in a certain order. So, you start with 1, find it and see that the next number 2 is not beyond it, i.e., 2 cannot be in the same round as 1 and form an increasing sequence. So, we need another round for 2. Now we find that, 2 and 3 both can be collected in the same round, as we're moving from left to right and taking numbers in an increasing order. But we cannot take 4 because it starts before 2. Finally, for 4 and 5 we need another round. That's makes a total of three rounds.
Now, the problem becomes very easy to solve if you simulate the process in this way. In the first round, you look for numbers that form an increasing sequence starting with 1. You remove these numbers before starting the second round. You continue this way until you've exhausted all the numbers.
But simulating this process will result in a time complexity that won't pass the constraints mentioned in the problem statement. So, we need to figure out another way that gives the same output without simulating the whole process.
Notice that the position of numbers is crucial here. Why do we need another round for 2? Because it comes before 1. We don't need another round for 3 because it comes after 2. Similarly, we need another round for 4 because it comes before 2.
So, when considering each number, we only need to be concerned with the position of the number that comes before it in the order. When considering 2, we look at the position of 1? Does 1 come before or after 2? It it comes after, we don't need another round. But if it comes before, we'll need an extra round. For each number, we look at this condition and increment the round count if necessary. This way, we can figure out the total number of rounds without simulating the whole process.
#include <iostream>
#include <vector>
using namespace std;
int main(int argc, char const *argv[])
{
int n;
cin >> n;
vector <int> v(n + 1), pos(n + 1);
for(int i = 1; i <= n; ++i){
cin >> v[i];
pos[v[i]] = i;
}
int total_rounds = 1; // we'll always need at least one round because the input sequence will never be empty
for(int i = 2; i <= n; ++i){
if(pos[i] < pos[i - 1]) total_rounds++;
}
cout << total_rounds << '\n';
return 0;
}
Next time when you're faced with this type of problem, pause for a while and try to control your urge to simulate the process in code. Almost certainly, there will be some clever observation that will allow you to achieve optimal solution.

Related

C++ Finding the last occurrence of an int in a linear search

This week for homework I've been tasked with loading in a text file of 1,000 numbers and to do a linear search of a number entered in by a user. I have the linear search part done, but I have to find and print the last occurrence of that integer. I figured it would be easiest to run the array from the end and print the last occurrence and break the loop. I've started the code, but am having some trouble at finding the last occurrence.
I know my second for loop to run the array backwards is wrong, I'm just not sure what about it is wrong. Any help is appreciated! Thank you!
#include <iostream>
#include <fstream>
#include <conio.h>
using namespace std;
int main()
{
ifstream input("A1.txt");
int find;
cout << "Enter a number to search for: ";
cin >> find;
if (input.is_open())
{
int linSearch[1000];
for (int i = 0; i < 1000; i++)
{
input >> linSearch[i];
for (i = 1000; i > 0; i--)
{
if (find == linSearch[i])
{
cout << find << " is at position " << i << ". " << endl;
}
}
}
}
_getch();
return 0;
}
for (int i = 0; i < 1000; i++)
{
input >> linSearch[i];
This is a good start. You started a loop to read the 1000 numbers into your array.
for (i = 1000; i > 0; i--)
Don't you think this is a bit premature? You haven't yet finished the loop to read the 1000 numbers in the file, yet, and you're already searching the array, that hasn't been fully read yet. There's a very technical term for this logical mistake: "putting the cart before the horse". You need to finish the loop to read the 1000 numbers, first. And only then you can execute this second loop.
{
if (find == linSearch[i])
Ok, now let's back up a bit. You started the loop with i=1000. Now, right here, what is the very first value if i? It is 1000. Don't you see a problem here? The 1000 element array, "linSearch", as you know, contains values numbered 0 through 999. That's a 1000 elements total. With i starting off with a value of 1000, accessing the non-existent linSearch[1000] is undefined behavior, and a bug.
You could tweak the logic here, to get it right. But it's not even necessary to do that. You already have a perfectly working loop that reads the 1000 numbers from the file. And you know which number you want to search.
So, each time you read the next number from the file, if it's the number you're looking for, you just store its position. So, when all is said and done, the last position that's stored in that variable will be the position of the last occurrence of the number you're searching for. Simple logic. All you have to do is also set a flag indicating that the number you were searching for has been found.
And once you come to the decision to do that, you will find that it's no longer even needed to have any kind of an array in the first place. All you have to do is read the 1000 numbers from the file, one number at a time, check if each number is the one you're searching for, and if so, save its position. Then, at the end of the loop, compare notes.
Since it's homework, I should probably be at least a little vague, and I definitely shouldn't use code.
You should not be nesting the 2nd loop within the first loop. It should be at the same indentation level as, and after the closing bracket for, the first loop.
Also, you shouldn't be searching back to 0 in almost all cases, but instead back to where you found the element in your linear search, or where you find it, and no further.
And yes, pay attention to what Beta wrote.
Also, shouldn't you break out of the loop each time you find what you're looking for?

Is this code a bubble sorting program?

I made a simple bubble sorting program, the code works but I do not know if its correct.
What I understand about the bubble sorting algorithm is that it checks an element and the other element beside it.
#include <iostream>
#include <array>
using namespace std;
int main()
{
int a, b, c, d, e, smaller = 0,bigger = 0;
cin >> a >> b >> c >> d >> e;
int test1[5] = { a,b,c,d,e };
for (int test2 = 0; test2 != 5; ++test2)
{
for (int cntr1 = 0, cntr2 = 1; cntr2 != 5; ++cntr1,++cntr2)
{
if (test1[cntr1] > test1[cntr2]) /*if first is bigger than second*/{
bigger = test1[cntr1];
smaller = test1[cntr2];
test1[cntr1] = smaller;
test1[cntr2] = bigger;
}
}
}
for (auto test69 : test1)
{
cout << test69 << endl;
}
system("pause");
}
It is a bubblesort implementation. It just is a very basic one.
Two improvements:
the outerloop iteration may be one shorter each time since you're guaranteed that the last element of the previous iteration will be the largest.
when no swap is done during an iteration, you're finished. (which is part of the definition of bubblesort in wikipedia)
Some comments:
use better variable names (test2?)
use the size of the container or the range, don't hardcode 5.
using std::swap() to swap variables leads to simpler code.
Here is a more generic example using (random access) iterators with my suggested improvements and comments and here with the improvement proposed by Yves Daoust (iterate up to last swap) with debug-prints
The correctness of your algorithm can be explained as follows.
In the first pass (inner loop), the comparison T[i] > T[i+1] with a possible swap makes sure that the largest of T[i], T[i+1] is on the right. Repeating for all pairs from left to right makes sure that in the end T[N-1] holds the largest element. (The fact that the array is only modified by swaps ensures that no element is lost or duplicated.)
In the second pass, by the same reasoning, the largest of the N-1 first elements goes to T[N-2], and it stays there because T[N-1] is larger.
More generally, in the Kth pass, the largest of the N-K+1 first element goes to T[N-K], stays there, and the next elements are left unchanged (because they are already increasing).
Thus, after N passes, all elements are in place.
This hints a simple optimization: all elements following the last swap in a pass are in place (otherwise the swap wouldn't be the last). So you can record the position of the last swap and perform the next pass up to that location only.
Though this change doesn't seem to improve a lot, it can reduce the number of passes. Indeed by this procedure, the number of passes equals the largest displacement, i.e. the number of steps an element has to take to get to its proper place (elements too much on the right only move one position at a time).
In some configurations, this number can be small. For instance, sorting an already sorted array takes a single pass, and sorting an array with all elements swapped in pairs takes two. This is an improvement from O(N²) to O(N) !
Yes. Your code works just like Bubble Sort.
Input: 3 5 1 8 2
Output after each iteration:
3 1 5 2 8
1 3 2 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
1 2 3 5 8
Actually, in the inner loop, we don't need to go till the end of the array from the second iteration onwards because the heaviest element of the previous iteration is already at the last. But that doesn't better the time complexity much. So, you are good to go..
Small Informal Proof:
The idea behind your sorting algorithm is that you go though the array of values (left to right). Let's call it a pass. During the pass pairs of values are checked and swapped to be in correct order (higher right).
During first pass the maximum value will be reached. When reached, the max will be higher then value next to it, so they will be swapped. This means that max will become part of next pair in the pass. This repeats until pass is completed and max moves to the right end of the array.
During second pass the same is true for the second highest value in the array. Only difference is it will not be swapped with the max at the end. Now two most right values are correctly set.
In every next pass one value will be sorted out to the right.
There are N values and N passes. This means that after N passes all N values will be sorted like:
{kth largest, (k-1)th largest,...... 2nd largest, largest}
No it isn't. It is worse. There is no point whatsoever in the variable cntr1. You should be using test1 here, and you should be referring to one of the many canonical implementations of bubblesort rather than trying to make it up for yourself.

Time Limit Exceeded in dealing with large arrays

I'm trying to solve this question:
As we all know that Varchas is going on. So FOC wants to organise an event called Finding Occurrence.
The task is simple :
Given an array A[1...N] of positive integers. There will be Q queries. In the queries you will be given an integer. You need to find out the frequency of that integer in the given array.
INPUT:
First line of input comprises of integer N, the number of integers in given array.
The next line will comprise of N space separated integers. The next line will be Q, number of Queries.
The next Q lines will comprise of a single integer whose Occurrence you are supposed to find out.
OUTPUT:
Output single integer for each Query which is the frequency of the given integer.
Constraints:
1<=N<=100000
1<=Q<=100000
0<=A[i]<=1000000
And this is my code:
#include <iostream>
using namespace std;
int main()
{
long long n=0;
cin >> n;
long long a[1000000];
for (int i=1;i<=n;i++)
{
cin >> a[i];
}
long long q=0;
cin >> q;
while (q--)
{
long long temp=0,counter=0;
cin >> temp;
for (int k=1;k<=n;k++)
{
if (a[k]==temp)
counter++;
}
cout << "\n" << counter;
temp=0;
counter=0;
}
return 0;
}
However, I encountered the 'Time Limit Exceeded' error. I suspect this is due to the failure to handle large values in arrays. Could someone tell me how to handle such large size arrays?
The failure is in the algorithm itself, note that for each query, you traverse the whole array. There are 100,000 queries and 100,000 elements. That means at worse case you're traversing 100,000 * 100,000 elements = 10,000,000,000 elements, which won't finish in time. If you analyze the complexity using the Big-O notation, your algorithm is O(nq), which is too slow for this problem, since n*q are large.
What you're supposed to do is to calculate the scores before any query is made, then store in an array (this is why the range of A[i] is given. You should be able to do this by traversing the array only once. (hint: you don't need to store the input into an array, you can just count directly).
By doing this, the algorithm will just be O(n), and since n is small enough (as a rule of thumb, less than one million is small), it should finish in time.
Then you can answer each query instantly, making your program fast enough to be under the time limit.
Another thing that you can improve is the data type of the array. The value stored in that array won't be larger than 1 million, and so you don't need long long, which uses more memory. You can just use int.
Your algorithm was inefficient. You read all the numbers into an array, then you searched linearly through the array for each query.
What you should have done is make one array of counts. In other words, if you read the number 5, do count[5]++. Then for each query all you have to do is return the count from the array. For example, how many 5's were there in the array? Answer: count[5].
Since your maximum number can be 10^6, I think that your problem will take memory limit exceeded, even if it fits in time. Another solution is to sort the array( you can do it in N*logN using STL sort function) and for each query you can make two binary searches. First is used to find the first position where the element appears and the second is used to find the last position where your element appears, so the answer for each query will be lastPosition - firstPosition + 1.

Understanding Sum of subsets

I've just started learning Backtracking algorithms at college. Somehow I've managed to make a program for the Subset-Sum problem. Works fine but then i discovered that my program doesn't give out all the possible combinations.
For example : There might be a hundred combinations to a target sum but my program gives only 30.
Here is the code. It would be a great help if anyone could point out what my mistake is.
int tot=0;//tot is the total sum of all the numbers in the set.
int prob[500], d, s[100], top = -1, n; // n = number of elements in the set. prob[i] is the array with the set.
void subset()
{
int i=0,sum=0; //sum - being updated at every iteration and check if it matches 'd'
while(i<n)
{
if((sum+prob[i] <= d)&&(prob[i] <= d))
{
s[++top] = i;
sum+=prob[i];
}
if(sum == d) // d is the target sum
{
show(); // this function just displays the integer array 's'
top = -1; // top points to the recent number added to the int array 's'
i = s[top+1];
sum = 0;
}
i++;
while(i == n && top!=-1)
{
sum-=prob[s[top]];
i = s[top--]+1;
}
}
}
int main()
{
cout<<"Enter number of elements : ";cin>>n;
cout<<"Enter required sum : ";cin>>d;
cout<<"Enter SET :\n";
for(int i=0;i<n;i++)
{
cin>>prob[i];
tot+=prob[i];
}
if(d <= tot)
{
subset();
}
return 0;
}
When I run the program :
Enter number of elements : 7
Enter the required sum : 12
Enter SET :
4 3 2 6 8 12 21
SOLUTION 1 : 4, 2, 6
SOLUTION 2 : 12
Although 4, 8 is also a solution, my program doesnt show it.
Its even worse with the number of inputs as 100 or more. There will be atleast 10000 combinations, but my program shows 100.
The Logic which I am trying to follow :
Take in the elements of the main SET into a subset as long as the
sum of the subset remains less than or equal to the target sum.
If the addition of a particular number to the subset sum makes it
larger than the target, it doesnt take it.
Once it reaches the end
of the set, and answer has not been found, it removes the most
recently taken number from the set and starts looking at the numbers
in the position after the position of the recent number removed.
(since what i store in the array 's' is the positions of the
selected numbers from the main SET).
The solutions you are going to find depend on the order of the entries in the set due to your "as long as" clause in step 1.
If you take entries as long as they don't get you over the target, once you've taken e.g. '4' and '2', '8' will take you over the target, so as long as '2' is in your set before '8', you'll never get a subset with '4' and '8'.
You should either add a possibility to skip adding an entry (or add it to one subset but not to another) or change the order of your set and re-examine it.
It may be that a stack-free solution is possible, but the usual (and generally easiest!) way to implement backtracking algorithms is through recursion, e.g.:
int i = 0, n; // i needs to be visible to show()
int s[100];
// Considering only the subset of prob[] values whose indexes are >= start,
// print all subsets that sum to total.
void new_subsets(int start, int total) {
if (total == 0) show(); // total == 0 means we already have a solution
// Look for the next number that could fit
while (start < n && prob[start] > total) {
++start;
}
if (start < n) {
// We found a number, prob[start], that can be added without overflow.
// Try including it by solving the subproblem that results.
s[i++] = start;
new_subsets(start + 1, total - prob[start]);
i--;
// Now try excluding it by solving the subproblem that results.
new_subsets(start + 1, total);
}
}
You would then call this from main() with new_subsets(0, d);. Recursion can be tricky to understand at first, but it's important to get your head around it -- try easier problems (e.g. generating Fibonacci numbers recursively) if the above doesn't make any sense.
Working instead with the solution you have given, one problem I can see is that as soon as you find a solution, you wipe it out and start looking for a new solution from the number to the right of the first number that was included in this solution (top = -1; i = s[top+1]; implies i = s[0], and there is a subsequent i++;). This will miss solutions that begin with the same first number. You should just do if (sum == d) { show(); } instead, to make sure you get them all.
I initially found your inner while loop pretty confusing, but I think it's actually doing the right thing: once i hits the end of the array, it will delete the last number added to the partial solution, and if this number was the last number in the array, it will loop again to delete the second-to-last number from the partial solution. It can never loop more than twice because numbers included in a partial solution are all at distinct positions.
I haven't analysed the algorithm in detail, but what struck me is that your algorithm doesn't account for the possibility that, after having one solution that starts with number X, there could be multiple solutions starting with that number.
A first improvement would be to avoid resetting your stack s and the running sum after you printed the solution.

I have n spaces, in each space, I can place a number 0 through m. Writing a program to output all possible results. Need help :)

The idea is, given an n number of spaces, empty fields, or what have you, I can place in either a number from 0 to m. So if I have two spaces and just 01 , the outcome would be:
(0 1)
(1 0)
(0 0)
(1 1)
if i had two spaces and three numbers (0 1 2) the outcome would be
(0 1)
(1 1)
(0 2)
(2 0)
(2 2)
(2 1)
and so on until I got all 9 (3^2) possible outcomes.
So i'm trying to write a program that will give me all possible outcomes if I have n spaces and can place in any number from 0 to m in any one of those spaces.
Originally I thought to use for loops but that was quickly shotdown when I realzed I'd have to make one for every number up through n, and that it wouldn't work for cases where n is bigger.
I had the idea to use a random number generator and generate a number from 0 to m but that won't guarantee I'll actually get all the possible outcomes.
I am stuck :(
Ideas?
Any help is much appreciated :)
Basically what you will need is a starting point, ending point, and a way to convert from each state to the next state. For example, a recursive function that is able to add one number to the smallest pace value that you need, and when it is larger than the maximum, to increment the next larger number and set the current one back to zero.
Take this for example:
#include <iostream>
#include <vector>
using namespace std;
// This is just a function to print out a vector.
template<typename T>
inline ostream &operator<< (ostream &os, const vector<T> &v) {
bool first = true;
os << "(";
for (int i = 0; i < v.size (); i++) {
if (first) first = false;
else os << " ";
os << v[i];
}
return os << ")";
}
bool addOne (vector<int> &nums, int pos, int maxNum) {
// If our position has moved off of bounds, so we're done
if (pos < 0)
return false;
// If we have reached the maximum number in one column, we will
// set it back to the base number and increment the next smallest number.
if (nums[pos] == maxNum) {
nums[pos] = 0;
return addOne (nums, pos-1, maxNum);
}
// Otherwise we simply increment this numbers.
else {
nums[pos]++;
return true;
}
}
int main () {
vector<int> nums;
int spaces = 3;
int numbers = 3;
// populate all spaces with 0
nums.resize (spaces, 0);
// Continue looping until the recursive addOne() function returns false (which means we
// have reached the end up all of the numbers)
do {
cout << nums << endl;
} while (addOne (nums, nums.size()-1, numbers));
return 0;
}
Whenever a task requires finding "all of" something, you should first try to do it in these three steps: Can I put them in some kind of order? Can I find the next one given one? Can I find the first?
So if I asked you to give me all the numbers from 1 to 10 inclusive, how would you do it? Well, it's easy because: You know a simple way to put them in order. You can give me the next one given any one of them. You know which is first. So you start with the first, then keep going to the next until you're done.
This same method applies to this problem. You need three algorithms:
An algorithm that orders the outputs such that each output is either greater than or less than every other possible output. (You don't need to code this, just understand it.)
An algorithm to convert any output into the next output and fail if given the last output. (You do need to code this.)
An algorithm to generate the first output, one less (according to the first algorithm) than every other possible output. (You do need to code this.)
Then it's simple:
Generate the first output (using algorithm 3). Output it.
Use the increment algorithm (algorithm 2) to generate the next output. If there is no next output, stop. Otherwise, output it.
Repeat step 2.
Update: Here are some possible algorithms:
Algorithm 1:
Compare the first digits of the two outputs. If one is greater than the other, that output is greater. If they are equal, continue
Repeat step on moving to successive digits until we find a mismatch.
Algorithm 2:
Start with the rightmost digit.
If this digit is not the maximum it can be, increment it and stop.
Are we at the leftmost digit? If so, stop with error.
Move the digit pointer left one digit.
Algorithm 3:
Set all digits to zero.
“i'm trying to write a program that will give me all possible outcomes if I have n spaces and can place in any number from 0 to m in any one of those spaces.”
Assuming an inclusive “to”, let R = m + 1.
Then this is isomorphic to outputting every number in the range 0 through Rn-1 presented in the base R numeral system.
Which means one outer loop to count (for this you can use the C++ ++ increment operator), and an inner loop to extract and present the digits. For the inner loop you can use C++’ / division operator, and depending on what you find most clear, also the % remainder operator. Unless you restrict yourself to the three choices of R directly supported by the C++ standard library, in which case use the standard formatters.
Note that Rn can get large fast.
So don't redirect the output to your printer, and be prepared to wait for a while for the program to complete.
I think you need to look up recursion. http://www.danzig.us/cpp/recursion.html
Basically it is a function that calls itself. This allows you to perform an N number of nested for loops.