Here is the link to the problem: https://ioi2019.az/source/Tasks/Day1/Shoes/NGA.pdf
Here is a brief explanation about the problem statement:
You are given an integer n in the range 1≤n≤1e5 which will be representing the amount of positive integers inside of the array, as-well as the amount of negative integers in an array(so the total size of the array will be 2n).
The problem wants you to find the minimum number of swaps needed in the array such that the negative value of a number and the absolute value of that negative number are adjacent to each other(such that -x is to the right of x)
Example:
n = 2;
the array inputed = {2, 1, -1, -2}
The minimum number of operations will be four:
2,1,-1,-2: 0 swaps
2,-1,1,-2: 1 swap(swapping 1 and -1)
2,-1,-2,1: 2 swaps (swapping 1 and -2)
2,-2,-1,1: 3 swaps (swapping -1 and -2)
-2,2,-1,1: 4 swaps (swapping 2 and -2)
The final answer will be four.
Another example:
the array inputed = {-2, 2, 2, -2, -2, 2}
The minimum swaps is one. Because we can just swap elements at index 2 and 3.
Final array: {-2,2,-2,2,-2,2}
When doing this question I got wrong answer and I decided to look at someones source code on git hub.
Here is the source code:
#include "shoes.h"
#include <bits/stdc++.h>
#define sz(v) ((int)(v).size())
using namespace std;
using lint = long long;
using pi = pair<int, int>;
const int MAXN = 200005;
struct bit{
int tree[MAXN];
void add(int x, int v){
for(int i=x; i<MAXN; i+=i&-i) tree[i] += v;
}
int query(int x){
int ret = 0;
for(int i=x; i; i-=i&-i) ret += tree[i];
return ret;
}
}bit;
lint count_swaps(vector<int> s) {
int n = sz(s) / 2;
lint ret = 0;
vector<pi> v;
vector<pi> ord[MAXN];
for(int i=0; i<sz(s); i++){
ord[abs(s[i])].emplace_back(s[i], i);
}
for(int i=1; i<=n; i++){
sort(ord[i].begin(), ord[i].end());
for(int j=0; j<sz(ord[i])/2; j++){
int l = ord[i][j].second;
int r = ord[i][j + sz(ord[i])/2].second; //confusion starts here all the way to the buttom
if(l > r){
swap(l, r);
ret++;
}
v.emplace_back(l + 1, r + 1);
}
}
for(int i=1; i<=2*n; i++) bit.add(i, 1);
sort(v.begin(), v.end());
for(auto &i : v){
ret += bit.query(i.second - 1) - bit.query(i.first);
bit.add(i.first, -1);
bit.add(i.second, -1);
}
return ret;
}
However, I dont think I understand the this code too well.
I understand what the functions add and query in BIT do I'm just confused on where I commented on the code all the way to the bottom. I dont understand what it does and what the purpose of that is.
Can someone walk through what this code is doing? Or give any suggestions to how I should properly and efficiently approach this problem(even maybe your solutions?). Thank you.
int r = ord[i][j + sz(ord[i])/2].second;
We've sorted the tuples of one shoe size in a vector of <size, idx>, which means all the negatives of this size take up the first half of ord[i], and all the positives are in the second half.
if (l > r){
swap(l, r);
ret++;
}
After our sort on size, the indexes of each corresponding pair may not be ordered with the negative before the positive. Each one of those costs a swap.
v.emplace_back(l + 1, r + 1);
insert into v our interval for the corresponding pair of shoes of size i.
for(int i=1; i<=2*n; i++) bit.add(i, 1);
sort(v.begin(), v.end());
Add the value of 1 in our segment-sum tree for each index location of a shoe. Sort the shoe intervals.
for(auto &i : v){
ret += bit.query(i.second - 1) - bit.query(i.first);
For each pair of shoes in v, the number of swaps needed is the number of shoes left in between them, expressed in the sum of the segment.
bit.add(i.first, -1);
bit.add(i.second, -1);
Remove the pair of shoes from the tree so a new segment sum won't include them. We can do this since the shoe intervals are processed left to right, which means no "inner" pair of shoes gets processed before an outer pair.
Related
I created a little program that is able to calculate the determinant of a matrix in C++. I used laplace-expansion, although I know that there are more efficient ways to do it:
double getDeterminantLaplace(const std::vector<std::vector<double>> vect) {
int dimension = vect.size();
if(dimension == 0) {
return 1;
}
if(dimension == 1) {
return vect[0][0];
}
//Formula for 2x2-matrix
if(dimension == 2) {
return vect[0][0] * vect[1][1] - vect[0][1] * vect[1][0];
}
double result = 0;
int sign = 1;
for(int i = 0; i < dimension; i++) {
//Submatrix
std::vector<std::vector<double>> subVect(dimension - 1, std::vector<double> (dimension - 1));
for(int m = 1; m < dimension; m++) {
int z = 0;
for(int n = 0; n < dimension; n++) {
if(n != i) {
subVect[m-1][z] = vect[m][n];
z++;
}
}
}
//recursive call
result = result + sign * vect[0][i] * getDeterminantLaplace(subVect);
sign = -sign;
}
return result;
}
My question now is: How can this algorithm be made more efficient?
One of my ideas is to not create the "submatrices" and just work with the original matrix, but I don't really know how to do it. What do you think about this idea? How can I do this in C++?
Do you have any more ideas?
A first, trivial optimization is not to recurse when the current element is zero. This will give you an instant speed-up on sparse matrices.
The next optimization is what you already suggested: Do not to create all submatrices. You can do that by creating an index vector. For example, if your original matrix has 4×4 elements, you recurse with the following index vectors:
0: {1, 2, 3}
1: {0, 2, 3}
2: {0, 1, 3}
3: {0, 1, 2}
You don't need to create the index vector from scratch each time: Start with the subvector that is the current vector without its front, then overwrite the i-th place with the i-th entry of the current ubvector.
When you access the element s[r][c] of the submatrix, access element a[r + top][col[c]] of the original matrix. You can determine the index of the top row from the dimensions of the current column vector and the original matrix.
You never create submatrices, only sub-column vectors. Split your function in two: One public function as front-end, which calls the recursive worker function.
This will speed up the calculation somewhat, but unfortunately, this improvement will not buy you much when your matrices grow. Let's look at the 4×4 matrix again. In the first recursion step, you will consider these 3×3 submatrices:
1, 2, 3 0, 2, 3 0, 1, 3 0, 1, 2
From there, you will calculate these 2×2 submatrices:
2, 3 2, 3 1, 3 1, 2
1, 3 0, 3 0, 3 0, 2
1, 2 0, 2, 0, 1, 0, 1,
Notice that these 12 indices are realy just 6 different pairs. You'll calculate each of them twice. This will get worse the bigger your original matrix is. A solution to this is memoizing: Once you have calculated the determinant of a certain submatrix, store the value in an associated array. Before calculating a submatrix, check whether you have already done that and if so, just return the value you calculated earlier.
This will speec up your function, but it comes at a price: It will create many entries in the associated array.
Anyway, here's the code that implements all optimizations I've described:
#include <vector>
#include <map>
#include <iostream>
double subdet(const std::vector<std::vector<double> > &a,
const std::vector<int> &col,
std::map<std::vector<int>, double> &memo)
{
int dim = col.size();
int top = a.size() - dim;
if (memo.find(col) != memo.end()) {
return memo[col];
}
if (dim == 2) return a[top + 0][col[0]] * a[top + 1][col[1]]
- a[top + 0][col[1]] * a[top + 1][col[0]];
double result = 0.0;
int sign = 1;
std::vector<int> ncol(&col[1], &col[dim]);
for (int i = 0; i < dim; i++) {
if (a[top][col[i]]) {
double d = subdet(a, ncol, memo);
result = result + sign * a[top][col[i]] * d;
}
sign = -sign;
if (i + 1 < dim) ncol[i] = col[i];
}
memo[col] = result;
return result;
}
double det(const std::vector<std::vector<double> > a)
{
int dim = a.size();
if (dim == 0) return 1.0;
if (dim == 1) return a[0][0];
std::vector<int> col(dim);
std::map<std::vector<int>, double> memo;
for (unsigned i = 0; i < a.size(); i++) col[i] = i;
return subdet(a, col, memo);
}
Notes: The map (a binary tree with O(log n) lookup) should really be an unodered map (a hash table with O(1) lookup), but I couldn't get it to work, because I'm bad at C++. Sorry about that.
There's probably room for optimization of the lookup key, too: One can enumerate the possible index vectors or use a bit mask, perhaps, thereby saving memory in the hash map. It's no good string references to the column-index vector, because it's short-lived and we're swapping around in it a lot, so it's not constant.
Of course, other algorithms are better suited for finding the determinat of large matrices. My answer focuses on improving the existing method.
How to divide elements in an array into a minimum number of arrays such that the difference between the values of elements of each of the formed arrays does not differ by more than 1?
Let's say that we have an array: [4, 6, 8, 9, 10, 11, 14, 16, 17].
The array elements are sorted.
I want to divide the elements of the array into a minimum number of array(s) such that each of the elements in the resulting arrays do not differ by more than 1.
In this case, the groupings would be: [4], [6], [8, 9, 10, 11], [14], [16, 17]. So there would be a total of 5 groups.
How can I write a program for the same? Or you can suggest algorithms as well.
I tried the naive approach:
Obtain the difference between consecutive elements of the array and if the difference is less than (or equal to) 1, I add those elements to a new vector. However this method is very unoptimized and straight up fails to show any results for a large number of inputs.
Actual code implementation:
#include<cstdio>
#include<iostream>
#include<vector>
using namespace std;
int main() {
int num = 0, buff = 0, min_groups = 1; // min_groups should start from 1 to take into account the grouping of the starting array element(s)
cout << "Enter the number of elements in the array: " << endl;
cin >> num;
vector<int> ungrouped;
cout << "Please enter the elements of the array: " << endl;
for (int i = 0; i < num; i++)
{
cin >> buff;
ungrouped.push_back(buff);
}
for (int i = 1; i < ungrouped.size(); i++)
{
if ((ungrouped[i] - ungrouped[i - 1]) > 1)
{
min_groups++;
}
}
cout << "The elements of entered vector can be split into " << min_groups << " groups." << endl;
return 0;
}
Inspired by Faruk's answer, if the values are constrained to be distinct integers, there is a possibly sublinear method.
Indeed, if the difference between two values equals the difference between their indexes, they are guaranteed to belong to the same group and there is no need to look at the intermediate values.
You have to organize a recursive traversal of the array, in preorder. Before subdividing a subarray, you compare the difference of indexes of the first and last element to the difference of values, and only subdivide in case of a mismatch. As you work in preorder, this will allow you to emit pieces of the groups in consecutive order, as well as detect to the gaps. Some care has to be taken to merge the pieces of the groups.
The worst case will remain linear, because the recursive traversal can degenerate to a linear traversal (but not worse than that). The best case can be better. In particular, if the array holds a single group, it will be found in time O(1). If I am right, for every group of length between 2^n and 2^(n+1), you will spare at least 2^(n-1) tests. (In fact, it should be possible to estimate an output-sensitive complexity, equal to the array length minus a fraction of the lengths of all groups, or similar.)
Alternatively, you can work in a non-recursive way, by means of exponential search: from the beginning of a group, you start with a unit step and double the step every time, until you detect a gap (difference in values too large); then you restart with a unit step. Here again, for large groups you will skip a significant number of elements. Anyway, the best case can only be O(Log(N)).
I would suggest encoding subsets into an offset array defined as follows:
Elements for set #i are defined for indices j such that offset[i] <= j < offset[i+1]
The number of subsets is offset.size() - 1
This only requires one memory allocation.
Here is a complete implementation:
#include <cassert>
#include <iostream>
#include <vector>
std::vector<std::size_t> split(const std::vector<int>& to_split, const int max_dist = 1)
{
const std::size_t to_split_size = to_split.size();
std::vector<std::size_t> offset(to_split_size + 1);
offset[0] = 0;
size_t offset_idx = 1;
for (std::size_t i = 1; i < to_split_size; i++)
{
const int dist = to_split[i] - to_split[i - 1];
assert(dist >= 0); // we assumed sorted input
if (dist > max_dist)
{
offset[offset_idx] = i;
++offset_idx;
}
}
offset[offset_idx] = to_split_size;
offset.resize(offset_idx + 1);
return offset;
}
void print_partition(const std::vector<int>& to_split, const std::vector<std::size_t>& offset)
{
const std::size_t offset_size = offset.size();
std::cout << "\nwe found " << offset_size-1 << " sets";
for (std::size_t i = 0; i + 1 < offset_size; i++)
{
std::cout << "\n";
for (std::size_t j = offset[i]; j < offset[i + 1]; j++)
{
std::cout << to_split[j] << " ";
}
}
}
int main()
{
std::vector<int> to_split{4, 6, 8, 9, 10, 11, 14, 16, 17};
std::vector<std::size_t> offset = split(to_split);
print_partition(to_split, offset);
}
which prints:
we found 5 sets
4
6
8 9 10 11
14
16 17
Iterate through the array. Whenever the difference between 2 consecutive element is greater than 1, add 1 to your answer variable.
`
int getPartitionNumber(int arr[]) {
//let n be the size of the array;
int result = 1;
for(int i=1; i<n; i++) {
if(arr[i]-arr[i-1] > 1) result++;
}
return result;
}
`
And because it is always nice to see more ideas and select the one that suites you best, here the straight forward 6 line solution. Yes, it is also O(n). But I am not sure, if the overhead for other methods makes it faster.
Please see:
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <iterator>
using Data = std::vector<int>;
using Partition = std::vector<Data>;
Data testData{ 4, 6, 8, 9, 10, 11, 14, 16, 17 };
int main(void)
{
// This is the resulting vector of vectors with the partitions
std::vector<std::vector<int>> partition{};
// Iterating over source values
for (Data::iterator i = testData.begin(); i != testData.end(); ++i) {
// Check,if we need to add a new partition
// Either, at the beginning or if diff > 1
// No underflow, becuase of boolean shortcut evaluation
if ((i == testData.begin()) || ((*i) - (*(i-1)) > 1)) {
// Create a new partition
partition.emplace_back(Data());
}
// And, store the value in the current partition
partition.back().push_back(*i);
}
// Debug output: Copy all data to std::cout
std::for_each(partition.begin(), partition.end(), [](const Data& d) {std::copy(d.begin(), d.end(), std::ostream_iterator<int>(std::cout, " ")); std::cout << '\n'; });
return 0;
}
Maybe this could be a solution . . .
How do you say your approach is not optimized? If your is correct, then according to your approach, it takes O(n) time complexity.
But you can use binary-search here which can optimize in average case. But in worst case this binary search can take more than O(n) time complexity.
Here's a tips,
As the array sorted so you will pick such a position whose difference is at most 1.
Binary search can do this in simple way.
int arr[] = [4, 6, 8, 9, 10, 11, 14, 16, 17];
int st = 0, ed = n-1; // n = size of the array.
int partitions = 0;
while(st <= ed) {
int low = st, high = n-1;
int pos = low;
while(low <= high) {
int mid = (low + high)/2;
if((arr[mid] - arr[st]) <= 1) {
pos = mid;
low = mid + 1;
} else {
high = mid - 1;
}
}
partitions++;
st = pos + 1;
}
cout<< partitions <<endl;
In average case, it is better than O(n). But in worst case (where the answer would be equal to n) it takes O(nlog(n)) time.
In a recent problem where i have to sum all values at common indexes in all possible subsets of size k in array of size n.
For eg: If
array ={1,2,3}
Its subsets (k=2) will be (x [i] , x [j]) where i < j
1 2
1 3
2 3
Sum:4,8
Firstly I have used recursion (same that of generating all subsets)
int sum_index[k]={0};
void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
if(q==k)
{
for(int i=0;i<k;i++)
sum_index[i]+=temp[i];
}
else
{
for(int i=r;i<n;i++)
{
temp[q]=array[i];
sub_set(value,n,k,temp,q+1,i+1);
}
}
}
Problem is its taking too much time then expected .
Then i modified it to...
void sub_set(int array[],int n,int k,int temp[],int q=0,int r=0)
{
if(q==k)
{
return;
}
else
{
for(int i=r;i<n;i++)
{
temp[q]=array[i];
sum_index[q]+=temp[q]; //or sum_index[q]+=s[i];
sub_set(value,n,k,temp,q+1,i+1);
}
}
}
Still taking too much time!!
Is there any other approach to this problem?? Or any other modification i needed that i am unaware of??
Instead of iterating through the possible sub-sets, think of it a combinatorics problem.
To use your example of k=2 and {1,2,3}, let's just look at the first value of the result. It has two 1's and one 2. The two 1's correspond to the number one element sets that can be made from {2, 3} and the one 2 corresponds to the number of one element sets that can be made from {3}. A similar arrangement exists for the one 2 and two 3's in the second element of the result and looking at the subsets of the elements that appear before the element being considered.
Things get a bit more complicated when k>2 because then you will have to look for the number of combinations of elements before and after the element being considered, but the basic premise still works. Multiply the number of possible subsets before times the number of subsets afterwards and that will tell you how many times each element contributes to the result.
A solution in O(n^2) instead of O(n!):
First a tiny (:)) bit of explanation, then some code:
I´m going to assume here that your array is sorted (if not, use std::sort first). Additionally, I´m going to work with the array values 1,2,3,4... here, if you array consists arbitrary values (like 2 8 17), you´ll have to think of it as the indices (ie. 1=>2, 2=>8 etc.)
Definition: (x choose y) means the binomial coefficient, how it is calculated is in the link too. If you have an array size a and some k for the subset size, (a choose k) is the number of permutations, eg. 3 for your example: (1,2), (1,3) and (2,3).
You want the sum for each column if you write the permutations under each other, this would be easy if you knew for each column how many times each array element occurs, ie. how many 1´s, 2´s and 3´s for the first, and how many for second column (with k=2).
Here a bigger example to explain: (1,2,3,4,5) and all possible k´s (each in one block):
1
2
3
4
5
12
13
14
15
23
24
25
34
35
45
123
124
125
134
135
145
234
235
245
345
... (didn´t write k=4)
12345
Let´s introduce column indices, 0<=c<k, ie. c=0 means the first column, c=1 the second and so on; and the array size s=5.
So, looking eg. at the k=3-block, you´ll notice that the lines beginning with 1 (column c=0) have all permutations of the values (2,3,4,5) for k=2, more generally a value x in column c has all permutations for values x+1 to s after it. The values from from x+1 to s are s-x different values, and after column c there are k-c-1 more columns. So, for a value x, you can calculate ((s-x) choose (k-c-1)).
Additionally, the first column has only the values 1,2,3, the last two numbers are not here because after this column there are two more columns.
If you do this for the first column, it works well. Eg. with value 1 in the first column of k=3 above:
count(x) = ((s-x) choose (k-c-1)) = (4 choose 2) = 6
and indeed there are six 1 there. Calculate this count for every array value, multiply x*count(x), and sum it up for every x, that´s the result for the first column.
The other columns are a tiny bit harder, because there can be multiple "permutation blocks" of the same number. To start with, the step above needs a small adjustment: You need a muliplier array somewhere, one multiplier for each array value, and in the beginning each multiplier is 1. In the calculation x*count(x) above, take x*count(x)*muliplier(x) instead.
In the k=3-example, 1 in the first column can be followed by 2,3,4, 2 can be followed by 3,4, and 3 by 4. So the 3-based permutations of the second column need to be counted twice, and the 4-based even three times; more generally so many times like there are smaller values in the previos colums. Multiply that to the current multiplier.
...
Some code:
#include<iostream>
#include<vector>
#include<algorithm>
using namespace std;
// factorial (x!)
unsigned long long fact(unsigned char x)
{
unsigned long long res = 1;
while(x)
{
res *= x;
x--;
}
return res;
}
//binomial coefficient (n choose k)
unsigned long long binom(unsigned char n, unsigned char k)
{
if(!n || !k) return 1;
return (fact(n) / fact(k)) / fact(n-k);
}
//just for convenience
template<class T> void printvector(std::vector<T> data)
{
for(auto l : data) cout << l << " ";
cout << endl;
}
std::vector<unsigned long long> calculate(std::vector<int> data, int k)
{
std::vector<unsigned long long> res(k, 0); //result data
std::vector<unsigned long long> multiplier(data.size(), 1);
if(k < 1 || k > 255 || data.size() < 1) return res; //invalid stuff
std::sort(data.begin(), data.end()); //as described
for(int column = 0; column < k; column++) //each column separately
{
//count what to multiply to the multiplier array later
std::vector<unsigned long long> newmultiplier(data.size(), 0);
//for each array element in this column
for(int x = column; x <= (data.size() + column - k); x++)
{
//core calculation
res[column] += data[x] * multiplier[x] * binom(data.size() - x - 1, k - column - 1);
//counting the new multiplier factor
for(int helper = x + 1; helper < data.size(); helper++)
newmultiplier[helper]++;
}
//calculating new multiplier
for(int x = 0; x < data.size(); x++)
{
if(newmultiplier[x])
multiplier[x] *= newmultiplier[x];
}
}
return res;
}
int main() {
printvector(calculate({1,2,3}, 2)); //output 4 8
return 0;
}
std::next_permutation may help:
std::vector<int> sub_set(const std::vector<int>& a, int k)
{
std::vector<int> res(k, 0);
std::vector<bool> p(a.size() - k, false);
p.resize(a.size(), true);
do
{
int index = 0;
for (std::size_t i = 0; i != p.size(); ++i) {
if (p[i]) {
res[index++] += a[i];
}
}
} while (std::next_permutation(p.begin(), p.end()));
return res;
}
Live Demo
I have done a test in C++ asking for a function that returns one of the indices that splits the input vector in 2 parts having the same sum of the elements, for eg: for the vec = {1, 2, 3, 5, 4, -1, 1, 1, 2, -1}, it may return 3, because 1+2+3 = 6 = 4-1+1+1+2-1. So I have done the function that returns the correct answer:
int func(const std::vector< int >& vecIn)
{
for (std::size_t p = 0; p < vecin.size(); p++)
{
if (std::accumulator(vecIn.begin(), vecIn.begin() + p, 0) ==
std::accumulator(vecIn.begin() + p + 1, vecIn.end(), 0))
return p;
}
return -1;
}
My problem was when the input was a very long vector containing just 1 (or -1), the return of the function was slow. So I have thought of starting the search for the wanted index from middle, and then go left and right. But the best approach I suppose is the one where the index is in the merge-sort algorithm order, that means: n/2, n/4, 3n/4, n/8, 3n/8, 5n/8, 7n/8... where n is the size of the vector. Is there a way to write this order in a formula, so I can apply it in my function?
Thanks
EDIT
After some comments I have to mention that I had done the test a few days ago, so I have forgot to put and mention the part of no solution: it should return -1... I have updated also the question title.
Specifically for this problem, I would use the following algorithm:
Compute the total sum of the vector. This gives two sums (empty vector, and full vector)
for each element in order, move one element from full to empty, which means adding the value of next element from sum(full) to sum(empty). When the two sums are equal, you have found your index.
This give a o(n) algorithm instead of o(n2)
You can solve the problem much faster without calling std::accumulator at each step:
int func(const std::vector< int >& vecIn)
{
int s1 = 0;
int s2 = std::accumulator(vecIn.begin(), vecIn.end(), 0);
for (std::size_t p = 0; p < vecin.size(); p++)
{
if (s1 == s2)
return p;
s1 += vecIn[p];
s2 -= vecIn[p];
}
}
This is O(n). At each step, s1 will contain the sum of the first p elements, and s2 the sum of the rest. You can update both of them with an addition and a subtraction when moving to the next element.
Since std::accumulator needs to iterate over the range you give it, your algorithm was O(n^2), which is why it was so slow for many elements.
To answer the actual question: Your sequence n/2, n/4, 3n/5, n/8, 3n/8 can be rewritten as
1*n/2
1*n/4 3*n/4
1*n/8 3*n/8 5*n/8 7*n/8
...
that is to say, the denominator runs from i=2 up in powers of 2, and the nominator runs from j=1 to i-1 in steps of 2. However, this is not what you need for your actual problem, because the example you give has n=10. Clearly you don't want n/4 there - your indices have to be integer.
The best solution here is to recurse. Given a range [b,e], pick a value middle (b+e/2) and set the new ranges to [b, (b+e/2)-1] and [(b+e/2)=1, e]. Of course, specialize ranges with length 1 or 2.
Considering MSalters comments, I'm afraid another solution would be better. If you want to use less memory, maybe the selected answer is good enough, but to find the possibly multiple solutions you could use the following code:
static const int arr[] = {5,-10,10,-10,10,1,1,1,1,1};
std::vector<int> vec (arr, arr + sizeof(arr) / sizeof(arr[0]) );
// compute cumulative sum
std::vector<int> cumulative_sum( vec.size() );
cumulative_sum[0] = vec[0];
for ( size_t i = 1; i < vec.size(); i++ )
{ cumulative_sum[i] = cumulative_sum[i-1] + vec[i]; }
const int complete_sum = cumulative_sum.back();
// find multiple solutions, if there are any
const int complete_sum_half = complete_sum / 2; // suggesting this is valid...
std::vector<int>::iterator it = cumulative_sum.begin();
std::vector<int> mid_indices;
do {
it = std::find( it, cumulative_sum.end(), complete_sum_half );
if ( it != cumulative_sum.end() )
{ mid_indices.push_back( it - cumulative_sum.begin() ); ++it; }
} while( it != cumulative_sum.end() );
for ( size_t i = 0; i < mid_indices.size(); i++ )
{ std::cout << mid_indices[i] << std::endl; }
std::cout << "Split behind these indices to obtain two equal halfs." << std::endl;
This way, you get all the possible solutions. If there is no solution to split the vector in two equal halfs, mid_indices will be left empty.
Again, you have to sum up each value only once.
My proposal is this:
static const int arr[] = {1,2,3,5,4,-1,1,1,2,-1};
std::vector<int> vec (arr, arr + sizeof(arr) / sizeof(arr[0]) );
int idx1(0), idx2(vec.size()-1);
int sum1(0), sum2(0);
int idxMid = -1;
do {
// fast access without using the index each time.
const int& val1 = vec[idx1];
const int& val2 = vec[idx2];
// Precompute the next (possible) sum values.
const int nSum1 = sum1 + val1;
const int nSum2 = sum2 + val2;
// move the index considering the balanace between the
// left and right sum.
if ( sum1 - nSum2 < sum2 - nSum1 )
{ sum1 = nSum1; idx1++; }
else
{ sum2 = nSum2; idx2--; }
if ( idx1 >= idx2 ){ idxMid = idx2; }
} while( idxMid < 0 && idx2 >= 0 && idx1 < vec.size() );
std::cout << idxMid << std::endl;
It does add every value only once no matter how many values. Such that it's complexity is only O(n) and not O(n^2).
The code simply runs from left and right simultanuously and moves the indices further if it's side is lower than the other.
You want nth term of the series you mentioned. Then it would be:
numerator: (n - 2^((int)(log2 n)) ) *2 + 1
denominator: 2^((int)(log2 n) + 1)
I came across the same question in Codility tests. There is a similar looking answer above (didn't pass some of the unit tests), but below code segment was successful in tests.
#include <vector>
#include <numeric>
#include <iostream>
using namespace std;
// Returns -1 if equilibrium point is not found
// use long long to support bigger ranges
int FindEquilibriumPoint(vector<long> &values) {
long long lower = 0;
long long upper = std::accumulate(values.begin(), values.end(), 0);
for (std::size_t i = 0; i < values.size(); i++) {
upper -= values[i];
if (lower == upper) {
return i;
}
lower += values[i];
}
return -1;
}
int main() {
vector<long> v = {-1, 3, -4, 5, 1, -6, 2, 1};
cout << "Equilibrium Point:" << FindEquilibriumPoint(v) << endl;
return 0;
}
Output
Equilibrium Point:1
Here it is the algorithm in Javascript:
function equi(arr){
var N = arr.length;
if (N == 0){ return -1};
var suma = 0;
for (var i=0; i<N; i++){
suma += arr[i];
}
var suma_iz = 0;
for(i=0; i<N; i++){
var suma_de = suma - suma_iz - arr[i];
if (suma_iz == suma_de){
return i};
suma_iz += arr[i];
}
return -1;
}
As you see this code satisfy the condition of O(n)
I have a sorted std::vector<int> and I would like to find the longest 'streak of consecutive numbers' in this vector and then return both the length of it and the smallest number in the streak.
To visualize it for you :
suppose we have :
1 3 4 5 6 8 9
I would like it to return: maxStreakLength = 4 and streakBase = 3
There might be occasion where there will be 2 streaks and we have to choose which one is longer.
What is the best (fastest) way to do this ? I have tried to implement this but I have problems with coping with more than one streak in the vector. Should I use temporary vectors and then compare their lengths?
No you can do this in one pass through the vector and only storing the longest start point and length found so far. You also need much fewer than 'N' comparisons. *
hint: If you already have say a 4 long match ending at the 5th position (=6) and which position do you have to check next?
[*] left as exercise to the reader to work out what's the likely O( ) complexity ;-)
It would be interesting to see if the fact that the array is sorted can be exploited somehow to improve the algorithm. The first thing that comes to mind is this: if you know that all numbers in the input array are unique, then for a range of elements [i, j] in the array, you can immediately tell whether elements in that range are consecutive or not, without actually looking through the range. If this relation holds
array[j] - array[i] == j - i
then you can immediately say that elements in that range are consecutive. This criterion, obviously, uses the fact that the array is sorted and that the numbers don't repeat.
Now, we just need to develop an algorithm which will take advantage of that criterion. Here's one possible recursive approach:
Input of recursive step is the range of elements [i, j]. Initially it is [0, n-1] - the whole array.
Apply the above criterion to range [i, j]. If the range turns out to be consecutive, there's no need to subdivide it further. Send the range to output (see below for further details).
Otherwise (if the range is not consecutive), divide it into two equal parts [i, m] and [m+1, j].
Recursively invoke the algorithm on the lower part ([i, m]) and then on the upper part ([m+1, j]).
The above algorithm will perform binary partition of the array and recursive descent of the partition tree using the left-first approach. This means that this algorithm will find adjacent subranges with consecutive elements in left-to-right order. All you need to do is to join the adjacent subranges together. When you receive a subrange [i, j] that was "sent to output" at step 2, you have to concatenate it with previously received subranges, if they are indeed consecutive. Or you have to start a new range, if they are not consecutive. All the while you have keep track of the "longest consecutive range" found so far.
That's it.
The benefit of this algorithm is that it detects subranges of consecutive elements "early", without looking inside these subranges. Obviously, it's worst case performance (if ther are no consecutive subranges at all) is still O(n). In the best case, when the entire input array is consecutive, this algorithm will detect it instantly. (I'm still working on a meaningful O estimation for this algorithm.)
The usability of this algorithm is, again, undermined by the uniqueness requirement. I don't know whether it is something that is "given" in your case.
Anyway, here's a possible C++ implementation
typedef std::vector<int> vint;
typedef std::pair<vint::size_type, vint::size_type> range;
class longest_sequence
{
public:
const range& operator ()(const vint &v)
{
current = max = range(0, 0);
process_subrange(v, 0, v.size() - 1);
check_record();
return max;
}
private:
range current, max;
void process_subrange(const vint &v, vint::size_type i, vint::size_type j);
void check_record();
};
void longest_sequence::process_subrange(const vint &v,
vint::size_type i, vint::size_type j)
{
assert(i <= j && v[i] <= v[j]);
assert(i == 0 || i == current.second + 1);
if (v[j] - v[i] == j - i)
{ // Consecutive subrange found
assert(v[current.second] <= v[i]);
if (i == 0 || v[i] == v[current.second] + 1)
// Append to the current range
current.second = j;
else
{ // Range finished
// Check against the record
check_record();
// Start a new range
current = range(i, j);
}
}
else
{ // Subdivision and recursive calls
assert(i < j);
vint::size_type m = (i + j) / 2;
process_subrange(v, i, m);
process_subrange(v, m + 1, j);
}
}
void longest_sequence::check_record()
{
assert(current.second >= current.first);
if (current.second - current.first > max.second - max.first)
// We have a new record
max = current;
}
int main()
{
int a[] = { 1, 3, 4, 5, 6, 8, 9 };
std::vector<int> v(a, a + sizeof a / sizeof *a);
range r = longest_sequence()(v);
return 0;
}
I believe that this should do it?
size_t beginStreak = 0;
size_t streakLen = 1;
size_t longest = 0;
size_t longestStart = 0;
for (size_t i=1; i < len.size(); i++) {
if (vec[i] == vec[i-1] + 1) {
streakLen++;
}
else {
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
beginStreak = i;
streakLen = 1;
}
}
if (streakLen > longest) {
longest = streakLen;
longestStart = beginStreak;
}
You can't solve this problem in less than O(N) time. Imagine your list is the first N-1 even numbers, plus a single odd number (chosen from among the first N-1 odd numbers). Then there is a single streak of length 3 somewhere in the list, but worst case you need to scan the entire list to find it. Even on average you'll need to examine at least half of the list to find it.
Similar to Rodrigo's solutions but solving your example as well:
#include <vector>
#include <cstdio>
#define len(x) sizeof(x) / sizeof(x[0])
using namespace std;
int nums[] = {1,3,4,5,6,8,9};
int streakBase = nums[0];
int maxStreakLength = 1;
void updateStreak(int currentStreakLength, int currentStreakBase) {
if (currentStreakLength > maxStreakLength) {
maxStreakLength = currentStreakLength;
streakBase = currentStreakBase;
}
}
int main(void) {
vector<int> v;
for(size_t i=0; i < len(nums); ++i)
v.push_back(nums[i]);
int lastBase = v[0], currentStreakBase = v[0], currentStreakLength = 1;
for(size_t i=1; i < v.size(); ++i) {
if (v[i] == lastBase + 1) {
currentStreakLength++;
lastBase = v[i];
} else {
updateStreak(currentStreakLength, currentStreakBase);
currentStreakBase = v[i];
lastBase = v[i];
currentStreakLength = 1;
}
}
updateStreak(currentStreakLength, currentStreakBase);
printf("maxStreakLength = %d and streakBase = %d\n", maxStreakLength, streakBase);
return 0;
}