Equilibrium index of an array of large numbers, how to prevent overflow?

Problem statement:
An equilibrium index of an array is an index into the array such that the sum of elements at lower indices is equal to the sum of elements at higher indices.
For example, in {-7, 1, 5, 2, -4, 3, 0}, 3 is an equilibrium index, because:
-7 + 1 + 5 = -4 + 3 + 0
Write a function that, given an vector of ints, returns its equilibrium index (if any). Assume that the vector may be very long.
All solutions (efficient), that I found are based on the fact that given the sum of all elements and current running sum of one part we can obtain via deduction sum of elements of residual part.
I don't think that solutions are correct, because if we provide large vector with MAX_INT elements, sum of elements will result on an overflow.
How issue with an overflow can be solved?
#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>
template <typename T>
std::vector<size_t> equilibrium(T first, T last)
typedef typename std::iterator_traits<T>::value_type value_t;
value_t left = 0;
value_t right = std::accumulate(first, last, value_t(0));
std::vector<size_t> result;
for (size_t index = 0; first != last; ++first, ++index)
right -= *first;
if (left == right)
left += *first;
return result;
template <typename T>
void print(const T& value)
std::cout << value << "\n";
int main()
const int data[] = { -7, 1, 5, 2, -4, 3, 0 };
std::vector<size_t> indices(equilibrium(data, data + 7));
std::for_each(indices.begin(), indices.end(), print<size_t>);

The short answer is that ultimately it can't be completely cured/solved, unless you limit the number/magnitude of the inputs--not even with something like Java's BigInt (or equivalents for C++ such as gmp, NTL, etc.)
The problem is pretty simple: the memory in any computer is finite. There will always be some finite limit on the numbers we can represent. An arbitrary precision integer type can increase the limit to numbers far larger than most of use work with on a regular basis, but regardless of what the limit might be, there will always be dramatically larger numbers that can't be represented (at least without changing to some other representation--but if we're going to have precision to the units place for arbitrary numbers, there are distinct limits on how clever we can get in representing gargantuan numbers).
For the conditions given in the linked problem, the long long type in C and C++ is adequate. If we want to increase the limit to some ridiculous size with a solution in C++, it's pretty simple. Although they're not a required part of a C++ implementation, there are many arbitrary precision integer libraries available for C++.
I suppose there could be some way to compute an answer to this problem that doesn't involve actually summing the numbers--but at least at first glance, this idea doesn't seem very promising to me. The statement of the problem is specifically about computing sums. While you could certainly carry out various machinations to keep the summing from looking like summing, the fact is that the basic statement of the problem involves sums, which tends to suggest that solutions that don't involve sums may well be difficult to find.

Yes it is possible. Notice that if data[0] < data[len - 1], then data[1] shall belong to the "left" part; similarly if data[0] > data[len-1] then data[len-2] belongs to the "right" part. This observation allows an inductive proof of correctness of the following algorithm:
left_weight = 0; right_weight = 0
left_index = 0; right_index = 0
while left_index < right_index
if left_weight < right_weight
left_weight += data[left_index++];
right_weight += data[--right_index]
Still there is an accumulation, but it is easy to deal with by keeping track of imbalance and a boolean indicator of which side is heavier:
while left_index < right_index
if heavier_side == right
weight = data[left_index++]
weight = data[--right_index]
if weight < imbalance
imbalance = imbalance - weight
heavier_side = !heavier_side
imbalance = weight - imbalance
At least for unsigned data there is no possibility of overflow. Some tinkering might be required for signed values.


Efficiently find an integer not in a set of size 40, 400, or 4000

Related to the classic problem find an integer not among four billion given ones but not exactly the same.
To clarify, by integers what I really mean is only a subset of its mathemtical definition. That is, assume there are only finite number of integers. Say in C++, they are int in the range of [INT_MIN, INT_MAX].
Now given a std::vector<int> (no duplicates) or std::unordered_set<int>, whose size can be 40, 400, 4000 or so, but not too large, how to efficiently generate a number that is guaranteed to be not among the given ones?
If there is no worry for overflow, then I could multiply all nonzero ones together and add the product by 1. But there is. The adversary test cases could delibrately contain INT_MAX.
I am more in favor of simple, non-random approaches. Is there any?
Thank you!
Update: to clear up ambiguity, let's say an unsorted std::vector<int> which is guaranteed to have no duplicates. So I am asking if there is anything better than O(n log(n)). Also please note that test cases may contain both INT_MIN and INT_MAX.
You could just return the first of N+1 candidate integers not contained in your input. The simplest candidates are the numbers 0 to N. This requires O(N) space and time.
int find_not_contained(container<int> const&data)
const int N=data.size();
std::vector<char> known(N+1, 0); // one more candidates than data
for(int i=0; i< N; ++i)
if(data[i]>=0 && data[i]<=N)
for(int i=0; i<=N; ++i)
return i;
assert(false); // should never be reached.
Random methods can be more space efficient, but may require more passes over the data in the worst case.
Random methods are indeed very efficient here.
If we want to use a deterministic method and by assuming the size n is not too large, 4000 for example, then we can create a vector x of size m = n + 1 (or a little bit larger, 4096 for example to facilitate calculation), initialised with 0.
For each i in the range, we just set x[array[i] modulo m] = 1.
Then a simple O(n) search in x will provide a value which is not in array
Note: the modulo operation is not exactly the "%" operation
Edit: I mentioned that calculations are made easier by selecting here a size of 4096. To be more concrete, this implies that the modulo operation is performed with a simple & operation
You can find the smallest unused integer in O(N) time using O(1) auxiliary space if you are allowed to reorder the input vector, using the following algorithm. [Note 1] (The algorithm also works if the vector contains repeated data.)
size_t smallest_unused(std::vector<unsigned>& data) {
size_t N = data.size(), scan = 0;
while (scan < N) {
auto other = data[scan];
if (other < scan && data[other] != other) {
data[scan] = data[other];
data[other] = other;
for (scan = 0; scan < N && data[scan] == scan; ++scan) { }
return scan;
The first pass guarantees that if some k in the range [0, N) was found after position k, then it is now present at position k. This rearrangement is done by swapping in order to avoid losing data. Once that scan is complete, the first entry whose value is not the same as its index is not referenced anywhere in the array.
That assertion may not be 100% obvious, since a entry could be referenced from an earlier index. However, in that case the entry could not be the first entry unequal to its index, since the earlier entry would be meet that criterion.
To see that this algorithm is O(N), it should be observed that the swap at lines 6 and 7 can only happen if the target entry is not equal to its index, and that after the swap the target entry is equal to its index. So at most N swaps can be performed, and the if condition at line 5 will be true at most N times. On the other hand, if the if condition is false, scan will be incremented, which can also only happen N times. So the if statement is evaluated at most 2N times (which is O(N)).
I used unsigned integers here because it makes the code clearer. The algorithm can easily be adjusted for signed integers, for example by mapping signed integers from [INT_MIN, 0) onto unsigned integers [INT_MAX, INT_MAX - INT_MIN) (The subtraction is mathematical, not according to C semantics which wouldn't allow the result to be represented.) In 2's-complement, that's the same bit pattern. That changes the order of the numbers, of course, which affects the semantics of "smallest unused integer"; an order-preserving mapping could also be used.
Make random x (INT_MIN..INT_MAX) and test it against all. Test x++ on failure (very rare case for 40/400/4000).
Step 1: Sort the vector.
That can be done in O(n log(n)), you can find a few different algorithms online, use the one you like the most.
Step 2: Find the first int not in the vector.
Easily iterate from INT_MIN to INT_MIN + 40/400/4000 checking if the vector has the current int:
SIZE = 40|400|4000 // The one you are using
for (int i = 0; i < SIZE; i++) {
if (array[i] != INT_MIN + i)
return INT_MIN + i;
The solution would be O(n log(n) + n) meaning: O(n log(n))
Edit: just read your edit asking for something better than O(n log(n)), sorry.
For the case in which the integers are provided in an std::unordered_set<int> (as opposed to a std::vector<int>), you could simply traverse the range of integer values until you come up against one integer value that is not present in the unordered_set<int>. Searching for the presence of an integer in an std::unordered_set<int> is quite straightforward, since std::unodered_set does provide searching through its find() member function.
The space complexity of this approach would be O(1).
If you start traversing at the lowest possible value for an int (i.e., std::numeric_limits<int>::min()), you will obtain the lowest int not contained in the std::unordered_set<int>:
int find_lowest_not_contained(const std::unordered_set<int>& set) {
for (auto i = std::numeric_limits<int>::min(); ; ++i) {
auto it = set.find(i); // search in set
if (it == set.end()) // integer not in set?
return *it;
Analogously, if you start traversing at the greatest possible value for an int (i.e., std::numeric_limits<int>::max()), you will obtain the lowest int not contained in the std::unordered_set<int>:
int find_greatest_not_contained(const std::unordered_set<int>& set) {
for (auto i = std::numeric_limits<int>::max(); ; --i) {
auto it = set.find(i); // search in set
if (it == set.end()) // integer not in set?
return *it;
Assuming that the ints are uniformly mapped by the hash function into the unordered_set<int>'s buckets, a search operation on the unordered_set<int> can be achieved in constant time. The run-time complexity would then be O(M ), where M is the size of the integer range you are looking for a non-contained value. M is upper-bounded by the size of the unordered_set<int> (i.e., in your case M <= 4000).
Indeed, with this approach, selecting any integer range whose size is greater than the size of the unordered_set, is guaranteed to come up against an integer value which is not present in the unordered_set<int>.

summing array of doubles with large value span : proper algorithm

I have an algorithm where I need to sum (a lot of time) double numbers ranging in the e-40 to the e+40.
Array Example (randomly dumped from real application):
It goes without saying the I am aware of the rounding effect this will cause, I am trying to keep it under control : the final result should not have any missing information in the fractional part of the double or, if not avoidable result should be at least n-digit accurate (with n defined). End result needs something like 5 digits plus exponent.
After some decent thinking, I ended up with following algorithm :
Sort the array so that the largest absolute value comes first, closest to zero last.
Add everything in a loop
The idea is that in this case, any cancellation of large values (negatives and positive) will not impact latter smaller values.
In short :
(10e40 - 10e40) + 1 = 1 : result is as expected
(1 + 10e-40) - 10e40 = 0 : not good
I ended up using std::multiset (benchmark on my PC gave 20% higher speed with long double compared to normal doubles - I am fine with doubles resolution) with a custom sort function using std:fabs.
It's still quite slow (it takes 5 seconds to do the whole thing) and I still have this feeling of "you missed something in your algo". Any recommandation :
for speed optimization. Is there a better way to sort the intermediate products ? Sorting a set of 40 intermediate results (typically) takes about 70% of the total execution time.
for missed issues. Is there a chance to still lose critical data (one that should have been in the fractional part of the final result) ?
On a bigger picture, I am implementing real coefficient polynomial classes of pure imaginary variable (electrical impedances : Z(jw)). Z is a big polynom representing a user defined system, with coefficient exponent ranging very far.
The "big" comes from adding things like Zc1 = 1/jC1w to Zc2 = 1/jC2w :
Zc1 + Zc2 = (C1C2(jw)^2 + 0(jw))/(C1+C2)(jw)
In this case, with C1 and C2 in nanofarad (10e-9), C1C2 is already in 10e-18 (and it only started...)
my sort function use a manhattan distance of complex variables (because, mine are either pure real or pure imaginary) :
struct manhattan_complex_distance
bool operator() (std::complex<long double> a, std::complex<long double> b)
return std::fabs(std::real(a) + std::imag(a)) > std::fabs(std::real(b) + std::imag(b));
and my multi set in action :
std:complex<long double> get_value(std::vector<std::complex<long double>>& frequency_vector)
//frequency_vector is precalculated once for all to have at index n the value (jw)^n.
std::multiset<std::complex<long double>, manhattan_distance> temp_list;
for (int i=0; i<m_coeficients.size(); ++i)
// element of : ℝ * ℂ
temp_list.insert(m_coeficients[i] * frequency_vector[i]);
std::complex<long double> ret=0;
for (auto i:temp_list)
// it is VERY important to start adding the big values before adding the small ones.
// in informatics, 10^60 - 10^60 + 1 = 1; while 1 + 10^60 - 10^60 = 0. Of course you'd expected to get 1, not 0.
ret += i;
return ret;
The project I have is c++11 enabled (mainly for improvement of the math lib and complex number tools)
ps : I refactored the code to make is easy to read, in reality all complexes and long double names are template : I can change the polynomial type in no time or use the class for regular polynomial of ℝ
As GuyGreer suggested, you can use Kahan summation:
double sum = 0.0;
double c = 0.0;
for (double value : values) {
double y = value - c;
double t = sum + y;
c = (t - sum) - y;
sum = t;
EDIT: You should also consider using Horner's method to evaluate the polynomial.
double value = coeffs[degree];
for (auto i = degree; i-- > 0;) {
value *= x;
value += coeffs[i];
Sorting the data is on the right track. But you definitely should be summing from smallest magnitude to largest, not from largest to smallest. Summing from largest to smallest, by the time you get to the smallest, aligning the next value with the current sum is liable to cause most or all of the bits of the next value to 'fall off the end'. Summing instead from smallest to largest, the smallest values get a chance to accumulate a decent-sized sum, for which more bits will get into the largest. Combined with Kahan summation, that should yield a fairly accurate sum.
First: have your math keep track of error. Replace your doubles with error-aware types, and when you add or multiply together two doubles it also calculates the maximium error.
This is about the only way you can guarantee that your code produces accurate results while being reasonably fast.
Second, don't use a multiset. The associative containers are not for sorting, they are for maintaining a sorted collection, while being able to incrementally add or remove elements from it efficiently.
The ability to add/remove elements incrementally means it is node-based, and node-based means it is slow in general.
If you simply want a sorted collection, start with a vector then std::sort it.
Next, to minimize error, keep a list of positive and negative elements. Start with zero as your sum. Now pick the smallest of either the positive or negative elements such that the total of your sum and that element is closest to zero.
Do so with elements that calculate their error bounds.
At the end, determine if you have 5 digits of precision, or not.
These error-propogating doubles should be ideally used as early on in the algorithm as possible.

Simulate random iteration of array

I have an array of given size. I want to traverse it in pseudorandom order, keeping array intact and visiting each element once. It will be best if current state can be stored in a few integers.
I know you can't have full randomness without storing full array, but I don't need the order to be really random. I need it to be perceived as random by user. The solution should use sub-linear space.
One possible suggestion - using large prime number - is given here. The problem with this solution is that there is an obvious fixed step (taken module array size). I would prefer a solution which is not so obviously non-random. Is there a better solution?
How about this algorithm?
To pseudo-pseudo randomly traverse an array of size n.
Create a small array of size k
Use the large prime number method to fill the small array, i = 0
Randomly remove a position using a RNG from the small array, i += 1
if i < n - k then add a new position using the large prime number method
if i < n goto 3.
the higher k is the more randomness you get. This approach will allow you to delay generating numbers from the prime number method.
A similar approach can be done to generate a number earlier than expected in the sequence by creating another array, "skip-list". Randomly pick items later in the sequence, use them to traverse the next position, and then add them to the skip-list. When they naturally arrive they are searched for in the skip-list and suppressed and then removed from the skip-list at which point you can randomly add another item to the skip-list.
The idea of a random generator that simulates a shuffle is good if you can get one whose maximum period you can control.
A Linear Congruential Generator calculates a random number with the formula:
x[i + 1] = (a * x[i] + c) % m;
The maximum period is m and it is achieved when the following properties hold:
The parameters c and m are relatively prime.
For every prime number r dividing m, a - 1 is a multiple of r.
If m is a multiple of 4 then also a - 1 is multiple of 4.
My first darft involved making m the next multiple of 4 after the array length and then finding suitable a and c values. This was (a) a lot of work and (b) yielded very obvious results sometimes.
I've rethought this approach. We can make m the smallest power of two that the array length will fit in. The only prime factor of m is then 2, which will make every odd number relatively prime to it. With the exception of 1 and 2, m will be divisible by 4, which means that we must make a - 1 a multiple of 4.
Having a greater m than the array length means that we must discard all values that are illegal array indices. This will happen at most every other turn and should be negligible.
The following code yields pseudo random numbers with a period of exaclty m. I've avoided trivial values for a and c and on my (not too numerous) spot cheks, the results looked okay. At least there was no obvious cycling pattern.
class RandomIndexer
RandomIndexer(size_t length) : len(length)
m = 8;
while (m < length) m <<= 1;
c = m / 6 + uniform(5 * m / 6);
c |= 1;
a = m / 12 * uniform(m / 6);
a = 4*a + 1;
x = uniform(m);
size_t next()
do { x = (a*x + c) % m; } while (x >= len);
return x;
static size_t uniform(size_t m)
double p = std::rand() / (1.0 + RAND_MAX);
return static_cast<int>(m * p);
size_t len;
size_t x;
size_t a;
size_t c;
size_t m;
You can then use the generator like this:
std::vector<int> list;
for (size_t i = 0; i < 3; i++) list.push_back(i);
RandomIndexer ix(list.size());
for (size_t i = 0; i < list.size(); i++) {
std::cout << list[ix.next()]<< std::endl;
I am aware that this still isn't a great random number generator, but it is reasonably fast, doesn't require a copy of the array and seems to work okay.
If the approach of picking a and c randomly yields bad results, it might be a good idea to restrict the generator to some powers of two and to hard-code literature values that have proven to be good.
As pointed out by others, you can create a sort of "flight plan" upfront by shuffling an array of array indices and then follow it. This violates the "it will be best if current state can be stored in a few integers" constraint but does it really matter? Are there tight performance constraints? After all, I believe that if you don't accept repetitions, than you need to store the items you already visited somewhere or somehow.
Alternatively, you can opt for an intrusive solution and store a bool inside each element of the array, telling you whether the element was already selected or not. This can be done in an almost clean way by employing inheritance (multiple as needed).
Many problems come with this solution, e.g. thread safety, and of course it violates the "keep the array intact" constraint.
Quadratic residues which you have mentioned ("using a large prime") are well-known, will work, and guarantee iterating each and every element exactly once (if that is required, but it seems that's not strictly the case?). Unluckily they are not "very random looking", and there are a few other requirements to the modulo in addition to being prime for it to work.
There is a page on Jeff Preshing's site which describes the technique in detail and suggests to feed the output of the residue generator into the generator again with a fixed offset.
However, since you said that you merely need "perceived as random by user", it seems that you might be able to do with feeding a hash function (say, cityhash or siphash) with consecutive integers. The output will be a "random" integer, and at least so far there will be a strict 1:1 mapping (since there are a lot more possible hash values than there are inputs).
Now the problem is that your array is most likely not that large, so you need to somehow reduce the range of these generated indices without generating duplicates (which is tough).
The obvious solution (taking the modulo) will not work, as it pretty much guarantees that you get a lot of duplicates.
Using a bitmask to limit the range to the next greater power of two should work without introducing bias, and discarding indices that are out of bounds (generating a new index) should work as well. Note that this needs non-deterministic time -- but the combination of these two should work reasonably well (a couple of tries at most) on the average.
Otherwise, the only solution that "really works" is shuffling an array of indices as pointed out by Kamil Kilolajczyk (though you don't want that).
Here is a java solution, which can be easily converted to C++ and similar to M Oehm's solution above, albeit with a different way of choosing LCG parameters.
import java.util.Enumeration;
import java.util.Random;
public class RandomPermuteIterator implements Enumeration<Long> {
int c = 1013904223, a = 1664525;
long seed, N, m, next;
boolean hasNext = true;
public RandomPermuteIterator(long N) throws Exception {
if (N <= 0 || N > Math.pow(2, 62)) throw new Exception("Unsupported size: " + N);
this.N = N;
m = (long) Math.pow(2, Math.ceil(Math.log(N) / Math.log(2)));
next = seed = new Random().nextInt((int) Math.min(N, Integer.MAX_VALUE));
public static void main(String[] args) throws Exception {
RandomPermuteIterator r = new RandomPermuteIterator(100);
while (r.hasMoreElements()) System.out.print(r.nextElement() + " ");
//output:50 52 3 6 45 40 26 49 92 11 80 2 4 19 86 61 65 44 27 62 5 32 82 9 84 35 38 77 72 7 ...
public boolean hasMoreElements() {
return hasNext;
public Long nextElement() {
next = (a * next + c) % m;
while (next >= N) next = (a * next + c) % m;
if (next == seed) hasNext = false;
return next;
Finding an increasing sequence a[] which minimizes sigma(abs(a[i]+c[i]))

Problem statement
c is a given array of n integers; the problem is to find an increasing array of n integers a (a[i] <= a[i+1]) such that this sum is minimized:
abs(a[0]+c[0]) + abs(a[1]+c[1]) + ... + abs(a[n-1]+c[n-1])
// abs(x) = absolute value of x
An optimal a exists only made by integers appeared in c so we can solve it using DP in O(n^2):
dp[i][j]: a[i] >= j'th integer
But there should be a faster solution, probably O(n lg n).
Update: I add the solution, which minimizes sum-of-absolute-values. Other solution, which minimizes sum-of-squares, is still here, at the end of this post, in case someone is interested.
Minimize sum-of-absolute-values algorithm
I start with the algorithm, that works only with the array of non-negative integers. Then it will be extended to any integers (or even to non-integer objects).
This is a greedy algorithm. It uses bitwise representation of integers. Start with the most significant bit of each array's element (and ignore other bits for a while). Find largest prefix, that maximizes ones/zeros balance. Now clear all the array values, belonging to prefix and having zero most significant bit (zero all bits of these values). And for all the array values in the suffix, that have non-zero most significant bit, set all other bits to "one". Apply this algorithm recursively to both prefix and suffix using next bit as "most significant".
This splits the original array into segments. You can find median of each segment and fill the output array with this median. Alternatively, just set corresponding bits in the output array when processing prefixes and leave them zero when dealing with suffixes.
All this works because minimizing sum-of-absolute-values requires to find the median of subarrays, and while finding this median, you can compare values very approximately, always using only a single most-significant bit for the whole array and descending to other bits later, for subarrays.
Here is C++11 code snippet, which explains the details:
//g++ -std=c++0x
#include <iostream>
#include <vector>
#include <iomanip>
using namespace std;
typedef vector<unsigned> arr_t;
typedef arr_t::iterator arr_it;
void nonincreasing(arr_it array, arr_it arrayEnd, arr_it out, int bits)
if (bits != -1)
int balance = 0;
int largestBalance = -1;
arr_it prefixEnd = array;
for (arr_it i = array; i != arrayEnd; ++i)
int d = ((*i >> bits) & 1)? 1: -1;
balance += d;
if (balance > largestBalance)
balance = largestBalance;
prefixEnd = i + 1;
for (arr_it i = array; i != prefixEnd; ++i)
*(out + (i - array)) += (1 << bits);
if (!((*i >> bits) & 1))
*i = 0;
nonincreasing(array, prefixEnd, out, bits - 1);
for (arr_it i = prefixEnd; i != arrayEnd; ++i)
if ((*i >> bits) & 1)
*i = (1 << bits) - 1;
nonincreasing(prefixEnd, arrayEnd, out + (prefixEnd - array), bits - 1);
void printArray(const arr_t& array)
for (auto val: array)
cout << setw(2) << val << ' ';
cout << endl;
int main()
arr_t array({12,10,10,17,6,3,9});
arr_t out(array.size());
nonincreasing(begin(array), end(array), begin(out), 5);
return 0;
To work with any integers, not just positive, there are two alternatives:
Find minimum integer in the input array and subtract it from other elements. When done with the main algorithm, add it back (and negate the result). This gives complexity O(N log U), where U is range of the array's values.
Compact values of the input array. Sort it by value, remove duplicates, and instead of the original values, use index of this array. When done with the main algorithm, change indexes back to corresponding values (and negate the result). This gives complexity O(N log H), where H is the number of unique input array's values. Also this allows using not only integers, but any objects which may be ordered (compared to each other).
Minimize sum-of-squares algorithm
Here is a high level description of this algorithm. Complexity is O(N).
Start with searching of a subarray, starting at the beginning of c[] and having largest possible average value. Then fill subarray of the same length in a[] with this average value (rounded to nearest integer and negated). Then remove this subarray from a[] and c[] (in other words, assume the beginning of a[] and c[] is moved forward by subarray's length) and recursively apply this algorithm to the remaining parts of a[] and c[].
Most interesting part of this algorithm is searching of largest subarray. Fill a temporary array b[] with cumulative sum of elements from c[]: b[0] = c[0], b[1] = b[0] + c[1], ... Now you can determine average of any interval in c[] with this: (b[i+m] - b[i]) / m. By coincidence, exactly the same formula (maximization of its value) determines a tangent line from b[i] to the curve, described by b[]. So you can find all maximum values (as well as subarray bounds), needed for this algorithm, at once, using any Convex hull algorithm. Convex hull algorithms usually work with points in two dimensions and have super-linear complexity. But in this case, points are already sorted in one dimension, so Graham scan or Monotone Chain algorithm do the task in O(N) time, which also determines complexity of the whole algorithm.
Pseudocode for this algorithm:
b[] = Integrate(c[])
h[] = ConvexHull(b[])
a[] = - Derivative(h[])
Visualization of the example array processing:

Finding repeating signed integers with O(n) in time and O(1) in space

(This is a generalization of: Finding duplicates in O(n) time and O(1) space)
Problem: Write a C++ or C function with time and space complexities of O(n) and O(1) respectively that finds the repeating integers in a given array without altering it.
Example: Given {1, 0, -2, 4, 4, 1, 3, 1, -2} function must print 1, -2, and 4 once (in any order).
EDIT: The following solution requires a duo-bit (to represent 0, 1, and 2) for each integer in the range of the minimum to the maximum of the array. The number of necessary bytes (regardless of array size) never exceeds (INT_MAX – INT_MIN)/4 + 1.
#include <stdio.h>
void set_min_max(int a[], long long unsigned size,\
int* min_addr, int* max_addr)
long long unsigned i;
if(!size) return;
*min_addr = *max_addr = a[0];
for(i = 1; i < size; ++i)
if(a[i] < *min_addr) *min_addr = a[i];
if(a[i] > *max_addr) *max_addr = a[i];
void print_repeats(int a[], long long unsigned size)
long long unsigned i;
int min, max = min;
long long diff, q, r;
char* duos;
set_min_max(a, size, &min, &max);
diff = (long long)max - (long long)min;
duos = calloc(diff / 4 + 1, 1);
for(i = 0; i < size; ++i)
diff = (long long)a[i] - (long long)min; /* index of duo-bit
corresponding to a[i]
in sequence of duo-bits */
q = diff / 4; /* index of byte containing duo-bit in "duos" */
r = diff % 4; /* offset of duo-bit */
switch( (duos[q] >> (6 - 2*r )) & 3 )
case 0: duos[q] += (1 << (6 - 2*r));
case 1: duos[q] += (1 << (6 - 2*r));
printf("%d ", a[i]);
void main()
int a[] = {1, 0, -2, 4, 4, 1, 3, 1, -2};
print_repeats(a, sizeof(a)/sizeof(int));
The definition of big-O notation is that its argument is a function (f(x)) that, as the variable in the function (x) tends to infinity, there exists a constant K such that the objective cost function will be smaller than Kf(x). Typically f is chosen to be the smallest such simple function such that the condition is satisfied. (It's pretty obvious how to lift the above to multiple variables.)
This matters because that K — which you aren't required to specify — allows a whole multitude of complex behavior to be hidden out of sight. For example, if the core of the algorithm is O(n2), it allows all sorts of other O(1), O(logn), O(n), O(nlogn), O(n3/2), etc. supporting bits to be hidden, even if for realistic input data those parts are what actually dominate. That's right, it can be completely misleading! (Some of the fancier bignum algorithms have this property for real. Lying with mathematics is a wonderful thing.)
So where is this going? Well, you can assume that int is a fixed size easily enough (e.g., 32-bit) and use that information to skip a lot of trouble and allocate fixed size arrays of flag bits to hold all the information that you really need. Indeed, by using two bits per potential value (one bit to say whether you've seen the value at all, another to say whether you've printed it) then you can handle the code with fixed chunk of memory of 1GB in size. That will then give you enough flag information to cope with as many 32-bit integers as you might ever wish to handle. (Heck that's even practical on 64-bit machines.) Yes, it's going to take some time to set that memory block up, but it's constant so it's formally O(1) and so drops out of the analysis. Given that, you then have constant (but whopping) memory consumption and linear time (you've got to look at each value to see whether it's new, seen once, etc.) which is exactly what was asked for.
It's a dirty trick though. You could also try scanning the input list to work out the range allowing less memory to be used in the normal case; again, that adds only linear time and you can strictly bound the memory required as above so that's constant. Yet more trickiness, but formally legal.
[EDIT] Sample C code (this is not C++, but I'm not good at C++; the main difference would be in how the flag arrays are allocated and managed):
#include <stdio.h>
#include <stdlib.h>
// Bit fiddling magic
int is(int *ary, unsigned int value) {
return ary[value>>5] & (1<<(value&31));
void set(int *ary, unsigned int value) {
ary[value>>5] |= 1<<(value&31);
// Main loop
void print_repeats(int a[], unsigned size) {
int *seen, *done;
unsigned i;
seen = calloc(134217728, sizeof(int));
done = calloc(134217728, sizeof(int));
for (i=0; i<size; i++) {
if (is(done, (unsigned) a[i]))
if (is(seen, (unsigned) a[i])) {
set(done, (unsigned) a[i]);
printf("%d ", a[i]);
} else
set(seen, (unsigned) a[i]);
void main() {
int a[] = {1,0,-2,4,4,1,3,1,-2};
Since you have an array of integers you can use the straightforward solution with sorting the array (you didn't say it can't be modified) and printing duplicates. Integer arrays can be sorted with O(n) and O(1) time and space complexities using Radix sort. Although, in general it might require O(n) space, the in-place binary MSD radix sort can be trivially implemented using O(1) space (look here for more details).
The O(1) space constraint is intractable.
The very fact of printing the array itself requires O(N) storage, by definition.
Now, feeling generous, I'll give you that you can have O(1) storage for a buffer within your program and consider that the space taken outside the program is of no concern to you, and thus that the output is not an issue...
Still, the O(1) space constraint feels intractable, because of the immutability constraint on the input array. It might not be, but it feels so.
And your solution overflows, because you try to memorize an O(N) information in a finite datatype.
There is a tricky problem with definitions here. What does O(n) mean?
Konstantin's answer claims that the radix sort time complexity is O(n). In fact it is O(n log M), where the base of the logarithm is the radix chosen, and M is the range of values that the array elements can have. So, for instance, a binary radix sort of 32-bit integers will have log M = 32.
So this is still, in a sense, O(n), because log M is a constant independent of n. But if we allow this, then there is a much simpler solution: for each integer in the range (all 4294967296 of them), go through the array to see if it occurs more than once. This is also, in a sense, O(n), because 4294967296 is also a constant independent of n.
I don't think my simple solution would count as an answer. But if not, then we shouldn't allow the radix sort, either.
I doubt this is possible. Assuming there is a solution, let's see how it works. I'll try to be as general as I can and show that it can't work... So, how does it work?
Without losing generality we could say we process the array k times, where k is fixed. The solution should also work when there are m duplicates, with m >> k. Thus, in at least one of the passes, we should be able to output x duplicates, where x grows when m grows. To do so, some useful information has been computed in a previous pass and stored in the O(1) storage. (The array itself can't be used, this would give O(n) storage.)
The problem: we have O(1) of information, when we walk over the array we have to identify x numbers(to output them). We need a O(1) storage than can tell us in O(1) time, if an element is in it. Or said in a different way, we need a data structure to store n booleans (of wich x are true) that uses O(1) space, and takes O(1) time to query.
Does this data structure exists? If not, then we can't find all duplicates in an array with O(n) time and O(1) space (or there is some fancy algorithm that works in a completely different manner???).
I really don't see how you can have only O(1) space and not modify the initial array. My guess is that you need an additional data structure. For example, what is the range of the integers? If it's 0..N like in the other question you linked, you can have an additinal count array of size N. Then in O(N) traverse the original array and increment the counter at the position of the current element. Then traverse the other array and print the numbers with count >= 2. Something like:
int* counts = new int[N];
for(int i = 0; i < N; i++) {
for(int i = 0; i < N; i++) {
if(counts[i] >= 2) cout << i << " ";
delete [] counts;
Say you can use the fact you are not using all the space you have. You only need one more bit per possible value and you have lots of unused bit in your 32-bit int values.
This has serious limitations, but works in this case. Numbers have to be between -n/2 and n/2 and if they repeat m times, they will be printed m/2 times.
void print_repeats(long a[], unsigned size) {
long i, val, pos, topbit = 1 << 31, mask = ~topbit;
for (i = 0; i < size; i++)
a[i] &= mask;
for (i = 0; i < size; i++) {
val = a[i] & mask;
if (val <= mask/2) {
pos = val;
} else {
val += topbit;
pos = size + val;
if (a[pos] < 0) {
printf("%d\n", val);
a[pos] &= mask;
} else {
a[pos] |= topbit;
void main() {
long a[] = {1, 0, -2, 4, 4, 1, 3, 1, -2};
print_repeats(a, sizeof (a) / sizeof (long));