Space Complexity of an initialised pointer data structure - c++

I have got a question.
In terms of theoretical computer science, when we analyse an algorithm, if an algorithm initialises a new data structure, then we consider that data structure as part of space complexity.
Now I am not too sure about this part then.
Let's say I have got an array of int and I would like to map them by using a map of int pointers. Such as
std::map<int*,int*> mymap;
for (int i = 1; i < arraySize; i++) {
mymap[&arr[i-1]]=&arr[i];
}
If this algorithm was not using pointers, then we could clearly state that it is initialising a map with size of n, hence space complexity is O(n), however for this case, where we are using pointers, what would be the space complexity of this algorithm?

The space complexity of a single pointer is the same as that of any other primitive - i.e. O(1).
std::map<K,V> is implemented as a tree of N nodes. Its space complexity is O(N*space-complexity-of-one-node), so the total space complexity in your case is O(N).
Note that the big-O notation factors out the constant multiplier: although the big-O space complexity of an std::map<Ptr1,Ptr2> and std::vector<Ptr1> is the same, the multiplier for the map is higher, because tree construction imposes its overhead for storing tree nodes and connections among them.

Related

Multiple insertions and deletion in priority_queue

Problem
In my algorithm I have to implement a "Ordered queue" (I choose this name to distinguish the idea in my mind from existing implementations). I have to insert some values into a queue where the value represents the order in the queue, and then I have to digest the queue following the order. I have the feeling that the best data structure for what I need to do is std::priority_queue, but I have some concerns about the efficiency of my program, in particular due to:
Interface which does not provide methods for insertions/deletions of multiple elements
(Possibly) Internal design of the class and its algorithms
From the documentation, both priority_queue::push and priority_queue::pop internally call std::push_heap and std::pop_heap, which both have complexity O(log(N)). I think it is very inefficient to insert/delete one element at a time, resorting the underlying container at every call.
Why is it implemented in this way? Maybe when you call std::push_heap and std::pop_heap in a sequence the underlying heap structure is in the optimal case and the complexity is reduced with respect to O(log(N))?
Otherwise, is there a best data structure which fits my needs that I have not considered? I thought also std::forward_list could fulfil my needs on deletion (through forward_list::pop_front), but I fear that the insertion becomes too expensive as I should find the iterator for the correct place to insert, which should be O(N).
I would prefer not to rely on any external library (Boost included) because the project must be lightweight and dependency-free.
Implementation
The program is equivalent to:
struct MyType{
double when;
int who;
MyType(double t, double i) : when(t), who(i) {};
bool operator<(const MyType & other) const{ return when < other.when; }
};
using OrderedQueue = priority_queue<MyType,std::vector<MyType>,std::less<MyType>>;
const double TMax = 1e9; // some BIG stopping condition
double some_time(){/*routine to generate the time*/ return TMax * rand(); }
int some_number(){/*routine to generate the number*/ return 100 * rand(); }
void populate(OrderedQueue & q){
unsigned Ni = 10; // number of insertions: it is not fixed in the real program
for (auto i = 0; i < Ni; ++i){
q.emplace(some_time(), some_number());
}
}
void use_MyType(MyType m){/*routine that uses the top value*/ return; }
void remove(double t, OrderedQueue & q){
while(q.top().when < t){
use_MyType(q.top());
q.pop();
}
}
int main(){
double t = 0;
OrderedQueue q;
while(t < TMax){
populate(q);
remove(t, q);
t += 1;
}
}
I am particularly interested in the efficiency of populate() and remove() because the loop when they are called has very many iterations.
std::priority_queue is an adaptor for a heap structure. Given your requirements of consuming elements in order one by one, a heap is the most efficient structure.
Heap insertions are worst case O(log(N)), but are on average O(1). This is faster than e.g. a binary tree (std::map) insertion, which is always O(log(N)). Similarly, removing the top element from a heap is worst case O(log(N)), but on average much faster since a heap is partially sorted.
With that said, the effects of branch prediction and caching in modern computers cannot be neglected. The best way to answer a performance question is to benchmark it with your actual data and a representative number of elements. I would suggest to benchmark using these 3 queue structures:
std::priority_queue<MyType, std::vector<MyType>>
std::priority_queue<MyType, std::deque<MyType>>
std::map<MyType>
std::deque as the backing store may offer improved pop_front performance, at the expense of slower random access. So it should be benchmarked.
I would disregard std::list (std::forward_list) at this point - inserting into a linked list at the right place is O(N), plus a linked list isn't cache-friendly, so is definitely going to be a much slower solution.
For more details on Heap vs Binary Tree performance, see this related question.
To address your concerns:
Interface which does not provide methods for insertions/deletions of multiple elements
Inserting an element into a heap involves appending the element at the end and "repairing" the heap structure. This is what the std::push_heap algorithm does. It is entirely feasible to implement an algorithm to insert multiple elements this way and/or simply invoke std::make_heap after appending multiple elements to repair the entire heap.
Removing multiple elements from a heap isn't possible, since a heap is only sorted with respect to the first (top) element. After removing it, the heap structure needs to be adjusted to find the next top element. This is what the std::pop_heap algorithm does.
Internal design of the class and its algorithms
std::priority_queue is just an adapter around heap algorithms. It's a convenience class that wraps a sequential container and invokes heap algorithms on it. You don't have to use it, you can use std::vector, std::push_heap and std::pop_heap with exact same results (though the code might be less readable and more error-prone).

Which container is most efficient for multiple insertions / deletions in C++?

I was set a homework challenge as part of an application process (I was rejected, by the way; I wouldn't be writing this otherwise) in which I was to implement the following functions:
// Store a collection of integers
class IntegerCollection {
public:
// Insert one entry with value x
void Insert(int x);
// Erase one entry with value x, if one exists
void Erase(int x);
// Erase all entries, x, from <= x < to
void Erase(int from, int to);
// Return the count of all entries, x, from <= x < to
size_t Count(int from, int to) const;
The functions were then put through a bunch of tests, most of which were trivial. The final test was the real challenge as it performed 500,000 single insertions, 500,000 calls to count and 500,000 single deletions.
The member variables of IntegerCollection were not specified and so I had to choose how to store the integers. Naturally, an STL container seemed like a good idea and keeping it sorted seemed an easy way to keep things efficient.
Here is my code for the four functions using a vector:
// Previous bit of code shown goes here
private:
std::vector<int> integerCollection;
};
void IntegerCollection::Insert(int x) {
/* using lower_bound to find the right place for x to be inserted
keeps the vector sorted and makes life much easier */
auto it = std::lower_bound(integerCollection.begin(), integerCollection.end(), x);
integerCollection.insert(it, x);
}
void IntegerCollection::Erase(int x) {
// find the location of the first element containing x and delete if it exists
auto it = std::find(integerCollection.begin(), integerCollection.end(), x);
if (it != integerCollection.end()) {
integerCollection.erase(it);
}
}
void IntegerCollection::Erase(int from, int to) {
if (integerCollection.empty()) return;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
/* std::vector::erase deletes entries between the two pointers
fromBound (included) and toBound (not indcluded) */
integerCollection.erase(fromBound, toBound);
}
size_t IntegerCollection::Count(int from, int to) const {
if (integerCollection.empty()) return 0;
int count = 0;
// lower_bound points to the first element of integerCollection >= from/to
auto fromBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), from);
auto toBound = std::lower_bound(integerCollection.begin(), integerCollection.end(), to);
// increment pointer until fromBound == toBound (we don't count elements of value = to)
while (fromBound != toBound) {
++count; ++fromBound;
}
return count;
}
The company got back to me saying that they wouldn't be moving forward because my choice of container meant the runtime complexity was too high. I also tried using list and deque and compared the runtime. As I expected, I found that list was dreadful and that vector took the edge over deque. So as far as I was concerned I had made the best of a bad situation, but apparently not!
I would like to know what the correct container to use in this situation is? deque only makes sense if I can guarantee insertion or deletion to the ends of the container and list hogs memory. Is there something else that I'm completely overlooking?
We cannot know what would make the company happy. If they reject std::vector without concise reasoning I wouldn't want to work for them anyway. Moreover, we dont really know the precise requirements. Were you asked to provide one reasonably well performing implementation? Did they expect you to squeeze out the last percent of the provided benchmark by profiling a bunch of different implementations?
The latter is probably too much for a homework challenge as part of an application process. If it is the first you can either
roll your own. It is unlikely that the interface you were given can be implemented more efficiently than one of the std containers does... unless your requirements are so specific that you can write something that performs well under that specific benchmark.
std::vector for data locality. See eg here for Bjarne himself advocating std::vector rather than linked lists.
std::set for ease of implementation. It seems like you want the container sorted and the interface you have to implement fits that of std::set quite well.
Let's compare only isertion and erasure assuming the container needs to stay sorted:
operation std::set std::vector
insert log(N) N
erase log(N) N
Note that the log(N) for the binary_search to find the position to insert/erase in the vector can be neglected compared to the N.
Now you have to consider that the asymptotic complexity listed above completely neglects the non-linearity of memory access. In reality data can be far away in memory (std::set) leading to many cache misses or it can be local as with std::vector. The log(N) only wins for huge N. To get an idea of the difference 500000/log(500000) is roughly 26410 while 1000/log(1000) is only ~100.
I would expect std::vector to outperform std::set for considerably small container sizes, but at some point the log(N) wins over cache. The exact location of this turning point depends on many factors and can only reliably determined by profiling and measuring.
Nobody knows which container is MOST efficient for multiple insertions / deletions. That is like asking what is the most fuel-efficient design for a car engine possible. People are always innovating on the car engines. They make more efficient ones all the time. However, I would recommend a splay tree. The time required for a insertion or deletion is a splay tree is not constant. Some insertions take a long time and some take only a very a short time. However, the average time per insertion/deletion is always guaranteed to be be O(log n), where n is the number of items being stored in the splay tree. logarithmic time is extremely efficient. It should be good enough for your purposes.
The first thing that comes to mind is to hash the integer value so single look ups can be done in constant time.
The integer value can be hashed to compute an index in to an array of bools or bits, used to tell if the integer value is in the container or not.
Counting and and deleting large ranges could be sped up from there, by using multiple hash tables for specific integer ranges.
If you had 0x10000 hash tables, that each stored ints from 0 to 0xFFFF and were using 32 bit integers you could then mask and shift the upper half of the int value and use that as an index to find the correct hash table to insert / delete values from.
IntHashTable containers[0x10000];
u_int32 hashIndex = (u_int32)value / 0x10000;
u_int32int valueInTable = (u_int32)value - (hashIndex * 0x10000);
containers[hashIndex].insert(valueInTable);
Count for example could be implemented as so, if each hash table kept count of the number of elements it contained:
indexStart = startRange / 0x10000;
indexEnd = endRange / 0x10000;
int countTotal = 0;
for (int i = indexStart; i<=indexEnd; ++i) {
countTotal += containers[i].count();
}
Not sure if using sorting really is a requirement for removing the range. It might be based on position. Anyway, here is a link with some hints which STL container to use.
In which scenario do I use a particular STL container?
Just FYI.
Vector maybe a good choice, but it does a lot of re allocation, as you know. I prefer deque instead, as it doesn't require big chunk of memory to allocate all items. For such requirement as you had, list probably fit better.
Basic solution for this problem might be std::map<int, int>
where key is the integer you are storing and value is the number of occurences.
Problem with this is that you can not quickly remove/count ranges. In other words complexity is linear.
For quick count you would need to implement your own complete binary tree where you can know the number of nodes between 2 nodes(upper and lower bound node) because you know the size of tree, and you know how many left and right turns you took to upper and lower bound nodes. Note that we are talking about complete binary tree, in general binary tree you can not make this calculation fast.
For quick range remove I do not know how to make it faster than linear.

Space complexity of an array of pairs

So I'm wondering what the space complexity of an array of integer pairs is?
std::pair<int,int> arr[n];
I'm thinking that, because a pair is constant and the array is n, the space complexity is O(2) * O(n) = O(2n) = O(n). Or is the space complexity O(n^2) because the array of pairs is still essentially a 2D array?
The correct Space complexity is O(n)
The fact that it superficially resembles a 2D array is immaterial: the magnitude of the second dimension is known, and as such, it remains O(n). This would also be true if the pairs were, instead, 100-element arrays. Because the dimensions of the elements (each a 100 element array) is known, the Space complexity of the structure is O(100 * n) which is O(n).
Conversely, however, if the elements were instead explicitly always the same as the size of the container as a whole, i.e. this were something like this:
int n = /*...*/;
std::vector<std::vector<int>> arr(n);
for(std::vector<int> & subarr : arr) {
subarr.resize(n);
}
Then it would indeed be O(n2) instead. Because now both dimensions are depending on an unknown quantity.
Conversely, if the second dimension were unknown but known to not be correlated to the first dimension, you'd instead express it as O(nm), i.e. an array constructed like this:
int n = /*...*/;
int m = /*...*/;
std::vector<std::vector<int>> arr(n);
for(std::vector<int> & subarr : arr) {
subarr.resize(m);
}
Now this might seem contradictory: "But Xirema, you just said that if we knew the dimensions were n X 100 elements, it would be O(n), but if we substitute 100 for m, would we not instead have a O(nm) or O(100n) space complexity?"
But like I said: we remove known quantities. O(2n) is equivalent to O(5n) because all we care about is the unknowns. Once an unknown becomes known, we no longer include it when evaluating Space Complexity.
Space complexity (and Runtime Complexity, etc.) are intended to function as abstract representations of an algorithm or data structure. We use these concepts to work out, at a high level conception, how well they scale to larger and larger inputs. Two different data structures, one requiring 100 bytes per element, another requiring 4 bytes per element squared, will not have consistent space ranks between each other when scaling from a small environment to a large environment; in a smaller environment, the latter data structure will consume less memory, and in a larger environment, the former data structure will consume less memory. Space/Runtime Order complexity is just a shorthand for expressing that relationship, without needing to get bogged down in the details or semantics. If details or semantics are what you care about, then you're not going to just use the Order of the structure/algorithm, you're going to actually test and measure those different approaches.
The space taken is n * sizeof(std::pair<int, int>) bytes. sizeof(std::pair<int, int>) is a constant, and O(n * (constant)) == O(n).
The space complexity of an array can in general be said to be:
O(<size of array> * <size of each array element>)
Here you have:
std::pair<int,int> arr[n];
So arr is an array with n elements, and each element is a std::pair<int,int>. Let's suppose an int takes 4 bytes, so a pair of two int should take 8 bytes (this numbers could be slightly different depending on the implementation, but that doesn't matter for the purposes of complexity calculation). So the complexity would be O(n * 8), which is the same as O(n), because constants do not make a difference in complexity.
When would you have something like O(n^2)? Well, you would need a multi-dimensional array. For example, something like this:
std::pair<int,int> arr[n][m];
Now arr is an array with m elements, but each element is in turn an array of n std::pair<int,int> elements. So you have O(m * <size of array of n pairs>), which is to say O(m * n * 8), that is, O(m * n). If m happens to be the same as n, then you get O(n * n), or O(n^2).
As you can imagine, the same reasoning follows for any number of array dimensions.

Is there any array-like data structure that can grow in size on both sides?

I'm a student working on a small project for an high performance computing course, hence efficiency it's a key issue.
Let say that I have a vector of N floats and I want to remove the smallest n elements and the biggest n elements. There are two simple ways of doing this:
A
sort in ascending order // O(NlogN)
remove the last n elements // O(1)
invert elements order // O(N)
remove the last n elements // O(1)
B
sort in ascending order // O(NlogN)
remove the last n elements // O(1)
remove the first n elements // O(N)
In A inverting the elements order require swapping all the elements, while in B removing the first n elements require moving all the others to occupy the positions left empty. Using std::remove would give the same problem.
If I could remove the first n elements for free then solution B would be cheaper. That should be easy to achieve, if instead of having a vector, i.e. an array with some empty space after vector::end(), I would have a container with some free space also before vector::begin().
So the question is: does exist already an array-like (i.e. contiguous memory, no linked lists) in some libraries (STL, Boost) that allows for O(1) inserting/removing on both sides of the array?
If not, do you think that there are better solutions than creating such a data structure?
Have you thought of using std::partition with a custom functor like the example below:
#include <iostream>
#include <vector>
#include <algorithm>
template<typename T>
class greaterLess {
T low;
T up;
public:
greaterLess(T const &l, T const &u) : low(l), up(u) {}
bool operator()(T const &e) { return !(e < low || e > up); }
};
int main()
{
std::vector<double> v{2.0, 1.2, 3.2, 0.3, 5.9, 6.0, 4.3};
auto it = std::partition(v.begin(), v.end(), greaterLess<double>(2.0, 5.0));
v.erase(it, v.end());
for(auto i : v) std::cout << i << " ";
std::cout << std::endl;
return 0;
}
This way you would erase elements from your vector in O(N) time.
Try boost::circular_buffer:
It supports random access iterators, constant time insert and erase operations at the beginning or the end of the buffer and interoperability with std algorithms.
Having looked at the source, it seems (and is only logical) that data is kept as a continuous memory block.
The one caveat is that the buffer has fixed capacity and after exhausting it elements will get overwritten. You can either detect such cases yourself and resize the buffer manually, or use boost::circular_buffer_space_optimized with a humongous declared capacity, since it won't allocate it if not needed.
To shrink & grow a vector at both ends, you can use idea of slices, reserving extra memory to expand into ahead of time at front and back, if efficient growth is needed.
Simply, make a class with not only a length but indices for first & last elements and a suitably sized vector, to create a window of data on the underlying block of stored floats. A C++ class can provide inlined functions, for things like deleting items, address into the array, find the nth largest value, shift the slice values down or up to insert new elements maintaining sorted order. Should no spare elements be available, then dynamic allocation of a new larger float store, permits continuing growth at the cost of an array copy.
A circular buffer is designed as a FIFO, with new elements added at end, removal at front, and not allowing insertion in the middle, a self defined class can also (trivially) support array subscript values different from 0..N-1
Due to memory locality, avoiding excessive indirection due to pointer chains, and the pipelining of subscript calculations on a modern processor, a solution based on an array (or a vector), is likely to be most efficicent, despite element copying on insertion. Deque would be suitable but it fails to guarantee contiguous storage.
Additional supplementary info. Researching classes providing slices, finds some plausible alternatives to evaluate :
A) std::slice which uses slice_arrays
B) Boost Class Range
Hope this is the kind of specific information you were hoping for, in general a simpler clearer solution is more maintainable, than a tricky one. I would expect slices and ranges on sorted data sets, being quite common, for example filtering experimental data where "outliers" are excluded as faulty readings.
I think a good solution, should actually be - O(NlogN), 2xO(1), with any binary searches O(logN +1) for filtering on outlying values, in place of deleting a fixed number of small or large values; it matters that the "O" is relatively fast to, sometimes an O(1) algorithmn can be in practice slower for practical values of N than an O(N) one.
as a complementary to #40two 's answer, before partitioning the array, you will need to find the partitioning pivot, which is you will need to find the nth smallest number, and the nth greatest number in an unsorted array.
There is a discussion on that in SO: How to find the kth largest number in unsorted array
There are several algorithms to solve this problem. Some are deterministic O(N) - on of them is a variation on finding the median (median of medians). There are some non-deterministic algorithms with O(N) average-case.
A good source book to find those algorithms is Introduction to algorithms.
Also in books like
So eventually, your code will run in an O(N) time

How does one remove duplicate elements in place in an array in O(n) in C or C++?

Is there any method to remove the duplicate elements in an array in place in C/C++ in O(n)?
Suppose elements are a[5]={1,2,2,3,4}
then resulting array should contain {1,2,3,4}
The solution can be achieved using two for loops but that would be O(n^2) I believe.
If, and only if, the source array is sorted, this can be done in linear time:
std::unique(a, a + 5); //Returns a pointer to the new logical end of a.
Otherwise you'll have to sort first, which is (99.999% of the time) n lg n.
Best case is O(n log n). Perform a heap sort on the original array: O(n log n) in time, O(1)/in-place in space. Then run through the array sequentially with 2 indices (source & dest) to collapse out repetitions. This has the side effect of not preserving the original order, but since "remove duplicates" doesn't specify which duplicates to remove (first? second? last?), I'm hoping that you don't care that the order is lost.
If you do want to preserve the original order, there's no way to do things in-place. But it's trivial if you make an array of pointers to elements in the original array, do all your work on the pointers, and use them to collapse the original array at the end.
Anyone claiming it can be done in O(n) time and in-place is simply wrong, modulo some arguments about what O(n) and in-place mean. One obvious pseudo-solution, if your elements are 32-bit integers, is to use a 4-gigabit bit-array (512 megabytes in size) initialized to all zeros, flipping a bit on when you see that number and skipping over it if the bit was already on. Of course then you're taking advantage of the fact that n is bounded by a constant, so technically everything is O(1) but with a horrible constant factor. However, I do mention this approach since, if n is bounded by a small constant - for instance if you have 16-bit integers - it's a very practical solution.
Yes. Because access (insertion or lookup) on a hashtable is O(1), you can remove duplicates in O(N).
Pseudocode:
hashtable h = {}
numdups = 0
for (i = 0; i < input.length; i++) {
if (!h.contains(input[i])) {
input[i-numdups] = input[i]
h.add(input[i])
} else {
numdups = numdups + 1
}
This is O(N).
Some commenters have pointed out that whether a hashtable is O(1) depends on a number of things. But in the real world, with a good hash, you can expect constant-time performance. And it is possible to engineer a hash that is O(1) to satisfy the theoreticians.
I'm going to suggest a variation on Borealids answer, but I'll point out up front that it's cheating. Basically, it only works assuming some severe constraints on the values in the array - e.g. that all keys are 32-bit integers.
Instead of a hash table, the idea is to use a bitvector. This is an O(1) memory requirement which should in theory keep Rahul happy (but won't). With the 32-bit integers, the bitvector will require 512MB (ie 2**32 bits) - assuming 8-bit bytes, as some pedant may point out.
As Borealid should point out, this is a hashtable - just using a trivial hash function. This does guarantee that there won't be any collisions. The only way there could be a collision is by having the same value in the input array twice - but since the whole point is to ignore the second and later occurences, this doesn't matter.
Pseudocode for completeness...
src = dest = input.begin ();
while (src != input.end ())
{
if (!bitvector [*src])
{
bitvector [*src] = true;
*dest = *src; dest++;
}
src++;
}
// at this point, dest gives the new end of the array
Just to be really silly (but theoretically correct), I'll also point out that the space requirement is still O(1) even if the array holds 64-bit integers. The constant term is a bit big, I agree, and you may have issues with 64-bit CPUs that can't actually use the full 64 bits of an address, but...
Take your example. If the array elements are bounded integer, you can create a lookup bitarray.
If you find an integer such as 3, turn the 3rd bit on.
If you find an integer such as 5, turn the 5th bit on.
If the array contains elements rather than integer, or the element is not bounded, using a hashtable would be a good choice, since hashtable lookup cost is a constant.
The canonical implementation of the unique() algorithm looks like something similar to the following:
template<typename Fwd>
Fwd unique(Fwd first, Fwd last)
{
if( first == last ) return first;
Fwd result = first;
while( ++first != last ) {
if( !(*result == *first) )
*(++result) = *first;
}
return ++result;
}
This algorithm takes a range of sorted elements. If the range is not sorted, sort it before invoking the algorithm. The algorithm will run in-place, and return an iterator pointing to one-past-the-last-element of the unique'd sequence.
If you can't sort the elements then you've cornered yourself and you have no other choice but to use for the task an algorithm with runtime performance worse than O(n).
This algorithm runs in O(n) runtime. That's big-oh of n, worst case in all cases, not amortized time. It uses O(1) space.
The example you have given is a sorted array. It is possible only in that case (given your constant space constraint)