min n elements with expensive or deleted default constructor - c++

Given an array v (some STL container, e.g. std::vector< double >) of generally unsorted data (say assert(std::is_same< typeof(v), V >::value);). Over the elements of the array is defined comparison operator, say std::less. You need to create an array with n minimal elements (copies form v), but the elements are not default constructible (or is expensive operation). How to do it by means of STL? Non-modifying sequence algorithm is required.
Originally seen as a way to solve using std::back_insert_iterator, but there is some confusion as explained further:
assert(!std::is_default_constructible< typename V::value_type >::value); // assume
template< class V >
V min_n_elements(typename V::const_iterator begin, typename V::const_iterator end, typename V::size_type const n)
assert(!(std::distance(begin, end) < n));
V result; // V result(n); not allowed
std::partial_sort_copy(begin, end, std::back_inserter(result), /*What should be here? mb something X(result.capacity())?*/, std::less< typename V::value_type >());
return result;
I want to find solution that is optimal in terms of time and memory (O(1) additional memory and <= O(std::partial_sort_copy) time consumption). Totally algorithm should operate on the following number of memory: v.size() elements of non-modifiable source v as input and n of newly created elements, all of which are copies of the n smallest elements of source array v, as output. That's all. I think this is a realistic limits.

EDIT: reimplemented with heap:
template< class V >
V min_n_elements(typename V::const_iterator b, typename V::const_iterator e, typename V::size_type const n) {
assert(std::distance(b, e) >= n);
V res(b, b+n);
make_heap(res.begin(), res.end());
for (auto i=b+n; i<e; ++i) {
if (*i < res.front()) {
pop_heap(res.begin(), res.end());
res.back() = *i;
push_heap(res.begin(), res.end());
return std::move(res);

Unless you also need those elements sorted, it's probably easiest and fastest to use std::nth_element, then std::copy.
template <class InIter, class OutIter>
min_n_elements(InIter b, InIter e, OutIter o, InIter::difference_type n) {
InIter pos = b+n;
std::nth_element(b, pos, e);
std:copy(b, pos, o);
std::nth_element not only finds the given element, but guarantees that those elements less than that are two it's "left", and those greater are to its "right".
This does side-step the real problem a bit though -- instead of actually creating the container for the results, it simply expects the user to create a container of the correct type, and then provide an iterator (e.g., a back_insert_iterator) to put the data in the right place. At the same time, I think this is really the correct thing to do -- the algorithm to find N minimum elements and the choice of container for the destination are separate.
If you really want to put the result in a specific container type anyway, that shouldn't be terribly difficult though:
template <class V>
V n_min_element(V::iterator b, V::iterator e) {
V::const_iterator pos = b+n;
nth_element(b, pos, e);
V ret(b, pos);
return V;
As they stand, these do modify the (order of elements in) the input, but given that you've said the input isn't sorted, I'm assuming their order doesn't matter, so that should be permissible. If you can't do that, the next possibility is probably to create a collection of pointers, and use a comparison function that compares based on the pointees, then do your nth_element on that, and finally copy the pointees to the new collection.


What is the fastest way to get the frequency of numbers in an array in C++?

My method creates an std::map<int, int> and populates it with the number and its frequency by iterating over the array once, but I'm wondering if there's a quicker way without using a map.
std::unordered_map<int,int> can count frequencies as well but its operator[] has complexity (cppreference):
Average case: constant, worst case: linear in size.
Compared to
Logarithmic in the size of the container.
with a std::map.
When the maximum number is small you can use an array, and directly count:
for (const auto& number : array) counter[number]++;
Admittetly, all this has already been said in comments, so I'll also add this one: You need to measure. Complexity is only about asymptotic runtime, while for given input size a std::map can actually be faster.
NOTE: ValueType, DifferenceType are defined to be
template <std::input_iterator I>
using ValueType = typename std::iterator_traits<I>::value_type;
template <std::input_iterator I>
DifferenceType = typename std::iterator_traits<I>::difference_type;
If the array is sorted, you can use std::equal_range to find the range of elements that is equal to x. With concepts you write:
// I2 is homomorphic to std::pair<I, unsigned>
// [first, last) is partially ordered with respect to I::value_type
// return value is d_first + |{x | x in [first, last)}|
// R is a relation over I, compare element using R
template <std::random_access_iterator I, std::forward_iterator I2,
std::relation<bool, ValueType<I>> R = std::less<ValueType<I>>>
requires(std::regular<ValueType<I>> &&
std::is_constructible_v<ValueType<I2>, I, DifferenceType<I>>)
I2 frequency_sorted(I first, I last, I2 d_first, R r = R())
while(first != last)
auto [left, right] = std::equal_range(first, last, *first, r);
*d_first = {left, std::distance(left, right)};
first = right;
return d_first;
If you have limited resources, you can truncate the result and have:
// I2 is homomorphic to std::pair<I, unsigned>
// [first, last) is partially ordered with respect to I::value_type
// return value is a pair, where the first element is
// the starting point of subsequence [first, last) where such
// subsequence is unevaluated
// the second element is
// - d_last if |{x | x in [first, last)}| >= d_last - d_first
// - d_first + |{x | x in [first, last)}| if otherwise
template <std::random_access_iterator I, std::forward_iterator I2,
std::relation<bool, ValueType<I>> R = std::less<ValueType<I>>>
requires(std::regular<ValueType<I>> &&
std::is_constructible_v<ValueType<I2>, I, DifferenceType<I>>)
std::pair<I, I2>
frequency_sorted_truncate(I first, I last, I2 d_first, I2 d_last, R r = R())
while(first != last && d_first != d_last)
auto [left, right] = std::equal_range(first, last, *first, r);
*d_first = {left, std::distance(left, right)};
first = right;
return {first, d_first};
These two functions allow you to pass in any relation, and the default comparison uses operator<.
If your array is unsorted, and the size of the array is large enough, then it might be a good idea to just sort the array and use the algorithm. Hashing might be tempting but it creates cache miss and might not be as fast as you would expect. You can try both methods and measure which one is faster, you are welcome to tell me the result.
My compiler version is g++ 11.2.11, I think the code can be compiled with a C++ 20 compiler. If you don't have one, simply replace the concepts part with typename, I think by doing that you will only need a C++ 17 compiler(due to structural binding).
Please tell me whether my code can be improved.

Function on multiple vector

I have a sorting algorithm on a vector, and I want to apply it to several vectors, without knowing how much. The only thing I'm sure is that there will be at least 1 vector (always the same) on which I will perform my algorithm. Other will just follow.
Here's an example :
void sort(std::vector<int>& sortVector, std::vector<double>& follow1, std::vector<char>& follow2, ... ){
for (int i = 1; i<vector.size(); ++i){
if ( vector[i-1] > vector[i] ) { //I know it's not sorting here, it's only for the example
std::swap(vector[i-1], vector[i]);
std::swap(follow1[i-1], follow1[i]);
std::swap(follow2[i-1], follow2[i]);
I was thinking about using variadic function, but since it's a recursive function, I was wondering if it won't take too much time to everytime create my va_arg list (I'm working on vector sized 500millions/1billions ...). So does something else exists?
As I'm writing this question, I'm understanding that maybe i'm fooling myself, and there is no other way to achieve what I want and variadic function is maybe not that long. (I really don't know, in fact).
In fact, I'm doing an Octree-sorting of datas in order to be usable in opengl.
Since my datas are not always the same (e.g OBJ files will gives me normals, PTS files will gives me Intensity and Colors, ...), I want to be able to reorder all my vectors (in which are contained my datas) so that they have the same order as the position vectors (The vector that contains the positions of my points, it'll be always here).
But all my vectors will have same length, and I want all my followervector to be reorganised as the first one.
If i have 3 Vectors, if I swap first and third values in my first vector, I want to swap first and thrid values in my 2 others vectors.
But my vectors are not all the same. Some will be std::vector<char>, other std::vector<Vec3>, std::vector<unsigned>, and so on.
With range-v3, you may use zip, something like:
template <typename T, typename ... Ranges>
void sort(std::vector<T>& refVector, Ranges&& ... ranges){
ranges::sort(ranges::view::zip(refVector, std::forward<Ranges>(ranges)...));
Or if you don't want to use ranges to compare (for ties in refVector), you can project to use only refVector:
template <typename T, typename ... Ranges>
void sort(std::vector<T>& refVector, Ranges&& ... ranges){
ranges::sort(ranges::view::zip(refVector, std::forward<Ranges>(ranges)...),
[](auto& tup) -> T& { return std::get<0>(tup); });
Although, I totally agree with the comment of n.m. I suggest to use a vector of vectors which contain the follow vectors and than do a loop over all follow vectors.
void sort(std::vector<int>& vector, std::vector<std::vector<double>>& followers){
for (int i = 1; i<vector.size(); ++i){
if ( vector[i-1] > vector[i] ) {
std::swap(vector[i-1], vector[i]);
for (auto & follow : followers)
std::swap(follow[i-1], follow[i]);
Nevertheless, as n.m. pointed out, perhaps think about putting all your data you like to sort in a class like structure. Than you can have a vector of your class and apply std::sort, see here.
struct MyStruct
int key; //content of your int vector named "vector"
double follow1;
std::string follow2;
// all your inforrmation of the follow vectors go here.
MyStruct(int k, const std::string& s) : key(k), stringValue(s) {}
struct less_than_key
inline bool operator() (const MyStruct& struct1, const MyStruct& struct2)
return (struct1.key < struct2.key);
std::vector < MyStruct > vec;
vec.push_back(MyStruct(4, 1.2, "test"));
vec.push_back(MyStruct(3, 2.8, "a"));
vec.push_back(MyStruct(2, 0.0, "is"));
vec.push_back(MyStruct(1, -10.5, "this"));
std::sort(vec.begin(), vec.end(), less_than_key());
The main problem here is that the std::sort algorithm cannot operate on multiple vectors at the same time.
For the purpose of demonstration, let's assume you have a std::vector<int> v1 and a std::vector<char> v2 (of the same size of course) and you want to sort both depending on the values in v1. To solve this, I basically see three possible solutions, all of which generalize to an arbitrary number of vectors:
1) Put all your data into a single vector.
Define a struct, say Data, that keeps an entry of every data vector.
struct Data
int d1;
char d2;
// extend here for more vectors
Now construct a new std::vector<Data> and fill it from your original vectors:
std::vector<Data> d(v1.size());
for(std::size_t i = 0; i < d.size(); ++i)
d[i].d1 = v1[i];
d[i].d2 = v2[i];
// extend here for more vectors
Since everything is stored inside a single vector now, you can use std::sort to bring it into order. Since we want it to be sorted based on the first entry (d1), which stores the values of the first vector, we use a custom predicate:
std::sort(d.begin(), d.end(),
[](const Data& l, const Data& r) { return l.d1 < r.d1; });
Afterwards, all data is sorted in d based on the first vector's values. You can now either work on with the combined vector d or you split the data into the original vectors:
std::transform(d.begin(), d.end(), v1.begin(),
[](const Data& e) { return e.d1; });
std::transform(d.begin(), d.end(), v2.begin(),
[](const Data& e) { return e.d2; });
// extend here for more vectors
2) Use the first vector to compute the indices of the sorted range and use these indices to bring all vectors into order:
First, you attach to all elements in your first vector their current position. Then you sort it using std::sort and a predicate that only compares for the value (ignoring the position).
template<typename T>
std::vector<std::size_t> computeSortIndices(const std::vector<T>& v)
std::vector<std::pair<T, std::size_t>> d(v.size());
for(std::size_t i = 0; i < v.size(); ++i)
d[i] = std::make_pair(v[i], i);
std::sort(d.begin(), d.end(),
[](const std::pair<T, std::size_t>& l,
const std::pair<T, std::size_t>& r)
return l.first < r.first;
std::vector<std::size_t> indices(v.size());
std::transform(d.begin(), d.end(), indices.begin(),
[](const std::pair<T, std::size_t>& p) { return p.second; });
return indices;
Say in the resulting index vector the entry at position 0 is 8, then this tells you that the vector entries that have to go to the first position in the sorted vectors are those at position 8 in the original ranges.
You then use this information to sort all of your vectors:
template<typename T>
void sortByIndices(std::vector<T>& v,
const std::vector<std::size_t>& indices)
assert(v.size() == indices.size());
std::vector<T> result(v.size());
for(std::size_t i = 0; i < indices.size(); ++i)
result[i] = v[indices[i]];
v = std::move(result);
Any number of vectors may then be sorted like this:
const auto indices = computeSortIndices(v1);
sortByIndices(v1, indices);
sortByIndices(v2, indices);
// extend here for more vectors
This can be improved a bit by extracting the sorted v1 out of computeSortIndices directly, so that you do not need to sort it again using sortByIndices.
3) Implement your own sort function that is able to operate on multiple vectors. I have sketched an implementation of an in-place merge sort that is able to sort any number of vectors depending on the values in the first one.
The core of the merge sort algorithm is implemented by the multiMergeSortRec function, which takes an arbitrary number (> 0) of vectors of arbitrary types.
The function splits all vectors into first and second half, sorts both halves recursively and merges the the results back together. Search the web for a full explanation of merge sort if you need more details.
template<typename T, typename... Ts>
void multiMergeSortRec(
std::size_t b, std::size_t e,
std::vector<T>& v, std::vector<Ts>&... vs)
const std::size_t dist = e - b;
if(dist <= 1)
std::size_t m = b + (dist / static_cast<std::size_t>(2));
// split in half and recursively sort both parts
multiMergeSortRec(b, m, v, vs...);
multiMergeSortRec(m, e, v, vs...);
// merge both sorted parts
while(b < m)
if(v[b] <= v[m])
rotateAll(b, m, v, vs...);
if(m == e)
template<typename T, typename... Ts>
void multiMergeSort(std::vector<T>& v, std::vector<Ts>&... vs)
// TODO: check that all vectors have same length
if(v.size() < 2)
return ;
multiMergeSortRec<T, Ts...>(0, v.size(), v, vs...);
In order to operate in-place, parts of the vectors have to be rotated. This is done by the rotateAll function, which again works on an arbitrary number of vectors by recursively processing the variadic parameter pack.
void rotateAll(std::size_t, std::size_t)
template<typename T, typename... Ts>
void rotateAll(std::size_t b, std::size_t e,
std::vector<T>& v, std::vector<Ts>&... vs)
std::rotate(v.begin() + b, v.begin() + e - 1, v.begin() + e);
rotateAll(b, e, vs...);
Note, that the recursive calls of rotateAll are very likely to be inlined by every optimizing compiler, such that the function merely applies std::rotate to all vectors. You can circumvent the need to rotate parts of the vector, if you leave in-place and merge into an additional vector. I like to emphasize that this is neither an optimized nor a fully tested implementation of merge sort. It should serve as a sketch, since you really do not want to use bubble sort whenever you work on large vectors.
Let's quickly compare the above alternatives:
1) is easier to implement, since it relies on an existing (highly optimized and tested) std::sort implementation.
1) needs all data to be copied into the new vector and possibly (depending on your use case) all of it to be copied back.
In 1) multiple places have to be extended if you need to attach additional vectors to be sorted.
The implementation effort for 2) is mediocre (more than 1, but less and easier than 3), but it relies on optimized and tested std::sort.
2) cannot sort in-place (using the indices) and thus has to make a copy of every vector. Maybe there is an in-place alternative, but I cannot think of one right now (at least an easy one).
2) is easy to extend for additional vectors.
For 3) you need to implement sorting yourself, which makes it more difficult to get right.
3) does not need to copy all data. The implementation can be further optimized and can be tweaked for improved performance (out-of-place) or reduced memory consumption (in-place).
3) can work on additional vectors without any change. Just invoke multiMergeSort with one or more additional arguments.
All three work for heterogeneous sets of vectors, in contrast to the std::vector<std::vector<>> approach.
Which of the alternatives performs better in your case, is hard to say and should greatly depend on the number of vectors and their size, so if you really need optimal performance (and/or memory usage) you need to measure.
Find an implementation of the above here.
By far the easiest solution is to create a helper vector std::vector<size_t> initialized with std::iota(helper.begin(), helper.end(), size_t{});.
Next, sort this array,. obviously not by the array index (iota already did that), but by sortvector[i]. IOW, the predicate is [sortvector&](size_t i, size_t j) { sortVector[i] < sortVector[j]; }.
You now have the proper order of array indices. I.e. if helper[0]==17, then it means that the new front of all vectors should be the original 18th element. Usually the easiest way to produce the sorted result is to copy over elements, and then swap the original vector and the copy, repeated for all vectors. But if copying all elements is too expensive, it can be done in-place. (Note that if O(N) element copes are too expensive, a straightforward std::sort tends to perform badly as well as it needs pivots)

Extract subvector in constant time

I have a std::vector<int> and I want to throw away the x first and y last elements. Just copying the elements is not an option, since this is O(n).
Is there something like vector.begin()+=x to let the vector just start later and end earlier?
I also tried
items = std::vector<int> (&items[x+1],&items[0]+items.size()-y);
where items is my vector, but this gave me bad_alloc
C++ standard algorithms work on ranges, not on actual containers, so you don't need to extract anything: you just need to adjust the iterator range you're working with.
void foo(const std::vector<T>& vec, const size_t start, const size_t end)
assert(vec.size() >= end-start);
auto it1 = vec.begin() + start;
auto it2 = vec.begin() + end;
std::whatever(it1, it2);
I don't see why it needs to be any more complicated than that.
(trivial live demo)
If you only need a range of values, you can represent that as a pair of iterators from first to last element of the range. These can be acquired in constant time.
Edit: According to the description in the comments, this seems like the most sensible solution. If your functions expect a vector reference, then you'll need to refactor a bit.
Other solutions:
If you don't need the original vector, and therefore can modify it, and the order of elements is not relevant, you can swap the first x elements with the n-x-y...n-y elements and then remove the last x+y elements. This can be done in O(x+y) time.
If appropriate, you could choose to use std::list for which what you're asking can be done in constant time if you have iterators to the first and last node of the sublist. This also requires that you can modify the original list but the order of elements won't change.
If those are not options, then you need to copy and are stuck with O(n).
The other answers are correct: usually iterators will do.
Nevertheless, you can also write a vector view. Here is a sketch:
template<typename T>
struct vector_view
vector_view(std::vector<T> const& v, size_t ind_begin, size_t ind_end)
: _v(v)
, _size(/* size of range */)
, _ind_begin(ind_begin) {}
auto size() const { return _size; }
auto const& operator[](size_t i) const
//possibly check for input outside range
return _v[ i + _ind_begin ];
//conversion of view to std::vector
operator std::vector<T>() const
std::vector<T> ret(_size);
//fill it
return ret;
std::vector<T> const& _v;
size_t _size;
size_t _ind_begin;
Expose further methods as required (some iterator stuff might be appropriate when you want to use that with the standard library algorithms).
Further, take care on the validity of the const reference std::vector<T> const& v; -- if that could be an issue, one should better work with shared-pointers.
One can also think of more general approaches here, for example, use strides or similar things.

Accessing elements of a list of lists in C++

I have a list of lists like this:
std::list<std::list<double> > list;
I filled it with some lists with doubles in them (actually quite a lot, which is why I am not using a vector. All this copying takes up a lot of time.)
Say I want to access the element that could be accesed like list[3][3] if the list were not a list but a vector or two dimensional array. How would I do that?
I know that accessing elements in a list is accomplished by using an iterator. I couldn't figure out how to get out the double though.
double item = *std::next(std::begin(*std::next(std::begin(list), 3)), 3);
Using a vector would usually have much better performance, though; accessing element n of a list is O(n).
If you're concerned about performance of splicing the interior of the container, you could use deque, which has operator[], amortized constant insertion and deletion from either end, and linear time insertion and deletion from the interior.
For C++03 compilers, you can implement begin and next yourself:
template<typename Container>
typename Container::iterator begin(Container &container)
return container.begin();
template<typename Container>
typename Container::const_iterator begin(const Container &container)
return container.begin();
template<typename T, int n>
T *begin(T (&array)[n])
return &array[0];
template<typename Iterator>
Iterator next(Iterator it, typename std::iterator_traits<Iterator>::difference_type n = 1)
std::advance(it, n);
return it;
To actually answer your question, you should probably look at std::advance.
To strictly answer your question, Joachim Pileborg's answer is the way to go:
std::list<std::list<double> >::iterator it = list.begin();
std::advance(it, 3);
std::list<double>::iterator it2 = (*it).begin();
std::advance(it2, 3);
double d = *it2;
Now, from your question and further comments it is not clear whether you always add elements to the end of the lists or they can be added anywhere. If you always add to the end, vector<double> will work better. A vector<T> does not need to be copied every time its size increases; only whenever its capacity increases, which is a very different thing.
In addition to this, using reserve(), as others said before, will help a lot with the reallocations. You don't need to reserve for the combined size of all vectors, but only for each individual vector. So:
std::vector<std::vector<double> > v;
v.reserve(512); // If you are inserting 400 vectors, with a little extra just in case
And you would also reserve for each vector<double> inside v. That's all.
Take into account that your list of lists will take much more space. For each double in the internal list, it will have to allocate at least two additional pointers, and also two additional pointers for each list inside the global least. This means that the total memory taken by your container will be roughly three times that of the vector. And all this allocation and management also takes extra runtime.

algorithm to remove elements in the intersection of two sets

I have a Visual Studio 2008 C++03 application where I have two standard containers. I would like to remove from one container all of the items that are present in the other container (the intersection of the sets).
something like this:
std::vector< int > items = /* 1, 2, 3, 4, 5, 6, 7 */;
std::set< int > items_to_remove = /* 2, 4, 5*/;
std::some_algorithm( items.begin, items.end(), items_to_remove.begin(), items_to_remove.end() );
assert( items == /* 1, 3, 6, 7 */ )
Is there an existing algorithm or pattern that will do this or do I need to roll my own?
Try with:
items.begin(), items.end()
, std::bind1st(
std::mem_fun( &std::set< int >::count )
, items_to_remove
, items.end()
std::remove(_if) doesn't actually remove anything, since it works with iterators and not containers. What it does is reorder the elements to be removed at the end of the range, and returns an iterator to the new end of the container. You then call erase to actually remove from the container all of the elements past the new end.
Update: If I recall correctly, binding to a member function of a component of the standard library is not standard C++, as implementations are allowed to add default parameters to the function. You'd be safer by creating your own function or function-object predicate that checks whether the element is contained in the set of items to remove.
Personally, I prefer to create small helpers for this (that I reuse heavily).
template <typename Container>
class InPredicate {
InPredicate(Container const& c): _c(c) {}
template <typename U>
bool operator()(U const& u) {
return std::find(_c.begin(), _c.end(), u) != _c.end();
Container const& _c;
// Typical builder for automatic type deduction
template <typename Container>
InPredicate<Container> in(Container const& c) {
return InPredicate<Container>(c);
This also helps to have a true erase_if algorithm
template <typename Container, typename Predicate>
void erase_if(Container& c, Predicate p) {
c.erase(std::remove_if(c.begin(), c.end(), p), c.end());
And then:
erase_if(items, in(items_to_remove));
which is pretty readable :)
One more solution:
There is standard provided algorithm set_difference which can be used for this.
But it requires extra container to hold the result. I personally prefer to do it in-place.
std::vector< int > items;
//say items = [1,2,3,4,5,6,7,8,9]
//say items_to_remove = <2,4,5>
std::vector<int>result(items.size()); //as this algorithm uses output
//iterator not inserter iterator for result.
std::vector<int>::iterator new_end = std::set_difference(items.begin(),
result.erase(new_end,result.end()); // to erase unwanted elements at the
// end.
You can use std::erase in combination with std::remove for this. There is a C++ idiom called the Erase - Remove idiom, which is going to help you accomplish this.
Assuming you have two sets, A and B, and you want to remove from B, the intersection, I, of (A,B) such that I = A^B, your final results will be:
A (left intact)
B' = B-I
Full theory:
This is quite simple.
Create and populate A and B
Create a third intermediate vector, I
Copy the contents of B into I
For each element a_j of A, which contains j elements, search I for the element a_j; If the element is found in I, remove it
Finally, the code to remove an individual element can be found here:
How do I remove an item from a stl vector with a certain value?
And the code to search for an item is here:
How to find if an item is present in a std::vector?
Good luck!
Here's a more "hands-on" in-place method that doesn't require fancy functions nor do the vectors need to be sorted:
#include <vector>
template <class TYPE>
void remove_intersection(std::vector<TYPE> &items, const std::vector<TYPE> &items_to_remove)
for (int i = 0; i < (int)items_to_remove.size(); i++) {
for (int j = 0; j < (int)items.size(); j++) {
if (items_to_remove[i] == items[j]) {
items.erase(items.begin() + j);
j--;//Roll back the iterator to prevent skipping over
If you know that the multiplicity in each set is 1 (not a multiset), then you can actually replace the j--; line with a break; for better performance.