How to avoid de-reference overhead of pointers, using references in containers? - c++

I'm encountering a design challenge. There is a huge std::vector<int> called O say of size 10000. There are two many objects of type Foo, f_1...f_n. Each Foo has an internal std::vector<int> which is a suborder of O. For example:
O = 1, 2, ..., 100000
f_1.order = 1, 2, 3
f_2.order = 1, 4, 16
f_3.order = 100, 101, 102
// ...
The main requirement is to update corresponding values of O when a f_n changes its values. Length and contents of all Foo objects are knows at construction time and not supposed to change during their lifetime. For example it's known that f_1 holds first, second and third elements of O.
The obvious solution is to use pointers of course. Foo may hold a std::vector<int*> which each element points to underlying data of original order (O).
On the other hand, my program do some heavy calculations using Foo objects. So I'm looking for a method to remove overhead of pointer dereferencing. It would be nice if design allow me to use some sort of std::vector<int&> but it's not possible (I guess because vector<T> needs presence of T*).
A colleague suggested to use boost::ptr_vector. Another suggested holding indexes in a vector<size_t>...

I would say that optimizing for pointer dereferencing overhead is pointless. Let's look at some example code:
void bar(int i);
void foo(int* p, int i)
{
bar(*p);
bar(i);
}
And now let's look at the assembly of it:
void foo(int* p, int i)
{
push rbx
mov ebx, esi
bar(*p);
mov edi, DWORD PTR [rdi]
call a <foo+0xa>
bar(i);
mov edi, ebx
call 11 <foo+0x11>
}
There's an "overhead" of one memory read.
As for using references, it's not gonna do anything useful. References may have different semantics to pointers, but underneath, they're still pointers:
void foo(int& r)
{
bar(r);
mov edi,DWORD PTR [rbx]
call 20 <_Z3fooPiiRi+0x20>
}
There's the same memory read happening.
I'm not sure if this counts as an "answer", but seriously: don't bother.

This cannot be stressed enough - do not optimize your code before you know you have a problem. Pointer dereferencing is not costly, and is usually not the main bottleneck in a program.
Note that references are implemented using pointer dereferencing, so even if you could do std::vector<int&>, it would not help.
If you really, really feel you must do something - even though I'm really, totally sure it can't possibly help your performance in any meaningful sense - you could try overlaying the memory. That is, you could define it like this (note that I'm not in any way endorsing this - I'm only pointing it out so that you don't do something worse):
std::vector<int> O;
struct Foo {
int *myData;
int &operator[](int offset) { return myData[offset]; }
};
O.resize(1000000, 0);
Foo f_1, f_2, ...;
f_1.myData = &(O[0]);
f_2.myData = &(O[3]);
O[0] = 5;
cout << f_1[0]; // prints 5
Also, BTW - please, please, please, do not use the name O as a variable. Please. It looks like a zero.

It sound as premature optimization. Dereference a pointer is ridiculously cheap.
The obvious solution is to use pointers of course. Foo may hold a std::vector which each element points to underlying data of original order (O).
Here a solution, deducting what you need, not using pointers, using std::reference_wrapper and std::ref:
struct Foo
{
Foo(std::vector<int>& _data) : dataFull(_data)
{ ; }
void add(int index)
{
assert(index < dataFull.size());
if(index < references.size())
{
// replace
references[index] = std::ref(dataFull[index]);
}
else
{
// add n times, need sync with index
while(index >= references.size())
{
references.push_back(std::ref(dataFull[index]));
}
}
// mark as valid index
indexes.push_back(index);
// sort for can find with binary_search
std::sort(indexes.begin(), indexes.end());
}
int* get(int index)
{
if(std::binary_search(indexes.begin(), indexes.end(), index))
{
return &references[index].get();
}
else
{
return NULL;
}
}
protected:
std::vector<int>& dataFull;
std::vector<std::reference_wrapper<int> > references;
std::vector<int> indexes;
};
int main()
{
const int size = 1000000;
std::vector<int> O;
O.resize(1000000, 0);
Foo f_1(O);
f_1.add(1);
f_1.add(2);
f_1.add(3);
Foo f_2(O);
f_2.add(1);
f_2.add(4);
f_2.add(16);
Foo f_3(O);
f_3.add(100);
f_3.add(101);
f_3.add(102);
// index 1 is changed, it must affect to all "Foo" that use this index (f_1 and f_2)
O[1] = 666;
// checking if it changed
assert( *f_1.get(1) == 666 );
assert( *f_2.get(1) == 666 );
assert( f_3.get(1) == NULL );
return 0;
}
EDIT:
Performance is same that if you use pointer, but std::reference_wrapper can be integrated best in templated code because you have T& and don't need have code for T* and T&.
Have indexes in other vector, only is useful if your struct is ordered by multiple criteria.
I show a example with a vector, where T is a struct complex with two fields. I can reorder this vector with 2 criterias, without touch the original.
template <typename T>
struct Index
{
typedef typename bool(*Comparator)(const T&, const T&);
Index(std::vector<T>& _data, Comparator _comp)
: dataFull(_data)
, comp(_comp)
{
for(unsigned int i = 0; i < dataFull.size(); ++i)
{
add(i);
}
commit();
}
void commit()
{
std::sort(references.begin(), references.end(), comp);
}
std::vector<std::reference_wrapper<T> >& getReference() {return references;}
protected:
void add(int index)
{
assert(index < dataFull.size());
references.push_back(std::ref(dataFull[index]));
}
protected:
std::vector<T>& dataFull;
std::vector<std::reference_wrapper<T> > references;
Comparator comp;
};
int main()
{
struct ComplexData
{
int field1;
int field2;
};
// Generate vector
const int size = 10;
std::vector<ComplexData> data;
data.resize(size);
for(unsigned int i = 0; i < size; ++i)
{
ComplexData& c = data[i];
c.field1 = i;
c.field2 = size - i;
}
// Vector reordered without touch original
std::cout << "Vector data, ordered by field1" << std::endl;
{
Index<ComplexData> f_1(data,
[](const ComplexData& a, const ComplexData& b){return a.field1 < b.field1;});
auto it = f_1.getReference().begin();
auto ite = f_1.getReference().end();
for(; it != ite; ++it)
{
std::cout << "-> " << it->get().field1 << " - " << it->get().field2 << std::endl;
}
}
// Vector reordered without touch original
std::cout << "Vector data, ordered by field2" << std::endl;
{
Index<ComplexData> f_2(data,
[](const ComplexData& a, const ComplexData& b){return a.field2 < b.field2;});
auto it = f_2.getReference().begin();
auto ite = f_2.getReference().end();
for(; it != ite; ++it)
{
std::cout << "-> " << it->get().field1 << " - " << it->get().field2 << std::endl;
}
}
return 0;
}

Related

How to copy a value inside an array

I have an integer array and the fifth value should always be similar to the first value.
For example, if i have:
int test[] ={1,2,3,4,1}
and if I say:
test[0]= 5
The array should look like: 5 2 3 4 5.
Is there some way to do this? I tried with pointers, but couldn't get a good result.
Test it on GodBolt
Using actual C++ syntax instead of C:
#include <array>
#include <iostream>
void print_list(std::array<int*, 5> const& list){
for( auto const& item : list ){
std::cout << *item << "\t";
}
std::cout << "\n";
}
int main(){
std::array<int, 4> n = {1, 2, 3, 4};
std::array<int*, 5> list = {&n[0], &n[1], &n[2], &n[3], &n[0]};
print_list(list);
*list[0] = 3;
print_list(list);
}
Not possible with integers but pointers would do the trick, see here: https://www.tutorialspoint.com/cplusplus/cpp_array_of_pointers.htm
Something like:
int *sharedNumber = new int(1);
int* ptr[] = {sharedNumber, new int(2), new int(3), new int(4), sharedNumber};
cout<<"values are "<<*ptr[0]<<" "<<*ptr[4]<<"\n";
*sharedNumber = 5;
cout<<"values are "<<*ptr[0]<<" "<<*ptr[4]<<"\n";
Will show 'value is 1' and 'value is 5' for ptr[1] and ptr[4] - just make sure to delete initialised memory to avoid memory leaks.
There is currently no way of doing what you want with the syntax you're providing. Your options, which would (minimally) alter your code are:
Manually setting two elements to the same value (see the comments below your question);
Writing a function for mutating elements that automatically synchronizes the first and fifth elements;
Storing an array of pointers (which seems to be your attempt);
Sharing state between elements of the array so that affecting one element would also change other elements, and that way "synchronizing" their values;
Wrapping your array into a class that manages that array.
It should seem obvious which solutions are, most likely, overkill.
Here is a very simple simple implementation of the second option:
void setElement(int* array, size_t index, int value)
{
array[index] = value;
if (index == 0)
array[4] = value;
if (index == 4)
array[0] = value;
}
Keep in mind that using std::array, std::vector or iterators would probably be a better choice than C-style arrays and raw pointers.
There's a way to achieve this effect for a custom wrapper type, but not for arrays:
Just implement a custom [] operator:
template<class ElementType, size_t arraySize>
class MyArray
{
public:
using value_type = ElementType;
constexpr MyArray() = default;
constexpr MyArray(std::initializer_list<ElementType> arr)
{
std::copy_n(arr.begin(), (std::min)(arraySize, arr.size()), m_data);
}
constexpr size_t size() const
{
return arraySize;
}
constexpr ElementType& operator[](size_t index)
{
return m_data[index % arraySize];
}
constexpr ElementType const& operator[](size_t index) const
{
return m_data[index % arraySize];
}
private:
ElementType m_data[arraySize];
};
int main()
{
MyArray<int, 4> a = { 1, 2, 3, 4 };
std::cout << a[4] << '\n';
a[0] = 5;
std::cout << a[4] << '\n';
}
Note that this just "wraps around"; not only can you access the first element using index 4, but for every index passed the remainder of the division by the array size is used as index. The second element of MyArray<int, 4> could be accessed using indices 1, 5, 9, 4449, ect.

Is list better than vector when we need to store "the last n items"?

There are a lot of questions which suggest that one should always use a vector, but it seems to me that a list would be better for the scenario, where we need to store "the last n items"
For example, say we need to store the last 5 items seen:
Iteration 0:
3,24,51,62,37,
Then at each iteration, the item at index 0 is removed, and the new item is added at the end:
Iteration 1:
24,51,62,37,8
Iteration 2:
51,62,37,8,12
It seems that for this use case, for a vector the complexity will be O(n), since we would have to copy n items, but in a list, it should be O(1), since we are always just chopping off the head, and adding to the tail each iteration.
Is my understanding correct? Is this the actual behaviour of an std::list ?
Neither. Your collection has a fixed size and std::array is sufficient.
The data structure you implement is called a ring buffer. To implement it you create an array and keep track of the offset of the current first element.
When you add an element that would push an item out of the buffer - i.e. when you remove the first element - you increment the offset.
To fetch elements in the buffer you add the index and the offset and take the modulo of this and the length of the buffer.
std::deque is a far better option. Or if you had benchmarked std::deque and found its performance to be inadequate for your specific use, you could implement a circular buffer in a fixed size array, storing the index of the start of the buffer. When replacing an element in the buffer, you would overwrite the element at the start index, and then set the start index to its previous value plus one modulo the size of the buffer.
List traversal is very slow, as list elements can be scattered throughout memory, and vector shifting is actually surprisingly fast, as memory moves on a single block of memory are quite fast even if it is a large block.
The talk Taming The Performance Beast from the Meeting C++ 2015 conference might be of interest to you.
If you can use Boost, try boost::circular_buffer:
It's a kind of sequence similar to std::list or std::deque. It supports random access iterators, constant time insert and erase operations at the beginning or the end of the buffer and interoperability with std algorithms.
It provides fixed capacity storage: when the buffer is filled, new data is written starting at the beginning of the buffer and overwriting the old
// Create a circular buffer with a capacity for 5 integers.
boost::circular_buffer<int> cb(5);
// Insert elements into the buffer.
cb.push_back(3);
cb.push_back(24);
cb.push_back(51);
cb.push_back(62);
cb.push_back(37);
int a = cb[0]; // a == 3
int b = cb[1]; // b == 24
int c = cb[2]; // c == 51
// The buffer is full now, so pushing subsequent
// elements will overwrite the front-most elements.
cb.push_back(8); // overwrite 3 with 8
cb.push_back(12); // overwrite 24 with 12
// The buffer now contains 51, 62, 37, 8, 12.
// Elements can be popped from either the front or the back.
cb.pop_back(); // 12 is removed
cb.pop_front(); // 51 is removed
The circular_buffer stores its elements in a contiguous region of memory, which then enables fast constant-time insertion, removal and random access of elements.
PS ... or implement the circular buffer directly as suggested by Taemyr.
Overload Journal #50 - Aug 2002 has a nice introduction (by Pete Goodliffe) to writing robust STL-like circular buffer.
The problem is that O(n) only talks about the asymptotic behaviour as n tends to infinity. If n is small then the constant factors involved become significant. The result is that for "last 5 integer items" I would be stunned if vector didn't beat list. I would even expect std::vector to beat std::deque.
For "last 500 integer items" I would still expect std::vector to be faster than std::list - but std::deque would now probably win. For "last 5 million slow-to-copy items", std:vector would be slowest of all.
A ring buffer based on std::array or std::vector would probably be faster still though.
As (almost) always with performance issues:
encapsulate with a fixed interface
write the simplest code that can implement that interface
if profiling shows you have a problem, optimize (which will make the code more complicated).
In practise, just using a std::deque, or a pre-built ring-buffer if you have one, will be good enough. (But it's not worth going to the trouble of writing a ring buffer unless profiling says you need to.)
Here is a minimal circular buffer. I'm primarily posting that here to get a metric ton of comments and ideas of improvement.
Minimal Implementation
#include <iterator>
template<typename Container>
class CircularBuffer
{
public:
using iterator = typename Container::iterator;
using value_type = typename Container::value_type;
private:
Container _container;
iterator _pos;
public:
CircularBuffer() : _pos(std::begin(_container)) {}
public:
value_type& operator*() const { return *_pos; }
CircularBuffer& operator++() { ++_pos ; if (_pos == std::end(_container)) _pos = std::begin(_container); return *this; }
CircularBuffer& operator--() { if (_pos == std::begin(_container)) _pos = std::end(_container); --_pos; return *this; }
};
Usage
#include <iostream>
#include <array>
int main()
{
CircularBuffer<std::array<int,5>> buf;
*buf = 1; ++buf;
*buf = 2; ++buf;
*buf = 3; ++buf;
*buf = 4; ++buf;
*buf = 5; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; ++buf;
std::cout << *buf << " "; --buf;
std::cout << *buf << " "; --buf;
std::cout << *buf << " "; --buf;
std::cout << *buf << " "; --buf;
std::cout << *buf << " "; --buf;
std::cout << *buf << " "; --buf;
std::cout << std::endl;
}
Compile with
g++ -std=c++17 -O2 -Wall -Wextra -pedantic -Werror
Demo
On Coliru: try it online
If you need to store last N-elements then logically you are doing some kind of queue or a circular buffer, std::stack and std::deque are implementations of LIFO and FIFO queues.
You can use boost::circular_buffer or implement simple circular buffer manually:
template<int Capcity>
class cbuffer
{
public:
cbuffer() : sz(0), p(0){}
void push_back(int n)
{
buf[p++] = n;
if (sz < Capcity)
sz++;
if (p >= Capcity)
p = 0;
}
int size() const
{
return sz;
}
int operator[](int n) const
{
assert(n < sz);
n = p - sz + n;
if (n < 0)
n += Capcity;
return buf[n];
}
int buf[Capcity];
int sz, p;
};
Sample use for circular buffer of 5 int elements:
int main()
{
cbuffer<5> buf;
// insert random 100 numbers
for (int i = 0; i < 100; ++i)
buf.push_back(rand());
// output to cout contents of the circular buffer
for (int i = 0; i < buf.size(); ++i)
cout << buf[i] << ' ';
}
As a note, keep in mind that when you have only 5 elements the best solution is the one that's fast to implement and works correctly.
Yes. Time complexity of the std::vector for removing elements from the end is linear. std::deque might be a good choice for what you are doing as it offers constant time insertion and removal at the beginning as well as at the end of the list and also better performance than std::list
Source:
http://www.sgi.com/tech/stl/Vector.html
http://www.sgi.com/tech/stl/Deque.html
Here are the beginnings of a ring buffer based dequeue template class that I wrote a while ago, mostly to experiment with using std::allocator (so it does not require T to be default constructible). Note it currently doesn't have iterators, or insert/remove, copy/move constructors, etc.
#ifndef RING_DEQUEUE_H
#define RING_DEQUEUE_H
#include <memory>
#include <type_traits>
#include <limits>
template <typename T, size_t N>
class ring_dequeue {
private:
static_assert(N <= std::numeric_limits<size_t>::max() / 2 &&
N <= std::numeric_limits<size_t>::max() / sizeof(T),
"size of ring_dequeue is too large");
using alloc_traits = std::allocator_traits<std::allocator<T>>;
public:
using value_type = T;
using reference = T&;
using const_reference = const T&;
using difference_type = ssize_t;
using size_type = size_t;
ring_dequeue() = default;
// Disable copy and move constructors for now - if iterators are
// implemented later, then those could be delegated to the InputIterator
// constructor below (using the std::move_iterator adaptor for the move
// constructor case).
ring_dequeue(const ring_dequeue&) = delete;
ring_dequeue(ring_dequeue&&) = delete;
ring_dequeue& operator=(const ring_dequeue&) = delete;
ring_dequeue& operator=(ring_dequeue&&) = delete;
template <typename InputIterator>
ring_dequeue(InputIterator begin, InputIterator end) {
while (m_tailIndex < N && begin != end) {
alloc_traits::construct(m_alloc, reinterpret_cast<T*>(m_buf) + m_tailIndex,
*begin);
++m_tailIndex;
++begin;
}
if (begin != end)
throw std::logic_error("Input range too long");
}
ring_dequeue(std::initializer_list<T> il) :
ring_dequeue(il.begin(), il.end()) { }
~ring_dequeue() noexcept(std::is_nothrow_destructible<T>::value) {
while (m_headIndex < m_tailIndex) {
alloc_traits::destroy(m_alloc, elemPtr(m_headIndex));
m_headIndex++;
}
}
size_t size() const {
return m_tailIndex - m_headIndex;
}
size_t max_size() const {
return N;
}
bool empty() const {
return m_headIndex == m_tailIndex;
}
bool full() const {
return m_headIndex + N == m_tailIndex;
}
template <typename... Args>
void emplace_front(Args&&... args) {
if (full())
throw std::logic_error("ring_dequeue full");
bool wasAtZero = (m_headIndex == 0);
auto newHeadIndex = wasAtZero ? (N - 1) : (m_headIndex - 1);
alloc_traits::construct(m_alloc, elemPtr(newHeadIndex),
std::forward<Args>(args)...);
m_headIndex = newHeadIndex;
if (wasAtZero)
m_tailIndex += N;
}
void push_front(const T& x) {
emplace_front(x);
}
void push_front(T&& x) {
emplace_front(std::move(x));
}
template <typename... Args>
void emplace_back(Args&&... args) {
if (full())
throw std::logic_error("ring_dequeue full");
alloc_traits::construct(m_alloc, elemPtr(m_tailIndex),
std::forward<Args>(args)...);
++m_tailIndex;
}
void push_back(const T& x) {
emplace_back(x);
}
void push_back(T&& x) {
emplace_back(std::move(x));
}
T& front() {
if (empty())
throw std::logic_error("ring_dequeue empty");
return *elemPtr(m_headIndex);
}
const T& front() const {
if (empty())
throw std::logic_error("ring_dequeue empty");
return *elemPtr(m_headIndex);
}
void remove_front() {
if (empty())
throw std::logic_error("ring_dequeue empty");
alloc_traits::destroy(m_alloc, elemPtr(m_headIndex));
++m_headIndex;
if (m_headIndex == N) {
m_headIndex = 0;
m_tailIndex -= N;
}
}
T pop_front() {
T result = std::move(front());
remove_front();
return result;
}
T& back() {
if (empty())
throw std::logic_error("ring_dequeue empty");
return *elemPtr(m_tailIndex - 1);
}
const T& back() const {
if (empty())
throw std::logic_error("ring_dequeue empty");
return *elemPtr(m_tailIndex - 1);
}
void remove_back() {
if (empty())
throw std::logic_error("ring_dequeue empty");
alloc_traits::destroy(m_alloc, elemPtr(m_tailIndex - 1));
--m_tailIndex;
}
T pop_back() {
T result = std::move(back());
remove_back();
return result;
}
private:
alignas(T) char m_buf[N * sizeof(T)];
size_t m_headIndex = 0;
size_t m_tailIndex = 0;
std::allocator<T> m_alloc;
const T* elemPtr(size_t index) const {
if (index >= N)
index -= N;
return reinterpret_cast<const T*>(m_buf) + index;
}
T* elemPtr(size_t index) {
if (index >= N)
index -= N;
return reinterpret_cast<T*>(m_buf) + index;
}
};
#endif
Briefly say the std::vector is better for a non-change size of memory.In your case,if you move all data forward or append new data in a vector,that must be a waste.As #David said the std::deque is a good option,since you would pop_head and push_back eg. two way list.
from the cplus cplus reference about the list
Compared to other base standard sequence containers (array, vector and
deque), lists perform generally better in inserting, extracting and
moving elements in any position within the container for which an
iterator has already been obtained, and therefore also in algorithms
that make intensive use of these, like sorting algorithms.
The main drawback of lists and forward_lists compared to these other
sequence containers is that they lack direct access to the elements by
their position; For example, to access the sixth element in a list,
one has to iterate from a known position (like the beginning or the
end) to that position, which takes linear time in the distance between
these. They also consume some extra memory to keep the linking
information associated to each element (which may be an important
factor for large lists of small-sized elements).
about deque
For operations that involve frequent insertion or removals of elements
at positions other than the beginning or the end, deques perform worse
and have less consistent iterators and references than lists and
forward lists.
vetor
Therefore, compared to arrays, vectors consume more memory in exchange
for the ability to manage storage and grow dynamically in an efficient
way.
Compared to the other dynamic sequence containers (deques, lists and
forward_lists), vectors are very efficient accessing its elements
(just like arrays) and relatively efficient adding or removing
elements from its end. For operations that involve inserting or
removing elements at positions other than the end, they perform worse
than the others, and have less consistent iterators and references
than lists and forward_lists.
I think even use std::deque it also have overhead of copy items in certain condition because std::deque is a map of arrays essentially, so std::list is a good idea to eliminate the copy overhead.
To increase the performance of traverse for std::list, you can implement a memory pool so that the std::list will allocate memory from a trunk and it's spatial locality for caching.

Sorting one std::vector based on the content of another [duplicate]

This question already has answers here:
How can I sort two vectors in the same way, with criteria that uses only one of the vectors?
(9 answers)
Closed 9 months ago.
I have several std::vector, all of the same length. I want to sort one of these vectors, and apply the same transformation to all of the other vectors. Is there a neat way of doing this? (preferably using the STL or Boost)? Some of the vectors hold ints and some of them std::strings.
Pseudo code:
std::vector<int> Index = { 3, 1, 2 };
std::vector<std::string> Values = { "Third", "First", "Second" };
Transformation = sort(Index);
Index is now { 1, 2, 3};
... magic happens as Transformation is applied to Values ...
Values are now { "First", "Second", "Third" };
friol's approach is good when coupled with yours. First, build a vector consisting of the numbers 1…n, along with the elements from the vector dictating the sorting order:
typedef vector<int>::const_iterator myiter;
vector<pair<size_t, myiter> > order(Index.size());
size_t n = 0;
for (myiter it = Index.begin(); it != Index.end(); ++it, ++n)
order[n] = make_pair(n, it);
Now you can sort this array using a custom sorter:
struct ordering {
bool operator ()(pair<size_t, myiter> const& a, pair<size_t, myiter> const& b) {
return *(a.second) < *(b.second);
}
};
sort(order.begin(), order.end(), ordering());
Now you've captured the order of rearrangement inside order (more precisely, in the first component of the items). You can now use this ordering to sort your other vectors. There's probably a very clever in-place variant running in the same time, but until someone else comes up with it, here's one variant that isn't in-place. It uses order as a look-up table for the new index of each element.
template <typename T>
vector<T> sort_from_ref(
vector<T> const& in,
vector<pair<size_t, myiter> > const& reference
) {
vector<T> ret(in.size());
size_t const size = in.size();
for (size_t i = 0; i < size; ++i)
ret[i] = in[reference[i].first];
return ret;
}
typedef std::vector<int> int_vec_t;
typedef std::vector<std::string> str_vec_t;
typedef std::vector<size_t> index_vec_t;
class SequenceGen {
public:
SequenceGen (int start = 0) : current(start) { }
int operator() () { return current++; }
private:
int current;
};
class Comp{
int_vec_t& _v;
public:
Comp(int_vec_t& v) : _v(v) {}
bool operator()(size_t i, size_t j){
return _v[i] < _v[j];
}
};
index_vec_t indices(3);
std::generate(indices.begin(), indices.end(), SequenceGen(0));
//indices are {0, 1, 2}
int_vec_t Index = { 3, 1, 2 };
str_vec_t Values = { "Third", "First", "Second" };
std::sort(indices.begin(), indices.end(), Comp(Index));
//now indices are {1,2,0}
Now you can use the "indices" vector to index into "Values" vector.
Put your values in a Boost Multi-Index container then iterate over to read the values in the order you want. You can even copy them to another vector if you want to.
Only one rough solution comes to my mind: create a vector that is the sum of all other vectors (a vector of structures, like {3,Third,...},{1,First,...}) then sort this vector by the first field, and then split the structures again.
Probably there is a better solution inside Boost or using the standard library.
You can probably define a custom "facade" iterator that does what you need here. It would store iterators to all your vectors or alternatively derive the iterators for all but the first vector from the offset of the first. The tricky part is what that iterator dereferences to: think of something like boost::tuple and make clever use of boost::tie. (If you wanna extend on this idea, you can build these iterator types recursively using templates but you probably never want to write down the type of that - so you either need c++0x auto or a wrapper function for sort that takes ranges)
I think what you really need (but correct me if I'm wrong) is a way to access elements of a container in some order.
Rather than rearranging my original collection, I would borrow a concept from Database design: keep an index, ordered by a certain criterion. This index is an extra indirection that offers great flexibility.
This way it is possible to generate multiple indices according to different members of a class.
using namespace std;
template< typename Iterator, typename Comparator >
struct Index {
vector<Iterator> v;
Index( Iterator from, Iterator end, Comparator& c ){
v.reserve( std::distance(from,end) );
for( ; from != end; ++from ){
v.push_back(from); // no deref!
}
sort( v.begin(), v.end(), c );
}
};
template< typename Iterator, typename Comparator >
Index<Iterator,Comparator> index ( Iterator from, Iterator end, Comparator& c ){
return Index<Iterator,Comparator>(from,end,c);
}
struct mytype {
string name;
double number;
};
template< typename Iter >
struct NameLess : public binary_function<Iter, Iter, bool> {
bool operator()( const Iter& t1, const Iter& t2 ) const { return t1->name < t2->name; }
};
template< typename Iter >
struct NumLess : public binary_function<Iter, Iter, bool> {
bool operator()( const Iter& t1, const Iter& t2 ) const { return t1->number < t2->number; }
};
void indices() {
mytype v[] = { { "me" , 0.0 }
, { "you" , 1.0 }
, { "them" , -1.0 }
};
mytype* vend = v + _countof(v);
Index<mytype*, NameLess<mytype*> > byname( v, vend, NameLess<mytype*>() );
Index<mytype*, NumLess <mytype*> > bynum ( v, vend, NumLess <mytype*>() );
assert( byname.v[0] == v+0 );
assert( byname.v[1] == v+2 );
assert( byname.v[2] == v+1 );
assert( bynum.v[0] == v+2 );
assert( bynum.v[1] == v+0 );
assert( bynum.v[2] == v+1 );
}
A slightly more compact variant of xtofl's answer for if you are just looking to iterate through all your vectors based on the of a single keys vector. Create a permutation vector and use this to index into your other vectors.
#include <boost/iterator/counting_iterator.hpp>
#include <vector>
#include <algorithm>
std::vector<double> keys = ...
std::vector<double> values = ...
std::vector<size_t> indices(boost::counting_iterator<size_t>(0u), boost::counting_iterator<size_t>(keys.size()));
std::sort(begin(indices), end(indices), [&](size_t lhs, size_t rhs) {
return keys[lhs] < keys[rhs];
});
// Now to iterate through the values array.
for (size_t i: indices)
{
std::cout << values[i] << std::endl;
}
ltjax's answer is a great approach - which is actually implemented in boost's zip_iterator http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/zip_iterator.html
It packages together into a tuple whatever iterators you provide it.
You can then create your own comparison function for a sort based on any combination of iterator values in your tuple. For this question, it would just be the first iterator in your tuple.
A nice feature of this approach is that it allows you to keep the memory of each individual vector contiguous (if you're using vectors and that's what you want). You also don't need to store a separate index vector of ints.
This would have been an addendum to Konrad's answer as it an approach for a in-place variant of applying the sort order to a vector. Anyhow since the edit won't go through I will put it here
Here is a in-place variant with a slightly higher time complexity that is due to a primitive operation of checking a boolean. The additional space complexity is of a vector which can be a space efficient compiler dependent implementation. The complexity of a vector can be eliminated if the given order itself can be modified.
Here is a in-place variant with a slightly higher time complexity that is due to a primitive operation of checking a boolean. The additional space complexity is of a vector which can be a space efficient compiler dependent implementation. The complexity of a vector can be eliminated if the given order itself can be modified. This is a example of what the algorithm is doing.
If the order is 3 0 4 1 2, the movement of the elements as indicated by the position indices would be 3--->0; 0--->1; 1--->3; 2--->4; 4--->2.
template<typename T>
struct applyOrderinPlace
{
void operator()(const vector<size_t>& order, vector<T>& vectoOrder)
{
vector<bool> indicator(order.size(),0);
size_t start = 0, cur = 0, next = order[cur];
size_t indx = 0;
T tmp;
while(indx < order.size())
{
//find unprocessed index
if(indicator[indx])
{
++indx;
continue;
}
start = indx;
cur = start;
next = order[cur];
tmp = vectoOrder[start];
while(next != start)
{
vectoOrder[cur] = vectoOrder[next];
indicator[cur] = true;
cur = next;
next = order[next];
}
vectoOrder[cur] = tmp;
indicator[cur] = true;
}
}
};
Here is a relatively simple implementation using index mapping between the ordered and unordered names that will be used to match the ages to the ordered names:
void ordered_pairs()
{
std::vector<std::string> names;
std::vector<int> ages;
// read input and populate the vectors
populate(names, ages);
// print input
print(names, ages);
// sort pairs
std::vector<std::string> sortedNames(names);
std::sort(sortedNames.begin(), sortedNames.end());
std::vector<int> indexMap;
for(unsigned int i = 0; i < sortedNames.size(); ++i)
{
for (unsigned int j = 0; j < names.size(); ++j)
{
if (sortedNames[i] == names[j])
{
indexMap.push_back(j);
break;
}
}
}
// use the index mapping to match the ages to the names
std::vector<int> sortedAges;
for(size_t i = 0; i < indexMap.size(); ++i)
{
sortedAges.push_back(ages[indexMap[i]]);
}
std::cout << "Ordered pairs:\n";
print(sortedNames, sortedAges);
}
For the sake of completeness, here are the functions populate() and print():
void populate(std::vector<std::string>& n, std::vector<int>& a)
{
std::string prompt("Type name and age, separated by white space; 'q' to exit.\n>>");
std::string sentinel = "q";
while (true)
{
// read input
std::cout << prompt;
std::string input;
getline(std::cin, input);
// exit input loop
if (input == sentinel)
{
break;
}
std::stringstream ss(input);
// extract input
std::string name;
int age;
if (ss >> name >> age)
{
n.push_back(name);
a.push_back(age);
}
else
{
std::cout <<"Wrong input format!\n";
}
}
}
and:
void print(const std::vector<std::string>& n, const std::vector<int>& a)
{
if (n.size() != a.size())
{
std::cerr <<"Different number of names and ages!\n";
return;
}
for (unsigned int i = 0; i < n.size(); ++i)
{
std::cout <<'(' << n[i] << ", " << a[i] << ')' << "\n";
}
}
And finally, main() becomes:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
void ordered_pairs();
void populate(std::vector<std::string>&, std::vector<int>&);
void print(const std::vector<std::string>&, const std::vector<int>&);
//=======================================================================
int main()
{
std::cout << "\t\tSimple name - age sorting.\n";
ordered_pairs();
}
//=======================================================================
// Function Definitions...
**// C++ program to demonstrate sorting in vector
// of pair according to 2nd element of pair
#include <iostream>
#include<string>
#include<vector>
#include <algorithm>
using namespace std;
// Driver function to sort the vector elements
// by second element of pairs
bool sortbysec(const pair<char,char> &a,
const pair<int,int> &b)
{
return (a.second < b.second);
}
int main()
{
// declaring vector of pairs
vector< pair <char, int> > vect;
// Initialising 1st and 2nd element of pairs
// with array values
//int arr[] = {10, 20, 5, 40 };
//int arr1[] = {30, 60, 20, 50};
char arr[] = { ' a', 'b', 'c' };
int arr1[] = { 4, 7, 1 };
int n = sizeof(arr)/sizeof(arr[0]);
// Entering values in vector of pairs
for (int i=0; i<n; i++)
vect.push_back( make_pair(arr[i],arr1[i]) );
// Printing the original vector(before sort())
cout << "The vector before sort operation is:\n" ;
for (int i=0; i<n; i++)
{
// "first" and "second" are used to access
// 1st and 2nd element of pair respectively
cout << vect[i].first << " "
<< vect[i].second << endl;
}
// Using sort() function to sort by 2nd element
// of pair
sort(vect.begin(), vect.end(), sortbysec);
// Printing the sorted vector(after using sort())
cout << "The vector after sort operation is:\n" ;
for (int i=0; i<n; i++)
{
// "first" and "second" are used to access
// 1st and 2nd element of pair respectively
cout << vect[i].first << " "
<< vect[i].second << endl;
}
getchar();
return 0;`enter code here`
}**
with C++11 lambdas and the STL algorithms based on answers from Konrad Rudolph and Gabriele D'Antona:
template< typename T, typename U >
std::vector<T> sortVecAByVecB( std::vector<T> & a, std::vector<U> & b ){
// zip the two vectors (A,B)
std::vector<std::pair<T,U>> zipped(a.size());
for( size_t i = 0; i < a.size(); i++ ) zipped[i] = std::make_pair( a[i], b[i] );
// sort according to B
std::sort(zipped.begin(), zipped.end(), []( auto & lop, auto & rop ) { return lop.second < rop.second; });
// extract sorted A
std::vector<T> sorted;
std::transform(zipped.begin(), zipped.end(), std::back_inserter(sorted), []( auto & pair ){ return pair.first; });
return sorted;
}
So many asked this question and nobody came up with a satisfactory answer. Here is a std::sort helper that enables to sort two vectors simultaneously, taking into account the values of only one vector. This solution is based on a custom RadomIt (random iterator), and operates directly on the original vector data, without temporary copies, structure rearrangement or additional indices:
C++, Sort One Vector Based On Another One

How to access the contents of a vector from a pointer to the vector in C++?

I have a pointer to a vector. Now, how can I read the contents of the vector through pointer?
There are many solutions, here's a few I've come up with:
int main(int nArgs, char ** vArgs)
{
vector<int> *v = new vector<int>(10);
v->at(2); //Retrieve using pointer to member
v->operator[](2); //Retrieve using pointer to operator member
v->size(); //Retrieve size
vector<int> &vr = *v; //Create a reference
vr[2]; //Normal access through reference
delete &vr; //Delete the reference. You could do the same with
//a pointer (but not both!)
}
Access it like any other pointer value:
std::vector<int>* v = new std::vector<int>();
v->push_back(0);
v->push_back(12);
v->push_back(1);
int twelve = v->at(1);
int one = (*v)[2];
// iterate it
for(std::vector<int>::const_iterator cit = v->begin(), e = v->end();
cit != e; ++cit)
{
int value = *cit;
}
// or, more perversely
for(int x = 0; x < v->size(); ++x)
{
int value = (*v)[x];
}
// Or -- with C++ 11 support
for(auto i : *v)
{
int value = i;
}
Do you have a pointer to a vector because that's how you've coded it? You may want to reconsider this and use a (possibly const) reference. For example:
#include <iostream>
#include <vector>
using namespace std;
void foo(vector<int>* a)
{
cout << a->at(0) << a->at(1) << a->at(2) << endl;
// expected result is "123"
}
int main()
{
vector<int> a;
a.push_back(1);
a.push_back(2);
a.push_back(3);
foo(&a);
}
While this is a valid program, the general C++ style is to pass a vector by reference rather than by pointer. This will be just as efficient, but then you don't have to deal with possibly null pointers and memory allocation/cleanup, etc. Use a const reference if you aren't going to modify the vector, and a non-const reference if you do need to make modifications.
Here's the references version of the above program:
#include <iostream>
#include <vector>
using namespace std;
void foo(const vector<int>& a)
{
cout << a[0] << a[1] << a[2] << endl;
// expected result is "123"
}
int main()
{
vector<int> a;
a.push_back(1);
a.push_back(2);
a.push_back(3);
foo(a);
}
As you can see, all of the information contained within a will be passed to the function foo, but it will not copy an entirely new value, since it is being passed by reference. It is therefore just as efficient as passing by pointer, and you can use it as a normal value rather than having to figure out how to use it as a pointer or having to dereference it.
vector<int> v;
v.push_back(906);
vector<int> * p = &v;
cout << (*p)[0] << endl;
You can access the iterator methods directly:
std::vector<int> *intVec;
std::vector<int>::iterator it;
for( it = intVec->begin(); it != intVec->end(); ++it )
{
}
If you want the array-access operator, you'd have to de-reference the pointer. For example:
std::vector<int> *intVec;
int val = (*intVec)[0];
There are a lot of solutions. For example you can use at() method.
*I assumed that you a looking for equivalent to [] operator.
vector <int> numbers {10,20,30,40};
vector <int> *ptr {nullptr};
ptr = &numbers;
for(auto num: *ptr){
cout << num << endl;
}
cout << (*ptr).at(2) << endl; // 20
cout << "-------" << endl;
cout << ptr -> at(2) << endl; // 20
The easiest way use it as array is use vector::data() member.

Comparing arrays of objects with arrays of fields of objects

Is there a good way to compare arr[i].A to A[i] and arr[i].B to B?
int A[10], B[10];
class Foo {
int A, B;
};
Foo arr[10];
I could do the following:
for (i=0;i<10;i++) {
if (A[i] == arr[i].A) {}
if (B[i] == arr[i].B) {}
}
But, this is painful especially if there are a lot of fields, and the if() conditional does the same thing over and over, there will be a lot of code duplication. What I really want to do is parametrize this somehow and call a function like (test(A,arr)). I guess I can solve this by using #define macros, but that seems ugly.
Any suggestions?
Also I want to avoid creating a new array of Foo objects because I don't want to create new objects that may have many fields I don't care about, also I may want to compare different subsets of fields.
IF the ranges are of equal size you can use std::equal with a predicate (or a lambda):
bool CompA( int lhs, Foo rhs ){
return lhs == rhs.A;
};
...
// to check for equality
bool result = std::equal( A, A + 10, arr, CompA );
...
// to see where the mismatch is
std::pair< int*, Foo* > result = std::mismatch( A, A + 10, arr, CompA );
size_t index = result.first - A;
if( index < 10 ){
std::cout << "Mismatch at index " << index << " A = " << *result.first << " Foo.A = " << (*result.second).A << std::endl;
}
There are standard-library algorithms for doing operations on containers (including arrays, kinda) but using them typically produces code that's harder to read and maintain, and no shorter or more efficient, than straightforward loops.
However, it sounds as if you might want to know about pointers-to-members.
bool all_equal(int Foo::* member, const Foo * obj_array, const int * elem_array, size_t n) {
for (int i=0; i<n; ++i) {
if (obj_array[i].*member != elem_array[i]) return false;
}
return true;
}
...
if (all_equal(&Foo::A, arr, A, 10) && all_equal(&Foo::*B, arr, B, 10)) ...
although actually you should probably generalize it:
template<typename T, typename E>
bool all_equal(E T::* member, const T* obj_array, const E* elem_array, size_t n) {
for (int i=0; i<n; ++i) {
if (obj_array[i].*member != elem_array[i]) return false;
}
return true;
}
(Danger: all code above is untested and may consist entirely of bugs.)