Sorting a structure of arrays [duplicate] - c++

This question already has answers here:
Sorting zipped (locked) containers in C++ using boost or the STL
(5 answers)
Closed 4 years ago.
I have some data laid out as follows:
size_t num_elements = //...
some_type_t *data = //...
int *scores = //...
Each element data[i] has a corresponding score in scores[i]. I would like to sort both data and scores, using the array of scores to order the data.
For example, for the data:
data = {'d', 'g', 'i', 'a', 'p'}
scores = {3, 5, 1, 2, 4}
the sorted version would be
data = {'i', 'a', 'd', 'p', 'g'}
scores = {1, 2, 3, 4, 5}
Is there a way to do this with the C++ standard library?
I would prefer not to need to include Boost or libraries that have not yet been standardised.
I would also like to avoid unnecessarily copying the the data. This includes converting it to an array of structures.

Assuming there isn't a reason to NOT combine the two data arrays, the simplest answer would be to merge them into a struct or a class (likely the former) with an overloaded operator. Then you can define an array of these structures/classes that will bind the data together so the data and score are moved together.
struct ScoredData
{
some_type_t data;
int score;
bool operator<(const ScoredData& right)
{
return this->score < right.score;
}
}
(This example could be extended by making some_type_t a template parameter)
If combining these in this way is not acceptable, you may find success defining an iterator that mimicks this behavior.

One other way to do this is to create an array of indexes (initialize it with 0,1,2,3...) and sort it using the score[a] < score[b] comparison.
Then you need to rearrange the scores and the data arrays according to the indexes. For example, if indexes[0] = 3 after sorting, you need to move elements score[0] = score[3]; data[0] = data[3].
I do not see a way to avoid copying when you rearrange, unfortunately.

can you group the elements in a new array of these couples, then sort it using sort and comparing at the right couple element, then update the initial arrays from the sorted array of couple ?
better :
you create a vector of couples made by the score values and the index 0 for the first couple, then 1 for the second, 2 for the third etc So you do not copy the datas
you qsort that array of couples considering the score part, after you just have to look at the new order of the indexes in the couples to sort scores and data easily updating them

Related

Sort merged array consisting of sorted arrays

I got two sorted arrays e.g. (3,4,5) and (1,3,7,8) and I got the combined sorted array (3,4,5,1,3,7,8).
Now I would like to sort the already combined array, without splitting it, but by overwriting it, by making use of the fact that it consists of 2 arrays which had already been sorted. Is there any way of doing this efficiently? I know there are a lot of threads about how to do this, by iterating through the sorted arrays and then putting the values into the new array accordingly, but I haven't seen this type of question anywhere yet. I would like to do this in c, but any help / pseudocode would be very kindly appreciated. Thanks!
Edit: The function which would do the sorting, would only be given the combined array and (maybe ) the length of the other two arrays if needed.
If you already have the original sorted arrays, the combined array (note it is not sorted) doesn't really help, except in that your destination storage is already allocated.
There's a well-known and very simple algorithm for merging two sorted ranges, but you can just use std::merge instead of coding it yourself.
Note that only works for non-overlapping input & output ranges: for your amended question, use std::inplace_merge, with the middle iterator set to the first element from your second sequence:
void sort_combined(int *array, size_t total, size_t first) {
std::inplace_merge(array, array + first, array + total);
}
// and use it like
int combined[] = {3, 4, 5, 1, 3, 7, 8};
const size_t first = 3;
const size_t second = 4;
const size_t total = 7; // == sizeof(combined)/sizeof(*combined)
sort_combined(combined, total, first);

How to sort a very large vector of user defined type

I want to sort a large collection of pixels.
typedef char HexGetal;
typedef unsigned int NatuurlijkGetal;
struct Pixel{
HexGetal Blue;
HexGetal Green;
HexGetal Red;
};
struct Palet{
Pixel Kleur;
NatuurlijkGetal Aantal;
};
vector <Palet> MyContainer;
NatuurlijkGetal Seeds[10]={1, 25, 55, 7, 3, 149, 6, 7, 1, 55};
Palet LoopPalet;
LoopPalet.Kleur.Blue = 0;
LoopPalet.Kleur.Green = 0;
LoopPalet.Kleur.Red = 0;
for(NatuurlijkGetal Looper = 0; Looper < 10;Looper++)
{
LoopPalet.Aantal = Seeds[Looper];
MyContainer.push_back(LoopPalet);
}
After creation of the type "Palet", i create a vector of Palet's called "MyContainer", and initialize it.
Now i want to sort it, based on the field "Aantal".
How do i do that? I am probably looking for 2 different ways.
PART 1:
I want to learn the general way to do this when the vector is small. I have never sorted a vector. Read a lot about it and watched video's, but i'm just not getting it.
PART 2:
This vector is going to have more than 1 million elements later when used. so maybe a smarter approach is needed, to limit the amount of copy operations.
Thx in advance.
struct mycomp
{
bool operator() (const Palet& p1, const Palet& p2)
{
return (p1.Aantal < p2.Aantal); //Change the operator as required
}
};
std::sort(MyContainer.begin(), MyContainer.end() , mycomp());
Just use std::sort. I've used in programs which ran close to the 2GB process limit. "A million" elements perhaps sounds like a lot, but at 8 bytes that's still only 8 MB. It might even fit in cache.
I'd personally use std::sort or similar, but I wouldn't be sorting huge elements directly in a std::vector. I would sort references to my big elements with the comparator. Each reference would be a simple container with a smart pointer to a big element.
You can also think of using std::map as a container which automatically gives you the sorted sequence.

C++ finding doubles in list

I have to find if there are doubles in my list<SnakeParts> and set alive to false if there are doubles
I tried with the unique() function of the list and added an operator==() to my class.
now when I execute the unique function I doesn't filter out the doubles. and after some debugging I found out that the == comparator only get's exececuted as many times as there are objects in my list I used the following code:
list<SnakePart> uniquelist = m_snakeParts;
uniquelist.unique();
if (m_snakeParts.size() != uniquelist.size()){
alive = false;
}
operator:
bool SnakePart::operator==(const SnakePart& snakePart) const{
return (x == snakePart.x && y == snakePart.y );
}
but that doesn't work. so what am I doing wrong, or is there another way I could do this?
std::list::unique works only with consecutive duplicates. Say, if we have a {1, 2, 2, 1}, after calling unique we got {1, 2, 1}. You could use sort function before(N * log(N) + N complexity) , or use std::map to count every element in list(linear, + N memory(in worst case)).
Notice that an element is only removed from the list container if it compares equal to the element immediately preceding it. Thus, this function is especially useful for sorted lists.
So you'll have to either sort your list beforehand, or use an std::set (sets by nature can't contain duplicate objects).
If using a std::list is not a requirement then I would suggest using std::set which won't allow you to insert an element that's already in the set. Moreover, the insert method will let you know if the element you are trying to insert is already in the set or not via its return value.
If using a std::list is a requirement, then I would suggest you to use std::unique algorithm to weed out the duplicates. Please have a look at the example in there.

How to uniquely characterize an array in c++?

Is it possible to characterize an integer array in C++? Once characterized, arrays containing same set of elements will have same characteristics.
I was thinking on lines of hashcode, each hashcode will uniquely identify an array!
For example ary[]={4,5,3,2,4} and ary_two[]={4,4,2,3,5} should both have same characteristics/ hashcode!
I am trying to solve this question( asked in an interview ): A number of variable sized arrays are being generated. For each array determine if we have encountered an array before containing the same elements as this array!
Investigate std::hash. You can probably overload it to do what you want. For instance, if you want the arrays with values {4, 5, 3, 2, 4} and {4, 4, 2, 3, 5} to hash to the same value, you could specialize it like this:
template<> struct hash<std::array<int, 5>>
{
size_t operator()(const std::array<int, 5> &ary) const
{
return std::accumulate(std::begin(ary), std::end(ary), 0U) * 16777619;
}
};
One possible solution would be to use the elements hashes themselves (assuming the content of the array is hashable). Then just fold them together with some suitable function (e.g. an xor or better yet +). Make sure the folding function is commutative and associative, or the order of the array will make a difference.

c++ Sorting a vector based on values of other vector, or what's faster?

There are a couple of other posts about sorting a vector A based on values in another vector B. Most of the other answers tell to create a struct or a class to combine the values into one object and use std::sort.
Though I'm curious about the performance of such solutions as I need to optimize code which implements bubble sort to sort these two vectors. I'm thinking to use a vector<pair<int,int>> and sort that.
I'm working on a blob-tracking application (image analysis) where I try to match previously tracked blobs against newly detected blobs in video frames where I check each of the frames against a couple of previously tracked frames and of course the blobs I found in previous frames. I'm doing this at 60 times per second (speed of my webcam).
Any advice on optimizing this is appreciated. The code I'm trying to optimize can be shown here:
http://code.google.com/p/projectknave/source/browse/trunk/knaveAddons/ofxBlobTracker/ofCvBlobTracker.cpp?spec=svn313&r=313
important: I forgot to mention that the size of the vectors will never be bigger than 5, and mostly have only 3 items in it and will be unsorted (maybe I could even hardcode it for 3 items?)
Thanks
C++ provides lots of options for sorting, from the std::sort algorithm to sorted containers like std::map and std::set. You should always try to use these as your first solution, and only try things like "optimised bubble sorts" as a last resort.
I implemented this a while ago. Also, I think you mean ordering a vector B in the same way as the
sorted values of A.
Index contains the sorting order of data.
/** Sorts a vector and returns index of the sorted values
* \param Index Contains the index of sorted values in the original vector
* \param data The vector to be sorted
*/
template<class T>
void paired_sort(vector<unsigned int> & Index, const vector<T> & data)
{
// A vector of a pair which will contain the sorted value and its index in the original array
vector<pair<T,unsigned int>> IndexedPair;
IndexedPair.resize(data.size());
for(unsigned int i=0;i<IndexedPair.size();++i)
{
IndexedPair[i].first = data[i];
IndexedPair[i].second = i;
}
sort(IndexedPair.begin(),IndexedPair.end());
Index.resize(data.size());
for(size_t i = 0; i < Index.size(); ++i) Index[i] = IndexedPair[i].second;
}