Best way to group string members of object in a vector - c++

I am trying to store a vector of objects and sort them by a string member possessed by each object. It doesn't need to be sorted alphabetically, it only needs to group every object with an identical string together in the vector.
IE reading through the vector and outputting the strings from beginning to end should return something like:
string_bulletSprite
string_bulletSprite
string_bulletSprite
string_playerSprite
string_enemySprite
string_enemySprite
But should NEVER return something like:
string_bulletSprite
string_playerSprite
string_bulletSprite
[etc.]
Currently I am using std:sort and a custom comparison function:
std::vector<GameObject*> worldVector;
[...]
std::sort(worldVector.begin(), worldVector.end(), compString);
And the comparison function used in the std::sort looks like this:
bool compString(GameObject* a, GameObject* b)
{
return a->getSpriteNameAndPath() < b->getSpriteNameAndPath();
}
getSpriteNameAndPath() is a simple accessor which returns a normal string.
This seems to work fine. I've stress tested this a fair bit and it seems to always group things together the way I wanted.
My question is, is this the ideal or most logical/efficient way of accomplishing the stated goal? I get the impression Sort isn't quite meant to be used this way and I'm wondering if there's a better way to do this if all I want to do is group but don't care about doing so in alphabetic order.
Or is this fine?

If you have lots of equivalent elements in your range, then std::sort is less efficient than manually sorting the elements.
You can do this by shifting the minimum elements to the beginning of the range, and then repeating this process on the remaining non-minimum elements
// given some range v
auto b = std::begin(v); // keeps track of remaining elements
while (b != std::end(v)) // while there's elements to be arranged
{
auto min = *std::min_element(b, std::end(v)); // find the minimum
// move elements matching that to the front
// and simultaneously update the remaining range
b = std::partition(b, std::end(v),
[=](auto const & i) {
return i == min;
});
}
Of course, a custom comparator can be passed to min_element, and the lambda in partition can be modified if equivalence is defined some other way.
Note that if you have very few equivalent elements, this method is much less efficient than using std::sort.
Here's a demo with a range of ints.

I hope I understood your question correctly, if so, I will give you a little example of std::map which is great for grouping things by keys, which will most probably be a std::string.
Please take a look:
class Sprite
{
public:
Sprite(/* args */)
{
}
~Sprite()
{
}
};
int main(int argc, char ** argv){
std::map <std::string, std::map<std::string, Sprite>> sprites;
std::map <std::string, Sprite> spaceships;
spaceships.insert(std::make_pair("executor", Sprite()));
spaceships.insert(std::make_pair("millennium Falcon", Sprite()));
spaceships.insert(std::make_pair("death star", Sprite()));
sprites.insert(std::make_pair("spaceships",spaceships));
std::cout << sprites["spaceships"]["executor"].~member_variable_or_function~() << std::endl;
return 0;
}

Seems like Functor or Lambda is the way to go for this particular program, but I realized some time after posting that I could just create an ID for the images and sort those instead of strings. Thanks for the help though, everyone!

Related

Constraining remove_if on only part of a C++ list

I have a C++11 list of complex elements that are defined by a structure node_info. A node_info element, in particular, contains a field time and is inserted into the list in an ordered fashion according to its time field value. That is, the list contains various node_info elements that are time ordered. I want to remove from this list all the nodes that verify some specific condition specified by coincidence_detect, which I am currently implementing as a predicate for a remove_if operation.
Since my list can be very large (order of 100k -- 10M elements), and for the way I am building my list this coincidence_detect condition is only verified by few (thousands) elements closer to the "lower" end of the list -- that is the one that contains elements whose time value is less than some t_xv, I thought that to improve speed of my code I don't need to run remove_if through the whole list, but just restrict it to all those elements in the list whose time < t_xv.
remove_if() though does not seem however to allow the user to control up to which point I can iterate through the list.
My current code.
The list elements:
struct node_info {
char *type = "x";
int ID = -1;
double time = 0.0;
bool spk = true;
};
The predicate/condition for remove_if:
// Remove all events occurring at t_event
class coincident_events {
double t_event; // Event time
bool spk; // Spike condition
public:
coincident_events(double time,bool spk_) : t_event(time), spk(spk_){}
bool operator()(node_info node_event){
return ((node_event.time==t_event)&&(node_event.spk==spk)&&(strcmp(node_event.type,"x")!=0));
}
};
The actual removing from the list:
void remove_from_list(double t_event, bool spk_){
// Remove all events occurring at t_event
coincident_events coincidence(t_event,spk_);
event_heap.remove_if(coincidence);
}
Pseudo main:
int main(){
// My list
std::list<node_info> event_heap;
...
// Populate list with elements with random time values, yet ordered in ascending order
...
remove_from_list(0.5, true);
return 1;
}
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for cycle as suggested for example in this post?
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for loop?
Yes and yes. Don't fight to use code that is preventing you from reaching your goals. Keep it simple. Loops are nothing to be ashamed of in C++.
First thing, comparing double exactly is not a good idea as you are subject to floating point errors.
You could always search the point up to where you want to do a search using lower_bound (I assume you list is properly sorted).
The you could use free function algorithm std::remove_if followed by std::erase to remove items between the iterator returned by remove_if and the one returned by lower_bound.
However, doing that you would do multiple passes in the data and you would move nodes so it would affect performance.
See also: https://en.cppreference.com/w/cpp/algorithm/remove
So in the end, it is probably preferable to do you own loop on the whole container and for each each check if it need to be removed. If not, then check if you should break out of the loop.
for (auto it = event_heap.begin(); it != event_heap.end(); )
{
if (coincidence(*it))
{
auto itErase = it;
++it;
event_heap.erase(itErase)
}
else if (it->time < t_xv)
{
++it;
}
else
{
break;
}
}
As you can see, code can easily become quite long for something that should be simple. Thus, if you need to do that kind of algorithm often, consider writing you own generic algorithm.
Also, in practice you might not need to do a complete search for the end using the first solution if you process you data in increasing time order.
Finally, you might consider using an std::set instead. It could lead to simpler and more optimized code.
Thanks. I used your comments and came up with this solution, which seemingly increases speed by a factor of 5-to-10.
void remove_from_list(double t_event,bool spk_){
coincident_events coincidence(t_event,spk_);
for(auto it=event_heap.begin();it!=event_heap.end();){
if(t_event>=it->time){
if(coincidence(*it)) {
it = event_heap.erase(it);
}
else
++it;
}
else
break;
}
}
The idea to make erase return it (as already ++it) was suggested by this other post. Note that in this implementation I am actually erasing all list elements up to t_event value (meaning, I pass whatever I want for t_xv).

insert doesn't permanently change map?

I have a member variable of this class that is set<pair<string,map<string,int> > > setLabelsWords; which is a little convoluted but bear with me. In a member function of the same class, I have the following code:
pair<map<string,int>::iterator,bool> ret;
for (auto j:setLabelsWords) {
if (j.first == label) {
for (auto k:words) {
ret = j.second.insert(make_pair(k,1));
if (ret.second == false) {
j.second[k]++;
}
}
}
}
"words" is a set of strings and "label" is a string. So basically it's supposed to insert "k" (a string) and if k is already in the map, it increments the int by 1. The problem is it works until the outermost for loop is over. If I print j.second's size right before the last bracket, it will give me a size I expect like 13, but right outside the last bracket the size goes back to 7, which is what the size of the map is initially before this code block runs. I am super confused why this is happening and any help would be much appreciated.
for (auto j:setLabelsWords) {
This iterates over the container by value. This is the same thing as if you did:
class Whatever { /* something in here */ };
void changeWhatever(Whatever w);
// ...
{
Whatever w;
changewhatever(w);
}
Whatever changewhatever does to w, whatever modifications are made, are made to a copy of w, because it gets passed by value, to the function.
In order to correctly update your container, you must iterate by reference:
for (auto &j:setLabelsWords) {
for (auto j:setLabelsWords) {
This creates a copy of each element. All the operations you perform on j affect that copy, not the original element in setLabelsWords.
Normally, you would just use a reference:
for (auto&& j:setLabelsWords) {
However, due to the nature of a std::set, this won't get you far, because a std::set's elements cannot be modified freely via iterators or references to elements, because that would allow you to create a set with duplicates in it. The compiler will not allow that.
Here's a pragmatic solution: Use a std::vector instead of a std::set:
std::vector<std::pair<std::string, std::map<std::string,int>>> setLabelsWords;
You will then be able to use the reference approach explained above.
If you need std::set's uniqueness and sorting capabilities later on, you can either apply std::sort and/or std::unique on the std::vector, or create a new std::set from the std::vector's elements.

Iterating over Unorderd_map using indexed for loop

I am trying to access values stored in an unorderd_map using a for loop, but I am stuck trying to access values using the current index of my loop. Any suggestion, or link to look-on? thanks. [Hint: I don't want to use an iterator].
my sample code:
#include <iostream>
#include <string>
#include <unordered_map>
#include <vector>
using namespace std;
int main()
{
unordered_map<int,string>hash_table;
//filling my hash table
hash_table.insert(make_pair(1,"one"));
hash_table.insert(make_pair(2,"two"));
hash_table.insert(make_pair(3,"three"));
hash_table.insert(make_pair(4,"four"));
//now, i want to access values of my hash_table with for loop, `i` as index.
//
for (int i=0;i<hash_table.size();i++ )
{
cout<<"Value at index "<<i<<" is "<<hash_table[i].second;//I want to do something like this. I don't want to use iterator!
}
return 0;
}
There are two ways to access an element from an std::unordered_map.
An iterator.
Subscript operator, using the key.
I am stuck trying to access values using the current index of my loop
As you can see, accessing an element using the index is not listed in the possible ways to access an element.
I'm sure you realize that since the map is unordered the phrase element at index i is quite meaningless in terms of ordering. It is possible to access the ith element using the begin iterator and std::advance but...
Hint: I don't want to use an iterator].
Hint: You just ran out of options. What you want to do is not possible. Solution: Start wanting to use tools that are appropriate to achieving your objective.
If you want to iterate a std::unordered_map, then you use iterators because that's what they're for. If you don't want to use iterators, then you cannot iterate an std::unordered_map. You can hide the use of iterators with a range based for loop, but they're still used behind the scenes.
If you want to iterate something using a position - index, then what you need is an array such as a std::vector.
First, why would you want to use an index versus an iterator?
Suppose you have a list of widgets you want your UI to draw. Each widget can have its own list of child widgets, stored in a map. Your options are:
Make each widget draw itself. Not ideal since widgets are now coupled to the UI kit you are using.
Return the map and use an iterator in the drawing code. Not ideal because now the drawing code knows your storage mechanism.
An API that can avoid both of these might look like this.
const Widget* Widget::EnumerateChildren(size_t* io_index) const;
You can make this work with maps but it isn't efficient. You also can't guarantee the stability of the map between calls. So this isn't recommended but it is possible.
const Widget* Widget::EnumerateChildren(size_t* io_index) const
{
auto& it = m_children.begin();
std::advance(it, *io_index);
*io_index += 1;
return it->second;
}
You don't have to use std::advance and could use a for loop to advance the iterator yourself. Not efficient or very safe.
A better solution to the scenario I described would be to copy out the values into a vector.
void Widget::GetChildren(std::vector<Widget*>* o_children) const;
You can't do it without an iterator. An unordered map could store the contents in any order and move them around as it likes. The concept of "3rd element" for example means nothing.
If you had a list of the keys from the map then you could index into that list of keys and get what you want. However unless you already have it you would need to iterate over the map to generate the list of keys so you still need an iterator.
An old question.
OK, I'm taking the risk: here may be a workaround (not perfect though: it is just a workaround).
This post is a bit long because I explain why this may be needed. In fact one might want to use the very same code to load and save data from and to a file. This is very useful to synchronize the loading and saving of data, to keep the same data, the same order, and the same types.
For example, if op is load_op or save_op:
load_save_data( var1, op );
load_save_data( var2, op );
load_save_data( var3, op );
...
load_save_data hides the things performed inside. Maintenance is thus much more easy.
The problem is when it comes to containers. For example (back to the question) it may do this for sets (source A) to save data:
int thesize = theset.size();
load_save(thesize, load); // template (member) function with 1st arg being a typename
for( elem: theset) {
load_save_data( thesize, save_op );
}
However, to read (source B):
int thesize;
load_save_data( thesize, save);
for( int i=0; i<thesize, i++) {
Elem elem;
load_save_data( elem, load_op);
theset.insert(elem);
}
So, the whole source code would be something like this, with too loops:
if(op == load_op) { A } else { B }
The problem is there are two different kinds of loop, and it would be nice to merge them as one only. Ideally, it would be nice to be able to do:
int thesize;
load_save_data( thesize, save);
for( int i=0; i<thesize, i++) {
Elem elem;
if( op == save_op ) {
elem=theset[i]; // not possible
}
load_save_data( elem, op);
if( op == load_op ) {
theset.insert(elem);
}
}
(as this code is used in different contexts, care may be taken to provide enough information to the compiler to allow it the strip the unnecessary code (the right "if"), not obvious but possible)
This way, each call to load_save_data is in the same order, the same type. You forget a field for both or none, but everything is kept synchronized between save and load. You may add a variable, change a type, change the order etc in one place only. The code maintenance is thus easier.
A solution to the impossible "theset[i]" is indeed to use a vector or a map instead of a set but you're losing the properties of a set (avoid two identical items).
So a workaround (but it has a heavy price: efficiency and simplicity) is something like:
void ...::load_save( op )
{
...
int thesize;
set<...> tmp;
load_save_data( thesize, save);
for( int i=0; i<thesize, i++) {
Elem elem;
if( op == save_op ) {
elem=*(theset.begin()); \
theset.erase(elem); > <-----
tmp.insert(elem); /
}
load_save_data( elem, op);
if( op == load_op ) {
theset.insert(elem);
}
}
if(op == save_op) {
theset.insert(tmp.begin(), tmp.end()); <-----
}
...
}
Not very beautiful but it does the trick, and (IMHO) itis the closest answer to the question.

std::map rotate method or algorithm using stl::map

iam developing a class that inside holds an std::map, by now the funcionality was optimal, but now i have a requirement to rotate the map, i mean by rotate change order in wich the map elements id besides the values corresponding to those values , by example:
Given:
Map[122]=1
Map[12]=2
Map[3]=45
applyng the rotation algorithm once:
Map[12]=2
Map[3]=45
Map[122]=1
applyng the rotation algorithm again:
Well, my first intention is write a algoritm that perform this operation, but i new in c++
Map[3]=45
Map[122]=1
Map[12]=2
Do i have a proper solution in stl libs that i cannot see by now¡?
thx
No.
The order of map elements is not something you control. It's inherent, based on sort key.
Sure, you can provide your own comparator in order to manipulate the underlying order of the container.
However, you should not be relying on order in a map. It is not a sequence container, and is simply not designed for you to use order as a property.
Instead of this "rotating", why not begin your iteration at a different place in the container each time, and "wrap-around"?
I think you might be confusing "mapping" with "storage". In a mathematical (or algorithmic) sense, if you "map" a key to a value, then that is a one to one mapping and until it has been changed, when you look up that key, you will always get that value. It doesn't matter yet how it actually works or whether whatever object is used to implement the map has been "rotated" or not. Look up a key, get the value. In your case, before or after rotation, if you look up "12" for example, you will always get 2. Do you see what I'm saying? Order here, doesn't matter. Therefore, if you use std::map from the STL, you lose control over guarantees on the order in which the elements are stored.
Now, what you're asking has to do with the implementation and in particular, with how the elements are stored, so what you need is an STL container that guarantees order. One such container is a vector. It seems to me that what you might want is probably a vector of pairs. Something like this would work:
#include <vector>
#include <map> //for std::pair
#include <iostream>
#include <algorithm> //for std::rotate
typedef std::pair<int,int> entry;
typedef std::vector<entry> storage;
void print( const char* msg, const storage& obj )
{
std::cout<<msg<<std::endl;
for(auto i : obj)
{
std::cout << i.first << "," << i.second << std::endl;
}
}
void lookup(int key, const storage& obj)
{
for(auto i : obj)
{
if( i.first == key )
{
std::cout<<"\t"<<key<<"=>"<<i.second<<std::endl;
return;
}
}
std::cout<<key<<"not found"<<std::endl;
}
int main()
{
storage mymap = {entry(122,1),entry(12,2),entry(3,45)};
print("Before rotation", mymap);
lookup(12,mymap);
std::rotate(mymap.begin(),mymap.begin()+1,mymap.end());
print("After one rotation", mymap);
lookup(12,mymap);
std::rotate(mymap.begin(),mymap.begin()+1,mymap.end());
print("After one more rotation", mymap);
lookup(12,mymap);
return 0;
}
Note, however, that because you're using a vector, it will not protect you from adding duplicate pairs or pairs with different keys but the same value and vice versa. If you want to maintain a one to one mapping, you will have to make sure that when you insert elements in, that the "key" and the "value" are not repeated anywhere else in the vector. That should be pretty easy for you to figure out after some reading on how std::vector works.
To extend Lightness's answer, which I believe is the correct one. If you wan't more control over your map you should use a static matrix instead.
Matrices provide many more rotational options using simple math, instead of the cyclical rotation you're trying to implement.

Safe To Modify std::pair<U, V>::first in vector of pairs?

I'm currently working on a DNA database class and I currently associate each row in the database with both a match score (based on edit distance) and the actual DNA sequence itself, is it safe to modify first this way within an iteration loop?
typedef std::pair<int, DnaDatabaseRow> DnaPairT;
typedef std::vector<DnaPairT> DnaDatabaseT;
// ....
for(DnaDatabaseT::iterator it = database.begin();
it != database.end(); it++)
{
int score = it->second.query(query);
it->first = score;
}
The reason I am doing this is so that I can sort them by score later. I have tried maps and received a compilation error about modifying first, but is there perhaps a better way than this to store all the information for sorting later?
To answer your first question, yes. It is perfectly safe to modify the members of your pair, since the actual data in the pair does not affect the vector itself.
edit: I have a feeling that you were getting an error when using a map because you tried to modify the first value of the map's internal pair. That would not be allowed because that value is part of the map's inner workings.
As stated by dribeas:
In maps you cannot change first as it would break the invariant of the map being a sorted balanced tree
edit: To answer your second question, I see nothing at all wrong with the way you are structuring the data, but I would have the database hold pointers to DnaPairT objects, instead of the objects themselves. This would dramatically reduce the amount of memory that gets copied around during the sort procedure.
#include <vector>
#include <utility>
#include <algorithm>
typedef std::pair<int, DnaDatabaseRow> DnaPairT;
typedef std::vector<DnaPairT *> DnaDatabaseT;
// ...
// your scoring code, modified to use pointers
void calculateScoresForQuery(DnaDatabaseT& database, queryT& query)
{
for(DnaDatabaseT::iterator it = database.begin(); it != database.end(); it++)
{
int score = (*it)->second.query(query);
(*it)->first = score;
}
}
// custom sorting function to handle DnaPairT pointers
bool sortByScore(DnaPairT * A, DnaPairT * B) { return (A->first < B->first); }
// function to sort the database
void sortDatabaseByScore(DnaDatabaseT& database)
{
sort(database.begin(), database.end(), sortByScore);
}
// main
int main()
{
DnaDatabaseT database;
// code to load the database with DnaPairT pointers ...
calculateScoresForQuery(database, query);
sortDatabaseByScore(database);
// code that uses the sorted database ...
}
The only reason you might need to look into more efficient methods is if your database is so enormous that the sorting loop takes too long to complete. If that is the case, though, I would imagine that your query function would be the one taking up most of the processing time.
You can't modify since the variable first of std::pair is defined const