How to insert unique items into vector? - c++

I have a type called Neighbors:
typedef vector<pair<data,int>> Neighbors;
and here's data:
struct data
{
int par[PARAMETERS];
int cluster;
bool visited;
bool noise;
};
I'm trying to write a function that inserts values from _NeighborPts to NeighborPts (but only ones that aren't already in NeighborPts):
void insert_unique(Neighbors* NeighborPts, const Neighbors& _NeighborPts)
{
Neighbors_const_it _it = _NeighborPts.begin();
while(_it != _NeighborPts.end())
{
if(/* _it->first.par isn't in *NeighborPts */)
NeighborPts->push_back(*_it);
++_it;
}
}
and i already have a function equal() which checks if 2 pars are equal.
So do i have to iterate through NeighborPts in a while loop and check if the item is found? or could i use some built-in find or find_if function to do that for me?

You can maintain a sorted vector. Use the lower_bound function from C++ algorithms to locate the insert position each time. If the element at the insert position is equal to the insert element then you have a duplicate.
The performance of this will be pretty good unless the vector grows too large. The point at which you're better off using a set or a unordered_set varies and you'd need to benchmark to find it.

Your current solution with vector will run in O(N^2) time, which is not efficient.
For efficient solution an associative container will be great - such as std::set .
Also you will need to have some "operator less" (instead of "equal ()"), to pass to the function.
template < class T, // set::key_type/value_type
class Compare = less<T>, // set::key_compare/value_compare
class Alloc = allocator<T> // set::allocator_type
> class set;
So you need to provide compare class
struct data_compare {
bool operator() (const data& lhs, const data& rhs) const{
//...
}
};
set<int64_t, data_compare> exising_items;
You may define such a function, or override "operator <" in struct data.
insert all "data" from "_NeighborPts" into a set - O(N*log(N)) time
std::set other_items;
in a loop - iterate _NeighborPts and insert data elements
other_items.insert (_NeighborPts [i]);
std::set my_items;
in a loop - iterate _NeighborPts and insert data elements
my_items.insert (NeighborPts [i]);
Now you need to compare between the 2 sets:
You can do it using std::set_intersection
. or construct a simple loop on the set "my_items"
if the current element in other_items isn't in _my_items, insert it in "NeighborPts"
this solution will run in O(Nlog(N)) time

There is no getting around iterating over the items in _NeighborPts.
As long as you are using an std::vector, there is no getting around the check to determine whether an item is in NeighborPts before inserting in it.
You can make the code a little bit easier to read by using std::for_each and a functor.
struct UniqueItemInserter
{
UniqueItemInserter(Neighbors* neighborsIn) : neighbors(neighborsIn) {}
void operator(pair<data,int> const& item)
{
if ( std::find(neighbors->begin(), neighbors->end(), item) != neighbors->end() )
{
neighbors->push_back(item);
}
}
Neighbors* neighbors;
};
void insert_unique(Neighbors* NeighborPts, const Neighbors& _NeighborPts)
{
std::for_each(_NeighborPts.begin(), _NeighborPts.end(), UniqueItemInserter(NeighborPts));
}

Related

Sort a list of Objects by given list of Names that exist [duplicate]

I am having trouble sorting a list of custom class pointers. The class I need to sort are events. These get assigned a random time and I need to do them in the right order.
#include <list>
Class Event{
public:
float time; // the value which I need to sort them by
int type; // to indicate which event i'm dealing with
Event(float tempTime, int tempType)
{
time = tempTime;
type = tempType;
}
int main(){
std::list<Event*> EventList;
list<Event*>::iterator it;
.........
If you could help me sort this out it would be much appreciated! I've been stuck on this for hours now.
Thanks!
Since the list contains pointers, rather than objects, you'll have to provide a custom comparator to compare the objects they point to. And since you're using a list, you have to use its own sort method: the generic std::sort algorithm only works on random-access sequences.
EventList.sort([](Event * lhs, Event * rhs) {return lhs->time < rhs->time;});
or, if you're stuck in the past and can't use lambdas:
struct CompareEventTime {
bool operator()(Event * lhs, Event * rhs) {return lhs->time < rhs->time;}
};
EventList.sort(CompareEventTime());
If the list contained objects (as it probably should), then it might make sense to provide a comparison operator instead:
bool operator<(Event const & lhs, Event const & rhs) {return lhs.time < rhs.time;}
std::list<Event> EventList;
//...
EventList.sort();
You should to that with std::sort. You can either make a custom comparator function that you pass as third argument to the std::sort function, or you can make a < operator overload for your class and std::sort will work naturally.

Is there a C++ container for unique values that supports strict size checking?

I'm looking for a C++ container to store pointers to objects which also meets the following requirements.
A container that keeps the order of elements (sequence container, so std::set is not suitable)
A container that has a member function which return the actual size (As std::array::size() always returns the fixed size, std::array is not suitable)
A container that supports random accesses such as operator [].
This is my code snippet and I'd like to remove the assertions used for checking size and uniqueness of elements.
#include <vector>
#include <set>
#include "assert.h"
class Foo {
public:
void DoSomething() {
}
};
int main() {
// a variable used to check whether a container is properly assigned
const uint8_t size_ = 2;
Foo foo1;
Foo foo2;
// Needs a kind of sequential containers to keep the order
// used std::vector instead of std::array to use member function size()
const std::vector<Foo*> vec = {
&foo1,
&foo2
};
std::set<Foo*> set_(vec.begin(), vec.end());
assert(vec.size() == size_); // size checking against pre-defined value
assert(vec.size() == set_.size()); // check for elements uniqueness
// Needs to access elements using [] operator
for (auto i = 0; i < size_; i++) {
vec[i]->DoSomething();
}
return 0;
}
Is there a C++ container which doesn't need two assertions used in my code snippet? Or should I need to make my own class which encapsulates one of STL containers?
So a class that acts like a vector except if you insert, it rejects duplicates like a set or a map.
One option might be the Boost.Bimap with indices of T* and sequence_index.
Your vector-like indexing would be via the sequence_index. You might even be willing to live with holes in the sequence after an element is erased.
Sticking with STLyou could implement a bidirectional map using 2 maps, or the following uses a map and a vector:
Note that by inheriting from vector I get all the vector methods for free, but I also risk the user downcasting to the vector.
One way round that without remodelling with a wrapper (a la queue vs list) is to make it protected inheritance and then explicitly using all the methods back to public. This is actually safer as it ensures you haven't inadvertently left some vector modification method live that would take the two containers out of step.
Note also that you would need to roll your own initializer_list constructor if you wanted one to filter out any duplicates. And you would have to do a bit of work to get this thread-safe.
template <class T>
class uniqvec : public std::vector<T*>
{
private:
typedef typename std::vector<T*> Base;
enum {push_back, pop_back, emplace_back, emplace}; //add anything else you don't like from vector
std::map <T*, size_t> uniquifier;
public:
std::pair<typename Base::iterator, bool> insert(T* t)
{
auto rv1 = uniquifier.insert(std::make_pair(t, Base::size()));
if (rv1.second)
{
Base::push_back(t);
}
return std::make_pair(Base::begin()+rv1.first.second, rv1.second);
}
void erase(T* t)
{
auto found = uniquifier.find(t);
if (found != uniquifier.end())
{
auto index = found->second;
uniquifier.erase(found);
Base::erase(Base::begin()+index);
for (auto& u : uniquifier)
if (u.second > index)
u.second--;
}
}
// Note that c++11 returns the next safe iterator,
// but I don't know if that should be in vector order or set order.
void erase(typename Base::iterator i)
{
return erase(*i);
}
};
As others have mentioned, your particular questions seems like the XY problem (you are down in the weeds about a particular solution instead of focusing on the original problem). There was an extremely useful flowchart provided here a number of years ago (credit to #MikaelPersson) that will help you choose a particular STL container to best fit your needs. You can find the original question here In which scenario do I use a particular STL container?.

Sorted data structure that allows duplicates on sort key, but replaces duplicates on another key in sub linear time

This question is language-agnostic, but I'm specifically looking for a solution using C++ STL containers. I have a struct like this.
struct User {
int query_count;
std::string user_id;
}
std::multiset<User> users; //currently using
I use a multiset with a comparator that sort on query_count. This allow me to have sorted multiple entries with the same query_count. Now, if I want to avoid duplicates on user_id, I need to scan data and remove the entry and create a new one, taking O(n). I'm trying to think of a way to do this in sub-linear time. I was thinking of a solution based on a map ordered on user_id, but then I would have to scan all the whole data when trying to locate the largest query_count.
EDIT: requirements are insert, delete, update(delete/insert), get highest query_count, find user_id in sub-linear time.
I prefer to use the standard stl containers, but simple modifications are fine. Is there any way to achieve my requirements?
Summary:
The summary of the answers is that to use a ootb solution, I can use boost bi-directional map.
If I'm sticking to STL, then it has to be a combination of using two maps together, updating both carefully, for each insertion of a user.
This sounds like a job for boost's multi_index: http://www.boost.org/doc/libs/1_57_0/libs/multi_index/doc/tutorial/
You can set one index based on the user id to easily prevent duplicates (you insert based on this), and then another sorted index on the query count to easily locate the max.
multi_index from boost is the way to go.
But if you want to use your own DataStructure using basic STL containers, then i suggest you create a class which has two conatiners internally.
keep an itertor to SortedContainer in the map. So that you can delete and access it in O(1)( same as lookup of unordered_map).
X
struct User {
int query_count;
std::string user_id;
}
class UserQueryCountSomething
{
typedef std::list<int> SortedContainer; //better to use a Stack or Heap here instead of list.
SortedContainer sortedQueryCount; //keep the query_count sorted here.
typedef std::pair< User, typename SortedContainer::iterator> UserPosition_T;//a pair of User struct and the iterator in list.
typedef unordered_map < std::string, UserPosition_T > Map_T; // Keep your User struct and the iterator here in this map, aginst the user_id.
Map_T map_;
public:
Insert(User u)
{
//insert into map_ and also in sortedQueryCount
}
int getHighestQueryCount()
{
//return first element in sortedQueryCount.
}
Delete()
{
//find in map and delete.
//get the iterator from the map's value type here.
//delete from the sortedQueryCount using the iteartor.
}
};
}
This can be a starting point for you. Let me know if you more details.
If we just need the highest count, not other ranks of count, then one approach may be to track it explicitly. We may do that as
unordered_map<UserId, QueryCount>;
int max_query_count;
Unfortunately, in some operations, e.g. when the user with max query count is removed, the max value need to freshly computed. Note that, for all other users, whose query count is not maximum, removal of them does not need re-computation of max_query_count. The re-computation, when done, would be O(N), which does not meet the "sub linear" requirement. That may be good enough for many use cases, because the user with maximum query count may not be frequently removed.
However, if we absolutely want to avoid the O(N) re-computation, then we may introduce another container as
multimap<QueryCount, UserId>
to map a specific query count to a collection of users.
In this approach, any mutation operation e.g. add, remove, update, may need to update both the containers. That is little bit of pain, but the gain is that such updates are expected to be logarithmic, e.g. O(lg N), i.e. sub linear.
Update with some code sketch. Note I have used unordered_map and unordered_set, instead of multimap, for count-to-user mapping. Since we do not really need ordering on count, this might be fine; in case if not, unordered_map may be simply changed to map.
class UserQueryCountTracker {
public:
typedef std::string UserId;
typedef int QueryCount;
void AddUser(UserId id) {
int new_count = -1;
auto it = user_count_map_.find(id);
if (it == user_count_map_.end()) { // id does not exist
new_count = 1;
user_count_map_[id] = new_count;
count_user_map_[new_count].insert(id);
}
else { // id exists
const int old_count = it->second;
new_count = old_count + 1;
it->second = new_count;
// move 'id' from old count to new count
count_user_map_[old_count].erase(id);
count_user_map_[new_count].insert(id);
}
assert(new_count != -1);
if (new_count > max_query_count_) {
max_query_count_ = new_count;
}
}
const unordered_set<UserId>& UsersWithMaxCount() const {
return count_user_map_[max_query_count_];
}
private:
unordered_map<UserId, QueryCount> user_count_map_{};
int max_query_count_{0};
unordered_map<QueryCount, unordered_set<UserId>> count_user_map_{};
};
Use bidirectional map, where user id is key and query count is value
#include <map>
#include <utility>
#include <functional>
template
<
typename K, // key
typename V, // value
typename P = std::less<V> // predicate
>
class value_ordered_map
{
private:
std::map<K, V> key_to_value_;
std::multimap<V, K, P> value_to_key_;
public:
typedef typename std::multimap<typename V, typename K, typename P>::iterator by_value_iterator;
const V& value(const K& key) {
return key_to_value_[key];
}
std::pair<by_value_iterator, by_value_iterator> keys(const V& value) {
return value_to_key_.equal_range(value);
}
void set(const K& key, const V& value) {
by_key_iterator it = key_to_value_.find(key);
if (key_to_value_.end() != it) {
std::pair<by_value_iterator, by_value_iterator> it_pair = value_to_key_.equal_range(key_to_value_[key]);
while (it_pair.first != it_pair.second)
if (it_pair.first->first == it->second) {
value_to_key_.erase(it_pair.first);
break;
} else ++it_pair.first;
}
key_to_value_[key] = value;
value_to_key_.insert(std::make_pair(value, key));
}
};

C++ Sorting Custom Objects in a list

I am having trouble sorting a list of custom class pointers. The class I need to sort are events. These get assigned a random time and I need to do them in the right order.
#include <list>
Class Event{
public:
float time; // the value which I need to sort them by
int type; // to indicate which event i'm dealing with
Event(float tempTime, int tempType)
{
time = tempTime;
type = tempType;
}
int main(){
std::list<Event*> EventList;
list<Event*>::iterator it;
.........
If you could help me sort this out it would be much appreciated! I've been stuck on this for hours now.
Thanks!
Since the list contains pointers, rather than objects, you'll have to provide a custom comparator to compare the objects they point to. And since you're using a list, you have to use its own sort method: the generic std::sort algorithm only works on random-access sequences.
EventList.sort([](Event * lhs, Event * rhs) {return lhs->time < rhs->time;});
or, if you're stuck in the past and can't use lambdas:
struct CompareEventTime {
bool operator()(Event * lhs, Event * rhs) {return lhs->time < rhs->time;}
};
EventList.sort(CompareEventTime());
If the list contained objects (as it probably should), then it might make sense to provide a comparison operator instead:
bool operator<(Event const & lhs, Event const & rhs) {return lhs.time < rhs.time;}
std::list<Event> EventList;
//...
EventList.sort();
You should to that with std::sort. You can either make a custom comparator function that you pass as third argument to the std::sort function, or you can make a < operator overload for your class and std::sort will work naturally.

Get Element Position within std::vector

How do I get the position of an element inside a vector, where the elements are classes. Is there a way of doing this?
Example code:
class Object
{
public:
void Destroy()
{
// run some code to get remove self from vector
}
}
In main.cpp:
std::vector<Object> objects;
objects.push_back( <some instances of Object> );
// Some more code pushing back some more stuff
int n = 20;
objects.at(n).Destroy(); // Assuming I pushed back 20 items or more
So I guess I want to be able to write a method or something which is a member of the class which will return the location of itself inside the vector... Is this possible?
EDIT:
Due to confusion, I should explain better.
void Destroy(std::vector<Object>& container){
container.erase( ?...? );
}
The problem is, how can I find the number to do the erasing...? Apparently this isn't possible... I thought it might not be...
You can use std::find to find elements in vector (providing you implement a comparison operator (==) for Object. However, 2 big concerns:
If you need to find elements in a container then you will ger much better performance with using an ordered container such as std::map or std::set (find operations in O(log(N)) vs O(N)
Object should not be the one responsible of removing itself from the container. Object shouldn't know or be concerned with where it is, as that breaks encapsulation. Instead, the owner of the container should concern itself ith such tasks.
The object can erase itself thusly:
void Destroy(std::vector<Object>& container);
{
container.erase(container.begin() + (this - &container[0]));
}
This will work as you expect, but it strikes me as exceptionally bad design. Members should not have knowledge of their containers. They should exist (from their own perspective) in an unidentifiable limbo. Creation and destruction should be left to their creator.
Objects in a vector don't automatically know where they are in the vector.
You could supply each object with that information, but much easier: remove the object from the vector. Its destructor is then run automatically.
Then the objects can be used also in other containers.
Example:
#include <algorithm>
#include <iostream>
#include <vector>
class object_t
{
private:
int id_;
public:
int id() const { return id_; }
~object_t() {}
explicit object_t( int const id ): id_( id ) {}
};
int main()
{
using namespace std;
vector<object_t> objects;
for( int i = 0; i <= 33; ++i )
{
objects.emplace_back( i );
}
int const n = 20;
objects.erase( objects.begin() + n );
for( auto const& o : objects )
{
cout << o.id() << ' ';
}
cout << endl;
}
If you need to destroy the n'th item in a vector then the easiest way is to get an iterator from the beginning using std::begin() and call std::advance() to advance how ever many places you want, so something like:
std::vector<Object> objects;
const size_t n = 20;
auto erase_iter = std::advance(std::begin(objects), n);
objects.erase(erase_iter);
If you want to find the index of an item in a vector then use std::find to get the iterator and call std::distance from the beginning.
So something like:
Object object_to_find;
std::vector<Object> objects;
auto object_iter = std::find(std::begin(objects), std::end(objects), object_to_find);
const size_t n = std::distance(std::begin(objects), object_iter);
This does mean that you need to implement an equality operator for your object. Or you could try something like:
auto object_iter = std::find(std::begin(objects), std::end(objects),
[&object_to_find](const Object& object) -> bool { return &object_to_find == &object; });
Although for this to work the object_to_find needs to be the one from the actual list as it is just comparing addresses.