Erase by value in a vector of shared pointers - c++

I want to erase by value from a vector of shared ptr of string (i.e vector<shared_ptr<string>>) . Is there any efficient way of doing this instead of iterating the complete vector and then erasing from the iterator positions.
#include <bits/stdc++.h>
using namespace std;
int main()
{
vector<shared_ptr<string>> v;
v.push_back(make_shared<string>("aaa"));
int j = 0,ind;
for(auto i : v) {
if((*i)=="aaa"){
ind = j;
}
j++;
}
v.erase(v.begin()+ind);
}
Also I dont want to use memory for a map ( value vs address).

Try like that (Erase-Remove Idiom):
string s = "aaa";
auto cmp = [s](const shared_ptr<string> &p) { return s == *p; };
v.erase(std::remove_if(v.begin(), v.end(), cmp), v.end());

There is no better way then O(N) - you have to find the object in a vector, and you have to iterate the vector once to find it. Does not really matter if it is a pointer or any object.
The only way to do better is to use a different data structure, which provides O(1) finding/removal. A set is the first thing that comes to mind, but that would indicate your pointers are unique. A second option would be a map, such that multiple pointers pointing to the same value exist at the same hash key.
If you do not want to use a different structure, then you are out of luck. You could have an additional structure hashing the pointers, if you want to retain the vector but also have O(1) access.
For example if you do use a set, and define a proper key - hasher or key_equal. probably hasher is enough defined as the hash for *elementInSet, so each pointer must point to a distinct string for example:
struct myPtrHash {
size_t operator()(const std::shared_ptr<std::string>& p) const {
//Maybe we want to add checks/throw a more meaningful error if p is invalid?
return std::hash<std::string>()(*p);
}
};
such that your set is:
std::unordered_set<std::shared_ptr<std::string>,myPtrHash > pointerSet;
Then erasing would be O(1) simply as:
std::shared_ptr<std::string> toErase = make_shared("aaa");
pointerSet.erase(toErase)
That said, if you must use a vector a more idomatic way to do this is to use remove_if instead of iterating yourself - this will not improve time complexity though, just better practice.

Don't include bits/stdc++.h, and since you're iterating through the hole vector, you should be using std::for_each with a lambda.

Related

Store selected fields from an unordered set on struct to a vector

I have an unordered_set that stores the following struct
struct match_t{
size_t score;
size_t ci;
};
typedef std::unordered_set<match_t> uniq_t;
Now I want to store the elements of uniq_t myset; to a vector, but in doing so, I want to copy just the score and not the entire struct. I have seen solutions for assigning the elements using assign or back_inserter. I was wondering how to select just the required fields from the struct. I don't see any parameter in assign or back_inserter for this purpose.
Should I try overriding push_back method for the vector or are there other methods for doing this?
EDIT 1
Do I get any performance improvements by using any of these methods instead of looping over the set and assigning the required values?
There is nothing wrong a simple for loop:
std::unordered_set<match_t> myset;
std::vector<std::size_t> myvec;
myvec.reserve(myset.size()); // allocate memory only once
for (const auto& entry : myset)
myvec.push_back(entry.score);
Alternatively, you could use std::transform with a custom lambda:
#include <algorithm>
std::tranform(myset.cbegin(), myset.cend(), std::back_inserter(myvec),
[](const auto& entry){ return entry.score; });
Another way is to use a range library, e.g. with range-v3
#include <range/v3/view/transform.hpp>
std::vector<std::size_t> myvec = myset | ranges::view::transform(&match_t::score);
Performance-wise, you can't do anything about the linear pass over all match_t objects. The important tweak instead is to minimize the number of allocations. As the size of the resulting std::vector is known a priori, a call to std::vector::reserve as shown above makes sure that no unnecessary allocation occur.

Strings in Vectors. and placing them in order

So I am placing objects in a vector. I want to drop them in order as they are added. the basics of the object are
class myObj {
private:
string firstName;
string lastName;
public:
string getFirst;
string getLast;
}
I also have a vector of these objects
vector< myObj > myVect;
vector< myObj >::iterator myVectit = myVect.begin();
when I add a new object to the vector I want to find where it should be placed before inserting it. Can I search a vector by an object value and how? This is my first attempt
void addanObj (myObj & objtoAdd){
int lowwerB = lower_bound(
myVect.begin().getLast(), myVect.end().getLast(), objtoAdd.getLast()
);
int upperB = upper_bound(
myVect.begin().getLast(), myVect.end().getLast(), objtoAdd.getLast()
);
from there i plan to use lowwerB and upper B to determine where to insert the entry. what do I need to do to get this to work or what is a better method of tackling this challenge?
----Follow up----
the error I get when I attempt to compile
error C2440: 'initializing' : cannot convert from 'std::string' to 'int'
No user-defined-conversion operator available that can perform this conversion,
or the operator cannot be called
The compiler highlights both lower_bound and upper_bound. I would guess it is referring to where I am putting
objtoAdd.getLast()
-----More Follow up-----------------
THis is close to compiling but not quite. What should I expect to get from lower_bound and upper_bound? It doesnt match the iterator i defined and im not sure what I should expect.
void addMyObj(myObj myObjtoadd)
vector< myObj>::iterator tempLB;
vector< myObj>::iterator tempUB;
myVectit= theDex.begin();
tempLB = lower_bound(
myVect.begin()->getLast(), myVect.end()->getLast(), myObjtoadd.getLast()
);
tempUB = upper_bound(
myVect.begin()->getLast(), myVect.end()->getLast(), myObjtoadd.getLast()
);
Your calls to std::lower_bound and std::upper_bound are incorrect. The first two parameters must be iterators that define a range of elements to search and the returned values are also iterators.
Since these algorithms compare the container elements to the third parameter value you'll also need to provide correct operator< functions that compare an object's lastName and a std::string. I've added two different compare functions since std::lower_bound and std::upper_bound pass the parameters in opposite order.
I think I have the machinery correct in this code, it should be close enough for you to get the idea.
class myObj {
private:
std::string firstName;
std::string lastName;
public:
std::string getFirst() const { return firstName; }
std::string getLast() const { return lastName; }
};
bool operator<(const myObj &obj, const std::string &value) // used by lower_bound()
{
return obj.getLast() < value;
}
bool operator<(const std::string &value, const myObj &obj) // used by upper_bound()
{
return value < obj.getLast();
}
int main()
{
std::vector<myObj> myVect;
std::vector<myObj>::iterator tempLB, tempUB;
myObj objtoAdd;
tempLB = std::lower_bound(myVect.begin(), myVect.end(), objtoAdd.getLast());
tempUB = std::upper_bound(myVect.begin(), myVect.end(), objtoAdd.getLast());
}
So this is definitely not the best way to go. Here's why:
Vector Size
A default vector starts out with 0 elements, but capacity to hold some number; say 100. After you add the 101st element, it has to completely recreate the vector, copy over all the data, and then delete the old memory. This copying can become expensive, if done enough.
Inserting into a vector
This is going to be even more of a problem. Because a vector is just a contiguous block of memory with objects stored in insert order, say you have the below:
[xxxxxxxzzzzzzzz ]
if you want to add 'y', it belongs between x and z, right? this means you need to move all the z's over 1 place. But because you are reusing the same block of memory, you need to do it one at a time.
[xxxxxxxzzzzzzz z ]
[xxxxxxxzzzzzz zz ]
[xxxxxxxzzzzz zzz ]
...
[xxxxxxx zzzzzzzz ]
[xxxxxxxyzzzzzzzz ]
(the spaces are for clarity - previous value isn't explicitly cleared)
As you can see, this is a lot of steps to make room for your 'y', and will be very very slow for large data sets.
A better solution
As others have mentioned, std::set sounds like it's more appropriate for your needs. std::set will automatically order all inserted elements (using a tree data structure for much faster insertion), and allows you to find particular data members by last name also in log(n) time. It does this by using bool myObj::operator(const & _myObj) const to know how to sort the different objects. If you simply define this operator to compare this->lastName < _myObj.lastName, you can simply insert into the set much quicker.
Alternately, if you really really want to use vector: instead of sorting it as you go, just add all the items to the vector, and then perform std::sort to sort them after all the inserts are done. This will also complete in n log(n) time, but should be considerably faster than the current approach because of the vector insertion problem.

Replacing std::map with std::set and search by index

Say we have a map with larger objects and an index value. The index value is also part of the larger object.
What I would like to know is whether it is possible to replace the map with a set, extracting the index value.
It is fairly easy to create a set that sorts on a functor comparing two larger objects by extracting the index value.
Which leaves searching by index value, which is not supported by default in a set, I think.
I was thinking of using std::find_if, but I believe that searches linearly, ignoring the fact we have set.
Then I thought of using std::binary_search with a functor comparing the larger object and the value, but I believe that it doesn't work in this case as it wouldn't make use of the structure and would use traversal as it doesn't have a random access iterator. Is this correct? Or are there overloads which correctly handle this call on a set?
And then finally I was thinking of using a boost::containter::flat_set, as this has an underlying vector and thus presumably should be able to work well with std::binary_search?
But maybe there is an all together easier way to do this?
Before you answer just use a map where a map ought to be used - I am actually using a vector that is manually sorted (well std::lower_bound) and was thinking of replacing it with boost::containter::flat_set, but it doesn't seem to be easily possible to do so, so I might just stick with the vector.
C++14 will introduce the ability to lookup by a key that does not require the construction of the entire stored object. This can be used as follows:
#include <set>
#include <iostream>
struct StringRef {
StringRef(const std::string& s):x(&s[0]) { }
StringRef(const char *s):x(s) { std::cout << "works: " << s << std::endl; }
const char *x;
};
struct Object {
long long data;
std::size_t index;
};
struct ObjectIndexer {
ObjectIndexer(Object const& o) : index(o.index) {}
ObjectIndexer(std::size_t index) : index(index) {}
std::size_t index;
};
struct ObjComp {
bool operator()(ObjectIndexer a, ObjectIndexer b) const {
return a.index < b.index;
}
typedef void is_transparent; //Allows the comparison with non-Object types.
};
int main() {
std::set<Object, ObjComp> stuff;
stuff.insert(Object{135, 1});
std::cout << stuff.find(ObjectIndexer(1))->data << "\n";
}
More generally, these sorts of problems where there are multiple ways of indexing your data can be solved using Boost.MultiIndex.
Use boost::intrusive::set which can utilize the object's index value directly. It has a find(const KeyType & key, KeyValueCompare comp) function with logarithmic complexity. There are also other set types based on splay trees, AVL trees, scapegoat trees etc. which may perform better depending on your requirements.
If you add the following to your contained object type:
less than operator that only compares the object indices
equality operator that only compares the object indices
a constructor that takes your index type and initializes a dummy object with that value for the index
then you can pass your index type to find, lower_bound, equal_range, etc... and it will act the way you want. When you pass your index to the set's (or flat_set's) find methods it will construct a dummy object of the contained type to use for the comparisons.
Now if your object is really big, or expensive to construct, this might not be the way you want to go.

std::map keys in C++

I have a requirement to create two different maps in C++. The Key is of type CHAR* and the Value is a pointer to a struct. I am filling 2 maps with these pairs, in separate iterations. After creating both maps I need find all such instances in which the value of the string referenced by the CHAR* are same.
For this I am using the following code :
typedef struct _STRUCTTYPE
{
..
} STRUCTTYPE, *PSTRUCTTYPE;
typedef pair <CHAR *,PSTRUCTTYPE> kvpair;
..
CHAR *xyz;
PSTRUCTTYPE abc;
// after filling the information;
Map.insert (kvpair(xyz,abc));
// the above is repeated x times for the first map, and y times for the second map.
// after both are filled out;
std::map<CHAR *, PSTRUCTTYPE>::iterator Iter,findIter;
for (Iter=iteratedMap->begin();Iter!=iteratedMap->end();mapIterator++)
{
char *key = Iter->first;
printf("%s\n",key);
findIter=otherMap->find(key);
//printf("%u",findIter->second);
if (findIter!=otherMap->end())
{
printf("Match!\n");
}
}
The above code does not show any match, although the list of keys in both maps show obvious matches. My understanding is that the equals operator for CHAR * just equates the memory address of the pointers.
My question is, what should i do to alter the equals operator for this type of key or could I use a different datatype for the string?
My understanding is that the equals operator for CHAR* just equates the memory address of the pointers.
Your understanding is correct.
The easiest thing to do would be to use std::string as the key. That way you get comparisons for the actual string value working without much effort:
std::map<std::string, PSTRUCTTYPE> m;
PSTRUCTTYPE s = bar();
m.insert(std::make_pair("foo", s));
if(m.find("foo") != m.end()) {
// works now
}
Note that you might leak memory for your structs if you don't always delete them manually. If you can't store by value, consider using smart pointers instead.
Depending on your usecase, you don't have to neccessarily store pointers to the structs:
std::map<std::string, STRUCTTYPE> m;
m.insert(std::make_pair("foo", STRUCTTYPE(whatever)));
A final note: typedefing structs the way you are doing it is a C-ism, in C++ the following is sufficient:
typedef struct STRUCTTYPE {
// ...
} *PSTRUCTTYPE;
If you use std::string instead of char * there are more convenient comparison functions you can use. Also, instead of writing your own key matching code, you can use the STL set_intersection algorithm (see here for more details) to find the shared elements in two sorted containers (std::map is of course sorted). Here is an example
typedef map<std::string, STRUCTTYPE *> ExampleMap;
ExampleMap inputMap1, inputMap2, matchedMap;
// Insert elements to input maps
inputMap1.insert(...);
// Put common elements of inputMap1 and inputMap2 into matchedMap
std::set_intersection(inputMap1.begin(), inputMap1.end(), inputMap2.begin(), inputMap2.end(), matchedMap.begin());
for(ExampleMap::iterator iter = matchedMap.begin(); iter != matchedMap.end(); ++iter)
{
// Do things with matched elements
std::cout << iter->first << endl;
}

C++ STL map I don't want it to sort!

This is my code
map<string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(map<string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
cout<< (*i).first << ":"<<(*i).second<<endl;
}
Expected output:
B:123
A:321
But output it gives is:
A:321
B:123
I want it to maintain the order in which keys and values were inserted in the map<string,int>.
Is it possible? Or should I use some other STL data structure? Which one?
There is no standard container that does directly what you want. The obvious container to use if you want to maintain insertion order is a vector. If you also need look up by string, use a vector AND a map. The map would in general be of string to vector index, but as your data is already integers you might just want to duplicate it, depending on your use case.
Like Matthieu has said in another answer, the Boost.MultiIndex library seems the right choice for what you want. However, this library can be a little tough to use at the beginning especially if you don't have a lot of experience with C++. Here is how you would use the library to solve the exact problem in the code of your question:
struct person {
std::string name;
int id;
person(std::string const & name, int id)
: name(name), id(id) {
}
};
int main() {
using namespace::boost::multi_index;
using namespace std;
// define a multi_index_container with a list-like index and an ordered index
typedef multi_index_container<
person, // The type of the elements stored
indexed_by< // The indices that our container will support
sequenced<>, // list-like index
ordered_unique<member<person, string,
&person::name> > // map-like index (sorted by name)
>
> person_container;
// Create our container and add some people
person_container persons;
persons.push_back(person("B", 123));
persons.push_back(person("C", 224));
persons.push_back(person("A", 321));
// Typedefs for the sequence index and the ordered index
enum { Seq, Ord };
typedef person_container::nth_index<Seq>::type persons_seq_index;
typedef person_container::nth_index<Ord>::type persons_ord_index;
// Let's test the sequence index
persons_seq_index & seq_index = persons.get<Seq>();
for(persons_seq_index::iterator it = seq_index.begin(),
e = seq_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// And now the ordered index
persons_ord_index & ord_index = persons.get<Ord>();
for(persons_ord_index::iterator it = ord_index.begin(),
e = ord_index.end(); it != e; ++it)
cout << it->name << ":"<< it->id << endl;
cout << "\n";
// Thanks to the ordered index we have fast lookup by name:
std::cout << "The id of B is: " << ord_index.find("B")->id << "\n";
}
Which produces the following output:
B:123
C:224
A:321
A:321
B:123
C:224
The id of B is: 123
Map is definitely not right for you:
"Internally, the elements in the map are sorted from lower to higher key value following a specific strict weak ordering criterion set on construction."
Quote taken from here.
Unfortunately there is no unordered associative container in the STL, so either you use a nonassociative one like vector, or write your own :-(
I had the same problem every once in a while and here is my solution: https://github.com/nlohmann/fifo_map. It's a header-only C++11 solution and can be used as drop-in replacement for a std::map.
For your example, it can be used as follows:
#include "fifo_map.hpp"
#include <string>
#include <iostream>
using nlohmann::fifo_map;
int main()
{
fifo_map<std::string,int> persons;
persons["B"] = 123;
persons["A"] = 321;
for(fifo_map<std::string,int>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout<< (*i).first << ":"<<(*i).second << std::endl;
}
}
The output is then
B:123
A:321
Besides Neil's recommendation of a combined vector+map if you need both to keep the insertion order and the ability to search by key, you can also consider using boost multi index libraries, that provide for containers addressable in more than one way.
maps and sets are meant to impose a strict weak ordering upon the data. Strick weak ordering maintains that no entries are equavalent (different to being equal).
You need to provide a functor that the map/set may use to perform a<b. With this functor the map/set sorts its items (in the STL from GCC it uses a red-black tree). It determines weather two items are equavalent if !a<b && !b<a -- the equavelence test.
The functor looks like follows:
template <class T>
struct less : binary_function<T,T,bool> {
bool operator() (const T& a, const T& b) const {
return a < b;
}
};
If you can provide a function that tells the STL how to order things then the map and set can do what you want. For example
template<typename T>
struct ItemHolder
{
int insertCount;
T item;
};
You can then easily write a functor to order by insertCount. If your implementation uses red-black trees your underlying data will remain balanced -- however you will get a lot of re-balancing since your data will be generated based on incremental ordering (vs. Random) -- and in this case a list with push_back would be better. However you cannot access data by key as fast as you would with a map/set.
If you want to sort by string -- provide the functor to search by string, using the insertCount you could potentiall work backwards. If you want to search by both you can have two maps.
map<insertcount, string> x; // auxhilary key
map<string, item> y; //primary key
I use this strategy often -- however I have never placed it in code that is run often. I'm considering boost::bimap.
Well, there is no STL container which actually does what you wish, but there are possibilities.
1. STL
By default, use a vector. Here it would mean:
struct Entry { std::string name; int it; };
typedef std::vector<Entry> container_type;
If you wish to search by string, you always have the find algorithm at your disposal.
class ByName: std::unary_function<Entry,bool>
{
public:
ByName(const std::string& name): m_name(name) {}
bool operator()(const Entry& entry) const { return entry.name == m_name; }
private:
std::string m_name;
};
// Use like this:
container_type myContainer;
container_type::iterator it =
std::find(myContainer.begin(), myContainer.end(), ByName("A"));
2. Boost.MultiIndex
This seems way overkill, but you can always check it out here.
It allows you to create ONE storage container, accessible via various indexes of various styles, all maintained for you (almost) magically.
Rather than using one container (std::map) to reference a storage container (std::vector) with all the synchro issues it causes... you're better off using Boost.
For preserving all the time complexity constrains you need map + list:
struct Entry
{
string key;
int val;
};
typedef list<Entry> MyList;
typedef MyList::iterator Iter;
typedef map<string, Iter> MyMap;
MyList l;
MyMap m;
int find(string key)
{
Iter it = m[key]; // O(log n)
Entry e = *it;
return e.val;
}
void put(string key, int val)
{
Entry e;
e.key = key;
e.val = val;
Iter it = l.insert(l.end(), e); // O(1)
m[key] = it; // O(log n)
}
void erase(string key)
{
Iter it = m[key]; // O(log n)
l.erase(it); // O(1)
m.erase(key); // O(log n)
}
void printAll()
{
for (Iter it = l.begin(); it != l.end(); it++)
{
cout<< it->key << ":"<< it->val << endl;
}
}
Enjoy
You could use a vector of pairs, it is almost the same as unsorted map container
std::vector<std::pair<T, U> > unsorted_map;
Use a vector. It gives you complete control over ordering.
I also think Map is not the way to go. The keys in a Map form a Set; a single key can occur only once. During an insert in the map the map must search for the key, to ensure it does not exist or to update the value of that key. For this it is important (performance wise) that the keys, and thus the entries, have some kind of ordering. As such a Map with insert ordering would be highly inefficient on inserts and retrieving entries.
Another problem would be if you use the same key twice; should the first or the last entry be preserved, and should it update the insert order or not?
Therefore I suggest you go with Neils suggestion, a vector for insert-time ordering and a Map for key-based searching.
Yes, the map container is not for you.
As you asked, you need the following code instead:
struct myClass {
std::string stringValue;
int intValue;
myClass( const std::string& sVal, const int& iVal ):
stringValue( sVal ),
intValue( iVal) {}
};
std::vector<myClass> persons;
persons.push_back( myClass( "B", 123 ));
persons.push_back( myClass( "A", 321 ));
for(std::vector<myClass>::iterator i = persons.begin();
i!=persons.end();
++i)
{
std::cout << (*i).stringValue << ":" << (*i).intValue << std::endl;
}
Here the output is unsorted as expected.
Map is ordered collection (second parametr in template is a order functor), as set. If you want to pop elements in that sequenses as pushd you should use deque or list or vector.
In order to do what they do and be efficient at it, maps use hash tables and sorting. Therefore, you would use a map if you're willing to give up memory of insertion order to gain the convenience and performance of looking up by key.
If you need the insertion order stored, one way would be to create a new type that pairs the value you're storing with the order you're storing it (you would need to write code to keep track of the order). You would then use a map of string to this new type for storage. When you perform a look up using a key, you can also retrieve the insertion order and then sort your values based on insertion order.
One more thing: If you're using a map, be aware of the fact that testing if persons["C"] exists (after you've only inserted A and B) will actually insert a key value pair into your map.
Instead of map you can use the pair function with a vector!
ex:
vector<::pair<unsigned,string>> myvec;
myvec.push_back(::pair<unsigned,string>(1,"a"));
myvec.push_back(::pair<unsigned,string>(5,"b"));
myvec.push_back(::pair<unsigned,string>(3,"aa"));`
Output:
myvec[0]=(1,"a"); myvec[1]=(5,"b"); myvec[2]=(3,"aa");
or another ex:
vector<::pair<string,unsigned>> myvec2;
myvec2.push_back(::pair<string,unsigned>("aa",1));
myvec2.push_back(::pair<string,unsigned>("a",3));
myvec2.push_back(::pair<string,unsigned>("ab",2));
Output: myvec2[0]=("aa",1); myvec2[1]=("a",3); myvec2[2]=("ab",2);
Hope this can help someone else in the future who was looking for non sorted maps like me!
struct Compare : public binary_function<int,int,bool> {
bool operator() (int a, int b) {return true;}
};
Use this to get all the elements of a map in the reverse order in which you entered (i.e.: the first entered element will be the last and the last entered element will be the first). Not as good as the same order but it might serve your purpose with a little inconvenience.
Use a Map along with a vector of iterators as you insert in Map. (Map iterators are guaranteed not to be invalidated)
In the code below I am using Set
set<string> myset;
vector<set<string>::iterator> vec;
void printNonDuplicates(){
vector<set<string>::iterator>::iterator vecIter;
for(vecIter = vec.begin();vecIter!=vec.end();vecIter++){
cout<<(*vecIter)->c_str()<<endl;
}
}
void insertSet(string str){
pair<set<string>::iterator,bool> ret;
ret = myset.insert(str);
if(ret.second)
vec.push_back(ret.first);
}
If you don't want to use boost::multi_index, I have put a proof of concept class template up for review here:
https://codereview.stackexchange.com/questions/233157/wrapper-class-template-for-stdmap-stdlist-to-provide-a-sequencedmap-which
using std::map<KT,VT> and std::list<OT*> which uses pointers to maintain the order.
It will take O(n) linear time for the delete because it needs to search the whole list for the right pointer. To avoid that would need another cross reference in the map.
I'd vote for typedef std::vector< std::pair< std::string, int > > UnsortedMap;
Assignment looks a bit different, but your loop remains exactly as it is now.
There is std::unordered_map that you can check out. From first view, it looks like it might solve your problem.