I am attempting to optimize a std::vector "search " - index based iterating through a vector and returning and element that matches a "search" criteria
struct myObj {
int id;
char* value;
};
std::vector<myObj> myObjList;
create a few thousand entries with unique id's and values and push them to the vector myObjList.
What is the most efficient way to retrieve myObj that matches the id.
Currently I am index iterating like:
for(int i = 0; i < myObjList.size(); i++){
if(myObjList.at(i).id == searchCriteria){
return myObjList.at(i);
}
}
Note: searchCriteria = int. All the elements have unique id's.
The above does the job, but probably not the most efficient way.
The C++ standard library has some abstract algorithms, which give C++ a kind of functional flavour, as I call it, which lets you concentrate more on the criteria of your search than on how you implement the search itself. This applies to a lot of other algorithms.
The algorithm you are looking for is std::find_if, a simple linear search through an iterator range.
In C++11, you can use a lambda to express your criteria:
std::find_if(myObjList.begin(), myObjList.end(), [&](const myObj & o) {
return o.id == searchCriteria;
});
When not having C++11 available, you have to provide a predicate (function object (=functor) or function pointer) which returns true if the provided instance is the one you are looking for. Functors have the advantage that they can be parameterized, in your case you want to parameterize the functor with the ID you are looking for.
template<class TargetClass>
class HasId {
int _id;
public:
HasId(int id) : _id(id) {}
bool operator()(const TargetClass & o) const {
return o.id == _id;
}
}
std::find_if(myObjList.begin(), myObjList.end(), HasId<myObj>(searchCriteria));
This method returns an iterator pointing to the first element found which matches your criteria. If there is no such element, the end iterator is returned (which points past the end of the vector, not to the last element). So your function could look like this:
vector<myObj>::iterator it = std::find_if(...);
if(it == myObjList.end())
// handle error in any way
else
return *it;
Using std::find_if.
There's an example on the referenced page.
Here's a working example that more precisely fits your question:
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
struct myObj
{
int id;
char* value;
myObj(int id_) : id(id_), value(0) {}
};
struct obj_finder
{
obj_finder(int key) : key_(key)
{}
bool operator()(const myObj& o) const
{
return key_ == o.id;
}
const int key_;
};
int main () {
vector<myObj> myvector;
vector<myObj>::iterator it;
myvector.push_back(myObj(30));
myvector.push_back(myObj(50));
myvector.push_back(myObj(100));
myvector.push_back(myObj(32));
it = find_if (myvector.begin(), myvector.end(), obj_finder(100));
cout << "I found " << it->id << endl;
return 0;
}
And, if you have C++11 available, you can make this even more concise using a lambda:
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
struct myObj
{
int id;
char* value;
myObj(int id_) : id(id_), value(0) {}
};
int main ()
{
vector<myObj> myvector;
vector<myObj>::iterator it;
myvector.push_back(myObj(30));
myvector.push_back(myObj(50));
myvector.push_back(myObj(100));
myvector.push_back(myObj(32));
int key = 100;
it = find_if (myvector.begin(), myvector.end(), [key] (const myObj& o) -> bool {return o.id == key;});
cout << "I found " << it->id << endl;
return 0;
}
This isn't really an answer to your question. The other people who answered gave pretty good answers, so I have nothing to add to them.
I would like to say though that your code is not very idiomatic C++. Really idiomatic C++ would, of course, use ::std::find_if. But even if you didn't have ::std::find_if your code is still not idiomatic. I'll provide two re-writes. One a C++11 re-write, and the second a C++03 re-write.
First, C++11:
for (auto &i: myObjList){
if(i.id == searchCriteria){
return i;
}
}
Second, C++03:
for (::std::vector<myObj>::iterator i = myObjList.begin(); i != myObjList.end(); ++i){
if(i->id == searchCriteria){
return *i;
}
}
The standard way of going through any sort of C++ container is to use an iterator. It's nice that vectors can be indexed by integer. But if you rely on that behavior unnecessarily you make it harder on yourself if you should change data structures later.
If the ids are sorted you may perform binary search(there is also a function binary_search in stl). If they are not nothing will perform better, but still you may write your code in a shorter way using stl(use find_if).
Related
It does not have to be a QHash; just an existing data structure (ideally in Qt) which cleanly accomplishes that task without being considered an esoteric solution, because I need this code to be quite short (to fit on a small playing card) and easily understandable. Vectors, Multi-hashes, Lists, Maps, or anything is welcome, as long as it would be considered good practice.
Basically, I have a class which has an integer value associated with it. For example:
class Flowers {
public:
const int m_Cost;
Flowers(int cost) { m_Cost = cost; }
}
Flowers roses{5};
Flowers violets{7};
Flowers tulips{9};
Flowers posies{3};
/* Place them in some sort of datastructure. */
flowerDataStructure[4]; // Returns Posies
flowerDataStructure[7]; // Returns Violets, Roses, Posies
flowerDataStructure[roses.m_Cost]; // Returns Roses, Posies
Would they perhaps support a range such as,
flowerDataStructure[5 ... 11]; // Returns Roses, Violets, Tulips
PS: int m_Cost; does not have to be const. I just assumed it would be easier if it was.
Thanks.
How to achieve a data structure like “QHash bar;” where
“bar[10]” returns all “Foo” belonging to keys of 10 or less?
The right datastructure for the collection of non-unique items sorted by integral value would be either std::multimap or std::multiset depending on how we store the key. According to the authors example above the key is stored with the data type so I have chosen std::multiset:
#include <set>
#include <string>
#include <QDebug>
struct Flower
{
public:
const int m_cost;
const std::string m_name;
explicit Flower(int cost) : m_cost(cost) {}
Flower(const char* name, int cost) : m_cost(cost), m_name(name) {}
};
int main()
{
auto lessFunc = [](const Flower& l, const Flower& r) -> bool
{return l.m_cost < r.m_cost;};
std::multiset<Flower, decltype(lessFunc)> multiSet(lessFunc);
multiSet.emplace("roses", 5);
multiSet.emplace("violets", 7);
multiSet.emplace("tulips", 9);
multiSet.emplace("posies", 3);
// This is a request for items equal or below 7
const auto& itEnd = multiSet.upper_bound(Flower{7});
for(auto it = multiSet.begin(); it != itEnd; it++)
{
const Flower& flower{*it};
qDebug() << flower.m_name.c_str() << flower.m_cost;
}
return 0;
}
If operator [upper_bound] desired then it can be done but we have to overload std::multiset (more work).
How to store elements in set in insertion order.
for example.
set<string>myset;
myset.insert("stack");
myset.insert("overflow");
If you print, the output is
overflow
stack
needed output :
stack
overflow
One way is to use two containers, a std::deque to store the elements in insertion order, and another std::set to make sure there are no duplicates.
When inserting an element, check if it's in the set first, if yes, throw it out; if it's not there, insert it both in the deque and the set.
One common scenario is to insert all elements first, then process(no more inserting), if this is the case, the set can be freed after the insertion process.
A set is the wrong container for keeping insertion order, it will sort its element according to the sorting criterion and forget the insertion order. You have to use a sequenced container like vector, deque or list for that. If you additionally need the associative access set provides you would have to store your elements in multiple containers simultaneously or use a non-STL container like boost::multi_index which can maintain multiple element orders at the same time.
PS: If you sort the elements before inserting them in a set, the set will keep them in insertion order but I think that will not address your problem.
If you don't need any order besides the insertion order, you could also store the insert number in the stored element and make that the sorting criterion. However, why one would use a set in this case at all escapes me. ;)
Here's how I do it:
template <class T>
class VectorSet
{
public:
using iterator = typename vector<T>::iterator;
using const_iterator = typename vector<T>::const_iterator;
iterator begin() { return theVector.begin(); }
iterator end() { return theVector.end(); }
const_iterator begin() const { return theVector.begin(); }
const_iterator end() const { return theVector.end(); }
const T& front() const { return theVector.front(); }
const T& back() const { return theVector.back(); }
void insert(const T& item) { if (theSet.insert(item).second) theVector.push_back(item); }
size_t count(const T& item) const { return theSet.count(item); }
bool empty() const { return theSet.empty(); }
size_t size() const { return theSet.size(); }
private:
vector<T> theVector;
set<T> theSet;
};
Of course, new forwarding functions can be added as needed, and can be forwarded to whichever of the two data structures implements them most efficiently. If you are going to make heavy use of STL algorithms on this (I haven't needed to so far) you may also want to define member types that the STL expects to find, like value_type and so forth.
If you can use Boost, a very straightforward solution is to use the header-only library Boost.Bimap (bidirectional maps).
Consider the following sample program that will display your dummy entries in insertion order (try out here):
#include <iostream>
#include <string>
#include <type_traits>
#include <boost/bimap.hpp>
using namespace std::string_literals;
template <typename T>
void insertCallOrdered(boost::bimap<T, size_t>& mymap, const T& element) {
// We use size() as index, therefore indexing with 0, 1, ...
// as we add elements to the bimap.
mymap.insert({ element, mymap.size() });
}
int main() {
boost::bimap<std::string, size_t> mymap;
insertCallOrdered(mymap, "stack"s);
insertCallOrdered(mymap, "overflow"s);
// Iterate over right map view (integers) in sorted order
for (const auto& rit : mymap.right) {
std::cout << rit.first << " -> " << rit.second << std::endl;
}
}
I'm just wondering why nobody has suggested using such a nice library as Boost MultiIndex. Here's an example how to do that:
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/indexed_by.hpp>
#include <boost/multi_index/identity.hpp>
#include <boost/multi_index/sequenced_index.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <iostream>
template<typename T>
using my_set = boost::multi_index_container<
T,
boost::multi_index::indexed_by<
boost::multi_index::sequenced<>,
boost::multi_index::ordered_unique<boost::multi_index::identity<T>>
>
>;
int main() {
my_set<int> set;
set.push_back(10);
set.push_back(20);
set.push_back(3);
set.push_back(11);
set.push_back(1);
// Prints elements of the set in order of insertion.
const auto &index = set.get<0>();
for (const auto &item : index) {
std::cout << item << " ";
}
// Prints elements of the set in order of value.
std::cout << "\n";
const auto &ordered_index = set.get<1>();
for (const auto &item : ordered_index) {
std::cout << item << " ";
}
}
what you need is this, very simple and a standard library. Example online compiler link: http://cpp.sh/7hsxo
#include <iostream>
#include <string>
#include <unordered_set>
static std::unordered_set<std::string> myset;
int main()
{
myset.insert("blah");
myset.insert("blah2");
myset.insert("blah3");
int count = 0;
for ( auto local_it = myset.begin(); local_it!= myset.end(); ++local_it ) {
printf("index: [%d]: %s\n", count, (*local_it).c_str());
count++;
}
printf("\n");
for ( unsigned i = 0; i < myset.bucket_count(); ++i) {
for ( auto local_it = myset.begin(i); local_it!= myset.end(i); ++local_it )
printf("bucket: [%d]: %s\n", i, (*local_it).c_str());
}
}
How can I efficiently tell if an element is at the beginning of an intrusive set or rbtree? I would like to define a simple function prev that returns a pointer to the previous item in a tree, or nullptr if there is no previous item. An analogous next function is easy to write, using iterator_to and comparing to end(). However, there is no equivalent reverse_iterator_to function that would allow me to compare to rend(). Moreover, I specifically do not want to compare to begin(), because that's not constant time in a red-black tree.
One thing that certainly seems to work is decrementing an iterator and comparing it to end(). That works fine with the implementation, but I can find no support for this in the documentation. What's the best way to implement prev in the following minimal working example?
#include <iostream>
#include <string>
#include <boost/intrusive/set.hpp>
using namespace std;
using namespace boost::intrusive;
struct foo : set_base_hook<> {
string name;
foo(const char *n) : name(n) {}
friend bool operator<(const foo &a, const foo &b) { return a.name < b.name; }
};
rbtree<foo> tree;
foo *
prev(foo *fp)
{
auto fi = tree.iterator_to(*fp);
return --fi == tree.end() ? nullptr : &*fi;
}
int
main()
{
tree.insert_equal(*new foo{"a"});
tree.insert_equal(*new foo{"b"});
tree.insert_equal(*new foo{"c"});
for (foo *fp = &*tree.find("c"); fp; fp = prev(fp))
cout << fp->name << endl;
}
Update: Okay, so what I was missing, which is probably what sehe was getting at indirectly, is that in STL begin() is actually guaranteed to be constant-time. So even though a generic red-black tree requires log(n) time to find the minimum element, an STL map doesn't--an STL std::map implementation is required to cache the first element. And I think the point sehe is making is that even though boost is not documented, it is fair to assume that boost::intrusive containers behave sort of like STL containers. Given that assumption, it is perfectly fine to say:
foo *
prev(foo *fp)
{
auto fi = tree.iterator_to(*fp);
return fi == tree.begin() ? nullptr : &*--fi;
}
As the comparison to tree.begin() shouldn't be too costly.
You can get the reverse-iterator from iterator_to.
Also, note that there is rbtree<>::container_from_iterator(iterator it) so you don't have to have a "global" state for your prev function.
You can just create the corresponding reverse_iterator. You'll have to +1 the iterator to get the expected address:
So my take on this would be (bonus: without memory leaks):
Live On Coliru
#include <boost/intrusive/set.hpp>
#include <iostream>
#include <string>
#include <vector>
using namespace boost::intrusive;
struct foo : set_base_hook<> {
std::string name;
foo(char const* n) : name(n) {}
bool operator<(const foo &b) const { return name < b.name; }
};
int main()
{
std::vector<foo> v;
v.emplace_back("a");
v.emplace_back("b");
v.emplace_back("c");
using Tree = rbtree<foo>;
Tree tree;
tree.insert_unique(v.begin(), v.end());
for (auto key : { "a", "b", "c", "missing" })
{
std::cout << "\nusing key '" << key << "': ";
auto start = tree.iterator_to(*tree.find(key));
if (start != tree.end()) {
for (auto it = Tree::reverse_iterator(++start); it != tree.rend(); ++it)
std::cout << it->name << " ";
}
}
}
Which prints
using key 'a': a
using key 'b': b a
using key 'c': c b a
using key 'missing':
I want to store a floating point value for an unordered pair of an integers. I am unable to find any kind of easy to understand tutorials for this. E.g for the unordered pair {i,j} I want to store a floating point value f. How do I insert, store and retrieve values like this?
Simple way to handle unordered int pairs is using std::minmax(i,j) to generate std::pair<int,int>. This way you can implement your storage like this:
std::map<std::pair<int,int>,float> storage;
storage[std::minmax(i,j)] = 0.f;
storage[std::minmax(j,i)] = 1.f; //rewrites storage[(i,j)]
Admittedly proper hashing would give you some extra performance, but there is little harm in postponing this kind of optimization.
Here's some indicative code:
#include <iostream>
#include <unordered_map>
#include <utility>
struct Hasher
{
int operator()(const std::pair<int, int>& p) const
{
return p.first ^ (p.second << 7) ^ (p.second >> 3);
}
};
int main()
{
std::unordered_map<std::pair<int,int>, float, Hasher> m =
{ { {1,3}, 2.3 },
{ {2,3}, 4.234 },
{ {3,5}, -2 },
};
// do a lookup
std::cout << m[std::make_pair(2,3)] << '\n';
// add more data
m[std::make_pair(65,73)] = 1.23;
// output everything (unordered)
for (auto& x : m)
std::cout << x.first.first << ',' << x.first.second
<< ' ' << x.second << '\n';
}
Note that it relies on the convention that you store the unordered pairs with the lower number first (if they're not equal). You might find it convenient to write a support function that takes a pair and returns it in that order, so you can use that function when inserting new values in the map and when using a pair as a key for trying to find a value in the map.
Output:
4.234
3,5 -2
1,3 2.3
65,73 1.23
2,3 4.234
See it on ideone.com. If you want to make a better hash function, just hunt down an implementation of hash_combine (or use boost's) - plenty of questions here on SO explaining how to do that for std::pair<>s.
You implement a type UPair with your requirements and overload ::std::hash (which is the rare occasion that you are allowed to implement something in std).
#include <utility>
#include <unordered_map>
template <typename T>
class UPair {
private:
::std::pair<T,T> p;
public:
UPair(T a, T b) : p(::std::min(a,b),::std::max(a,b)) {
}
UPair(::std::pair<T,T> pair) : p(::std::min(pair.first,pair.second),::std::max(pair.first,pair.second)) {
}
friend bool operator==(UPair const& a, UPair const& b) {
return a.p == b.p;
}
operator ::std::pair<T,T>() const {
return p;
}
};
namespace std {
template <typename T>
struct hash<UPair<T>> {
::std::size_t operator()(UPair<T> const& up) const {
return ::std::hash<::std::size_t>()(
::std::hash<T>()(::std::pair<T,T>(up).first)
) ^
::std::hash<T>()(::std::pair<T,T>(up).second);
// the double hash is there to avoid the likely scenario of having the same value in .first and .second, resulinting in always 0
// that would be a problem for the unordered_map's performance
}
};
}
int main() {
::std::unordered_map<UPair<int>,float> um;
um[UPair<int>(3,7)] = 3.14;
um[UPair<int>(8,7)] = 2.71;
return 10*um[::std::make_pair(7,3)]; // correctly returns 31
}
I'm new to C/C++ programming, but I've been programming in C# for 1.5 years now. I like C# and I like the List class, so I thought about making a List class in C++ as an exercise.
List<int> ls;
int whatever = 123;
ls.Add(1);
ls.Add(235445);
ls.Add(whatever);
The implementation is similar to any Array List class out there. I have a T* vector member where I store the items, and when this storage is about to be completely filled, I resize it.
Please notice that this is not to be used in production, this is only an exercise. I'm well aware of vector<T> and friends.
Now I want to loop through the items of my list. I don't like to use for(int i=0;i<n; i==). I typed for in the visual studio, awaited for Intellisense, and it suggested me this:
for each (object var in collection_to_loop)
{
}
This obviously won't work with my List implementation. I figured I could do some macro magic, but this feels like a huge hack. Actually, what bothers me the most is passing the type like that:
#define foreach(type, var, list)\
int _i_ = 0;\
##type var;\
for (_i_ = 0, var=list[_i_]; _i_<list.Length();_i_++,var=list[_i_])
foreach(int,i,ls){
doWork(i);
}
My question is: is there a way to make this custom List class work with a foreach-like loop?
Firstly, the syntax of a for-each loop in C++ is different from C# (it's also called a range based for loop. It has the form:
for(<type> <name> : <collection>) { ... }
So for example, with an std::vector<int> vec, it would be something like:
for(int i : vec) { ... }
Under the covers, this effectively uses the begin() and end() member functions, which return iterators. Hence, to allow your custom class to utilize a for-each loop, you need to provide a begin() and an end() function. These are generally overloaded, returning either an iterator or a const_iterator. Implementing iterators can be tricky, although with a vector-like class it's not too hard.
template <typename T>
struct List
{
T* store;
std::size_t size;
typedef T* iterator;
typedef const T* const_iterator;
....
iterator begin() { return &store[0]; }
const_iterator begin() const { return &store[0]; }
iterator end() { return &store[size]; }
const_iterator end() const { return &store[size]; }
...
};
With these implemented, you can utilize a range based loop as above.
Let iterable be of type Iterable.
Then, in order to make
for (Type x : iterable)
compile, there must be types called Type and IType and there must be functions
IType Iterable::begin()
IType Iterable::end()
IType must provide the functions
Type operator*()
void operator++()
bool operator!=(IType)
The whole construction is really sophisticated syntactic sugar for something like
for (IType it = iterable.begin(); it != iterable.end(); ++it) {
Type x = *it;
...
}
where instead of Type, any compatible type (such as const Type or Type&) can be used, which will have the expected implications (constness, reference-instead-of-copy etc.).
Since the whole expansion happens syntactically, you can also change the declaration of the operators a bit, e.g. having *it return a reference or having != take a const IType& rhs as needed.
Note that you cannot use the for (Type& x : iterable) form if *it does not return a reference (but if it returns a reference, you can also use the copy version).
Note also that operator++() defines the prefix version of the ++ operator -- however it will also be used as the postfix operator unless you explicitly define a postfix ++. The ranged-for will not compile if you only supply a postfix ++, which btw.can be declared as operator++(int) (dummy int argument).
Minimal working example:
#include <stdio.h>
typedef int Type;
struct IType {
Type* p;
IType(Type* p) : p(p) {}
bool operator!=(IType rhs) {return p != rhs.p;}
Type& operator*() {return *p;}
void operator++() {++p;}
};
const int SIZE = 10;
struct Iterable {
Type data[SIZE];
IType begin() {return IType(data); }
IType end() {return IType(data + SIZE);}
};
Iterable iterable;
int main() {
int i = 0;
for (Type& x : iterable) {
x = i++;
}
for (Type x : iterable) {
printf("%d", x);
}
}
output
0123456789
You can fake the ranged-for-each (e.g. for older C++ compilers) with the following macro:
#define ln(l, x) x##l // creates unique labels
#define l(x,y) ln(x,y)
#define for_each(T,x,iterable) for (bool _run = true;_run;_run = false) for (auto it = iterable.begin(); it != iterable.end(); ++it)\
if (1) {\
_run = true; goto l(__LINE__,body); l(__LINE__,cont): _run = true; continue; l(__LINE__,finish): break;\
} else\
while (1) \
if (1) {\
if (!_run) goto l(__LINE__,cont);/* we reach here if the block terminated normally/via continue */ \
goto l(__LINE__,finish);/* we reach here if the block terminated by break */\
} \
else\
l(__LINE__,body): for (T x = *it;_run;_run=false) /* block following the expanded macro */
int main() {
int i = 0;
for_each(Type&, x, iterable) {
i++;
if (i > 5) break;
x = i;
}
for_each(Type, x, iterable) {
printf("%d", x);
}
while (1);
}
(use declspec or pass IType if your compiler doesn't even have auto).
Output:
1234500000
As you can see, continue and break will work with this thanks to its complicated construction.
See http://www.chiark.greenend.org.uk/~sgtatham/mp/ for more C-preprocessor hacking to create custom control structures.
That syntax Intellisense suggested is not C++; or it's some MSVC extension.
C++11 has range-based for loops for iterating over the elements of a container. You need to implement begin() and end() member functions for your class that will return iterators to the first element, and one past the last element respectively. That, of course, means you need to implement suitable iterators for your class as well. If you really want to go this route, you may want to look at Boost.IteratorFacade; it reduces a lot of the pain of implementing iterators yourself.
After that you'll be able to write this:
for( auto const& l : ls ) {
// do something with l
}
Also, since you're new to C++, I want to make sure that you know the standard library has several container classes.
C++ does not have the for_each loop feature in its syntax. You have to use c++11 or use the template function std::for_each.
#include <vector>
#include <algorithm>
#include <iostream>
struct Sum {
Sum() { sum = 0; }
void operator()(int n) { sum += n; }
int sum;
};
int main()
{
std::vector<int> nums{3, 4, 2, 9, 15, 267};
std::cout << "before: ";
for (auto n : nums) {
std::cout << n << " ";
}
std::cout << '\n';
std::for_each(nums.begin(), nums.end(), [](int &n){ n++; });
Sum s = std::for_each(nums.begin(), nums.end(), Sum());
std::cout << "after: ";
for (auto n : nums) {
std::cout << n << " ";
}
std::cout << '\n';
std::cout << "sum: " << s.sum << '\n';
}
As #yngum suggests, you can get the VC++ for each extension to work with any arbitrary collection type by defining begin() and end() methods on the collection to return a custom iterator. Your iterator in turn has to implement the necessary interface (dereference operator, increment operator, etc). I've done this to wrap all of the MFC collection classes for legacy code. It's a bit of work, but can be done.