operator[] - why should I not use it? - c++

I have been told that using the operator[] is wrong.
what is the difference between:
vector<double> * IDs;
...
IDs->operator[](j) = 1;
and:
vector<double> * IDs;
...
(*IDs)[j] = 1;

Nothing at all, they are equivalent.
Two things though
1) IDs->operator[](j) is less clear than (*IDs)[j].
2) If you can use const vector<double>& IDs instead then not only do you benefit from the even clearer IDs[j] but you also gain stability via the fact that you cannot modify the vector using IDs.
Remember that one day your code will be maintained by someone less able than you so clarity is important.

The whole idea of using this operator is so that your object can use the [] in some way that hides the real data structure or data organization within itself but on the outside, provides an array-like interface.
For e.g. both vectors and deques are implemented quite differently and have better performances for different purposes, but on the outside, they both provide the [] so that the user feels like he's using a dynamic array.
So, it's not wrong, but it defeats the purpose.

Related

Searching data using different keys

I am no expert in C++ and STL.
I use a structure in a Map as data. Key is some class C1.
I would like to access the same data but using a different key C2 too (where C1 and C2 are two unrelated classes).
Is this possible without duplicating the data?
I tried searching in google, but had a tough time finding an answer that I could understand.
This is for an embedded target where boost libraries are not supported.
Can somebody offer help?
You may store pointers to Data as std::map values, and you can have two maps with different keys pointing to the same data.
I think a smart pointer like std::shared_ptr is a good option in this case of shared ownership of data:
#include <map> // for std::map
#include <memory> // for std::shared_ptr
....
std::map<C1, std::shared_ptr<Data>> map1;
std::map<C2, std::shared_ptr<Data>> map2;
Instances of Data can be allocated using std::make_shared().
Not in the Standard Library, but Boost offers boost::multi_index
Two keys of different types
I must admit I've misread a bit, and didn't really notice you want 2 keys of different types, not values. The solution for that will base on what's below, though. Other answers have pretty much what will be needed for that, I'd just add that you could make an universal lookup function: (C++14-ish pseudocode).
template<class Key>
auto lookup (Key const& key) { }
And specialize it for your keys (arguably easier than SFINAE)
template<>
auto lookup<KeyA> (KeyA const& key) { return map_of_keys_a[key]; }
And the same for KeyB.
If you wanted to encapsulate it in a class, an obvious choice would be to change lookup to operator[].
Key of the same type, but different value
Idea 1
The simplest solution I can think of in 60 seconds: (simplest meaning exactly that it should be really thought through). I'd also switch to unordered_map as default.
map<Key, Data> data;
map<Key2, Key> keys;
Access via data[keys["multikey"]].
This will obviously waste some space (duplicating objects of Key type), but I am assuming they are much smaller than the Data type.
Idea 2
Another solution would be to use pointers; then the only cost of duplicate is a (smart) pointer:
map<Key, shared_ptr<Data>> data;
Object of Data will be alive as long as there is at least one key pointing to it.
What I usually do in these cases is use non-owned pointers. I store my data in a vector:
std::vector<Data> myData;
And then I map pointers to each element. Since it is possible that pointers are invalidated because of the future growth of the vector, though, I will choose to use the vector indexes in this case.
std::map<Key1, int> myMap1;
std::map<Key2, int> myMap2;
Don't expose the data containers to your clients. Encapsulate element insertion and removal in specific functions, which insert everywhere and remove everywhere.
Bartek's "Idea 1" is good (though there's no compelling reason to prefer unordered_map to map).
Alternatively, you could have a std::map<C2, Data*>, or std::map<C2, std::map<C1, Data>::iterator> to allow direct access to Data objects after one C2-keyed search, but then you'd need to be more careful not to access invalid (erased) Data (or more precisely, to erase from both containers atomically from the perspective of any other users).
It's also possible for one or both maps to move to shared_ptr<Data> - the other could use weak_ptr<> if that's helpful ownership-wise. (These are in the C++11 Standard, otherwise the obvious source - boost - is apparently out for you, but maybe you've implemented your own or selected another library? Pretty fundamental classes for modern C++).
EDIT - hash tables versus balanced binary trees
This isn't particularly relevant to the question, but has received comments/interest below and I need more space to address it properly. Some points:
1) Bartek's casually advising to change from map to unordered_map without recommending an impact study re iterator/pointer invalidation is dangerous, and unwarranted given there's no reason to think it's needed (the question doesn't mention performance) and no recommendation to profile.
3) Relatively few data structures in a program are important to performance-critical behaviours, and there are plenty of times when the relative performance of one versus another is of insignificant interest. Supporting this claim - masses of code were written with std::map to ensure portability before C++11, and perform just fine.
4) When performance is a serious concern, the advice should be "Care => profile", but saying that a rule of thumb is ok - in line with "Don't pessimise prematurely" (see e.g. Sutter and Alexandrescu's C++ Coding Standards) - and if asked for one here I'd happily recommend unordered_map by default - but that's not particularly reliable. That's a world away from recommending every std::map usage I see be changed.
5) This container performance side-track has started to pull in ad-hoc snippets of useful insight, but is far from being comprehensive or balanced. This question is not a sane venue for such a discussion. If there's another question addressing this where it makes sense to continue this discussion and someone asks me to chip in, I'll do it sometime over the next month or two.
You could consider having a plain std::list holding all your data, and then various std::map objects mapping arbitrary key values to iterators pointing into the list:
std::list<Data> values;
std::map<C1, std::list<Data>::iterator> byC1;
std::map<C2, std::list<Data>::iterator> byC2;
I.e. instead of fiddling with more-or-less-raw pointers, you use plain iterators. And iterators into a std::list have very good invalidation guarantees.
I had the same problem, at first holding two map for shared pointers sound very cool. But you will still need to manage this two maps(inserting, removing etc...).
Than I came up with other way of doing this.
My reason was; accessing a data with x-y or radius-angle. Think like each point will hold data but point could be described as cartesian x,y or radius-angle .
So I wrote a struct like
struct MyPoint
{
std::pair<int, int> cartesianPoint;
std::pair<int, int> radianPoint;
bool operator== (const MyPoint& rhs)
{
if (cartesianPoint == rhs.cartesianPoint || radianPoint == rhs.radianPoint)
return true;
return false;
}
}
After that I could used that as key,
std::unordered_map<MyPoint, DataType> myMultIndexMap;
I am not sure if your case is the same or adjustable to this scenerio but it can be a option.

Mapping vectors of arbitrary type

I need to store a list vectors of different types, each to be referenced by a string identifier. For now, I'm using std::map with std::string as the key and boost::any as it's value (example implementation posted here).
I've come unstuck when trying to run a method on all the stored vector, e.g.:
std::map<std::string, boost::any>::iterator it;
for (it = map_.begin(); it != map_.end(); ++it) {
it->second.reserve(100); // FAIL: refers to boost::any not std::vector
}
My questions:
Is it possible to cast boost::any to an arbitrary vector type so I can execute its methods?
Is there a better way to map vectors of arbitrary types and retrieve then later on with the correct type?
At present, I'm toying with an alternative implementation which replaces boost::any with a pointer to a base container class as suggested in this answer. This opens up a whole new can of worms with other issues I need to work out. I'm happy to go down this route if necessary but I'm still interested to know if I can make it work with boost::any, of if there are other better solutions.
P.S. I'm a C++ n00b novice (and have been spoilt silly by Python's dynamic typing for far too long), so I may well be going about this the wrong way. Harsh criticism (ideally followed by suggestions) is very welcome.
The big picture:
As pointed out in comments, this may well be an XY problem so here's an overview of what I'm trying to achieve.
I'm writing a task scheduler for a simulation framework that manages the execution of tasks; each task is an elemental operation on a set of data vectors. For example, if task_A is defined in the model to be an operation on "x"(double), "y"(double), "scale"(int) then what we're effectively trying to emulate is the execution of task_A(double x[i], double y[i], int scale[i]) for all values of i.
Every task (function) operate on different subsets of data so these functions share a common function signature and only have access to data via specific APIs e.g. get_int("scale") and set_double("x", 0.2).
In a previous incarnation of the framework (written in C), tasks were scheduled statically and the framework generated code based on a given model to run the simulation. The ordering of tasks is based on a dependency graph extracted from the model definition.
We're now attempting to create a common runtime for all models with a run-time scheduler that executes tasks as their dependencies are met. The move from generating model-specific code to a generic one has brought about all sorts of pain. Essentially, I need to be able to generically handle heterogenous vectors and access them by "name" (and perhaps type_info), hence the above question.
I'm open to suggestions. Any suggestion.
Looking through the added detail, my immediate reaction would be to separate the data out into a number of separate maps, with the type as a template parameter. For example, you'd replace get_int("scale") with get<int>("scale") and set_double("x", 0.2) with set<double>("x", 0.2);
Alternatively, using std::map, you could pretty easily change that (for one example) to something like doubles["x"] = 0.2; or int scale_factor = ints["scale"]; (though you may need to be a bit wary with the latter -- if you try to retrieve a nonexistent value, it'll create it with default initialization rather than signaling an error).
Either way, you end up with a number of separate collections, each of which is homogeneous, instead of trying to put a number of collections of different types together into one big collection.
If you really do need to put those together into a single overall collection, I'd think hard about just using a struct, so it would become something like vals.doubles["x"] = 0.2; or int scale_factor = vals.ints["scale"];
At least offhand, I don't see this losing much of anything, and by retaining static typing throughout, it certainly seems to fit better with how C++ is intended to work.

Choosing specific objects satisfying conditions

Let's say I have objects which look very roughly like this:
class object
{
public:
// ctors etc.
bool has_property_X() const { ... }
std::size_t size() const { ... }
private:
// a little something here, but not really much
};
I'm storing these objects inside a vector and the vector is rather small (say, at most around 1000 elements). Then, inside a performance critical algorithm, I would like to choose the object that both has the property X and has the least size (in case there are multiple such objects, choose any of them). I need to do this "choosing" multiple times, and both the holding of the property X and the size may vary in between the choices, so that the objects are in a way dynamic here. Both queries (property, size) can be made in constant time.
How would I best achieve this? Performance is profiled to be important here. My ideas at the moment:
1) Use std::min_element with a suitable predicate. This would probably also need boost::filter_iterator or something similar to iterate over objects satisfying property X?
2) Use some data structure, such as a priority queue. I would store pointers or reference_wrappers to the objects and so forth. This atleast to me, feels slow and probably it's not even feasible because of the dynamic nature of the objects.
Any other suggestions or comments on these thoughts? Should I just go ahead and try any or both of these schemes and profile?
Your last choice is always a good one. Our intuitions about how code will run are often wrong. So where possible profiling is always useful on critical code.

Is this use of nested vector/multimap/map okay?

I am looking for the perfect data structure for the following scenario:
I have an index i, and for each one I need to support the following operation 1: Quickly look up its Foo objects (see below), each of which is associated with a double value.
So I did this:
struct Foo {
int a, b, c;
};
typedef std::map<Foo, double> VecElem;
std::vector<VecElem> vec;
But it turns out to be inefficient because I also have to provide very fast support for the following operation 2: Remove all Foos that have a certain value for a and b (together with the associated double values).
To perform this operation 2, I have to iterate over the maps in the vector, checking the Foos for their a and b values and erasing them one by one from the map, which seems to be very expensive.
So I am now considering this data structure instead:
struct Foo0 {
int a, b;
};
typedef std::multimap<Foo0, std::map<int, double> > VecElem;
std::vector<VecElem> vec;
This should provide fast support for both operations 1 and 2 above. Is that reasonable? Is there lots of overhead from the nested container structures?
Note: Each of the multimaps will usually only have one or two keys (of type Foo0), each of which will have about 5-20 values (of type std::map<int,double>).
To answer the headline question: yes, nesting STL containers is perfectly fine. Depending on your usage profile, this could result in excessive copying behind the scenes though. A better option might be to wrap the contents of all but top-level container using Boost::shared_ptr, so that container housekeeping does not require a deep copy of your nested container's entire contents. This would be the case say if you plan on spending a lot of time inserting and removing VecElem in the toplevel vector - expensive if VecElem is a direct multimap.
Memory overhead in the data structures is likely to be not significantly worse than anything you could design with equivalent functionality, and more likely better unless you plan to spend more time on this than is healthy.
Well, you have a reasonable start on this idea ... but there are some questions that must be addressed first.
For instance, is the type Foo mutable? If it is, then you need to be careful about creating a type Foo0 (um ... a different name may be a good idea hear to avoid confusion) since changes to Foo may invalidate Foo0.
Second, you need to decide whether you also need this structure to work well for inserts/updates. If the population of Foo is static and unchanging - this isn't an issue, but if it isn't, you may end up spending a lot of time maintaining Vec and VecElem.
As far as the question of nesting STL containers goes, this is fine - and is often used to create arbitrarily complex structures.

C++ class instances

I am working on intro c++ homework, but I am stuck.
Account *GetAccount(int an);
int main()
{
Account *a1,*a2,*b1;
a1=GetAccount(123);
a2=GetAccount(456);
b1=GetAccount(123);
if(a1==b1)
cout<<"YES"<<endl;
else
cout<<"NO"<<endl;
GetAccount method is supposed to check whether the instance already exists with the same account number, if it does, returns that instance.
Only method I can think of is to create array of Account and search for account, then if it doesn't exist, insert new Account in the array. If it exists, returns the pointer to the array.
This method doesn't really seem efficient to me, and is there any other way?
Yes. Instead of array, use a map. It fill be more efficient in terms of space, and almost as fast.
You can use STL and keep your accounts in a std::map, one of these variants:
map<int, Account> or
map<int, Account*>
In the first case, you keep the Accounts in the map, in the second you keep the pointers to Accounts, and are responsible for creation/deletion. Which variant is more appropriate? it depends on the way you create/initialize the account.
Short tutorial on STL map usage
I will exlain the case when you keep the pointers in the map.
This is how you would declare the map:
map<int, Account*> accounts;
This is how you can add a new account to the map:
int account_id = 123; // or anything else
Account* account = new Account(...paramters for the constructor...)
// any additional code to initialize the account goes here
accounts[account_id] = account; // this adds account to the map
This is how you check if the account with account_id is in the map:
if (accounts.find(account_id) != accounts.end()) {
// It is in the map
} else {
// it is not in the map
}
This is how you get pointer to an account from the map:
Account* ifoundit = accounts[account_id];
Finally, somewhere at the end of your program, you need to clean the map and delete all the account objects. The program will work fine even without the cleanup, but it's important to clean up after yourself. I leave this as an exercise for you :)
Find how to iterate all the elements of the map, and apply delete appropriately.
This method doesn't really seem efficient to me, and is there any other way?
Yes, as others have mentioned, there are more efficient ways using data structures other than arrays. If you've recently been studying arrays and loops in your class, though, the method you describe is probably what your instructor is expecting. I wouldn't try to get too far ahead of your instructor, since the arrays and loop method is probably the sort of thing you'll need to be very familiar with when you take your exams. It's also a good idea to have a strong foundation in the basics before you move forward. (Don't let that deter you from asking more advanced questions on here, though.)
Consider hash tables.
You could use a std::map instead of a simple array.
The method you proposed, an array of ids which you walk through and test, is a very easy one. I would use a std::vector, however, not an array, as you then don't have to worry about size. Otherwise, you just declare a big array, and test that it isn't full when adding.
In terms of efficiency, doing a linear search over a small array (in the hundreds) is quite fast, and may well be faster than other solutions, like maps and sets. However, it does not scale well.
Try to write your code well, but don't worry about optimising it until you know you have a probelem. I would much rather my programmers wrote clean, easy to maintain code than go for optimal speed. We can always speed things up later, if we need to.
Create a std::map of account # => Account object.