Related
Please help to figure out the logic of using unordered_set with custom structures.
Consider I have following class
struct MyClass {
int id;
// other members
};
used with shared_ptr
using CPtr = std::shared_ptr<MyClass>;
Because of fast access by key I supposed to use an unordered_set with a custom hash and the MyClass::id member as a key):
template <class T> struct CHash;
template<> struct CHash<CPtr>
{
std::size_t operator() (const CPtr& c) const
{
return std::hash<decltype(c->id)> {} (c->id);
}
};
using std::unordered_set<CPtr, CHash>;
Right now, unordered_set still seems to be an appropriate container. However standard find() functions for sets are assumed to be const to ensure keys won't be changed. I intend to change objects guaranteeing keeping keys unchanged. So, the questions are:
1) How to realize easy accessing to element of set by int key reserving possibility to change element, something like
auto element = my_set.find(5);
element->b = 3.3;
It is possible to add converting constructor and use something like
auto element = my_set.find(MyClass (5));
But it doesn't solve the problem with constness and what if the class is huge.
2) Am I actually going wrong way? Should I use another container? For example unordered_map, that will store one more int key for each entry consuming more memory.
A pointer doesn't project its constness to the object it points to. Meaning, if you have a constant reference to a std::shared_ptr (as in a set) you can still modify the object via this pointer. Whether or not that is something you should do a is a different question and it doesn't solve your lookup problem.
OF course, if you want to lookup a value by a key, then this is what std::unordered_map was designed for so I'd have a closer look there. The main problem I see with this approach is not so much the memory overhead (unordered_set and unordered_map as well as shared_ptr have noticeable memory overhead anyway), but that you have to maintain redundant information (id in the object and id as a key).
If you have not many insertions and you don't absolutely need the (on average) constant lookup time and memory overhead is really important to you, you could consider a third solution (besides using a third-party or self written data structure of courses): namely to write a thin wrapper around a sorted std::vector<std::shared_ptr<MyClass>> or - if appropriate - even better std::vector<std::unique_ptr<MyClass>> that uses std::upper_bound for lookups.
I think you are going a wrong way using unordered_set,because unordered_set's definition is very clear that:
Keys are immutable, therefore, the elements in an unordered_set cannot be modified once in the container - they can be inserted and removed, though.
You can see its definition in site:
http://www.cplusplus.com/reference/unordered_set/unordered_set/.
And hope it is helpful for you.Thanks.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
If I have a find function that can sometimes fail to find the required thing, I tend to make that function return a pointer such that a nullptr indicates that the thing was not found.
E.g.
Student* SomeClass::findStudent(/** some criteria. */)
If the Student exists, it will return a pointer to the found Student object, otherwise it will return nullptr.
I've seen boost::optional advocated for this purpose as well. E.g. When to use boost::optional and when to use std::unique_ptr in cases when you want to implement a function that can return "nothing"?
My question is, isn't returning a pointer the best solution in this case. i.e. There is a possibility that the queried item will not be found, in which case returning nullptr is a perfect solution. What is the advantage of using something like boost::optional (or any other similar solution)?
Note that, in my example, findStudent will only ever return a pointer to an object that is owned by SomeClass.
The advantage of an optional<Student&> return type here is that the semantics of usage are readily apparent to all users that are familiar with optional (and will become readily apparent once they familiarize themselves with it). Those semantics are:
The caller does not own the Student and is not responsible for memory management. The caller simply gets a reference to an existing object.
It is clear that this function can fail. You maybe get a value and you maybe get nothing. It is clear that the caller needs to check the result one way or the other.
optional<T> is self-documenting in a way that T* isn't. Moreover, it has other benefits in that it can work in cases where you want to return any kind of object type without the need for allocation. What if you needed to return an int or double or SomePOD?
optional<T&> was removed from the C++ standardization track because its use is questionable: it behaves nearly identically to a non-owning T* with slightly different (and confusingly different from optional<T> and T*) semantics.
optional<T&> is basically a non-owning T* wrapped up pretty, and somewhat strangely.
Now, optional<T> is a different beast.
I have used optional<Iterator> in my container-based find algorithms. Instead of returning end(), I return the empty optional. This lets users determine without a comparison if they have failed to find the item, and lets code like:
if(linear_search_for( vec, item))
work, while the same algorithm also lets you get at both the item and the location of the item in the container if you actually need it.
Pointers to elements doesn't give you the location information you might want except with contiguous containers.
So here, I've created a nullable iterator that has the advantages of iterators (generically working with different types of containers) and pointers (can be tested for the null state).
The next use is actually returning a value. Suppose you have a function that calculates a rectangle.
Rect GetRect();
now, this is great. But what if the question can be meaningless? Well, one approach is to return an empty rect or other "flag" value.
Optional lets you communicate that it can return a rect, or nothing, and not use the empty rect for the "nothing" state. It makes the return value nullable.
int GetValue();
is a better example. An invalid value could use a flag state of the int -- say -1 -- but that forces every user of your function to look up and track the flag state, and not accidentally treat it as a normal state.
Instead, optional<int> GetValue() makes it clear that it can fail, and what the failure state it. If it is populated, you know it is a real value, and not a flag value.
In both of these cases, returning a non-owning pointer is non-viable, because who owns the storage? Returning an owning pointer is expensive, because pointless heap allocations are pointless.
Optionals are nullable value types. When you want to manage resources locally, and you still want an empty state, they make it clear.
Another thing to look into is the expected type being proposed. This is an optional, but when in the empty state contains a reason why it is empty.
optional<T&> may indeed be replaced by T* but T* has not clear semantic (ownership ?).
But optional<T> cannot be replaced by T*.
For example:
optional<Interval> ComputeOverlap(const Interval&, const Interval&);
If there is no overlap, no problem with T* (nullptr) or optional<T>.
But if there is an overlap, we need to create a new interval. We may return a smart_pointer in this case, or optional.
Lets consider you have a std::map<IndexType, ValueType> where you are trying to find something (Note: The same applies for other containers, this is just to have an example). You have these options:
You return a ValueType&: The user can modify your map-content and does not need to think about memory-allocation/deallocation. But if you dont find anything in your map, you need to throw an exception or something similar.
You return a ValueType*: The user can modify your map-content and you can return a nullptr if you dont find anything. But the user can call delete on that pointer and you must specify anyhow if he has to do so or not.
You return a smart pointer to ValueType: The user does not have to worry about delete or not-delete and can modify your map-content depending on the type of smart-pointer. You can also return a nullptr. But this pretty much requires you to deal with smart_pointers in your map, which is overly complicated if ValueType would be e.g. just an int otherwise.
You return a simple ValueType: The user can not modify your map-content and does not need to think about memory-allocation/deallocation. But if you dont find anything in your map, you need to return some special ValueType which tells the user you didn't find anything. In case your ValueType is e.g. int, which one would you return that makes clear "no int found".
You return a boost::optional, which is the closest you can get to a simple ValueType return by value with the additional option of "not returning a ValueType"
This is inspired by an Item in Effective C# first edition, warning about overriding GetHashCode() naively.
Sorry, I do not have supporting code. By the way, this is not a homework, I am just not that familiar with C++/STL, and could not find information regarding implementation.
Suppose I create my own class named person which has 3 public mutable string fields:
First Name,
Middle Initial
Last Name
It also provides a less than operator to compare one person to another based on first name first, then middle name, and then the last name - that is all.
I create a map from person to int (say age), and fill it up with some 20 key/value pairs. I also store pointers to my keys in an array. I then change the first name of an object that say the fifth pointer points to, and try to look up a corresponding age using this modified key (remember the object is mutable and wide open).
Why did this happen?
A) Because the key used by std::map has not changed (was copied), and I changed my own copy and now my key is not found. But how can this be? I have not provided my own copy constructor. Perhaps a default one was created by the compiler?
B) The std::map collection is actually a Red-Black tree, and I happened to have a direct pointer to a key. When I have changed the key, I changed it directly in the node of a tree. Now it is likely that my node is not positioned correctly, and will not be found using a proper tree search algorithm. I should have deleted the node, then modified they key, and then re-inserted it again. If this is the case, then I suspect that STL collections in general are rather dangerous and cause noobs to make many mistakes.
C) Something else?
I would appreciate your insights.
When you use std containers all data is copied into the container. For maps this is no different.
One restriction that map places on the data is that the key is non mutable. Once it is inserted it is fixed to change the key you must find/erase and re-insert to change the value of the key.
struct Person
{
std::string first;
std::string middle;
std::string last;
Person(std::string const& f, std::string const& s, std::string const& l) { BLABLA }
bool operator<(Person const& rhs) { return BLABLABLA;}
};
std::map<Person,int> ageMap;
ageMap[Person("Tom", "Jones", "Smith")] = 68;
ageMap[Person("Tom", "I", "Smith")] = 46;
ageMap[Person("Tom", "II", "Smith")] = 24;
When you create your array of Person it will fail unless the array contains const pointers.
Person* pMap[3];
pMap[0] = &ageMap.begin().first; // Fail need a const pointer.
Person const* pMapConst[3];
pMapConst[0] = &ageMap.begin().first; // OK. Note a const pointer.
The standard container have a requirement that the class stored in them have value semantic and thus they are copied. But
if you store pointers of any kind, obviously what is pointed isn't copied;
if you try to modify the stored key (you can get a reference to it), especially playing tricks with mutable members or const_cast in order to be able to do this, in such a way than the sorting order isn't conserved, you are in UB realm.
The entries always make a copy. If the key type is std::string, then, yes, that's a copy. (Behind the scenes, std::string does some optimizations so the characters aren't necessarily always copied, but that's besides the point.)
(I think there's no way to get a pointer into the map's objects, so you can't change that key, ever again, just get copies upon iteration or other retrieval.)
Now, if your key type is *std::string (a pointer!) then the bits in the pointer are copied, but if the value of the particular string instance is later changed, then the key will effectively be changed.
(And the comparator needs to be appropriate for your key type.)
Yes -- when you insert an item into an std::map, you pass it by value, so what it contains is a copy of what you passed. Yes, the compiler will synthesize a copy constructor for you unless you declare one yourself.
It is possible to create (for example) map that uses a pointer as its key (along with a comparison function/functor that compares what the pointers refer to). If, however, you attempt to modify the keys pointed to by those pointers, you get UB. If you want to modify a key in a set/map/multiset/multimap, you need to delete the existing item from the collection, modify your copy, then insert the modified version back into the collection.
I have a big object say MyApplicationContext which keeps information about MyApplication such as name, path, loginInformation, description, details and others..
//MyApplicationCtx
class MyApplicationCtx{
// ....
private:
std::string name;
std::string path;
std::string desciption;
struct loginInformation loginInfo;
int appVersion;
std::string appPresident;
//others
}
this is my method cloneApplication() which actually sets up a new application. there are two ways to do it as shown in Code 1 and Code 2. Which one should I prefer and why?
//Code 1
public void cloneApplication(MyApplicationCtx appObj){
setAppName(appObj);
setAppPath(appObj);
setAppAddress(&appObj); // Note this address is passed
setAppDescription(appObj);
setAppLoginInformation(appObj);
setAppVersion(appObj);
setAppPresident(appObj);
}
public void setAppLoginInformation(MyApplicationCtx appObj){
this->loginInfo = appObj.loginInfo; //assume it is correct
}
public void setAppAddress(MyApplicationCtx *appObj){
this->address = appObj->address;
}
.... // same way other setAppXXX(appObj) methods are called.
Q1. Does passing the big object appObj everytime has a performance impact?
Q2. If I pass it using reference, what should be the impact on performance?
public void setAppLoginInformation(MyApplicationCtx &appObj){
this->loginInfo = appObj.loginInfo;
}
//Code 2
public void setUpApplication(MyApplicationCtx appObj){
std::string appName;
appName += appOj.getName();
appName += "myname";
setAppName(appName);
std::string appPath;
appPath += appObj.getPath();
appPath += "myname";
setAppPath(appPath);
std::string appaddress;
appaddress += appObj.getAppAddress();
appaddress += "myname";
setAppAddress(appaddress);
... same way setup the string for description and pass it to function
setAppDescription(appdescription);
struct loginInformation loginInfo = appObj.getLoginInfo();
setAppLoginInformation(loginInfo);
... similarly appVersion
setAppVersion(appVersion);
... similarly appPresident
setAppPresident(appPresident);
}
Q3. Compare code 1 and code 2, which one should I use? Personally i like Code 1
You're better off defining a Copy Constructor and an Assignment Operator:
// Note the use of passing by const reference! This avoids the overhead of copying the object in the function call.
MyApplicationCtx(const MyApplicationCtx& other);
MyApplicationCtx& operator = (const MyApplicationCtx& other);
Better still, also define a private struct in your class that looks like:
struct AppInfo
{
std::string name;
std::string path;
std::string desciption;
struct loginInformation loginInfo;
int appVersion;
std::string appPresident;
};
In your App class' copy constructor and assignment operator you can take advantage of AppInfo's automatically generated assignment operator to do all of the assignment for you. This is assuming you only want a subset of MyApplicationCtx's members copied when you "clone".
This will also automatically be correct if you add or remove members of the AppInfo struct without having to go and change all of your boilerplate.
Short answer:
Q1: Given the size of your MyAppCtx class, yes, a significant performance hit will take place if the data is dealt with very frequently.
Q2: Minimal, you're passing a pointer.
Q3: Neither, for large objects like that you should use reference semantics and access the data through accessors. Don't worry about function call overhead, with optimizations turned on, the compiler can inline them if they meet various criteria (which I leave up to you to find out).
Long answer:
Given functions:
void FuncByValue(MyAppCtx ctx);
void FuncByRef1(MyAppCtx& ctx);
void FuncByRef2(MyAppCtx* ctx);
When passing large objects like your MyApplicationCtx, it's a good idea to use reference semantics (FuncByRef1 & FuncByRef2), passing by reference is identical in performance to passing a pointer, the difference is only the syntax. If you pass the object by value, the object is copy-constructed into the function, such that the argument you pass into FuncByValue is different from the parameter FuncByValue receives. This is where you have to be careful of pointers (if any) contained in an object that was passed by value, because the pointer will have been copied as well, so it's very possible that more than one object will point to one element in memory at a given time, which could lead to memory leaks, corruption, etc.
In general, for objects like your MyAppCtx, I would recommend passing by reference and using accessors as appropriate.
Note, the reason I differentiated between argument and parameter above is that there is a difference between a function argument and a function parameter, it is as follows:
Given (template T is used simply to demonstrate that object type is irrelevent here):
template<typename T>
void MyFunc(T myTobject);
When calling MyFunc, you pass in an argument, eg:
int my_arg = 3;
MyFunc(my_arg);
And MyFunc receives a parameter, eg:
template<typename T>
void MyFunc(T myTobject)
{
T cloned_param = T(myTobject);
}
In other words, my_arg is an argument, myTobject is a parameter.
Another note, in the above examples, there are essentially three versions of my_arg in memory: the original argument, the copy-constructed parameter myTobject, plus cloned_param which was explicitly copied as well.
Luke beat me to tell you about copy constructors, to answer your other questions passing a large object by value has a performance impact when compared to passing by reference, make it a const if the function won't change it as in this case.
General:
Why do you need to copy application object? Isn't it better to use singleton for this (with completely disabled copying by the way)?
Q1:
Not only performance (yes, they will be copied) but memory too. As soon as I saw std::string implementations they at least occupy 2 memory chunks and first is in any case significantly less then minimal allocation size so such objects could cause memory efficiency problem if cloned extensively.
Q2:
Passing reference is barely different (from performance point of view) from passing pointer so this should in general constant complexity. It is much better. Don't forget to add "const" modifier to block modifications.
Q3:
I don't like actually both because of encapsulation broken. Once I saw good Java programmer article called something like "Why setters/getters are evil" (Well, I found it easily, there is not so much based on Java itself). This is VERY useful article to change style forever.
Q1..
Pass by object is heavy operation since it will create copy by invoking copy constructor.
Q2.
Pass reference to constant , it will improve perfomance.
Q1 - yes, every time you pass an object a copy is done
Q2 - minimal since an object passed by reference is basically just a pointer
Q3 - It is generally not a good idea to have large monolithic objects, instead you should split your objects into smaller objects, this allows for better re-usability and makes the code easier to read.
The best practice for cloning an object where all the members are copyable, like "Code 1" appears to be doing, is to use the default copy constructor - you don't have to write any code at all. Just copy like this:
MyApplicationCtx new_app = old_app;
"Code 2" is doing something different to "Code 1", so choosing one over the other is a matter of what you want the code to do, not a matter of style.
Q1. Passing a large object by value will cause it to be copied, which will have an impact on performance.
Q2. Yes, passing a reference to a large structure is more efficient than passing a copy. The only way to tell how large the impact is is to measure it with a profiler.
There is one single point that has not been dealt in any other of the answers (that focused on your explicit questions more than in the general approach). I agree with #luke in that you should use what is idiomatic: copy constructor and assignment operators are there for a reason. But just for the sake of discusion on the first possibility you presented:
In the first block you propose small functions like:
public:
void setAppLoginInformation(MyApplicationCtx appObj){
this->loginInfo = appObj.loginInfo; //assume it is correct
}
Now, besides the fact that the parameter should be passed by const reference, there are some other design issues there. You are offering a public operation that promises to change the login information, but the argument you require from your user is a full blown application context.
If you want to provide a method for just setting one of the attributes, I would change the method signature to match the attribute type:
public:
void setAppLoginInformation( loginInformation const & li ); // struct is optional here
This offers the possibility of changing the loginInformation both with a full application context or just with some specific login information object you can build yourself.
If on the other hand you want to disallow changing particular attributes of your class and you want to allow setting the values only from another application context object, then you should use the assignment operator, and if you want to do it in terms of small setter functions (assuming that the compiler provided assigment operator does not suffice), make them private.
With the proposed design you are offering users the possibility of setting each attribute to any value, but you are doing so in a cumbersome, hard to use way.
I know most people think that as a bad practice but when you are trying to make your class public interface only work with references, keeping pointers inside and only when necessary, I think there is no way to return something telling that the value you are looking doesn't exist in the container.
class list {
public:
value &get(type key);
};
Let's think that you don't want to have dangerous pointers being saw in the public interface of the class, how do you return a not found in this case, throwing an exception?
What is your approach to that? Do you return an empty value and check for the empty state of it? I actually use the throw approach but I introduce a checking method:
class list {
public:
bool exists(type key);
value &get(type key);
};
So when I forget to check that the value exists first I get an exception, that is really an exception.
How would you do it?
The STL deals with this situation by using iterators. For example, the std::map class has a similar function:
iterator find( const key_type& key );
If the key isn't found, it returns 'end()'. You may want to use this iterator approach, or to use some sort of wrapper for your return value.
The correct answer (according to Alexandrescu) is:
Optional and Enforce
First of all, do use the Accessor, but in a safer way without inventing the wheel:
boost::optional<X> get_X_if_possible();
Then create an enforce helper:
template <class T, class E>
T& enforce(boost::optional<T>& opt, E e = std::runtime_error("enforce failed"))
{
if(!opt)
{
throw e;
}
return *opt;
}
// and an overload for T const &
This way, depending on what might the absence of the value mean, you either check explicitly:
if(boost::optional<X> maybe_x = get_X_if_possible())
{
X& x = *maybe_x;
// use x
}
else
{
oops("Hey, we got no x again!");
}
or implicitly:
X& x = enforce(get_X_if_possible());
// use x
You use the first way when you’re concerned about efficiency, or when you want to handle the failure right where it occurs. The second way is for all other cases.
The problem with exists() is that you'll end up searching twice for things that do exist (first check if it's in there, then find it again). This is inefficient, particularly if (as its name of "list" suggests) your container is one where searching is O(n).
Sure, you could do some internal caching to avoid the double search, but then your implementation gets messier, your class becomes less general (since you've optimised for a particular case), and it probably won't be exception-safe or thread-safe.
Don't use an exception in such a case. C++ has a nontrivial performance overhead for such exceptions, even if no exception is thrown, and it additially makes reasoning about the code much harder (cf. exception safety).
Best-practice in C++ is one of the two following ways. Both get used in the STL:
As Martin pointed out, return an iterator. Actually, your iterator can well be a typedef for a simple pointer, there's nothing speaking against it; in fact, since this is consistent with the STL, you could even argue that this way is superior to returning a reference.
Return a std::pair<bool, yourvalue>. This makes it impossible to modify the value, though, since a copycon of the pair is called which doesn't work with referende members.
/EDIT:
This answer has spawned quite some controversy, visible from the comments and not so visible from the many downvotes it got. I've found this rather surprising.
This answer was never meant as the ultimate point of reference. The “correct” answer had already been given by Martin: execeptions reflect the behaviour in this case rather poorly. It's semantically more meaningful to use some other signalling mechanism than exceptions.
Fine. I completely endorse this view. No need to mention it once again. Instead, I wanted to give an additional facet to the answers. While minor speed boosts should never be the first rationale for any decision-making, they can provide further arguments and in some (few) cases, they may even be crucial.
Actually, I've mentioned two facets: performance and exception safety. I believe the latter to be rather uncontroversial. While it's extremely hard to give strong exceptions guarantees (the strongest, of course, being “nothrow”), I believe it's essential: any code that is guaranteed to not throw exceptions makes the whole program easier to reason about. Many C++ experts emphasize this (e.g. Scott Meyers in item 29 of “Effective C++”).
About speed. Martin York has pointed out that this no longer applies in modern compilers. I respectfully disagree. The C++ language makes it necessary for the environment to keep track, at runtime, of code paths that may be unwound in the case of an exception. Now, this overhead isn't really all that big (and it's quite easy to verify this). “nontrivial” in my above text may have been too strong.
However, I find it important to draw the distinction between languages like C++ and many modern, “managed” languages like C#. The latter has no additional overhead as long as no exception is thrown because the information necessary to unwind the stack is kept anyway. By and large, stand by my choice of words.
STL Iterators?
The "iterator" idea proposed before me is interesting, but the real point of iterators is navigation through a container. Not as an simple accessor.
If you're accessor is one among many, then iterators are the way to go, because you will be able to use them to move in the container. But if your accessor is a simple getter, able to return either the value or the fact there is no value, then your iterator is perhaps only a glorified pointer...
Which leads us to...
Smart pointers?
The point of smart pointers is to simplify pointer ownership. With a shared pointer, you'll get a ressource (memory) which will be shared, at the cost of an overhead (shared pointers needs to allocate an integer as a reference counter...).
You have to choose: Either your Value is already inside a shared pointer, and then, you can return this shared pointer (or a weak pointer). Or Your value is inside a raw pointer. Then you can return the row pointer. You don't want to return a shared pointer if your ressource is not already inside a shared pointer: A World of funny things will happen when your shared pointer will get out of scope an delete your Value without telling you...
:-p
Pointers?
If your interface is clear about its ownership of its ressources, and by the fact the returned value can be NULL, then you could return a simple, raw pointer. If the user of your code is dumb enough ignore the interface contract of your object, or to play arithmetics or whatever with your pointer, then he/she will be dumb enough to break any other way you'll choose to return the value, so don't bother with the mentally challenged...
Undefined Value
Unless your Value type really has already some kind of "undefined" value, and the user knows that, and will accept to handle that, it is a possible solution, similar to the pointer or iterator solution.
But do not add a "undefined" value to your Value class because of the problem you asked: You'll end up raising the "references vs. pointer" war to another level of insanity. Code users want the objects you give them to either be Ok, or to not exist. Having to test every other line of code this object is still valid is a pain, and will complexify uselessly the user code, by your fault.
Exceptions
Exceptions are usually not as costly as some people would like them to be. But for a simple accessor, the cost could be not trivial, if your accessor is used often.
For example, the STL std::vector has two accessors to its value through an index:
T & std::vector::operator[]( /* index */ )
and:
T & std::vector::at( /* index */ )
The difference being that the [] is non-throwing . So, if you access outside the range of the vector, you're on your own, probably risking memory corruption, and a crash sooner or later. So, you should really be sure you verified the code using it.
On the other hand, at is throwing. This means that if you access outside the range of the vector, then you'll get a clean exception. This method is better if you want to delegate to another code the processing of an error.
I use personnaly the [] when I'm accessing the values inside a loop, or something similar. I use at when I feel an exception is the good way to return the current code (or the calling code) the fact something went wrong.
So what?
In your case, you must choose:
If you really need a lightning-fast access, then the throwing accessor could be a problem. But this means you already used a profiler on your code to determinate this is a bottleneck, doesn't it?
;-)
If you know that not having a value can happen often, and/or you want your client to propagate a possible null/invalid/whatever semantic pointer to the value accessed, then return a pointer (if your value is inside a simple pointer) or a weak/shared pointer (if your value is owned by a shared pointer).
But if you believe the client won't propagate this "null" value, or that they should not propagate a NULL pointer (or smart pointer) in their code, then use the reference protected by the exception. Add a "hasValue" method returning a boolean, and add a throw should the user try the get the value even if there is none.
Last but not least, consider the code that will be used by the user of your object:
// If you want your user to have this kind of code, then choose either
// pointer or smart pointer solution
void doSomething(MyClass & p_oMyClass)
{
MyValue * pValue = p_oMyClass.getValue() ;
if(pValue != NULL)
{
// Etc.
}
}
MyValue * doSomethingElseAndReturnValue(MyClass & p_oMyClass)
{
MyValue * pValue = p_oMyClass.getValue() ;
if(pValue != NULL)
{
// Etc.
}
return pValue ;
}
// ==========================================================
// If you want your user to have this kind of code, then choose the
// throwing reference solution
void doSomething(MyClass & p_oMyClass)
{
if(p_oMyClass.hasValue())
{
MyValue & oValue = p_oMyClass.getValue() ;
}
}
So, if your main problem is choosing between the two user codes above, your problem is not about performance, but "code ergonomy". Thus, the exception solution should not be put aside because of potential performance issues.
:-)
Accessor?
The "iterator" idea proposed before me is interesting, but the real point of iterators is navigation through a container. Not as an simple accessor.
I agree with paercebal, an iterator is to iterate. I don't like the way STL does. But the idea of an accessor seems more appealing. So what we need? A container like class that feels like a boolean for testing but behaves like the original return type. That would be feasible with cast operators.
template <T> class Accessor {
public:
Accessor(): _value(NULL)
{}
Accessor(T &value): _value(&value)
{}
operator T &() const
{
if (!_value)
throw Exception("that is a problem and you made a mistake somewhere.");
else
return *_value;
}
operator bool () const
{
return _value != NULL;
}
private:
T *_value;
};
Now, any foreseeable problem? An example usage:
Accessor <type> value = list.get(key);
if (value) {
type &v = value;
v.doSomething();
}
How about returning a shared_ptr as the result. This can be null if the item wasn't found. It works like a pointer, but it will take care of releasing the object for you.
(I realize this is not always the right answer, and my tone a bit strong, but you should consider this question before deciding for other more complex alternatives):
So, what's wrong with returning a pointer?
I've seen this one many times in SQL, where people will do their earnest to never deal with NULL columns, like they have some contagious decease or something. Instead, they cleverly come up with a "blank" or "not-there" artificial value like -1, 9999 or even something like '#X-EMPTY-X#'.
My answer: the language already has a construct for "not there"; go ahead, don't be afraid to use it.
what I prefer doing in situations like this is having a throwing "get" and for those circumstances where performance matter or failiure is common have a "tryGet" function along the lines of "bool tryGet(type key, value **pp)" whoose contract is that if true is returned then *pp == a valid pointer to some object else *pp is null.
#aradtke, you said.
I agree with paercebal, an iterator is
to iterate. I don't like the way STL
does. But the idea of an accessor
seems more appealing. So what we need?
A container like class that feels like
a boolean for testing but behaves like
the original return type. That would
be feasible with cast operators. [..] Now,
any foreseeable problem?
First, YOU DO NOT WANT OPERATOR bool. See Safe Bool idiom for more info. But about your question...
Here's the problem, users need to now explict cast in cases. Pointer-like-proxies (such as iterators, ref-counted-ptrs, and raw pointers) have a concise 'get' syntax. Providing a conversion operator is not very useful if callers have to invoke it with extra code.
Starting with your refence like example, the most concise way to write it:
// 'reference' style, check before use
if (Accessor<type> value = list.get(key)) {
type &v = value;
v.doSomething();
}
// or
if (Accessor<type> value = list.get(key)) {
static_cast<type&>(value).doSomething();
}
This is okay, don't get me wrong, but it's more verbose than it has to be. now consider if we know, for some reason, that list.get will succeed. Then:
// 'reference' style, skip check
type &v = list.get(key);
v.doSomething();
// or
static_cast<type&>(list.get(key)).doSomething();
Now lets go back to iterator/pointer behavior:
// 'pointer' style, check before use
if (Accessor<type> value = list.get(key)) {
value->doSomething();
}
// 'pointer' style, skip check
list.get(key)->doSomething();
Both are pretty good, but pointer/iterator syntax is just a bit shorter. You could give 'reference' style a member function 'get()'... but that's already what operator*() and operator->() are for.
The 'pointer' style Accessor now has operator 'unspecified bool', operator*, and operator->.
And guess what... raw pointer meets these requirements, so for prototyping, list.get() returns T* instead of Accessor. Then when the design of list is stable, you can come back and write the Accessor, a pointer-like Proxy type.
Interesting question. It's a problem in C++ to exclusively use references I guess - in Java the references are more flexible and can be null. I can't remember if it's legal C++ to force a null reference:
MyType *pObj = nullptr;
return *pObj
But I consider this dangerous. Again in Java I'd throw an exception as this is common there, but I rarely see exceptions used so freely in C++.
If I was making a puclic API for a reusable C++ component and had to return a reference, I guess I'd go the exception route.
My real preference is to have the API return a pointer; I consider pointers an integral part of C++.