How to implement an API for a distributed map in c++? - c++

I am implementing a distributed map in c++ and searching for a good API design.
First and straightforward option is to make it exactly like std::map. Problem is with iterator.
IMap<std::string,Person>::iterator it;
it = map.find("sample");
if(it == map.end() ){
//NULL
}
for(it = map.begin(); it != map.end(); it++){
//iterate
}
In distributed context(at least in the one i am implementing) , there is no begin and end of the map. It is not ordered in anyway, so returning an iterator does not look like an option.
Second option is returning the value class by copy like below:
Person emptyPerson;
Person person = map.get("sample");
if(person == emptyPerson){
//NULL
}
Problem is with that NULL check looks strange. You can first ask if it is available and then get the object, but the requirement is that these operations must be atomic.
Third option is returning pointer:
Person* person = map.get("sample");
if(person == NULL){
//NULL
}
I don't want to do it this way, because it is error prone. User needs to delete the pointer that i created internally.
I am thinking about returning a class that wrapping user object like:
value_reference<std::map, Person> person = map.get("sample");
if(value_reference.hasValue() ){
Person p = value_reference;
}
So what do you think the best approach is?
Do you know any good api similar to requirements my distributed map?

Based on your term "distributed map" I am making the following assumptions:
A subset of the data is available locally, and for the set of data that is not some remote-fetch will need to be performed.
Writes to the returned object should not be automatically persisted in the data store. An explicit update request should be made instead.
If this is true then iterators are not what you want, nor do you want the STL container model. The C++ Iterator concept requires you to implement the pre-increment (++i) operator, and if your data is unordered and spread across multiple nodes, then the request "give me the next entry" does not make sense.
You could create a terrible kludge if you wanted to simulate STL containers and iterators for interoperability reasons: have the map's end() method return a sentinel iterator instance, and have operator++() for your iterators return this same sentinel. Effectively, every iterator would point to "the last element in the map." I would strongly advise against taking this approach unless it becomes necessary, and I don't think it will be.
It sounds like what you want is a simple CRUD model, where updates must be explicitly requested. In that case, your API would look something like:
template <typename TKey, typename TValue>
class IMap<TKey, TValue>
{
public:
void create(TKey const & key, TValue const & value) = 0;
std::unique_ptr<TValue> retrieve(TKey const & key) = 0;
bool update(TKey const & key, TValue const & value) = 0;
bool remove(TKey const & key) = 0;
};
In the retrieve case, you would simply return a null pointer as you suggested. std::unique_ptr<> will ensure that the caller will either delete the allocated object or explicitly take ownership of it.
An alternative to the "return pointer to newly-allocated object" case would be to let the caller pass in a reference, and the method would return true if the value was found in the map. This will, for example, let the caller retrieve an object directly into an array slot or other local structure without the need for an intermediary heap allocation.
bool retrieve(TKey const & key, TValue & value) = 0;
Use of this method would look something like:
Person person;
if (map.retrieve("sample", person)) {
std::cout << "Found person: " << person << std::endl;
} else {
std::cout << "Did not find person." << std::endl;
}
You could provide both overloads too, and the one returning a pointer can be implemented in terms of the other by default:
template <typename TKey, typename TValue>
std::unique_ptr<TValue> IMap<TKey, TValue>::retrieve(TKey const & key)
{
TValue v;
return std::unique_ptr<TValue>(retrieve(key, v) ? new TValue(v) : nullptr);
}

I'd say something like option 3 is best. You could just emulate it using one of the standard smart pointer types introduced in C++11, so you still create a pointer, but the user doesn't have to free it. So something like:
std::unqiue_ptr<Person> person = map.get("sample");
if(person) {
person->makeMeASandwitch();
}

Related

Writing C++ API - how to keep external references to API internal objects?

So I'm writing an API in C++ to be used in another GUI application I'll be writing. The API will allow the user to create instances of "MyObject" and modify the properties of that object, but the object itself will not be exposed to the client, only an ID to that object. So for instance:
Object_ID identifier = myApiCreateObject();
myApiModifyProperty(identifier, "PROPERTY_NAME", "value");
So the identifier acts as an external handler to a specific MyObject instance.
As of right now the Object_ID is defined as follows:
typedef int Object_ID;
Currently all MyObject instances are stored in an std::vector within my API. The Object_ID is simply the index in the vector that the desired instance lives.
The problem with this approach is that I don't know how to handle deleting instances of MyObject from the vector. For instance, let's say I have 10 instances of MyObject created and I want to delete the instance at index 5, I would want to do something like the following:
myApiDeleteObject(handlerForIndex5);
By doing this though, internally my API would remove that object from the std::vector and then would have to shift over all the objects at indices > 5. This would cause my external handlers to no longer reference the correct object.
So just using the index of the array by itself is not sufficient, but I don't know of a better alternative without having to expose the MyObject class to the client.
EDIT
Here's an updated example highlighting the issue at hand:
Internally the API performs certain algorithms on the list of objects, some of these algorithms require sorting the vector as a step.
So my GUI would do something like :
myApiBeginCalculations();
and then internally the API would be doing something like this:
myApiBeginCalculations()
{
//Start algorithm
.......
Sort(vector);
//Continue with algorithm
}
Then let's say after that algorithm is complete, the user wants to modify a given MyObject instance and start again:
myApiBeginCalculations();
myApiModifyProperty(myHandler, "PROPERTY", "VALUE");
myApiBeginCalculations();
myApiDeleteObject(myHandler);
myAPiBeginCalculations();
Internally myApi will be doing a bunch of things to the MyObject instances and I need a reliable way to keep track of individual instances on the client even as they get shuffled around.
You can use std::map in place of std::vector. So you can do look up quickly and remove objects whenever you need.
std::map<int, Object> Object_directory
You need to use an ID based on something that is both unique for each object and which remains constant for each object. Clearly an index into a vector you're continually rearranging does not qualify.
You haven't described the properties of the objects so I can't say whether there's something already suitable for this use, but if not then you can add something. You can assign an IDs to each object as you create them, or you could allocate the objects on the heap so that their addresses remain consistent as you, for example, sort a vector<unique_ptr<MyObject>>.
You'll have to consider each operation you need to perform and figure out the necessary performance. For example a linear search through the vector in order to find an object with a matching ID may be too slow for some purpose. In that case you'll have to figure out how to avoid that linear search, perhaps by keeping a map on the side or something, at the cost of having to keep the map updated during other operations.
I would suggest not generating an ID number at all. Simply use a real pointer to the actual Object instance instead. To hide it from the client, you can use void* or uintptr_t, and just have your API functions type-cast that value to an Object* pointer when needed. You can still keep track of the Object instances in a std::vector so you can perform your algorithms on the objects, but the order of the std:vector will not be important to clients, and deleting any given Object will not invalidate other object IDs.
typedef uintptr_t Object_ID;
typedef std::vector<Object*> ObjectVector;
typedef ObjectVector::iterator ObjectVectorIter;
ObjectVector objVec;
Object_ID myApiCreateObject()
{
try
{
std::auto_ptr<Object> obj(new Object);
objVec.push_back(obj.get());
return reinterpret_cast<Object_ID>(obj.release());
}
catch (const std::exception&)
{
return 0;
}
}
ObjectVectorIter myApiFindObject(Object_ID identifier)
{
Object *obj = reinterpret_cast<Object*>(identifier);
return std::find(objVec.begin(), objVec.end(), obj);
}
void myApiModifyProperty(Object_ID identifier, const char* propName, const char* propValue)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != objVec.end())
iter->property[propName] = propValue;
}
void myApiDeleteObject(Object_ID identifier)
{
ObjectVectorIter iter = myApiFindObject(identifer);
if (iter != objVec.end())
{
Object* obj = *iter;
objVec.erase(iter);
delete obj;
}
}
Or, if you are using C++11:
typedef uintptr_t Object_ID;
typedef std::shared_ptr<Object> ObjectPtr;
typedef std::vector<ObjectPtr> ObjectVector;
typedef ObjectVector::iterator ObjectVectorIter;
ObjectVector objVec;
Object_ID myApiCreateObject()
{
try
{
ObjectPtr obj = std::make_shared<Object>();
objVec.push_back(obj);
return reinterpret_cast<Object_ID>(obj.get());
}
catch (const std::exception&)
{
return 0;
}
}
ObjectVectorIter myApiFindObject(Object_ID identifier)
{
Object *obj = reinterpret_cast<Object*>(identifier);
return std::find_if(objVec.begin(), objVec.end(), [obj](const ObjectPtr &p){ return p.get() == obj; });
}
void myApiModifyProperty(Object_ID identifier, const char* propName, const char* propValue)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != objVec.end())
(*iter)->property[propName] = propValue;
}
void myApiDeleteObject(Object_ID identifier)
{
ObjectVectorIter iter = myApiFindObject(identifier);
if (iter != vec.end())
objVec.erase(iter);
}

Multi-index on boost::ptr_vector

I have the following classes in a program.
class Class1 {
public:
boost::ptr_vector<Class2> fields;
}
class Class2 {
public:
std:string name;
unsigned int value;
}
I want to write a member function in Class1 that returns a reference or pointer to an element in fields based on Class2's name variable. I don't have to be concerned with the lifetime of the objects in the container.
Currently, I am returning an iterator to the element I want after the function searches from the start of the vector to the element.
boost::ptr_vector<Class2>::iterator getFieldByName(std::string name) {
boost::ptr_vector<Class2>::iterator field = fields.begin();
while (field != fields.end()) {
if (field->name.compare(name) == 0) {
return field;
}
++field;
}
return fields.end();
}
The problems that I'm facing are:
(1.) I need to have fast random access to the elements or the program sits in getFieldByName() too long (a boost::ptr_vector<> is too slow when starting at the beginning of the container)
(2.) I need to preserve the order of insertion of the fields (so I can't use a boost::ptr_map<> directly)
I have discovered Boost::MultiIndex and it seems like it could provide a solution to the problems, but I need to use a smart container so that destruction of the container will also destruct the objects owned by the container.
Is there anyway to achieve a smart container that has multiple methods of access?
You can use two containers. Have a boost::ptr_map<> that stores the actual data, and then have a std::vector<> that stores pointers to the nodes of the map.
boost::ptr_map<std::string, Class2> by_field;
std::vector<Class2 const*> by_order;
void insert(Class2* obj) {
if (by_field.insert(obj->name, obj).second) {
// on insertion success, also add to by_order
by_order.push_back(obj);
}
}
This will give you O(lg n) access in your getFieldByName() function (just look it up in by_field) while also preserving the order of insertion (just look it up in by_order).

how to return a null iterator in c++?

I am writing a wrapper class which looks like this:
class Wrapper{
private:
std::list<People> men;
std::list<People> woman;
/**
some bizzar logics
**/
public:
std::list<People>::iterator getMeTheNextOne(){};
}
The problem is, sometime, I need to return an empty (or NULL) iterator, saying that there is no more 'suitable' people in either list any more. If I simply return men.end() or women.end(), is the user gonna catch this?
Imaging the user have following code:
Wrapper wo;
std::list<People>::iterator it = wo.getMeTheNextPeople();
if(it == /*what should I put here? i cannot access the list members of the Wrapper*/){
// do something here
}
Returning a list iterator, when the user doesn't have access to the list which the iterator is coming from, is weird and ugly. Why not return a pointer to a People instead, which can be NULL?
It doesn't make sense to return an iterator that can be from different lists. There is no way to check whether the iterator is valid. The best way in your approach is to return a pointer to the actual object being stored and that can be null.
On the other hand, what you could do if you insist to return an iterator is having a method in Wrapper to check the validity of the iterator.
class Wrapper{
private:
std::list<People> men;
std::list<People> woman;
/**
some bizzar logics
**/
public:
std::list<People>::iterator getMeTheNextOne(){};
bool isValid(std::list<People>::iterator const & it) const
{
return it != men.end() || it != women.end();
}
};
That you could use like this:
Wrapper wo;
std::list<People>::iterator it = wo.getMeTheNextPeople();
if(wo.isValid(it))
{
// do something here
}
Iterators can never be null. If an iterator does not point to anything, it's value is end(). I think it's OK for you to return end(). Usually the user uses an iterator to iterate over something, and when they iterate it is their responsibility to check whether they have reached the end or not, and the only way to check is to compare the iterator's value with end().
Return the standard "beyond the range iterator": wo.end();
Typically the solution would be to have an accessor to the beginning and end iterator of the container, so adding a
class Wrapper {
public:
std::list<People>::iterator end();
};
Would allow you do write the following:
Wrapper wo;
std::list<People>::iterator it = wo.getMeTheNextPeople();
if(it == wo.end()){
// do something here
}
However in this situation where you have two separate lists you may need to add an endOfMen and an endOfWomen, or combine them in a single list, depending on which best solves your problem.

how to convert iterator of list STL to instance (C++)

this is my first time using the list STL and i'm not sure if what i'm trying to do is possible.
I have class_B which holds a list of class_A, I need a function in class_B that takes an ID, searches the list for an instance with the same ID, and gets a pointer form the list to the instance in that list:
bool class_B::get_pointer(int ID,class_A* pointer2A){
list<class_A>::iterator i;
for(i=class_A.begin();i!=class_A.end();i++){
if((*i).get_id()==ID) {
\\pointer2A=(i);<---------------this is what I'm trying to do
return true;
}
}
pointer2A=NULL;
return false;
}
how do I perform this, is it possible to convert from iterator to instance ?
EDIT:
I'm using this function in a multi-threaded program and I can't return an iterator to the calling function since another thread might delete an element of the list.
Now that I have a pointer to my element(and lets say it's locked so it can't be deleted), and a different thread removed another element and performed a sort on the list, what will happen to the pointer I'm holding ? (I don't know how the list rearranges the elements, is done by copying the elements using a copy c'tor, or by another mean?).
Useless answer was the most helpful in my case (BIG thanks), and yes I should use a reference to the pointer since I'm planing to change it.
You should write this:
pointer2A= &*i;
Here *i returns the object whose address you can get by prepending & as : &*i.
Note that i is not same as &*i. See this topic for more general discussion:
Difference between &(*similarObject) and similarObject? Are they not same?
Anyway, I would suggest you to read the pointer itself as:
class_A* class_B::get_pointer(int ID)
{
//I assume the name of the list is objA, not class_A
for(list<class_A>::iterator i=objA.begin();i!=objA.end();i++)
{
if( i->get_id()==ID)
{
return &*i;
}
}
return NULL; //or nullptr in C++11
}
Or, in C++11, you can use std::find_if as:
auto it = std::find_if(objA.begin(),
objA.end(),
[&](class_A const &a){ return a->get_id() == ID;});
classA *ptr = NULL;
if ( it != objA.end())
ptr = &*it; //get the pointer from iterator
Make sure get_id is a const member function.
if(i->get_id()==ID) {
pointer2A=&*i;
return true;
}
iterators are designed to have similar semantics to pointers, so for example you can write i->get_id() just as if you had a pointer to A.
Similarly, *i yields a reference A&, and &*i converts that back into a pointer - it looks a bit clunky (it would be an identity operation if i were really a pointer), but it's idiomatic.
Note that this won't do what you presumably want anyway - the caller's class_A* pointer2A is passed by value, so only get_pointer's copy of the pointer is modified, and the caller won't see that value. Try this:
bool class_B::get_pointer(int ID, class_A *& pointer2A)
{
list<class_A>::iterator i;
for(i=class_A.begin();i!=class_A.end();i++) {
if(i->get_id()==ID) {
pointer2A=&*i;
return true;
}
}
pointer2A=NULL;
return false;
}
Now pointer2A is passed by reference, so the caller's copy gets modified inside your function.
BTW, you can read the parameter declaration class_A * & pointer2A right-to-left, as "pointer2A is a reference to a pointer to class_A".
If you have an iterator, you can get a raw pointer by simply dereferencing the iterator (which gives you a reference), and then taking the address of that (which gives you a pointer). So, in your case:
pointer2A = &*i;
This might seem like an odd, clumsy way to get a pointer, and it is. But you normally don't care about pointers when you are using the collections & iterators from the Std Lib. Iterators are the glue that hold the "STL" together. That's what you should be dealing with, by and large, rather than raw pointers.
The loop you've written above certainly gets the job done that you wish to accomplish, but there are better* ways to accomplish the same goal. (Better is a subjective term.) In particular, the <algorithm> library provides both std::find and std::find_if which do just what they say they do. They find something in a collection. find will find something that is equal to what you're looking for. find_if will find something that matches some criteria that you specify. The latter is the appropriate algorithm to use here, and there are two main ways to use it.
The first, more "traditional" approach is to use a functor:
struct match_id : public std::unary_function<bool, class_A>
{
match_id(int ID) : id_(id) {};
bool operator()(const class_A* rhs) const
{
if( id_ == rhs->get_id() )
return true;
else
return true;
};
/* ... */
list<class_A>::iterator it = std::find_if(objA.begin(), objA.end(), match_id(ID));
This approach works in C++03 or C++11. Some people don't like it because it is rather verbose. I like it, on the other hand, because the actual buisness logic (the find_if call) is quite succinct and more expressive than an explicit loop.
In C++11, you can use a lambda in place of the functor:
unsigned ID = 42;
std::find_if( objA.begin(), objB.end(), [&ID](const class_A& rhs) -> bool { return rhs.get_id() == ID; } };
There's a tradeoff here. On the pro side, you don't have to write 10 or so lines of code for the functor, but on the con side, the lambda syntax is funky and takes a bit of getting used to.

How to (deep)copy a map from a const object

I have another problem I can't seem to solve..., or find on this site...
I have an object (called DataObject) with a map, declared as follows:
std::map<size_t, DataElement*> dataElements;
Now i have a copy function (used in the copy constructor):
void DataObject::copy(DataObject const &other) {
//here some code to clean up the old data in this object...
//copy all the elements:
size = other.getSize();
for(size_t i = 0; i < size; ++i) {
DataElement* dat = new DataElement(*other.dataElements[i]);
dataElements[i] = dat;
}
}
This doesn't compile, since dataElements[i] is not possible on a const object. How do I make a deep copy of all the elements in the map that is owned by a const object?
I know that the find() function is possible on a const map, but then how do I get to the actual object that I want to copy?
std::map<size_t, DataElement*>::const_iterator it = other.dataElements.begin();
while(it != other.dataElements.end())
{
dataElements[it->first] = new DataElement(*(it->second));
++it;
}
I'm almost positive this should work.
You need to use std::transform. This does a copy whilst also performing a function on each element. In your case a deep copy of the value.
This will therefore do as a transformer:
class DeepCopyMapPointer
{
typedef std::map<size_t, DataElement*> map_type;
typedef map_type::value_type value_type;
public:
value_type operator()( const value_type & other ) const
{
return value_type(other.first, new DataElement(*other.second) );
}
};
void DataObject::copy(DataObject const &other)
{
std::transform(other.dataElements.begin(), other.dataElements.end(),
std::inserter( dataElements, dataElements.end() ), DeepCopyMapPointer() );
}
It's not quite that simple because if you do duplicate an element and your insert fails as a result you will get a leak. You could get round that by writing your own inserter instead of std::inserter... a bit tricky but that's your next exercise.
Since your map just has integer keys from 0 to n - 1, just change your container type to a vector, and your current code should work nicely (you'll need to resize the destination container to make sure there's enough room available).
If you need to use map for some reason (existing API?), as you discovered operator[] has only a non-const version.
Instead use a const_iterator approach (upvoted and taken from #PigBen's answer):
std::map<size_t, DataElement*>::const_iterator it = other.dataElements.begin();
while(it != other.dataElements.end())
{
dataElements[it->first] = new DataElement(*(it->second));
++it;
}
Don't have much time to answer now so this will be brief. There is a copy-constructor for map, but it won't do a deep copy. You want to use iterators (map.begin(), map.end()). *Iter will give you a pair object, so you can do (*iter).first and/or (*iter).second. (Or something like that... It's been a while...)
Ref: http://www.sgi.com/tech/stl/Map.html
for (auto& kv : other.dataElements) {
dataElements[kv.first] = new DataElement(*kv.second);
}
Just one observation :- You are giving direct access to the dataElements. (other.dataElements). Keep dataElements private and then give method like GetDataElement.