Creating a map for millions of objects in C++

Creating a map for millions of objects in C++ - c++

I have an abstract class called Object that has a few virtual functions, one of which is a function that will retrieve the id of an Object.
Currently, I am using a std::vector<Object> to store tens of millions of these objects. Unfortunately, adding, copying, or removing from this is painfully slow.
I wanted to create a hash map that could maybe have the Object->id as the key, and maybe the object itself as a value? Or is there some type of data structure that would allow for easy insertion and removal like a std::vector but would be faster for tens of millions of objects?
I would want the class to end up looking something like this outline:
stl::container<Objects*> obj_container;
DataContainer::DataContainer()
: stl::container(initialized_here)
{}
DataContainer::addObject(Object* object)
{
obj_container.insert(object);
}
DataContainer::removeObject(Object* object)
{
obj_container.remove(object);
}
DataContainer::preSort()
{
obj_container.sort_by_id();
}
DataContainer::getObject(Object* object)
{
if(!obj_container.contains(object)) { return; }
binary_search(object);
}
Is there anything really fast at processing large amounts of these objects, or is there anything really fast that could possibly use an unsigned integer id from an object to process the data?
Also, my class would get pre-sorted, so every object would be sorted by ID before being added to the container. Then I would do a binary search on the data by ID.

You probably could use std::set (if the id-s have some order and are unique for it) or std::unordered_set and I would suggest you make it a component of your container, not derive your container from it. You'll better have a way of constructing a local fake Object with only its id ...
class Object {
friend class BigContainer;
unsigned long _id;
// other fields;
// your constructors
public:
unsigned long id() const { return _id; };
private:
Object(unsigned long pseudoid); // construct a fake object
};
struct LessById {
bool operator () (const Object &ob1, const Object& ob2)
{ return ob1.id() < ob2.id(); };
bool operator () (const Object &ob, unsigned long idl)
{ return ob1.id() < idl;
};
class BigContainer {
std::set<Object,LessById> set;
public:
// add members, constructors, destructors, etc...
bool contains(unsigned long id) const {
Object fakeobj{id};
if (set.find(fakeobj) != set.end()) return true;
return false;
};
const Object* find_by_id(unsigned long id) const {
Object fakeobj{id};
auto p = set.find(fakeobj);
if (p != set.end()) return &(*p);
return nullptr;
};
bool contains(const Object& ob) const {
if (set.find(ob) != set.end()) return true;
return false;
};
void add(const Object&ob) const {
Object fakeobj{id};
auto p = set.find(fakeobj);
if (p == set.end()) set.insert(ob);
}
void remove(unsigned long id) const {
Object fakeobj{id};
auto p = set.find(fakeobj);
if (p != set.end()) set.erase(p);
}
};
If you want a set of pointers use a set of some smart pointers and adapt the scheme above.
If the Object is big and you have trouble in defining a way of constructing efficiently local fake objects for a given id, define a super struct BoxedId { unsigned long id; BoxedId(unsigned long l): id(l) {}; }, declare internally a std::set<std::shared_ptr<BoxedId>,BoxedLessById> make class Object : public BoxedId, etc...
BTW, since Object has virtual methods you probably will subclass it and you need to have a set of pointers. You need to define a pointer policy (are every actual instances of sub-classes of Object-s in your Container) and use some smart pointer.... You need to define who is in charge of delete-ing your Object-s (who owns the pointer). Is it only the unique BigContainer.
Read the C++11 rule of five.

Please have a look at this site : http://www.cs.northwestern.edu/~riesbeck/programming/c++/stl-summary.html
It shows the time complexity of each operation of each STL.
First be clear about your requirement and then choose particular STL wisely by comparing its time complexity shown in above link.

Related

Save reference to void pointer in a vector during loop iteration

Guys I have a function like this (this is given and should not be modified).
void readData(int &ID, void*&data, bool &mybool) {
if(mybool)
{
std::string a = "bla";
std::string* ptrToString = &a;
data = ptrToString;
}
else
{
int b = 9;
int* ptrToint = &b;
data = ptrToint;
}
}
So I want to use this function in a loop and save the returned function parameters in a vector (for each iteration).
To do so, I wrote the following struct:
template<typename T>
struct dataStruct {
int id;
T** data; //I first has void** data, but would not be better to
// have the type? instead of converting myData back
// to void* ?
bool mybool;
};
my main.cpp then look like this:
int main()
{
void* myData = nullptr;
std::vector<dataStruct> vec; // this line also doesn't compile. it need the typename
bool bb = false;
for(int id = 1 ; id < 5; id++) {
if (id%2) { bb = true; }
readData(id, myData, bb); //after this line myData point to a string
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
}
}
Or is there a better way to do that without template? I used c++11 (I can't use c++14)

The function that you say cannot be modified, i.e. readData() is the one that should alert you!
It causes Undefined Behavior, since the pointers are set to local variables, which means that when the function terminates, then these pointers will be dangling pointers.

Let us leave aside the shenanigans of the readData function for now under the assumption that it was just for the sake of the example (and does not produce UB in your real use case).
You cannot directly store values with different (static) types in a std::vector. Notably, dataStruct<int> and dataStruct<std::string> are completely unrelated types, you cannot store them in the same vector as-is.
Your problem boils down to "I have data that is given to me in a type-unsafe manner and want to eventually get type-safe access to it". The solution to this is to create a data structure that your type-unsafe data is parsed into. For example, it seems that you inteded for your example data to have structure in the sense that there are pairs of int and std::string (note that your id%2 is not doing that because the else is missing and the bool is never set to false again, but I guess you wanted it to alternate).
So let's turn that bunch of void* into structured data:
std::pair<int, std::string> readPair(int pairIndex)
{
void* ptr;
std::pair<int, std::string> ret;
// Copying data here.
readData(2 * pairIndex + 1, ptr, false);
ret.first = *reinterpret_cast<int*>(ptr);
readData(2 * pairIndex + 2, ptr, true);
ret.second = *reinterpret_cast<std::string*>(ptr);
}
void main()
{
std::vector<std::pair<int, std::string>> parsedData;
parsedData.push_back(readPair(0));
parsedData.push_back(readPair(1));
}
Demo
(I removed the references from the readData() signature for brevity - you get the same effect by storing the temporary expressions in variables.)
Generally speaking: Whatever relation between id and the expected data type is should just be turned into the data structure - otherwise you can only reason about the type of your data entries when you know both the current ID and this relation, which is exactly something you should encapsulate in a data structure.

Your readData isn't a useful function. Any attempt at using what it produces gives undefined behavior.
Yes, it's possible to do roughly what you're asking for without a template. To do it meaningfully, you have a couple of choices. The "old school" way would be to store the data in a tagged union:
struct tagged_data {
enum { T_INT, T_STR } tag;
union {
int x;
char *y;
} data;
};
This lets you store either a string or an int, and you set the tag to tell you which one a particular tagged_data item contains. Then (crucially) when you store a string into it, you dynamically allocate the data it points at, so it will remain valid until you explicitly free the data.
Unfortunately, (at least if memory serves) C++11 doesn't support storing non-POD types in a union, so if you went this route, you'd have to use a char * as above, not an actual std::string.
One way to remove (most of) those limitations is to use an inheritance-based model:
class Data {
public:
virtual ~Data() { }
};
class StringData : public Data {
std::string content;
public:
StringData(std::string const &init) : content(init) {}
};
class IntData : public Data {
int content;
public:
IntData(std::string const &init) : content(init) {}
};
This is somewhat incomplete, but I think probably enough to give the general idea--you'd have an array (or vector) of pointers to the base class. To insert data, you'd create a StringData or IntData object (allocating it dynamically) and then store its address into the collection of Data *. When you need to get one back, you use dynamic_cast (among other things) to figure out which one it started as, and get back to that type safely. All somewhat ugly, but it does work.
Even with C++11, you can use a template-based solution. For example, Boost::variant, can do this job quite nicely. This will provide an overloaded constructor and value semantics, so you could do something like:
boost::variant<int, std::string> some_object("input string");
In other words, it's pretty what you'd get if you spent the time and effort necessary to finish the inheritance-based code outlined above--except that it's dramatically cleaner, since it gets rid of the requirement to store a pointer to the base class, use dynamic_cast to retrieve an object of the correct type, and so on. In short, it's the right solution to the problem (until/unless you can upgrade to a newer compiler, and use std::variant instead).

Apart from the problem in given code described in comments/replies.
I am trying to answer your question
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
Before that you need to modify vec definition as following
vector<dataStruct<void>> vec;
Now you can simple push element in vector
vec.push_back({id, &mydata, bb});
i have tried to modify your code so that it can work
#include<iostream>
#include<vector>
using namespace std;
template<typename T>
struct dataStruct
{
int id;
T** data;
bool mybool;
};
void readData(int &ID, void*& data, bool& mybool)
{
if (mybool)
{
data = new string("bla");
}
else
{
int b = 0;
data = &b;
}
}
int main ()
{
void* mydata = nullptr;
vector<dataStruct<void>> vec;
bool bb = false;
for (int id = 0; id < 5; id++)
{
if (id%2) bb = true;
readData(id, mydata, bb);
vec.push_back({id, &mydata, bb});
}
}

How can I take ownership of a C++ std::string char data without copying and keeping std::string object?

How can I take ownership of std::string char data without copying and withoug keeping source std::string object? (I want to use moving semantics but between different types.)
I use the C++11 Clang compiler and Boost.
Basically I want to do something equivalent to this:
{
std::string s(“Possibly very long user string”);
const char* mine = s.c_str();
// 'mine' will be passed along,
pass(mine);
//Made-up call
s.release_data();
// 's' should not release data, but it should properly destroy itself otherwise.
}
To clarify, I do need to get rid of std::string: further down the road. The code deals with both string and binary data and should handle it in the same format. And I do want the data from std::string, because that comes from another code layer that works with std::string.
To give more perspective where I run into wanting to do so: for example I have an asynchronous socket wrapper that should be able to take both std::string and binary data from user for writing. Both "API" write versions (taking std::string or row binary data) internally resolve to the same (binary) write. I need to avoid any copying as the string may be long.
WriteId write( std::unique_ptr< std::string > strToWrite )
{
// Convert std::string data to contiguous byte storage
// that will be further passed along to other
// functions (also with the moving semantics).
// strToWrite.c_str() would be a solution to my problem
// if I could tell strToWrite to simply give up its
// ownership. Is there a way?
unique_ptr<std::vector<char> > dataToWrite= ??
//
scheduleWrite( dataToWrite );
}
void scheduledWrite( std::unique_ptr< std::vecor<char> > data)
{
…
}
std::unique_ptr in this example to illustrate ownership transfer: any other approach with the same semantics is fine to me.
I am wondering about solutions to this specific case (with std::string char buffer) and this sort of problem with strings, streams and similar general: tips to approach moving buffers around between string, stream, std containers and buffer types.
I would also appreciated tips and links with C++ design approaches and specific techniques when it comes to passing buffer data around between different API's/types without copying. I mention but not using streams because I'm shaky on that subject.

How can I take ownership of std::string char data without copying and withoug keeping source std::string object ? (I want to use moving semantics but between different types)
You cannot do this safely.
For a specific implementation, and in some circumstances, you could do something awful like use aliasing to modify private member variables inside the string to trick the string into thinking it no longer owns a buffer. But even if you're willing to try this it won't always work. E.g. consider the small string optimization where a string does not have a pointer to some external buffer holding the data, the data is inside the string object itself.
If you want to avoid copying you could consider changing the interface to scheduledWrite. One possibility is something like:
template<typename Container>
void scheduledWrite(Container data)
{
// requires data[i], data.size(), and &data[n] == &data[0] + n for n [0,size)
…
}
// move resources from object owned by a unique_ptr
WriteId write( std::unique_ptr< std::vector<char> > vecToWrite)
{
scheduleWrite(std::move(*vecToWrite));
}
WriteId write( std::unique_ptr< std::string > strToWrite)
{
scheduleWrite(std::move(*strToWrite));
}
// move resources from object passed by value (callers also have to take care to avoid copies)
WriteId write(std::string strToWrite)
{
scheduleWrite(std::move(strToWrite));
}
// assume ownership of raw pointer
// requires data to have been allocated with new char[]
WriteId write(char const *data,size_t size) // you could also accept an allocator or deallocation function and make ptr_adapter deal with it
{
struct ptr_adapter {
std::unique_ptr<char const []> ptr;
size_t m_size;
char const &operator[] (size_t i) { return ptr[i]; }
size_t size() { return m_size; }
};
scheduleWrite(ptr_adapter{data,size});
}

This class take ownership of a string using move semantics and shared_ptr:
struct charbuffer
{
charbuffer()
{}
charbuffer(size_t n, char c)
: _data(std::make_shared<std::string>(n, c))
{}
explicit charbuffer(std::string&& str)
: _data(std::make_shared<std::string>(str))
{}
charbuffer(const charbuffer& other)
: _data(other._data)
{}
charbuffer(charbuffer&& other)
{
swap(other);
}
charbuffer& operator=(charbuffer other)
{
swap(other);
return *this;
}
void swap(charbuffer& other)
{
using std::swap;
swap(_data, other._data);
}
char& operator[](int i)
{
return (*_data)[i];
}
char operator[](int i) const
{
return (*_data)[i];
}
size_t size() const
{
return _data->size();
}
bool valid() const
{
return _data;
}
private:
std::shared_ptr<std::string> _data;
};
Example usage:
std::string s("possibly very long user string");
charbuffer cb(std::move(s)); // s is empty now
// use charbuffer...

You could use polymorphism to resolve this. The base type is the interface to your unified data buffer implementation. Then you would have two derived classes. One for std::string as the source, and the other uses your own data representation.
struct MyData {
virtual void * data () = 0;
virtual const void * data () const = 0;
virtual unsigned len () const = 0;
virtual ~MyData () {}
};
struct MyStringData : public MyData {
std::string data_src_;
//...
};
struct MyBufferData : public MyData {
MyBuffer data_src_;
//...
};

Check if container of shared_ptr contains a pointer?

Using Observer pattern. I have a class, called Monitor for example, that is monitoring a collection of objects. The class is an Observer and each object in it's collection is a Subject. Currently the collection is implemented as a std::list of shared_ptr. In the Update method of the Monitor class I want to check if the update is coming from one of the objects in it's collection.
std::list<SomeSharedPointer> items_;
...
void Monitor::Update(Subject *subject)
{
if(subject == something_)
{
DoSomething();
}
else if
??
// if subject is one of the objects in our collection then do something..
}
Subject here is a raw pointer and my collection is a list of shared_ptr. How can I effectively check if the subject coming in is any one of the objects in my collection?
(Note my compiler, msvc, supports lambdas if there is an algorithmic solution requiring one)
UPDATE
I should add that I realize I can use a for loop over the container, but I'm wondering if there's a snazzier way.
UPDATE 2
SomeSharedPointer is a typedef for std::shared_ptr<SomeType> where SomeType derives from abstract class Subject (standard Observer pattern implementation). SomeType will at some point call Notify() which will call the Update() method for each observer.

auto i = std::find_if(items_.begin(), items_.end(),
[=](const SomeSharedPointer& x) { return x.get() == subject; });
if (i != c.end())
{
// Object found, and i is an iterator pointing to it
}
A little helper method can make this more readable:
typedef std::list<SomeSharedPtr> ObserverCollection;
// You can also add a const version if needed
ObserverCollection::iterator find_observer(Subject* s)
{
return std::find_if(items_.begin(), items_.end(),
[=](const SomeSharedPointer& x) { return x.get() == s; });
}
Then, you use it like this if you need the iterator
auto i = find_observer(subject);
if (i != items_.end())
{
// Object found
}
or simply like this if you don't:
if (find_observer(subject) != items_.end())
{
...
}

If you don't have C++11 support for auto, declare the iterator the old fashioned way
for (auto iter = items_.begin(); iter != items_.end(); ++iter)
{
if (subject == iter->get())
{
.. do stuff ..
}
}
Shared pointer has a .get() function that returns the pointer.

Since you said that the observer needs to make a decision based on a state of items it is monitoring, then you should add a method to the base class (Subject in your question) which returns an enum defining the item's state. Then based on the state, add a switch in the update method:
enum State{ STATE_1, STATE_2 };
void Monitor::Update(Subject *subject)
{
switch( subject->getState() )
{
case STATE_1:
// do something 1
break;
case STATE_2:
// do something 2
break;
default:
//error
}
}

If possible, you could consider changing your container to something that gives you better search behavior. For instance, you could use a std::set. It costs more per insertion, but faster per lookup. Or, std::unordered_set. Both insertion and lookup are fast, but iteration is likely slower. To achieve proper comparison, you can create a helper class to enable converting your raw pointer into a shared pointer that has an no-op deleter.
template <typename T>
struct unshared_ptr {
std::shared_ptr<T> p_;
unshared_ptr (T *p) : p_(p, [](...){}) {}
operator const std::shared_ptr<T> & () const { return p_; }
operator T * () const { return p_.get(); }
};
If your container supported the find method, then:
typedef unshared_ptr<SomeType> unshared_some;
if (items_.end() != items_.find(unshared_some(subject))) {
DoSomething();
}
Try it online!
If you are sticking with std::list, you can abuse the remove_if method by passing in a predicate that always returns false, but performs your matching test.
bool matched = false;
auto pred = [subject, &matched](SomeSharedPtr &v) -> bool {
if (!matched && v.get() == subject) {
matched = true;
}
return false;
};
items_.remove_if(pred);
if (matched) {
DoSomething();
} //...

C++ method that can/cannot return a struct

I have a C++ struct and a method:
struct Account
{
unsigned int id;
string username;
...
};
Account GetAccountById(unsigned int id) const { }
I can return an Account struct if the account exists, but what to do if there's no account?
I thought of having:
An "is valid" flag on the struct (so an empty one can be returned, with that set to false)
An additional "is valid" pointer (const string &id, int *is_ok) that's set if the output is valid
Returning an Account* instead, and returning either a pointer to a struct, or NULL if it doesn't exist?
Is there a best way of doing this?

You forgot the most obvious one, in C++:
bool GetAccountById(unsigned int id, Account& account);
Return true and fill in the provided reference if the account exists, else return false.
It might also be convenient to use the fact that pointers can be null, and having:
bool GetAccountById(unsigned int id, Account* account);
That could be defined to return true if the account id exists, but only (of course) to fill in the provided account if the pointer is non-null. Sometimes it's handy to be able to test for existance, and this saves having to have a dedicated method for only that purpose.
It's a matter of taste what you prefer having.

From the options given I would return Account*. But returning pointer may have some bad side effect on the interface.
Another possibility is to throw an exception when there is no such account. You may also try boost::optional.

You could also try the null object pattern.

It depends how likely you think the non-existent account is going to be.
If it is truly exceptional - deep in the bowels of the internals of the banking system where the data is supposed to be valid - then maybe throw an exception.
If it is in a user-interface level, validating the data, then probably you don't throw an exception.
Returning a pointer means someone has to deallocate the allocated memory - that's messier.
Can you use an 'marker ID' (such as 0) to indicate 'invalid account'?

I would use Account* and add a documentation comment to the method stating that the return value can be NULL.

There are several methods.
1) Throw an exception. This is useful if you want GetAccountById to return the account by value and the use of exceptions fit your programming model. Some will tell you that exceptions are "meant" to be used only in exceptional circumstances. Things like "out of memory" or "computer on fire." This is highly debatable, and for every programmer you find who says exceptions are not for flow control you'll find another (myself included) who says that exceptions can be used for flow control. You need to think about this and decide for yourself.
Account GetAccountById(unsigned int id) const
{
if( account_not_found )
throw std::runtime_error("account not found");
}
2) Don't return and Account by value. Instead, return by pointer (preferably smart pointer), and return NULL when you didn't find the account:
boost::shared_ptr<Account> GetAccountById(unsigned int id) const
{
if( account_not_found )
return NULL;
}
3) Return an object that has a 'presence' flag indicating whether or not the data item is present. Boost.Optional is an example of such a device, but in case you can't use Boost here is a templated object that has a bool member that is true when the data item is present, and is false when it is not. The data item itself is stored in the value_ member. It must be default constructible.
template<class Value>
struct PresenceValue
{
PresenceValue() : present_(false) {};
PresenceValue(const Value& val) : present_(true), value_(val) {};
PresenceValue(const PresenceValue<Value>& that) : present_(that.present_), value_(that.value_) {};
explicit PresenceValue(Value val) : present_(true), value_(val) {};
template<class Conv> explicit PresenceValue(const Conv& conv) : present_(true), value_(static_cast<Value>(conv)) {};
PresenceValue<Value>& operator=(const PresenceValue<Value>& that) { present_ = that.present_; value_ = that.value_; return * this; }
template<class Compare> bool operator==(Compare rhs) const
{
if( !present_ )
return false;
return rhs == value_;
}
template<class Compare> bool operator==(const Compare* rhs) const
{
if( !present_ )
return false;
return rhs == value_;
}
template<class Compare> bool operator!=(Compare rhs) const { return !operator==(rhs); }
template<class Compare> bool operator!=(const Compare* rhs) const { return !operator==(rhs); }
bool operator==(const Value& rhs) const { return present_ && value_ == rhs; }
operator bool() const { return present_ && static_cast<bool>(value_); }
operator Value () const;
void Reset() { value_ = Value(); present_ = false; }
bool present_;
Value value_;
};
For simplicity, I would create a typedef for Account:
typedef PresenceValue<Account> p_account;
...and then return this from your function:
p_account GetAccountByIf(...)
{
if( account_found )
return p_account(the_account); // this will set 'present_' to true and 'value_' to the account
else
return p_account(); // this will set 'present_' to false
}
Using this is straightforward:
p_account acct = FindAccountById(some_id);
if( acct.present_ )
{
// magic happens when you found the account
}

Another way besides returning a reference is to return a pointer. If the account exists, return its pointer. Else, return NULL.

There is yet another way similar to the "is valid" pattern. I am developing an application right now that has a lot of such stuff in it. But my IDs can never be less than 1 (they are all SERIAL fields in a PostgreSQL database) so I just have a default constructor for each structure (or class in my case) that initializes id with -1 and isValid() method that returns true if id is not equal to -1. Works perfectly for me.

I would do:
class Bank
{
public:
class Account {};
class AccountRef
{
public:
AccountRef(): m_account(NULL) {}
AccountRef(Account const& acc) m_account(&acc) {}
bool isValid() const { return m_account != NULL);}
Account const& operator*() { return *m_account; }
operator bool() { return isValid(); }
private:
Account const* m_account;
};
Account const& GetAccountById(unsigned int id) const
{
if (id < m_accounts.size())
{ return m_accounts[id];
}
throw std::outofrangeexception("Invalid account ID");
}
AccountRef FindAccountById(unsigned int id) const
{
if (id < m_accounts.size())
{ return AccountRef(m_accounts[id]);
}
return AccountRef();
}
private:
std::vector<Account> m_accounts;
};
A method called get should always return (IMHO) the object asked for. If it does not exist then that is an exception. If there is the possibility that something may not exists then you should also provide a find method that can determine if the object exists so that a user can test it.
int main()
{
Bank Chase;
// Get a reference
// As the bank ultimately ownes the account.
// You just want to manipulate it.
Account const& account = Chase.getAccountById(1234);
// If there is the possibility the account does not exist then use find()
AccountRef ref = Chase.FindAccountById(12345);
if ( !ref )
{ // Report error
return 1;
}
Account const& anotherAccount = *ref;
}
Now I could have used a pointer instead of going to the effort of creating AccountRef. The problem with that is that pointers do not have ownership symantics and thus there is no true indication of who should own (and therefore delete) the pointer.
As a result I like to wrap pointers in some container that allows the user to manipulate the object only as I want them too. In this case the AccountRef does not expose the pointer so there is no opportunity for the user of AccountRef to actually try and delete the account.
Here you can check if AccountRef is valid and extract a reference to an account (assuming it is valid). Because the object contains only a pointer the compiler is liable to optimize this to the point that this is no more expensive than passing the pointer around. The benefit is that the user can not accidentally abuse what I have given them.
Summary: AccountRef has no real run-time cost. Yet provides type safety (as it hides the use of pointer).

I like to do a combination of what you suggest with the Valid flag and what someone else suggested with the null object pattern.
I have a base class called Status that I inherit from on objects that I want to use as return values. I'll leave most of it out of this discussion since it's a little more involved but it looks something like this
class Status
{
public:
Status(bool isOK=true) : mIsOK(isOK)
operator bool() {return mIsOK;}
private
bool mIsOK
};
now you'd have
class Account : public Status
{
public:
Account() : Status(false)
Account(/*other parameters to initialize an account*/) : ...
...
};
Now if you create an account with no parameters:
Account A;
It's invalid. But if you create an account with data
Account A(id, name, ...);
It's valid.
You test for the validity with the operator bool.
Account A=GetAccountByID(id);
if (!A)
{
//whoa there! that's an invalid account!
}
I do this a lot when I'm working with math-types. For example, I don't want to have to write a function that looks like this
bool Matrix_Multiply(a,b,c);
where a, b, and c are matrices. I'd much rather write
c=a*b;
with operator overloading. But there are cases where a and b can't be multiplied so it's not always valid. So they just return an invalid c if it doesn't work, and I can do
c=a*b;
if (!c) //handle the problem.

boost::optional is probably the best you can do in a language so broken it doesn't have native variants.

How to create map with keys/values inside class body once (not each time functions from class are called)

I would like to create C++ class which would allow to return value by given key from map, and key by given value. I would like also to keep my predefined map in class content. Methods for getting value or key would be static. How to predefine map statically to prevent creating map each time I call getValue(str) function?
class Mapping
{
static map<string, string> x;
Mapping::Mapping()
{
x["a"] = "one";
x["b"] = "two";
x["c"] = "three";
}
string getValue(string key)
{
return x[key];
}
string getKey(string value)
{
map<string, string>::const_iterator it;
for (it = x.begin(); it < x.end(); ++it)
if (it->second == value)
return it->first;
return "";
}
};
string other_func(string str)
{
return Mapping.getValue(str); // I don't want to do: new Mapping().getValue(str);
}
Function other_func is called often so I would prefer to use map which is created only once (not each time when other_func is called). Do I have to create instance of Mapping in main() and then use it in other_func (return instance.getValue(str)) or is it possible to define map in class body and use it by static functions?

Is this what you want?
#include <map>
#include <string>
class Mapping
{
private:
// Internally we use a map.
// But typedef the class so it is easy to refer too.
// Also if you change the type you only need to do it in one place.
typedef std::map<std::string, std::string> MyMap;
MyMap x; // The data store.
// The only copy of the map
// I dont see any way of modifying so declare it const (unless you want to modify it)
static const Mapping myMap;
// Make the constructor private.
// This class is going to hold the only copy.
Mapping()
{
x["a"] = "one";
x["b"] = "two";
x["c"] = "three";
}
public:
// Public interface.
// Returns a const reference to the value.
// The interface use static methods (means we dont need an instance)
// Internally we refer to the only instance.
static std::string const& getValue(std::string const& value)
{
// Use find rather than operator[].
// This way you dont go inserting garbage into your data store.
// Also it allows the data store to be const (as operator may modify the data store
// if the value is not found).
MyMap::const_iterator find = myMap.x.find(value);
if (find != myMap.x.end())
{
// If we find it return the value.
return find->second;
}
// What happens when we don;t find anything.
// Your original code created a garbage entry and returned that.
// Could throw an exception or return a temporary reference.
// Maybe -> throw int(1);
return "";
}
};

First of all, you might want to look up Boost::MultiIndex and/or Boost::bimap. Either will probably help a bit with your situation of wanting to use either one of the paired items to look up the other (bimap is more directly what you want, but if you might need to add a third, fourth, etc. key, then MultiIndex may work better). Alternatively, you might want to just use a pair of sorted vectors. For situations like this where the data remains constant after it's been filled in, these will typically allow faster searching and consume less memory.
From there, (even though you don't have to make it explicit) you can handle initialization of the map object itself a bit like a singleton -- put the data in the first time it's needed, and from then on just use it:
class Mapping {
static map<string, string> x;
static bool inited;
public:
Mapping() {
if (!inited) {
x["a"] = "one";
x["b"] = "two";
x["c"] = "three";
inited = true;
}
}
string getValue(string const &key) { return x[key]; }
};
// This initialization is redundant, but being explicit doesn't hurt.
bool Mapping::inited = false;
map<string, string> Mapping::x;
With this your some_func could look something like this:
string some_func(string const &input) {
return Mapping().getValue(input);
}
This still has a little overhead compared to pre-creating and using an object, but it should be a lot less than re-creating and re-initializing the map (or whatever) every time.

If you are looking up the value from the key a lot, you will find it easier and more efficient to maintain a second map in parallel with the first.

You don't need to create a static map especially if you ever want to create multiple Mapping objects. You can create the object in main() where you need it, and pass it around by reference, as in:
string other_func(Mapping &mymap, string str)
{
return mymap.getValue(str);
}
Of course this raises questions about efficiency, with lots of strings being copied, so you might want to just call getValue directly without the extra overhead of calling other_func.
Also, if you know anything about the Boost libraries, then you might want to read up on Boost.Bimap which is sort of what you are implementing here.
http://www.boost.org/doc/libs/1_42_0/libs/bimap/doc/html/index.html

Static is bad. Don't. Also, throw or return NULL pointer on not found, not return empty string. Other_func should be a member method on the object of Mapping, not a static method. This whole thing desperately needs to be an object.
template<typename Key, typename Value> class Mapping {
std::map<Key, Value> primmap;
std::map<Value, Key> secmap;
public:
template<typename Functor> Mapping(Functor f) {
f(primmap);
struct helper {
std::map<Value, Key>* secmapptr;
void operator()(std::pair<Key, Value>& ref) {
(*secmapptr)[ref.second] = ref.first;
}
};
helper helpme;
helpme.secmapptr = &secmap;
std::for_each(primmap.begin(), primmap.end(), helpme);
}
Key& GetKeyFromValue(const Value& v) {
std::map<Value,Key>::iterator key = secmap.find(v);
if (key == secmap.end())
throw std::runtime_error("Value not found!");
return key->second;
}
Value& GetValueFromKey(const Key& k) {
std::map<Key, Value>::iterator key = primmap.find(v);
if (key == primmap.end())
throw std::runtime_error("Key not found!");
return key->second;
}
// Add const appropriately.
};
This code uses a function object to initialize the map, reverses it for you, then provides accessing methods for the contents. As for why you would use such a thing as opposed to a raw pair of std::maps, I don't know.
Looking at some of the code you've written, I'm guessing that you originate from Java. Java has a lot of things that C++ users don't use (unless they don't know the language) like singletons, statics, and such.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Creating a map for millions of objects in C++ - c++

Please have a look at this site : http://www.cs.northwestern.edu/~riesbeck/programming/c++/stl-summary.html It shows the time complexity of each operation of each STL. First be clear about your requirement and then choose particular STL wisely by comparing its time complexity shown in above link.

Related

Save reference to void pointer in a vector during loop iteration

How can I take ownership of a C++ std::string char data without copying and keeping std::string object?

Check if container of shared_ptr contains a pointer?

C++ method that can/cannot return a struct

How to create map with keys/values inside class body once (not each time functions from class are called)

Categories

Resources