Best Practice : How to get a unique identifier for the object - c++

I've got several objects and need to generate a unique identifier for them which will not be changed/repeated during the lifetime of each object.
Basically I want to get/generate a unique id for my objects, smth like this
int id = reinterpret_cast<int>(&obj);
or
int id = (int)&obj;
I understand the codes above are bad ideas, as int might not be large enough to store the address etc.
So whats the best practice to get a unique identifier from the object, which will be a portable solution ?

Depending on your "uniqueness"-requirements, there are several options:
If unique within one address space ("within one program execution") is OK and your objects stay where they are in memory then pointers are fine. There are pitfalls however: If your objects live in containers, every reallocation may change your objects' identity and if you allow copying of your objects, then objects returned from some function may have been created at the same address.
If you need a more global uniqueness, for instance because you are dealing with communicating programs or data that is persistent, use GUIDs/UUIds, such as boost.uuid.
You could create unique integers from some static counter, but beware of the pitfalls:
Make sure your increments are atomic
Protect against copying or create your custom copy constructors, assignment statements.
Personally, my choice has been UUIDs whenever I can afford them, because they provide me some ease of mind, not having to think about all the pitfalls.

If the objects need to be uniquely identified, you can generate the unique id in the constructor:
struct Obj
{
int _id;
Obj() { static int id = 0; _id = id++; }
};
You'll have to decide how you want to handle copies/assignments (same id - the above will work / different id's - you'll need a copy constructor and probably a static class member instead of the static local variable).

When I looked into this issue, I fairly quickly ended up at the Boost UUID library (universally unique identifier, http://www.boost.org/doc/libs/1_52_0/libs/uuid/). However, as my project grew, I switched over to Qt's GUID library (globally unique identifier, https://doc.qt.io/qt-5/quuid.html).
A lesson learned for me though was to start declaring your own UUID class and hide the implementation so that you can switch to whatever you find suitable later on.
I hope that helps.

If your object is a class then you could have a static member variable which you intestinal to 0. Then in the constructor you store this value into the class instance and increment the static variable:
class
Indexed
{
public:
Indexed() :
m_myIndex( m_nextIndex++ )
{ }
int getIndex() const
{ return m_myIndex; }
private:
const int m_myIndex;
static int m_nextIndex;
};

If you need unique id for distributed environment use boost::uuid

It does not look like a bad idea to use the object address as the unique (for this run) identifier, directly. Why to cast it into integer? Just compare pointers with ==:
MyObject *obj1, *obj2;
...
if (obj1 == obj2) ...
This will not work, of course, if you need to write IDs to database or the like. Same values for pointers are possible between runs. Also, do not overload comparison operator (==).

Related

Use a map across multiple objects

I use global maps to register one or several objects of the same type. I started with using a global namespace for that purpose.
Take this for an example (code untested, just an example):
//frame.h
class frame
{
public:
frame(int id);
~frame();
};
namespace frame_globals
{
extern std::map<int, frame *> global_frameMap;
};
//frame.cpp
#include "frame.h"
namespace frame_globals
{
std::map<int, frame *> global_frameMap;
}
frame::frame(int id)
{
//[...]
//assuming this class is exclusively used with "new"!
frame_globals::global_frameMap[id] = this;
}
frame::~frame()
{
frame_globals::global_frameMap.erase(id);
}
This was a rather quick solution I used, but now I stumble again on the use case, where I need my objects registered and I asked myself if there was no better way, than using a global variable (I would like to get rid of that).
[EDIT: Seems to be incorrect]
A static member variable is (to my best knowledge) no option, because a static is inline in every object and I need it across all objects.
What is the best way to do this? Using a common parent comes to mind, but I want easy access to my frame class. How would you solve it?
Or do you know of a better way to register an object, so that I can get a pointer to it from somewhere else? So I might not exlusively need "new" and save "this"?
I would consider reversing the design. eg. what is wrong with using language built functionality ?
std::map<int, frame> frames_;
frames_.emplace(id);//adding, ie. creating frames
frames_.erase(id)//removing, ie. deleting frames
Now creating/deleting is as easy as before. Coincidently, if someone needs a frame for a different purpose there is no issue here. If frame is supposed to be polymorphic you could store std::unique_ptr instead.
Also I would consider making id a member of frame and then store in a std::set instead of std::map.
class frame
{
public:
int id;
frame(int id);
friend bool operator <(const frame& lhs, const frame& rhs) {return lhs.id < rhs.id;}
};
std::set<frame> frames_;
Deleting is just:
frames_.erase(frame);
However in order to lookup based on id is now somewhat more complicated. Fortunately theres a solution at hand. That requires implementing a compare that is transparent, which among other things involves defining is_transparent.
Overall you need to think about ownership. Ask yourself: where or who would own/store the map/frame pointers/etc. Store it there. In case of shared ownership, use a shared_ptr (but use only this pattern when actually needed - which is more rare than what people use it for).
Some relevant core-guidelines
I.3 Avoid singletons
R.5 and R.6 Avoid non-const global variables
R.11 Avoid calling new and delete explicitly
R.20 and R.21 Prefer unique_ptr over shared_ptr unless you need to share ownership
You should ideally have a factory pattern to create a delete frame objects. The constructor of the frame must not be managing the map. It's like citizens are manging their passports (ideally govt does that for citizens). Have a manager class to keep frame in map (std::map<int,frame>). A method provided by the manager class would create the object:
class FrameManager
{
std::map<int,frame> Frames;
public:
frame CreateFrame(int id)
{
frame new_frame(id);
Frames[id] = new_frame;
}
};
What about removals? Well, depending on your design and requirement, you can have either/both of:
Removal of frames by the destructor of the frame. It may call FrameManager::Remove(id);
Allow removal also by FrameManager only (FrameManager::Remove(id)). I'd choose this (see more below).
Now, note that with this approach, many objects so frame will get created (local, assignment to map, return etc.). You may use shared_ptr<frame> as the return type from
CreateFrame, and keep shared_ptr as the type of map map<int, shared_ptr<frame>>.
It may seem complicated to use shared/unique_ptr but they become really helpful. You don't need to manage the lifetime. You may pass same shared_ptr<frame> to multiple functions, without creating frame object multiple times.
Modified version:
class FrameManager
{
std::map<int, shared_ptr<frame>> Frames;
public:
shared_ptr<frame> CreateFrame(int id)
{
if(Frames.count(id)>0) return nullptr; // Caller can check shared_ptr state
shared_ptr<frame> new_frame = make_shared<frame>(id);
Frames[id] = new_frame;
return new_frame;
}
};
Using a frame-manager will allow complete control over frames creation, better error handling, safe/secure code, safe multithreaded code.
I would like to put few points base on code posted by you.
How some one will know about frame_globals::global_frameMap and what it does with out looking source code.
If it is a global object then frame class logic can be bypassed and global object can be modified from outside.
What if someone wants to mange frame objects in different way or if in future some requirement changes for example if you want to allow duplicate frames then you have to change frame object.
frame class is not only having frame related data member and member function but also have managing logic. It seems not following single responsibility principle.
What about copy/move constructor? std::map/std::set will affect constructors of frame class.
I think logic separation is required between frame and how to manage frames.I have following solution ( I am not aware of all use cases, my solution is based on posted code)
frame class
frame class should not have any logic which is not related to frame.
If frame object need to know about other frame objects then a reference of ContainerWrapper should be provided (not directly std::map or std::set, so that change will not impact frame)
ContainerWrapper class
ContainerWrapper should manage all the frame objects.
It can have interface as per the requirement, for example try_emplace which returns iterator of inserted element (similar to std::map::try_emplace), so rather then first creating frame object and then try to insert in std::map, it can check the existence of the id(key) and only create frame object if std::map doesn't have any object with specified id.
About std::map/std::set,
std::set can be used if frame object once created is not required to modify because std::set will not let you modify frame without taking it out and re-inserting it or you have to use smart pointer.
By this way a container object can be shared across multiple frame objects and doesn't have to be global object.

Storing a list/map of object types in a class in C++11/C++14

I am writing a template for a C++ class (a registry) that has methods like Create and Delete, which instantiates and stores shared pointers to objects, but the Create method returns a reference to the created object rather than the shared pointer itself (the particular paradigm here being that no pointers, even smart pointers, exposed in the public interface).
The object registry that can deal with polymorphic types, in the sense that the registry is specialized for the base class and then Create is a template function that can be specialized for any polymorphically-derived class of the base class. It then returns a reference to the create object, of the derived class. The class also has an ID system, so any objects can be also referred to via that.
I require a Get method of type auto that can return the object (given its ID) in the same type is was created in. Obviously the objects are stored as a list of shared pointers to the base class, so this requires a dynamic_cast.
However, I cannot think of a way of storing the original object type when it is created. I need something akin to a std::map<[object ID], [object type]> stored as a member variable for the registry.
I've considered concatenating std::tuples but adding a new object changes its type, so it can't be stored as a member of the registry. I've also considered tricks of having a typedef within a new class that inherits from a virtual base class, so it can be stored in a list of pointers to the base class, but then using dynamic_cast to access the derived class requires knowing the object type in the first place.
Making a member list of std::functions that call another function (instantiated during Create) also won't work because the return types are different and auto cannot be used within std::function. I've also tried various tricks with variadic templates.
All solutions on SO I've seen are unsuitable because these are two methods (Create and Get) being called wrt the same class, so the information needs to be contained in the particular instance of the class itself.
Is this task impossible?
It's not impossible; but you made it impossible.
The system you're asking for doesn't require a lot of technicalities apart from using templates for the Get function. Let's break it down:
You want to create a system whereby you can instantiate (e.g. Create) classes that are of an appropriate 'base' and then store them in an associative-container, in which case you chose map.
Your map is defined thus:
std::map<[object ID], [object type]> m_map;
Now, given this information. Why, might I ask, would you want to return a reference to the object? Even more so, your Create function can be simplified a lot easier to something like this:
void System::create(int id, Base *b)
{
m_map.emplace(id, b); // Assuming object ID is of type int
}
If you have your create function implemented thus, then the following is permissible:
class Child : public Base
{
public:
Child();
Child(const std::string &name);
virtual ~Child();
};
int main()
{
System s;
s.create(1, new Child("Roger"));
}
You are probably not interested in using the manual approach of creating objects, but something more automated. Without introducing new technical measures to our infant System class:
static Child *create(const std::string &name)
{
return new Child(name);
}
Which allows the following usage:
s.create(2, Child::create("William"));
You want to be able to retrieve classes of a derived type based on such. Sans the pun, there's no need to create a highly specialised auto function. You know the type you want to get ''at compile time''; whereas auto and decltype C++14 are more concerned with types that are unknown until run-time. Assuming you know what type you want, our function is much easier:
template<typename T>
T Get(int id)
{
std::map<..>::iterator i = m_map.find(id);
if (i != m_map.end())
return dynamic_cast<T>(i->second);
else return nullptr;
}
Which now allows the following usage, continuing our int main()..
class Children : public Base
{
Children();
virtual ~Children();
void add(Child *c);
};
int main()
{
System s;
s.create(1, Child::create("Roger"));
s.create(2, Child::create("William"));
s.create(3, new Children());
s.get<Children*>(3)->add(s.get<Child*>(2)); // Add william to group
return 0;
}
The advantage is that you now have a system that is able to deal with many objects that derive from ''Base'' without having to know which objects actually derive from it! This makes our System class very versatile and extensible. It also means that any object-creation methods are the responsibility of the ''Base'' classes; e.g. Child and Children in our case. For the latter we did not implement an object-factory method because it was not practical at this time.
You want to delete an object from your registry, thus:
void System::delete(int id)
{
m_map.erase(id);
}
Now we have a pretty functional registry system that can serve any class. It's important that these registries aren't abused to serve ''too'' generic types. It's better to stratify which family of classes warrants their own registry.
Things to take into account:
When you add objects to your map, they are automatically converted into the Base type, but because of polymorphism the pointer is really pointing to a different location in memory with its own set of values and functionality. This is why it's possible to dynamically convert a type to another so you can get back the derived type. It's in fact a lot better to refer to objects outside the system through ids (handles) rather than the references to what they prescribe.
Please note, I'm using raw pointers for this example. If you want to use smart pointers, do take into account that maps already handle memory for you. If they didn't, it wouldn't be possible to use the memory when using the Get function. It's a matter of style, but also a highly controversial one. Valid objections.
Also, very important:
Consider using std::unordered_map if your system involves getting objects through the Get function. The reason for this is simple: the objects are unordered. This makes it easier to iterate through an unordered_map to retrieve objects contained. Whereas in an ordered_map std::map the Get function would have to go through all the objects until it finds the one it needs. For this reason: use std::unordered_map when you know you're going to retrieve values/objects; and use std::map when you know you're only going to iterate over them.
The usual way to do this sort of thing is to make the Get method a template method, something like:
class Registry {
template <class T> T &Get(id_t id) {
... fetch the smart pointer from the registry
return dynamic_cast<T &>(*ptr); }
This requires the caller of Get know what type of object it is getting (and will throw a std::bad_cast if it gets it wrong):
auto &obj = registry->Get<DerivedType>(id);
However, this approach exposes references in the interface, which are really pointers, which you say you want to avoid.
If you really want to avoid exposing all pointers, you need to provide a way of manipulating objects in the registry using only their ids. One way to do this is to create a DerivedTypeManipulator singleton for every derived type you store in the registry, which exposes all the operations on the derived type, but via an id rather than a pointer or reference.
This doesn't really solve the problem of needing to know the derived type in code that needs to do anything specific to a derived type, however.

Which containers to store objects for access via different identifiers?

I have to access my objects (multiple instances from one class) via several different identifiers and don't know which is the best way to store the mapping from identifier to object.
I act as a kind of "connector" between two worlds and each has its own identifiers I have to use / support.
If possible I'd like to prevent using pointers.
The first idea was to put the objects in a List/Vector and then create a map for each type of identifier. Soon I had to realize that the std-containers doesn't support storing references.
The next idea was to keep the objects inside the vector and just put the index in the map. The problem here is that I didn't find an index_of for vector and storing the index inside the object only works as long as nobody uses insert or erase.
The only identifer I have when creating the objects is a string and for performance I don't want to use this string as identifer for a map.
Is this a problem solved best with pointers or does anybody have an idea how to deal with it?
Thanks
Using pointers seems reasonable. Here's a suggested API that you could implement:
class WidgetDatabase {
public:
// Returns true if widget was inserted.
// If there is a Widget in *this with the same name and/or id,
// widget is not inserted.
bool Insert(const std::string& name, int id, const Widget& widget);
// Caller does NOT own returned pointer (do not delete it!).
// null is returned if there is no such Widget.
const Widget* GetByName(const string& name) const;
const Widget* GetById(int id) const;
private:
std::set<Widget> widgets_;
std::map<std::string, Widget*> widgets_by_name_;
std::map<int, Widget*> widgets_by_id_;
};
I think this should be pretty straightforward to implement. You just need to make sure to maintain the following invariant:
w is in widgets_ iff a pointer to it is in widgets_by_*
I think the main pitfall that you'll encounter is making sure is that name and id are not already in widgets_by_* when Insert is called.
It should be easy to make this thread safe; just throw in a mutex member variable, and some local lock_guards. Optionally, use shared_lock_guard in the Get* methods to avoid contention; this will be especially helpful if your use-case involves more reading than writing.
Have you considered an in-memory SQLite database? SQL gives you many ways of accessing the same data. For example, your schema might look like this:
CREATE TABLE Widgets {
-- Different ways of referring to the same thing.
name STRING,
id INTEGER,
-- Non-identifying characteristics.
mass_kg FLOAT,
length_m FLOAT,
cost_cents INTEGER,
hue INTEGER;
}
Then you can query using different identifiers:
SELECT mass_kg from Widgets where name = $name
or
SELECT mass_kg from Widgets where id = $id
Of course, SQL allows you to do much more than this. This will allow you to easily extend your library's functionality in the future.
Another advantage is that SQL is declarative, which usually makes it more concise and readable.
In recent versions, SQLite supports concurrent access to the same database. The concurrency model has gotten stronger over time, so you'll have to make sure you understand the model that is offered by the version that you're using. The latest version of the docs can be found on sqlite's website.

Using the address of a member variable as an ID

I'm trying to avoid declaring enums or using strings. Although the rationale to do so may seem dubious, the full explanation is irrelevant.
My question is fairly simple. Can I use the address of a member variable as a unique ID?
More specifically, the requirements are:
The ID won't have to be serialised.
IDs will be protected members - only to be used internally by the owning object (there is no comparison of IDs even between same class instances).
Subclasses need access to base class IDs and may add their new IDs.
So the first solution is this:
class SomeClass
{
public:
int mBlacks;
void AddBlack( int aAge )
{
// Can &mBlacks be treated as a unique ID?
// Will this always work?
// Is void* the right type?
void *iId = &mBlacks;
// Do something with iId and aAge
// Like push a struct of both to a vector.
}
};
While the second solution is this:
class SomeClass
{
public:
static int const *GetBlacksId()
{
static const int dummy = 0;
return &dummy;
}
void AddBlack( int aAge )
{
// Do something with GetBlacksId and aAge
// Like push a struct of both to a vector.
}
};
No other int data member of this object, and no mBlacks member of a different instance of SomeClass in the same process, has the same address as the mBlacks member of this instance of SomeClass. So you're safe to use it as a unique ID within the process.
An empty base class subobject of SomeClass could have the same address as mBlacks (if SomeClass had any empty base classes, which it doesn't), and the char object that's the first byte of mBlacks has the same address as mBlacks. Aside from that, no other object has the same address.
void* will work as the type. int* will work too, but maybe you want to use data members with different types for different ids.
However, the ID is unique to this instance. A different instance of the same type has a different ID. One of your comments suggests that this isn't actually what you want.
If you want each value of the type to have a unique ID, and for all objects that have the same value to have the same ID, then you'd be better of composing the ID from all of the significant fields of the object. Or just compare objects for equality instead of their IDs, with a suitable operator== and operator!=.
Alternatively if you want the ID to uniquely identify when a value was first constructed other than by copy constructors and copy assignment (so that all objects that are copies of the same "original" share an ID), then the way to do that would be to assign a new unique ID in all the other constructors, store it in a data member, and copy it in the copy constructor and copy assignment operator.
The canonical way to get a new ID is to have a global[*] counter that you increment each time you take a value. This may need to be made thread-safe depending what programs use the class (and how they use it). Values then will be unique within a given run of the program, provided that the counter is of a large enough type.
Another way is to generate a 128 bit random number. It's not theoretically satisfying, but assuming a decent source of randomness the chance of a collision is no larger than the chance of your program failing for some unavoidable reason like cosmic ray-induced data corruption. Random IDs are easier than sequential IDs when the sources of objects are widely distributed (for example if you need IDs that are unique across different processes or different machines). You can if you choose use some combination of the MAC address of the machine, a random number, the time, a per-process global[*] counter, the PID and anything else you think of and lay your hands on (or a standard UUID). But this might be overkill for your needs.
[*] needn't strictly be global - it can be a private static data member of the class, or a static local variable of a function.

Static Pointer to Dynamically allocated array

So the question is relatively straight forward, I have several semi-large lookup tables ~500kb a piece. Now these exact same tables are used by several class instantiations (maybe lots), with this in mind I don't want to store the same tables in each class. So I can either dump the entire tables onto the stack as 'static' members, or I can have 'static' pointers to these tables. In either case the constructor for the class will check whether they are initialized and do so if not. However, my question is, if I choose the static pointers to the tables (so as not to abuse the stack space) what is a good method for appropriately cleaning these up.
Also note that I have considered using boost::share_ptr, but opted not to, this is a very small project and I am not looking to add any dependencies.
Thanks
Static members will never be allocated on the stack. When you declare them (which of course, you do explicitly), they're assigned space somewhere (a data segment?).
If it makes sense that the lookup tables are members of the class, then make them static members!
When a class is instanced on the stack, the static member variables don't form part of the stack cost.
If, for instance, you want:
class MyClass {
...
static int LookUpTable[LARGENUM];
};
int MyClass:LookUpTable[LARGENUM];
When you instance MyClass on the stack, MyClass:LookUpTable points to the object that you've explicitly allocated on the last line of the codesample above. Best of all, there's no need to deallocate it, since it's essentially a global variable; it can't leak, since it's not on the heap.
If you don't free the memory for the tables at all, then when your program exits the OS will automatically throw away all memory allocated by your application. This is an appropriate strategy for handling memory that is allocated only once by your application.
Leaving the memory alone can actually improve performance too, because you won't waste time on shutdown trying to explicitly free everything and therefore possibly force a page in for all the memory you allocated. Just let the OS do it when you exit.
If these are lookup tables, the easiest solution is just to use std::vector:
class SomeClass {
/* ... */
static std::vector<element_type> static_data;
};
To initialize, you can do:
static_data.resize(numberOfElements);
// now initialize the contents
With this you can still do array-like access, as in:
SomeClass::static_data[42].foo();
And with any decent compiler, this should be as fast as a pointer to a native array.
Why don't you create a singleton class that manages the lookup tables? As it seems they need to be accessed by a number of classes; make the singleton the manager of the lookup tables accessible at global scope. Then all the classes can use the singleton getters/setters to manipulate the lookup tables. There are 3 advantages to this approach:-
If the static container size for the
lookup tables becomes large then the
default stack-size may ( 1MB on
Windows) lead to stack-overflow on
application statrt-up itself. Use a container that allocates dynamically.
If you plan to access the table via multiple-threads, the singleton class can be extended to accompany locked access.
You can also cleanup in the dtor of singleton during application exit.
I can think of several ways to approach for this depending upon what is trying to be accomplished.
If the data is static and fixed, using a static array which is global and initialized within the code would be a good approach. Everything is contained in the code and loaded when the program is started so it is available. Then all of the class which need access can access the information.
If the data is not static and needs to read in, an static STL structure, such as a vector, list or map would be good as it can grow as you add elements to the list. Some of these class provides lookup methods as well. Depending upon the data you are looking up, you may have to provide a structure and some operator to have the STL structures work correctly.
In either of the two case, you might what to make a static global class to read and contain the data. It can take care of managing initialization and access the data. You can use private members to indicate if the class has been read in and is available for use. If it has not, the class might be able to do the initialization by itself if it has enough information. The other class can call static function of the static global class to access the data. This provides encapsulation of the data, and then it can be used by several different classes without those classes needing to incorperate the large lookup table.
There are several possibilties with various advantages and disadvantages. I don't know what the table contains, so I'll call it an Entry.
If you just want the memory to be sure to go away when the program exits, use a global auto_ptr:
auto_ptr<Entry> pTable;
You can initialize it whenever you like, and it will automatically be deleted when the program exits. Unfortunately, it will pollute the global namespace.
It sounds like you are using the same table within multiple instances of the same class. In this case, it is usual to make it a static pointer of that class:
class MyClass {
...
protected:
static auto_ptr<Entry> pTable;
};
If you want it to be accessible in instances of different classes, then you might make it a static member of a function, these will also be deleted when the program exits, but the really nice thing is that it won't be initialized until the function is entered. I.e., the resource won't need to be allocated if the function is never called upon:
Entry* getTable() {
static auto_ptr<Entry> pTable = new Entry[ gNumEntries ];
return pTable;
}
You can do any of these with std::vector<Entry> rather than auto_ptr<Entry>, if you prefer, but the main advantage of that is that it can more easily be dynamically resized. That might not be something you value.