Counter and ID attribute of class stored in vector - c++

I have a dilemma regarding my removing database function code.
Whenever I remove the database in vector with unique, I cannot think of writing the code that could fill the gap with the removed number (like I removed database ID3 and I want that the IDs of further databases would increment to have a stable sequence, so the database with ID4 would become ID3).
I also don't know how to decrement my static int counter.
File:
**void Database::Rem()
{
int dddddet;
cin >> dddddet;
if (iter != DbMain.end())
{
DbMain.erase(iter);
}
}**
std::istream &operator>>(std::istream &re, Base &product)
{
}
}
std::ostream &printnames(std::ostream &pr, Base &pro)
{
pr << "\nID:" << pro.ID << "\nName:" << pro.name;
return pr;
}
Header file:
"

The thing you're doing here is called a design anti-pattern: Some structural idea that you could easily come up with in a lot of situations, but that's going to be a lot of trouble (i.e., bad). It's called a singleton: You're assuming there's only ever going to be one DbMain, so you store the length of that in a "global-alike" static member. Makes no sense! Simply use the length of DbMain. You should never need a global or a static member to count objects that you store centrally, anyways.
You never actually need the ID to be part of the object you store – its whole purpose is being the index within the dBMain storage. A vector is already ordered! So, instead of printing your .ID when you iterate through the vector, simply print the position in the vector. Instead of "find the base with ID == N and erase" you could do "erase the (DbMain.begin() + N) element" and be done with it.

The problem in your design is that you seem somehow to associate the unique ID with the index in the vector, as you initialize the IDs with a static counter. This is not a good idea, since:
you would need to renumber a lot of IDs at each delete and this would make it very difficult to maintain cross-references.
you could have several databases each with its own set of ids.
there is a risk that you'd not get the counter to a proper value, if you add bases without reading them.
Moreover, iterating sequentially looking for an ID is not very efficient. A more efficient approach would be to store your bases in a std::map: this allows to find IDs very efficiently, indexing by ID instead of sequential number.
Your only concern would then be to ensure uniqueness of IDs. Of course, your counter approach could work, if you make sure that it is updated whenever a new base is created, and that the state is persisted with the database in the text file in which you'll save all this. You just have to make clear that the IDs are not guaranteed to be sequential. A pragmatic way to ensure contribute to this understanding is to issue an error message if there is an attempt to delete a record that was not found.

Related

C++ custom iterator over container based on items property

I have a container of Employees with fields std::string name and bool is_manager. I want to iterate over container for regular and manager employees. The number of items in a container can be very large so I do not want to do a linear scan with checking is_manager property. Also the number of managers is very small i.e. 10 out of 100000, so doing a full scan over the container is ineffective. So I want to pre-cache the memory addresses of Employees with and without is_manager==true and have RegularEmployeeIterator and ManagerEmployeeIterator, I think this can be pre-cached and organized as a list/vector of pointers?
And I want to able to sort the container of Employees by name field and retain the ability to iterate over regulars and managers.
How to implement that in C++? Specifically, I have no idea how iterators are implemented in C++, how to define several of them for a single collection, based on property value, does my idea with caching the addresses works, etc.
organized as a list/vector of pointers?
Whatever the problem, a list is not the data structure you need.
1. Unrealistic answer for a beginner, but you claim it's the problem you're solving
ok actually I have tens of millions entries
I mean, this is very clearly a learning exercise, and you insisting that your data is 10s of millions of entries large is... mediocrily helpful, because that's the point where if access times are important, you stop storing the composite object in one container:
std::vector<Employee> employees; //10⁷ employees
but would group the data according to the properties you're going to work on at the same time:
std::vector<bool> bossiness; //10⁷ bits – std::vector<bool> has an optimization!
std::vector<std::string> names; //10⁷ std::strings
and as a matter of fact, if you know your data doesn't change, you wouldn't even do that, because the names vector is a dereferencing nightmare that wastes a lot of memory on redundant information, if you could as well just go
std::vector<bool> bossiness; //10⁷ bits – std::vector<bool> has an optimization!
std::string all_names; // a **very** long string containing all names, one after the other
std::vector<size_t> name_begins; // 10⁷ name beginnings; through all_names.substr(name_begins[i], name_begins[i+1]) you can access the i.th name
Now, to speed up looking for bosses, you just start by making a run-time encoded list of 64bit-regions in your bossiness vector where at least one bit is set. You could do elegant k-d trees if your problem becomes multidimensional, but at the sparsity you have, runtime encoding on machine word sizes will probably still beat the hell out of that.
But that's an optimization level you need when writing a database system or a 3D game with millions of vertices. You're learning C++. You're not writing these kinds of things, so:
2. Realistic answer that you didn't want when offered in the question
i.e. 10 out of 100000
so, let's really go with a problem size of 10⁵. I.e., a small problem.
You need to do your Employee vector, and add a bosses vector:
std::vector<Employee> employees;
std::vector<size_t> boss_indices;
Then you need to do your linear search once:
// if you know a safe and not too outlandish upper bound for the number of managers, reserve that memory once to avoid resizing the vector while filling it, as that's very expensive:
boss_indices.reserve(size_t(employees.size() * fraction_of_managers));
for(size_t idx = 0; idx < employees.size(); ++idx) {
if(employees[idx].is_manager) {
boss_indices.push_back(idx)
}
}
congratulations, an easy to use vector indices. Indices into std::vector are just as good as pointers to elements (it's a simple pointer deref, both ways, and the additional offset is usually merged into the deref operation on any modern CPU I know), but survive the target vector being moved.
And I want to able to sort the container of Employees by name field and retain the ability to iterate over regulars and managers.
have a class
#include <algorithm>
struct SortableEmployee {
const Employee* empl;
bool operator <(const SortableEmployee& other) const {
return std::lexicographical_compare(
empl->name.cbegin(); empl->name.cend(),
other.empl->name.cbegin(), other.empl->name.cend());
}
SortableEmployee(Employee* underlying) : empl(underlying){
}
};
and put it in a std::set to get a sorted version that you can iterate through:
std::set<SortableEmployee> namebook;
for(const auto& individual : employees) {
namebook.emplace(&individual);
}
You can then iterate through it linearly as well
for(const auto& sorted_empl : namebook) {
std::cout << std::format("{}: is {}a manager\n",
sorted_empl.empl->name,
sorted_empl.empl->is_manager ? "" : "not ");
}

C++11 unordered_map time complexity

I'm trying to figure out the best way to do a cache for resources. I am mainly looking for native C/C++/C++11 solutions (i.e. I don't have boost and the likes as an option).
What I am doing when retrieving from the cache is something like this:
Object *ResourceManager::object_named(const char *name) {
if (_object_cache.find(name) == _object_cache.end()) {
_object_cache[name] = new Object();
}
return _object_cache[name];
}
Where _object_cache is defined something like: std::unordered_map <std::string, Object *> _object_cache;
What I am wondering is about the time complexity of doing this, does find trigger a linear-time search or is it done as some kind of a look-up operation?
I mean if I do _object_cache["something"]; on the given example it will either return the object or if it doesn't exist it will call the default constructor inserting an object which is not what I want. I find this a bit counter-intuitive, I would have expected it to report in some way (returning nullptr for example) that a value for the key couldn't be retrieved, not second-guess what I wanted.
But again, if I do a find on the key, does it trigger a big search which in fact will run in linear time (since the key will not be found it will look at every key)?
Is this a good way to do it, or does anyone have some suggestions, perhaps it's possible to use a look up or something to know if the key is available or not, I may access often and if it is the case that some time is spent searching I would like to eliminate it or at least do it as fast as possible.
Thankful for any input on this.
The default constructor (triggered by _object_cache["something"]) is what you want; the default constructor for a pointer type (e.g. Object *) gives nullptr (8.5p6b1, footnote 103).
So:
auto &ptr = _object_cache[name];
if (!ptr) ptr = new Object;
return ptr;
You use a reference into the unordered map (auto &ptr) as your local variable so that you assign into the map and set your return value in the same operation. In C++03 or if you want to be explicit, write Object *&ptr (a reference to a pointer).
Note that you should probably be using unique_ptr rather than a raw pointer to ensure that your cache manages ownership.
By the way, find has the same performance as operator[]; average constant, worst-case linear (only if every key in the unordered map has the same hash).
Here's how I'd write this:
auto it = _object_cache.find(name);
return it != _object_cache.end()
? it->second
: _object_cache.emplace(name, new Object).first->second;
The complexity of find on an std::unordered_map is O(1) (constant), specially with std::string keys which have good hashing leading to very low rate of collisions. Even though the name of the method is find, it doesn't do a linear scan as you pointed out.
If you want to do some kind of caching, this container is definitely a good start.
Note that a cache typically is not just a fast O(1) access but also a bounded data structure. The std::unordered_map will dynamically increase its size when more and more elements are added. When resources are limited (e.g. reading huge files from disk into memory), you want a bounded and fast data structure to improve the responsiveness of your system.
In contrast, a cache will use an eviction strategy whenever size() reaches capacity(), by replacing the least valuable element.
You can implement a cache on top of a std::unordered_map. The eviction strategy can then be implemented by redefining the insert() member. If you want to go for an N-way (for small and fixed N) associative cache (i.e. one item can replace at most N other items), you could use the bucket() interface to replace one of the bucket's entries.
For a fully associative cache (i.e. any item can replace any other item), you could use a Least Recently Used eviction strategy by adding a std::list as a secondary data structure:
using key_tracker_type = std::list<K>;
using key_to_value_type = std::unordered_map<
K,std::pair<V,typename key_tracker_type::iterator>
>;
By wrapping these two structures inside your cache class, you can define the insert() to trigger a replace when your capacity is full. When that happens, you pop_front() the Least Recently Used item and push_back() the current item into the list.
On Tim Day's blog there is an extensive example with full source code that implements the above cache data structure. It's implementation can also be done efficiently using Boost.Bimap or Boost.MultiIndex.
The insert/emplace interfaces to map/unordered_map are enough to do what you want: find the position, and insert if necessary. Since the mapped values here are pointers, ekatmur's response is ideal. If your values are fully-fledged objects in the map rather than pointers, you could use something like this:
Object& ResourceManager::object_named(const char *name, const Object& initialValue) {
return _object_cache.emplace(name, initialValue).first->second;
}
The values name and initialValue make up arguments to the key-value pair that needs to be inserted, if there is no key with the same value as name. The emplace returns a pair, with second indicating whether anything was inserted (the key in name is a new one) - we don't care about that here; and first being the iterator pointing to the (perhaps newly created) key-value pair entry with key equivalent to the value of name. So if the key was already there, dereferencing first gives the original Ojbect for the key, which has not been overwritten with initialValue; otherwise, the key was newly inserted using the value of name and the entry's value portion copied from initialValue, and first points to that.
ekatmur's response is equivalent to this:
Object& ResourceManager::object_named(const char *name) {
bool res;
auto iter = _object_cache.end();
std::tie(iter, res) = _object_cache.emplace(name, nullptr);
if (res) {
iter->second = new Object(); // we inserted a null pointer - now replace it
}
return iter->second;
}
but profits from the fact that the default-constructed pointer value created by operator[] is null to decide whether a new Object needs to be allocated. It's more succinct and easier to read.

Proper Qt data structure for storing and accessing struct pointers

I have a certain struct:
struct MyClass::MyStruct
{
Statistics stats;
Oject *objPtr;
bool isActive;
QDateTime expiration;
};
For which I need to store pointers to in a private container. I will be getting objects from client code for which I need to return a pointer to the MyStruct. For example:
QList<MyStruct*> MyClass::structPtr( Statistics stats )
{
// Return all MyStruct* for which myStruct->stats == stats (== is overloaded)
}
or
QList<MyStruct*> MyClass::structPtr( Object *objPtr )
{
// Return all MyStruct* for which myStruct->objPtr == objPtr
}
Right now I'm storing these in a QLinkedList<MyStruct*> so that I can have fast insertions, and lookups roughly equivalent to QList<MyStruct*>. Ideally I would like to be able to perform lookups faster, without losing my insertion speed. This leads me to look at QHash, but I am not sure how I would use a QHash when I'm only storing values without keys, or even if that is a good idea.
What is the proper Qt/C++ way to address a problem such as this? Ideally, lookup times should be <= log(n). Would a QHash be a good idea here? If so, what should I use for a key and/or value?
If you want to use QHash for fast lookups, the hash's key type must be the same as the search token type. For example, if you want to find elements by Statistics value, your hash should be QHash<Statistics, MyStruct*>.
If you can live with only looking up your data in one specific way, a QHash should be fine for you. Though, in your case where you're pulling lists out, you may want to investigate QMultiHash and its .values() member. However, it's important to note, from the documentation:
The key type of a QHash must provide operator==() and a global hash function called qHash()
If you need to be able to pull these lists based on different information at different times you might just be better off iterating over the lists. All of Qt's containers provide std-style iterators, including its hash maps.

Is std::map a good solution?

All,
I have following task.
I have finite number of strings (categories). Then in each category there will be a set of team and the value pairs. The number of team is finite based on the user selection.
Both sizes are not more than 25.
Now the value will change based on the user input and when it change the team should be sorted based on the value.
I was hoping that STL has some kind of auto sorted vector or list container, but the only thing I could find is std::map<>.
So what I think I need is:
struct Foo
{
std::string team;
double value;
operator<();
};
std::map<std::string,std::vector<Foo>> myContainer;
and just call std::sort() when the value will change.
Or is there more efficient way to do it?
[EDIT]
I guess I need to clarify what I mean.
Think about it this way.
You have a table. The rows of this table are teams. The columns of this table are categories. The cells of this table are divided in half. Top half is the category value for a given team. This value is increasing with every player.
Now when the player is added to a team, the scoring categories of the player will be added to a team and the data in the columns will be sorted. So, for category "A" it may be team1, team2; and for category "B" it may be team2, team1.
Then based on the position of each team the score will be assigned for each team/category.
And that score I will need to display.
I hope this will clarify what I am trying to achieve and it become more clear of what I'm looking for.
[/EDIT]
It really depend how often you are going to modify the data in the map and how often you're just going to be searching for the std::string and grabbing the vector.
If your access pattern is add map entry then fill all entries in the vector then access the next, fill all entries in the vector, etc. Then randomly access the map for the vector afterwards then .. no map is probably not the best container. You'd be better off using a vector containing a standard pair of the string and the vector, then sort it once everything has been added.
In fact organising it as above is probably the most efficient way of setting it up (I admit this is not always possible however). Furthermore it would be highly advisable to use some sort of hash value in place of the std::string as a hash compare is many times faster than a string compare. You also have the string stored in Foo anyway.
map will, however, work but it really depends on exactly what you are trying to do.

Getting Unique Numbers and Knowing When They're Freed

I have a physics simulation (using Box2D) where bodies with identical integer IDs do not collide, for instance, bodies that belong to the same character. I have a problem though in that I need to be able to get a unique number for each possible entity, so that no two characters accidentally get the same ID. There's a finite number of bodies, but they are created and destroyed as the simulation dictates, so it's necessary to free unique IDs once the body they belonged to is gone.
A class World is responsible for creating and destroying all bodies, and is also the entity that manages the unique number generation, and anything else where physics simulation is concerned.
I thought of two methods so far but I'm not sure which would be better, if either of them at all:
Keep a vector<short>, with the data being the number of references floating around, and the position in the vector being the ID itself. This method has the disadvantage of creating unneeded complexity when coding entities that manipulate group IDs, since they would need to ensure they tell the World how many references they're taking out.
Keep a vector<bool>, with the data being if that ID is free or not, and the position in the vector being the ID itself. The vector would grow with every new call for a unique ID, if there exist no free slots. The disadvantage is that once the vector reaches a certain size, an audit of the entire simulation would need to be done, but has the advantage of entities being able to grab unique numbers without having to help manage reference counting.
What do you folks think, is there a better way?
You could maintain a "free" list of unused IDs as a singly linked list inside your master World object.
When an object is destroyed by World (making its ID unused) you could push that ID onto the head of the free list.
When you are creating a new object you could do the following:
If the free list is non-empty: pop the head item and take that ID.
Else increment a global ID counter and assign it's current value.
While you could still run out of IDs (if you simultaneously had more objects than the max value of your counter), this strategy will allow you to recycle IDs, and to do everything with O(1) runtime complexity.
EDIT: As per #Matthieu's comments below, a std::deque container could also be used to maintain the "free" list. This container also supports the push_front, pop_front operations with O(1) complexity .
Hope this helps.
How many bodies are there? Is it realistic that you'd ever run out of integers if you didn't reassign them? The simplest solution is to just have one integer storing the next ID -- you would increment this integer when you assign a new ID to a body.