C++ STL algorithms to add element in list - c++

I want to know if anyone has a quick way for adding an element to a std::list<T*> if the element is not already in it.
It's a generic function and I can not use loops so something like this
template <class T>
bool Class<T>::addElement(const T* element)
{
for (list<T*>::iterator it = list_.begin(); it != list_.end(); it++)
{
if (element == *it)
return false;
}
list_.push_back(element);
return true;
}
Is not ok because of the loop. Does anyone have ideas?

Why is what you have "not ok"? Looks perfectly fine and readable to me (modulo missing typename).
If you really don't want to use a loop, you can accomplish the same by using the algorithm to does precisely that loop: std::find:
template <class T>
bool Class<T>::addElement(const T* element)
{
if (std::find(list_.begin(), list_.end(), element) != list_.end()) {
return false;
}
list_.push_back(element);
return true;
}

If you can add other members to your class, you could add an index such as a std::unordered_set. That container stores a list of unique values, and can be searched for specific values in O(1) complexity, which implies that no full-loop search is done by the implementation for checking if the value already exists. It will stay fast even if you have a lot of values already stored.
With std::list, using library functions such as std::find will avoid explicitely writing a loop, but the implementation will perform the loop and this will be slow when a lot of values are already stored (O(n) complexity)

You can use intrusive list instead of the std::list. In this case each element in the list keeps its node data, so you can just query that data to find out if the element is already in the list. The disadvantage is that all elements in this list must be able to provide such data, and you can't put in such lists, for example, integer or boolean elements.
If you still need the std::list and/or the elements can be of any type, then the only way of fast queryng whether the element already exists in the list is to use an index. The indexes can be stored in separate std::unordered_set for fast lookups. You can use for indexes either the list's values "as is" or calculate the indexes using any custom function.

Related

Declaring a std::list with an array index C++

I was following a hash table implementation online (https://www.youtube.com/watch?v=2_3fR-k-LzI) when I observed the video author initialize a std::list with an array index. This was very confusing to me as I was always under the impression that std::list was always meant to operate like a linked list and was not capable of supporting random indexing. However, I thought it was maybe a weird way to declare the size of a list and ignored it and moved on. Specifically, he did the following:
static const int hashGroups = 10;
std::list<std::pair<int, std::string>> table[hashGroups];
Upon trying to implement a function to search to see if a key resided in the hash table, I realized that I could not access the std::list objects as I would expect to be able to. In HashTable.cpp (which includes the header file that defines the two variables above) I was only able to access the table member variable's elements as a pointer with -> instead of with . as I would expect to be able to. It looks like what is directly causing this is using the array index in the list definition. This seems to change the type of the table variable from a std::list to a pointer to a std::list. I do not understand why this is the case. This also appears to break my current implementation of attempting to iterate through the table variable because when I declare an iterator to iterate through table's elements, I am able to see that the table has the correct data in the VS debugger but the iterator seems to have completely invalid data and does not iterate through the loop even once despite seeing table correctly have 10 elements. My attempt at the search function is pasted below:
std::string HashTable::searchTable(int key) {
for (std::list<std::pair<int, std::string>>::const_iterator it = table->begin(); it != table->end(); it++)
{
if (key == it->first) {
return it->second;
}
std::cout << "One iteration performed." << std::endl;
}
return "No value found for that key.";
}
With all of this being said, I have several burning questions:
Why are we even able to declare a list with brackets when a std::list does not support random access?
Why does declaring a list like this change the type of the list from std::list to a pointer?
What would be the correct way to iterate through table in its current implementation with an iterator?
Thank you for any help or insight provided!
After reading the responses from #IgorTandetnik I realized that I was thinking about the list incorrectly. What I didn't fully understand was that we were declaring an array of lists and not attempting to initialize a list like an array. Once I realized this, I was able to access the elements correctly since I was not trying to iterate through an array with an iterator for a list. My revised searchTable function which to my knowledge now works correctly looks like this:
std::string HashTable::searchTable(int key) {
int hashedKey = hashFunction(key);
if (table[hashedKey].size() > 0)
{
for (std::list<std::pair<int, std::string>>::const_iterator it = table[hashedKey].begin(); it != table[hashedKey].end(); it++)
{
if (key == it->first) {
return it->second;
}
}
}
return "No value found for that key.";
}
And to answer my three previous questions...
1. Why are we even able to declare a list with brackets when a std::list does not support random access?
Response: We are declaring an array of std::list that contains a std::pair of int and std::string, not a list with the array index operator.
2. Why does declaring a list like this change the type of the list from std::list to a pointer?
Response: Because we are declaring table to be an array (which is equivalent to a const pointer to the first element) which contains instances of std::list. So we are never "changing" the type of the list variable.
3. What would be the correct way to iterate through table in its current implementation with an iterator?
Response: The current implementation only attempts to iterate over the first element of table. Create an iterator which uses the hashed key value as the array index of table and then tries to iterate through the std::list that holds instances of std::pair at that index.

Constraining remove_if on only part of a C++ list

I have a C++11 list of complex elements that are defined by a structure node_info. A node_info element, in particular, contains a field time and is inserted into the list in an ordered fashion according to its time field value. That is, the list contains various node_info elements that are time ordered. I want to remove from this list all the nodes that verify some specific condition specified by coincidence_detect, which I am currently implementing as a predicate for a remove_if operation.
Since my list can be very large (order of 100k -- 10M elements), and for the way I am building my list this coincidence_detect condition is only verified by few (thousands) elements closer to the "lower" end of the list -- that is the one that contains elements whose time value is less than some t_xv, I thought that to improve speed of my code I don't need to run remove_if through the whole list, but just restrict it to all those elements in the list whose time < t_xv.
remove_if() though does not seem however to allow the user to control up to which point I can iterate through the list.
My current code.
The list elements:
struct node_info {
char *type = "x";
int ID = -1;
double time = 0.0;
bool spk = true;
};
The predicate/condition for remove_if:
// Remove all events occurring at t_event
class coincident_events {
double t_event; // Event time
bool spk; // Spike condition
public:
coincident_events(double time,bool spk_) : t_event(time), spk(spk_){}
bool operator()(node_info node_event){
return ((node_event.time==t_event)&&(node_event.spk==spk)&&(strcmp(node_event.type,"x")!=0));
}
};
The actual removing from the list:
void remove_from_list(double t_event, bool spk_){
// Remove all events occurring at t_event
coincident_events coincidence(t_event,spk_);
event_heap.remove_if(coincidence);
}
Pseudo main:
int main(){
// My list
std::list<node_info> event_heap;
...
// Populate list with elements with random time values, yet ordered in ascending order
...
remove_from_list(0.5, true);
return 1;
}
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for cycle as suggested for example in this post?
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for loop?
Yes and yes. Don't fight to use code that is preventing you from reaching your goals. Keep it simple. Loops are nothing to be ashamed of in C++.
First thing, comparing double exactly is not a good idea as you are subject to floating point errors.
You could always search the point up to where you want to do a search using lower_bound (I assume you list is properly sorted).
The you could use free function algorithm std::remove_if followed by std::erase to remove items between the iterator returned by remove_if and the one returned by lower_bound.
However, doing that you would do multiple passes in the data and you would move nodes so it would affect performance.
See also: https://en.cppreference.com/w/cpp/algorithm/remove
So in the end, it is probably preferable to do you own loop on the whole container and for each each check if it need to be removed. If not, then check if you should break out of the loop.
for (auto it = event_heap.begin(); it != event_heap.end(); )
{
if (coincidence(*it))
{
auto itErase = it;
++it;
event_heap.erase(itErase)
}
else if (it->time < t_xv)
{
++it;
}
else
{
break;
}
}
As you can see, code can easily become quite long for something that should be simple. Thus, if you need to do that kind of algorithm often, consider writing you own generic algorithm.
Also, in practice you might not need to do a complete search for the end using the first solution if you process you data in increasing time order.
Finally, you might consider using an std::set instead. It could lead to simpler and more optimized code.
Thanks. I used your comments and came up with this solution, which seemingly increases speed by a factor of 5-to-10.
void remove_from_list(double t_event,bool spk_){
coincident_events coincidence(t_event,spk_);
for(auto it=event_heap.begin();it!=event_heap.end();){
if(t_event>=it->time){
if(coincidence(*it)) {
it = event_heap.erase(it);
}
else
++it;
}
else
break;
}
}
The idea to make erase return it (as already ++it) was suggested by this other post. Note that in this implementation I am actually erasing all list elements up to t_event value (meaning, I pass whatever I want for t_xv).

equivalent LinkedHashmap in C++?

I have a Java program that I want to convert it to C++. So, there is a Linkedhashmap data structure used in the Java code and I want to convert it to C++. Is there an equivalent datatype for LinkedHashmap in C++?
I tried to use std::unordered_map, however, it does not maintain the order of the insertion.
C++ does not offer a collection template with the behavior that would mimic Java's LinkedHashMap<K,V>, so you would need to maintain the order separately from the mapping.
This can be achieved by keeping the data in a std::list<std::pair<K,V>>, and keeping a separate std::unordered_map<k,std::list::iterator<std::pair<K,V>>> map for quick look-up of the item by key:
On adding an item, add the corresponding key/value pair to the end of the list, and map the key to the iterator std::prev(list.end()).
On removing an item by key, look up its iterator, remove it from the list, and then remove the mapping.
On replacing an item, look up list iterator from the unordered map first, and then replace its content with a new key-value pair.
On iterating the values, simply iterate std::list<std::pair<K,V>>.
The insertion order contract on key iteration can be achieved with a balanced tree for log(n) performance. This is better than maintaining keys in a list as item removal requires n lookup time. My mantra is never put something you look up in a list. If it doesn't have to be sorted, use a hash. If it should be sorted, use a balanced tree. If all you're going to do is iterate, then a list is fine.
In c++ this would be std::map where the key is the item reference and the value is the insertion order, the keys are sorted using red-black trees. See: Is there a sorted container in STL
This is how I do it:
map<TKey, set<MyClass<K1,K2>, greater<MyClass<K1, K2>>>> _objects; // set ordered by timestamp. Does not guarantee uniqueness based on K1 and K2.
map<TKey, map<K2, typename set<MyClass<K1, K2>, greater<MyClass<K1, K2>>>::iterator>> _objectsMap; // Used to locate object in _objects
To add object id:
if (_objectsMap[userId].find(id) == _objectsMap[userId].end())
_objectsMap[userId][id] = _objects[userId].emplace(userId, id).first;
To erase an object id:
if (_objectsMap[userId].find(id) != _objectsMap[userId].end()) {
_objects[userId].erase(_objectsMap[userId][id]);
_objectsMap[userId].erase(id);
}
To retrieve, say the most recent size objects from the list starting from a specific object id:
vector<K2> result;
if (_objectsMap[userId].find(id) != _objectsMap[userId].end() && _objectsMap[userId][id] != _objects[userId].begin()) {
set<MyClass<K2, K2>, greater<MyClass<K1, K2>>>::iterator start = _objects[userId].begin(), end = _objectsMap[userId][id];
size_t counts = distance(_objects[userId].begin(), _objectsMap[userId][id]);
if (counts > size)
advance(start, counts - size);
transform(start,
end,
back_inserter(result),
[](const MyClass<K1, K2>& obj) { return obj.ID(); });
}
return result;

How to do fast sorting in sorted list when only one element is changed

I need a list of elements that are always sorted. the operation involved is quite simple, for example, if the list is sorted from high to low, i only need three operations in some loop task:
while true do {
list.sort() //sort the list that has hundreds of elements
val = list[0] //get the first/maximum value in the list
list.pop_front() //remove the first/maximum element
...//do some work here
list.push_back(new_elem)//insert a new element
list.sort()
}
however, since I only add one elem at a time, and I have speed concern, I don't want the sorting go through all the elements, e.g., using bubble sorting. So I just wonder if there is a function to insert the element in order? or whether the list::sort() function is smarter enough to use some kind of quick sort when only one element is added/modified?
Or maybe should I use deque for better speed performance if above are all the operations needed?
thanks alot!
As mentioned in the comments, if you aren't locked into std::list then you should try std::set or std::multiset.
The std::list::insert method takes an iterator which specifies where to add the new item. You can use std::lower_bound to find the correct insertion point; it's not optimal without random access iterators but it still only does O(log n) comparisons.
P.S. don't use variable names that collide with built-in classes like list.
lst.sort(std::greater<T>()); //sort the list that has hundreds of elements
while true do {
val = lst.front(); //get the first/maximum value in the list
lst.pop_front(); //remove the first/maximum element
...//do some work here
std::list<T>::iterator it = std::lower_bound(lst.begin(), lst.end(), std::greater<T>());
lst.insert(it, new_elem); //insert a new element
// lst is already sorted
}

Does the C++ standard library have a set ordered by insertion order?

Does the C++ standard library have an "ordered set" datastructure? By ordered set, I mean something that is exactly the same as the ordinary std::set but that remembers the order in which you added the items to it.
If not, what is the best way to simulate one? I know you could do something like have a set of pairs with each pair storing the number it was added in and the actual value, but I dont want to jump through hoops if there is a simpler solution.
No single, homogeneous data structure will have this property, since it is either sequential (i.e. elements are arranged in insertion order) or associative (elements are arranged in some order depending on value).
The best, clean approach would perhaps be something like Boost.MultiIndex, which allows you to add multiple indexes, or "views", on a container, so you can have a sequential and an ordered index.
Instead of making a std::set of whatever type you're using, why not pass it a std::pair of the object and an index that gets incremented at each insertion?
No, it does not.
Such a container presumably would need two different iterators, one to iterate in the order defined by the order of adding, and another to iterate in the usual set order. There's nothing of that kind in the standard libraries.
One option to simulate it is to have a set of some type that contains an intrusive linked list node in addition to the actual data you care about. After adding an element to the set, append it to the linked list. Before removing an element from the set, remove it from the linked list. This is guaranteed to be OK, since pointers to set elements aren't invalidated by any operation other than removing that element.
I thought the answer is fairly simple, combine set with another iteratable structure (say, queue). If you like to iterate the set in the order that the element been inserted, push the elements in queue first, do your work on the front element, then pop out, put into set.
[Disclaimer: I have given a similar answer to this question already]
If you can use Boost, a very straightforward solution is to use the header-only library Boost.Bimap (bidirectional maps).
Consider the following sample program that will display some dummy entries in insertion order (try out here):
#include <iostream>
#include <string>
#include <type_traits>
#include <boost/bimap.hpp>
using namespace std::string_literals;
template <typename T>
void insertByOrder(boost::bimap<T, size_t>& mymap, const T& element) {
using pos = typename std::remove_reference<decltype(mymap)>::type::value_type;
// We use size() as index, therefore indexing the elements with 0, 1, ...
mymap.insert(pos(element, mymap.size()));
}
int main() {
boost::bimap<std::string, size_t> mymap;
insertByOrder(mymap, "stack"s);
insertByOrder(mymap, "overflow"s);
// Iterate over right map view (integers) in sorted order
for (const auto& rit : mymap.right) {
std::cout << rit.first << " -> " << rit.second << std::endl;
}
}
The funky type alias in insertByOrder() is needed to insert elements into a boost::bimap in the following line (see referenced documentation).
Yes, it's called a vector or list (or array). Just appends to the vector to add element to the set.