C++ Best container for really simple LRU cache - c++

I need to implement a really simple LRU cache which stores memory addresses.
The count of these addresses is fixed (at runtime).
I'm only interested in the last-recently used address (I don't care about the order of the other elements).
Each address has a corresponding index number (simple integer) which isn't unique and can change.
The implementation needs to run with as less overhead as possible. In addition to each address, there's is also a related info structure (which contains the index).
My current approach is using a std::list to store the address/info pair and a boost::unordered_multimap which is a mapping between the index and the related iterator of the list.
The following example has nothing to do with my production code. Please note, that this is just for a better understanding.
struct address_info
{
address_info() : i(-1) {}
int i;
// more ...
};
int main()
{
int const MAX_ADDR_COUNT = 10,
MAX_ADDR_SIZE = 64;
char** s = new char*[MAX_ADDR_COUNT];
address_info* info = new address_info[MAX_ADDR_COUNT]();
for (int i = 0; i < MAX_ADDR_COUNT; ++i)
s[i] = new char[MAX_ADDR_SIZE]();
typedef boost::unordered_multimap<int, std::list<std::pair<address_info, char*>>::const_iterator> index_address_map;
std::list<std::pair<address_info, char*>> list(MAX_ADDR_COUNT);
index_address_map map;
{
int i = 0;
for (std::list<std::pair<address_info, char*>>::iterator iter = list.begin(); i != MAX_ADDR_COUNT; ++i, ++iter)
*iter = std::make_pair(info[i], s[i]);
}
// usage example:
// try to find address_info 4
index_address_map::const_iterator iter = map.find(4);
if (iter == map.end())
{
std::pair<address_info, char*>& lru = list.back();
if (lru.first.i != -1)
map.erase(lru.first.i);
lru.first.i = 4;
list.splice(list.begin(), list, boost::prior(list.end()));
map.insert(std::make_pair(4, list.begin()));
}
else
list.splice(list.begin(), list, iter->second);
for (int i = 0; i < MAX_ADDR_COUNT; ++i)
delete[] s[i];
delete[] info;
delete[] s;
return 0;
}

The usual recommendation is to dig up Boost.MultiIndex for the task:
index 0: order of insertion
index 1: key of the element (either binary search or hash)
It's even demonstrated on Boost site if I recall correctly.

Related

What's the problem of saving pointer of vector inside std::map

I am trying to find the best sum with memorization, but when saving the vector pointer inside a map the values keep appending inside the vector and getting the wrong vector.
if I commented out the map insertion it works properly.
saving nullptr is not possible in case trying to save vector inside the map by reference.
std::vector<int> *bestSumV(int target, int nums[], int size) {
static std::map<int, std::vector<int> *> memo;
if (memo.find(target) != memo.end())
return memo.at(target);
if (target == 0)
return new std::vector<int>();
if (target < 0)
return NULL;
std::vector<int> *bestCom = nullptr;
for (int i = 0; i < size; i++) {
int reminder = target - nums[i];
std::vector<int> *reminderResult = bestSumV(reminder, nums, size);
if (reminderResult != NULL) {
reminderResult->push_back(nums[i]);
if (bestCom == nullptr || reminderResult->size() < bestCom->size()) {
bestCom = static_cast<std::vector<int> *>(reminderResult);
}
}
}
// if i commented out the map insertion i am getting the correct value
// and getting a vector of 5 items
memo.insert(std::make_pair(target, std::move(bestCom)));
return bestCom;
}
void runHowbestTest() {
int testArray[] = {5, 4, 2};
std::vector<int> *bestSum25 = bestSumV(25, testArray, 3);
for (int i = 0; i < bestSum25->size(); i++) {
std::cout << "the items " << bestSum25->at(i) << std::endl;
}
}
bestCom is a std::vector<int> * don't std::move it. It's pointless, and makes the code hard to read.
reminderResult is already a std::vector<int> *, no need to static_cast<std::vector<int> *> it.
On this line
memo.insert(std::make_pair(target, std::move(bestCom)));
bestCom might be null. When that happens,
if (memo.find(target) != memo.end())
return memo.at(target);
will return and bypass the function logic. You need:
if (memo.find(target) != memo.end() && memo.at(target))
return memo.at(target);
And, most likely, the real issue:
// returns a memoized value
std::vector<int> *reminderResult = bestSumV(reminder, nums, size);
...
// modifies it.
reminderResult->push_back(nums[i]);
You cannot modify a memoized value and expect it to be valid.
Using objects instead of pointers fixes the issue: https://godbolt.org/z/v35oT4 . But I make no claims about what it does to performance.

How do I Optimize my C++ key-value program to have a faster runtime?

This is a2.hpp, and is the program that can be edited, as far as I know the code is correct, just too slow. I am honestly lost here, I know my for loops are probably whats slowing me down so much, maybe use an iterator?
// <algorithm>, <list>, <vector>
// YOU CAN CHANGE/EDIT ANY CODE IN THIS FILE AS LONG AS SEMANTICS IS UNCHANGED
#include <algorithm>
#include <list>
#include <vector>
class key_value_sequences {
private:
std::list<std::vector<int>> seq;
std::vector<std::vector<int>> keyref;
public:
// YOU SHOULD USE C++ CONTAINERS TO AVOID RAW POINTERS
// IF YOU DECIDE TO USE POINTERS, MAKE SURE THAT YOU MANAGE MEMORY PROPERLY
// IMPLEMENT ME: SHOULD RETURN SIZE OF A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN 0
int size(int key) const;
// IMPLEMENT ME: SHOULD RETURN POINTER TO A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN nullptr
const int* data(int key) const;
// IMPLEMENT ME: INSERT VALUE INTO A SEQUENCE IDENTIFIED BY GIVEN KEY
void insert(int key, int value);
}; // class key_value_sequences
int key_value_sequences::size(int key) const {
//checks if the key is invalid or the count vector is empty.
if(key<0 || keyref[key].empty()) return 0;
// sub tract 1 because the first element is the key to access the count
return keyref[key].size() -1;
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
if(key<0 || keyref.size() < static_cast<unsigned int>(key+1)) {
return nullptr;
}
// ->at(1) accesses the count (skipping the key) with a pointer
return &keyref[key].at(1);
}
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1);
std::vector<int> val;
seq.push_back(val);
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back();
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
#endif // A2_HPP
This is a2.cpp, this just tests the functionality of a2.hpp, this code cannot be changed
// DO NOT EDIT THIS FILE !!!
// YOUR CODE MUST BE CONTAINED IN a2.hpp ONLY
#include <iostream>
#include "a2.hpp"
int main(int argc, char* argv[]) {
key_value_sequences A;
{
key_value_sequences T;
// k will be our key
for (int k = 0; k < 10; ++k) { //the actual tests will have way more than 10 sequences.
// v is our value
// here we are creating 10 sequences:
// key = 0, sequence = (0)
// key = 1, sequence = (0 1)
// key = 2, sequence = (0 1 2)
// ...
// key = 9, sequence = (0 1 2 3 4 5 6 7 8 9)
for (int v = 0; v < k + 1; ++v) T.insert(k, v);
}
T = T;
key_value_sequences V = T;
A = V;
}
std::vector<int> ref;
if (A.size(-1) != 0) {
std::cout << "fail" << std::endl;
return -1;
}
for (int k = 0; k < 10; ++k) {
if (A.size(k) != k + 1) {
std::cout << "fail";
return -1;
} else {
ref.clear();
for (int v = 0; v < k + 1; ++v) ref.push_back(v);
if (!std::equal(ref.begin(), ref.end(), A.data(k))) {
std::cout << "fail 3 " << A.data(k) << " " << ref[k];
return -1;
}
}
}
std::cout << "pass" << std::endl;
return 0;
} // main
If anyone could help me improve my codes efficiency I would really appreciate it, thanks.
First, I'm not convinced your code is correct. In insert, if they key is valid you create a new vector and insert it into sequence. Sounds wrong, as that should only happen if you have a new key, but if your tests pass it might be fine.
Performance wise:
Avoid std::list. Linked lists have terrible performance on today's hardware because they break pipelineing, caching and pre-fetching. Always use std::vector instead. If the payload is really big and you are worried about copies use std::vector<std::unique_ptr<T>>
Try to avoid copying vectors. In your code you have keyref[key] = seq.back() which copies the vector, but should be fine since it's only one element.
Otherwise there's no obvious performance problems. Try to benchmark and profile your program and see where the slow parts are. Usually there's one or two places that you need to optimize and get great performance. If it's still too slow, ask another question where you post your results so that we can better understand the problem.
I will join Sorin in saying don't use std::list if avoidable.
So you use key as direct index, where does it say it is none-negative? where does it say its less than 100000000?
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1); // could be large
std::vector<int> val; // don't need this temporary.
seq.push_back(val); // seq is useless?
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back(); // we now have 100000000-1 empty indexes
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
Can it be done faster? depending on your key range yes it can. You will need to implement a flat_map or hash_map.
C++11 concept code for a flat_map version.
// effectively a binary search
auto key_value_sequences::find_it(int key) { // type should be iterator
return std::lower_bound(keyref.begin(), keyref.end(), [key](const auto& check){
return check[0] < key; // key is 0-element
});
}
void key_value_sequences::insert(int key, int value) {
auto found = find_it(key);
// at the end or not found
if (found == keyref.end() || found->front() != key) {
found = keyref.emplace(found, key); // add entry
}
found->emplace_back(value); // update entry, whether new or old.
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
auto found = find_it(key);
if (found == keyref.end())
return nullptr;
// ->at(1) accesses the count (skipping the key) with a pointer
return found->at(1);
}
(hope I got that right ...)

How to replace std::vector by linked list?

I have used std::vector for making my algorithm. I would like to replace the vectors by linked lists.
In order to do so, I was thinking of using the std::list, but I have no idea how to do this, for example I have tried following example for finding a value within a vector/list:
void find_values_in_vector(const std::vector<int>& input_vector, int value, int &rv1, int &rv2)
{
if (input_vector[0] >= value) { // too small
rv1 = 0; rv2 = 0; return;
}
int index = (int)input_vector.size() - 1;
if (input_vector[index] <= value) { // too big
rv1 = index; rv2 = index; return;
}
// somewhere inside
index = 0;
while (input_vector[index] <= value) {
index++;
}
rv1 = index - 1; rv2 = index; return;
}
void find_values_in_list(const std::list<int>& input_list, int value, int &rv1, int &rv2)
{
if (*input_list.begin() >= value) { // too small
rv1 = 0; rv2 = 0; return;
}
if (*input_list.end() <= value) { // too big
rv1 = (int)input_list.size() - 1; rv2 = (int)input_list.size() - 1; return;
}
// somewhere inside
int index = 0; int temp = *input_list.begin();
while (temp <= value) {
temp = *input_list.next(); index++;
}
rv1 = index - 1; rv2 = index; return;
}
This seems not to work, as the member function next() is not existing. However I remember that browsing through a linked list is done by going to the beginning, and moving further to the next element until the a certain point is reached. I have seen that there is a way to get this done by using an interator in a for-loop, but I wonder what's wrong with my approach? I was under the impression that a std::list was a standard implementation of a double-directional linked list, or am I wrong and in that case, what std class is the implementation of a linked list (it does not need to be a double-directional linked list)?
The standard way to iterate through containers is like this:
for(std::list<int>::iterator it = input_list.begin();
it != input_list.end();
it++)
{
....
}
This also works for vectors,maps,deque,etc. The Iterator concept is consistently implemented throughout the STL so it's best to get used to this concepts.
There are also iterator operations like std::distance and std::advance etc. for the different types of iterators (I suggest you read up on them and their advantages/limitations)
If you have C++ 11 available you can also use this syntax (may not be useful for your problem though.)
for(const auto& value : input_list)
{
...
}
This also works throughout the STL container.
This should work for vector, list, deque, and set (assuming the contents are sorted).
template <class T>
void find_values_in_container(const T& container, int value, int &rv1, int &rv2)
{
rv1 = rv2 = 0; // Initialize
if (container.empty() || container.front() >= value)
{
return;
}
for (const auto& v : container)
{
rv2++;
if (v > value)
{
break;
}
rv1++;
}
return;
}

Adding an element to a Vector while iterating over it

As the title says, I want to add an element to a std::vector in certain cases while iterating through the vector. With the following code, I'm getting an error "Debug assertion failed". Is it possible to achieve what I want to do?
This is the code I have tested:
#include <vector>
class MyClass
{
public:
MyClass(char t_name)
{
name = t_name;
}
~MyClass()
{
}
char name;
};
int main()
{
std::vector<MyClass> myVector;
myVector.push_back(MyClass('1'));
myVector.push_back(MyClass('2'));
myVector.push_back(MyClass('3'));
for each (MyClass t_class in myVector)
{
if (t_class.name == '2')
myVector.push_back(MyClass('4'));
}
return 0;
}
EDIT:
Well, I thought for each was standard C++, but it seems that it's a Visual Studio feature:
for each, in
Visual c++ "for each" portability
The act of adding or removing an item from a std::vector invalidates existing iterators. So you cannot use any kind of loop that relies on iterators, such as for each, in, range-based for, std::for_each(), etc. You will have to loop using indexes instead, eg:
int main()
{
std::vector<MyClass> myVector;
myVector.push_back('1');
myVector.push_back('2');
myVector.push_back('3');
std::vector<MyClass>::size_type size = myVector.size();
for (std::vector<MyClass>::size_type i = 0; i < size; ++i)
{
if (myVector[i].name == '2')
{
myVector.push_back('4');
++size; // <-- remove this if you want to stop when you reach the new items
}
}
return 0;
}
As pointed out by pyon, inserting elements into a vector while iterating over it (via iterators) doesnt work, because iterators get invalidated by inserting elements. However, it seems like you only want to push elements at the back of the vector. This can be done without using iterators but you should be careful with the stop condition:
std::vector<MyClass> myVector;
size_t old_size = myVector.size();
for (int i=0;i<old_size;i++) {
if (myVector[i].name == '2') { myVector.push_back(MyClass('4')); }
}
After following the previous answers, you can use const auto& or auto& to have clean code. Should be optimized in release build by the compiler.
std::vector<MyClass> myVector;
std::vector<MyClass>::size_type size = myVector.size();
for (std::vector<MyClass>::size_type i = 0; i < size; ++i)
{
const auto& element = myVector[i];
element.do_stuff();
}

Storing Iterators of std::deque

I am trying to store iterators of a deque in a vector and want to preserve them inside the vector even when I have erased or inserted some elements from or into the deque. Is this possible?
I have the following code:
typedef struct {
int id;
int seedId;
double similarity;
} NODE_SEED_SIM;
typedef std::deque<NODE_SEED_SIM> NodesQueue;
typedef std::deque<NODE_SEED_SIM>::iterator ITRTR;
typedef std::vector<const ITRTR> PointerVec;
void growSegments (CSG_PointCloud *pResult, IntMatrix *WdIndices, NodesQueue *NodesList, IntMatrix *Segments) {
ITRTR nodeslistStart = (*NodesList).begin();
int pointCount = (*WdIndices).size();
int nodeslistSize = (*NodesList).size();
IntVector book(pointCount);
PointerVec pointerList (pointCount); // Vector of ITRTRs
for (int i = 0; i < nodeslistSize; i++) {
book [ (*NodesList)[i].id ] = 1;
pointerList [ (*NodesList)[i].id ] = nodeslistStart + i; // REF: 2
}
while (nodeslistSize > 0) {
int i = 0;
int returnCode = 0;
int nodeId = (*NodesList)[i].id;
int seedId = (*NodesList)[i].seedId;
int n_nbrOfNode = (*WdIndices)[ nodeId ].size();
(*Segments)[ seedId ].push_back ( nodeId );
(*NodesList).erase ( (*NodesList).begin() ); // REF: 3; This erase DOES NOT mess the pointerList
nodeslistSize --;
Point node;
/*
GET ATTRIBUTES OF NODE
*/
for (int j = 0; j < n_nbrOfNode; j++) {
int nborId = (*WdIndices)[nodeId][j];
if (nborId == seedId)
continue;
Point neighbor;
/*
GET ATTRIBUTES OF NEIGHBOUR
*/
double node_nbor_sim = computeSimilarity (neighbor, node);
if (book[nborId] == 1) {
ITRTR curr_node = pointerList[nborId]; // REF: 1
if ( curr_node -> similarity < node_nbor_sim) {
curr_node -> similarity = node_nbor_sim;
NODE_SEED_SIM temp = *curr_node;
(*NodesList).erase (curr_node); // REF: 4; This erase completely messes up the pointerList
returnCode = insertSortNodesList (&temp, NodesList, -1);
}
}
}
}
}
The nodes in the NodesList hold a global ID inside them. However they are stored in NodesList, not according to this global ID but in descending order of their "similarity". So later when I want to get the node from NodesList corresponding to a global ID (nborID in code)[REF: 1] I do it via the "pointerList" where I have previously stored the iterators of the deque but according to the global IDs of the nodes [REF: 2]. My pointerList stays true after the first erase command [REF: 3], but gets messed up in the next erase [REF: 4].
What is wrong here?
Thanks
I am trying to store iterators of a deque in a vector and want to preserve them inside the vector even when I have erased or inserted some elements from or into the deque. Is this possible?
As from the documentation it says
Sorry I'm posting this as an image here, but the formatting is too tedious to replicate in markup!
So the short answer is: NO! I'm afraid you cannot safely store iterators pointing to certain elements stored in a std::deque, while it's changed elsewhere.
Some other relevant Q&A:
Problem with invalidation of STL iterators when calling erase
C++ deque: when iterators are invalidated
If you want to create a vector of iterators then this is what you want to do:
#include<iostream>
#include <deque>
#include <vector>
using namespace std;
int main()
{
deque<int> dq { 1, 2, 3, 4, 5, 6, 7, 8 };
deque<int>::iterator it;
vector<deque<int>::iterator> vit;
for (it = dq.begin(); it != dq.end(); it++)
{
vit.push_back(it);
cout << *it;
}
//note that `it` is a pointer so if you modify the `block` it points in your deque then the value will change.
//If you delete it then you will have a segmenant fault.
//--- Don't Do: while (!dq.empty()) { dq.pop_back(); } ---//
for (size_t i = 0; i < vit.size(); i++)
{
cout << *vit[i];
}
system("pause");
return 0;
}
However, if you want to preserve that value of that iterator after it has been changed/deleted you may want to create a copy of each iterator and store the copy rather than the actual iterator