Boost R-tree : unable to remove values? - c++

First, my code:
// Reads the index
bi::managed_mapped_file file(bi::open_or_create, indexFile.c_str(), bf::file_size(indexFile.c_str()));
allocator_t alloc(file.get_segment_manager());
rtree_t * rtree_ptr = file.find_or_construct<rtree_t>("rtree")(params_t(), indexable_t(), equal_to_t(), alloc);
std::cout << "The index contains " << rtree_ptr->size() << " entries." << std::endl;
std::ifstream inf(transFile.c_str());
std::string line;
while(getline(inf,line))
{
transition t = transition(line);
point A;
A.set<0>(t.getQ1mz()-1);
A.set<1>(t.getQ3mz()-1);
A.set<2>(0.3);
A.set<3>(0.2);
value_t v = std::make_pair(A,t);
rtree_ptr->insert(v);
rtree_ptr->remove(v);
}
std::cout << "Finished. The index now contains " << rtree_ptr->size() << " entries." << std::endl;
It reads the R-tree from a memory-mapped file. Then, it reads an input file, transFile, make ten so-called "transition" objects from it's content, and inserts them in the tree. Immediately after, it removes them. This is a useless case, but it illustrates well the problem that the removal steps don't work. The output I get is :
The index contains 339569462 entries.
Finished. The index now contains 339569472 entries.
So clearly, the size of the tree increases by ten, because the ten insertions worked like a charm ; but if the removals were working, in the end the tree should have the same size as before, which is not the case.
I have followed the syntax about removing values from an R-tree described here, and all compiles properly, but for some strange reason it just doesn't remove the value. My guess might be that since it deletes by value, it might just not find the value to delete, but how can it be since the value is the one just inserted one line ago?

Related

list<pair<float,float>> iterating through a list that holds pairs?

As a part of runtime analysis I've got a small game that after calculating every Frame puts a new element in this list:
typedef std::list<std::pair<float, float>> PairList;
PairList Frames; //in pair: index 0 = elapsed time, index 1 = frames
The txt file is later used to draw a graph.
I decided to use a list, because while playing I do not need to process data held in the list and I think lists are the fastest containers when it comes to only adding or deleting items. As a next step I want to write the frames in an external txt file.
void WriteStats(PairList &pairList)
{
// open a file in write mode.
std::ofstream outfile;
outfile.open("afile.dat");
PairList::iterator itBegin = pairList.begin();
PairList::iterator itEnd = pairList.end();
for (auto it = itBegin; it != itEnd; ++it)
{
outfile << *it.first << "\t" << *it.second;
}
outfile.close();
}
With normal lists the pointer to "it" should return the item right?
Except visual studio says pair<float, float>* does not have a member called first
How do I want to do it then, when access via my iterator does not work? Is it because I pass in the reference to the list?
*it.first is parsed as *(it.first).
You need (*it).first or, better yet it->first.
Or, even better yet use range for:
for (auto& elem : pairList)
{
float a = elem.first;
}
I decided to use a list, because [...] I think lists are the fastest containers when it comes to only adding or deleting items.
The first go-to container should be std::vector. In practice it will outperform std::list even on algorithms that on paper should be faster on std::list because of cache locality. So I would test your theory with a good-ol benchmarking if performance is a concern.
The issue is one of operator precedence. Specifically, the member access operator '.' has higher precedence than indirection '*' so *it.first is effectively parsed as...
*(it.first)
Hence the warning. Instead use...
it->first
Use a range-based for loop instead of messing with iterators:
void WriteStats(const PairList &pairList)
{
// open a file in write mode.
std::ofstream outfile("afile.dat");
for (const auto &elem : pairList) {
outfile << elem.first << "\t" << elem.second << '\n';
}
}

Build different number of vectors/maps during runtime, to insert into BST - C++

The problem: So earlier the requirement for this C++ program was to just deal with one file's input (Each line represents the average of 10 minutes of the weather detail, about 50k lines). The end user wanted to be able to find out the average of the weather attributes for: a) A specified month and year, b)Average for each month of a specified year, c) Total for each month of a specified year, and d) average of each month of a specified year outputted to a .csv file.
Example: (First 4 lines of input csv)
WAST,DP,Dta,Dts,EV,QFE,QFF,QNH,RF,RH,S,SR,ST1,ST2,ST3,ST4,Sx,T
1/01/2010 9:00,8,151,23,0.1,1014.6,1018.1,1018.2,0,43.3,7,813,23.6,27,26.9,25.4,10,20.98
1/01/2010 9:10,8.7,161,28,0.1,1014.6,1018.1,1018.2,0,44.4,6,845,23.7,26.9,26.9,25.4,10,21.37
1/01/2010 9:20,8.9,176,21,0.2,1014.6,1018.1,1018.2,0,43.4,6,877,23.8,26.9,26.9,25.4,9,21.96
Solution: As not all the data from each line was required, upon reading in each line, they're parsed, segregated and built into an object instance of 'Weather', which consists of:
Date m_dateObj;
Time m_timeObj;
float m_windSpeed;
float m_solarRadiation;
float m_airTemperature;
A vector of Weather object was made to host this information.
Now, the problem has expanded to multiple files (150K-500K lines of data). Reading in multiple files is fine, all the data is retrieved and converted to Weather object with no problems, I'm just having trouble with the design(more specifically the syntax aspect of it, I know what I want to do). Additionally, there is a new option introduced where the user will enter dd/mm/yy and instances of highest solarRadiation for that day will be outputted(This requires me to have access to each specific object of weather and I cant just store aggregates).
BST and Maps are mandatory, so what I thought was: Data is read in line by line, for each line - Convert into Weather obj, store into a vector specifically for that month+year, so for every month of every year there will be a different vector eg; jan2007, feb2007, jan2008 etc. and each of these vectors are stored in a map:
map<pair<int, int>, vector<Weather> > monthMap;
So it looks like
<pair<3,2007>, march2007Vec>
and stores these maps into the BST (which I would need to randomize since its sorted data, to avoid making my BST a linked list, tips on how to do it? I found snippets for self-balancing trees that I might implement). This should work as the key for all maps are unique, thus making all BST nodes unique.
So it would look like this -
User runs program
Program opens files (there is a txt file with file names in it)
For each file
Open file
For each line
Convert into weather Object
Check month+year,
if map for combination exists,
add to that vector (eg march2007)
else
create new vector store in new map
Close file
add all maps to BST
BST will self sort
Provide user with menu to choose from
The actual computation of what the user needs is pretty simple, I just need help figuring out how to actually make it so there are n numbers of maps and vectors (n = number of maps = number of vectors, I think), as I don't know how many months/years there will be.
Heres a snippet of my code to get a better understanding of what I'm trying to do:
int main()
{
vector<Weather> monthVec;
map<pair<int, int>, vector<Weather> > monthMap;
map<pair<int, int>, vector<Weather> >::iterator itr;
int count = 0;
bool found = false;
Weather weatherObj;
ifstream weatherFileList;
weatherFileList.open("data/met_index.txt");
if(weatherFileList.is_open())
{
cout << "Success";
while (!weatherFileList.eof())
{
string data;
string fileName;
getline(weatherFileList, fileName);
cout << fileName << endl;
fileName = "data/" + fileName;
cout << fileName << endl;
ifstream weatherFile;
weatherFile.open(fileName.c_str());
getline(weatherFile, data);
while (!weatherFile.eof())
{
getline(weatherFile, data);
if (!data.empty())
{
weatherObj = ConvertData(data);
//cout << count << " " << weatherObj.GetTime().ToString() << endl;
//monthVec.push_back(weatherObj);
// for (itr = monthMap.begin(); itr != monthMap.end(); ++itr)
// {
//
// }
int month = weatherObj.GetDate().GetMonth();
int year = weatherObj.GetDate().GetYear();
itr = monthMap.find(make_pair(month,year));
if(itr != monthMap.end())
{
monthVec = itr->second;
monthVec.push_back(weatherObj);
}
else
{
}
count++;
}
//cout << data << endl;
}
weatherFile.close();
}
listOptions();
}
else
{
cout << "Not open";
}
cout << count << endl;
cout << monthVec.size() << "/" << monthVec.capacity();
return 0;
}
Apologies for the untidy code, so I was thinking about how to make it so for every new combination there's a new vector placed in a new map, but because of my inexperience, I don't know how to syntax it or even search it well.
TLDR: Need to map unknown number of combinations of ,VectorOfObject>
Would one make a switch case and have 12 vectors, one for each month hardcoded and just store all February (2007 2008 2009 etc) details in it, that would mean a lot of unnecessary processing.
How would one create different vectors without actually giving them a unique name for reference in the code, (<3,2007>,March2007)
How would one retrieve the contents of the vector(Of which we don't know the name, sure we know the key is 03 2007 aka march 2007, but wouldn't we need an explicit name to open the vector? march2007.find()), which is inside a map.
Thanks for the read, and potential help!
Please do Direct Message me if you'd like to see the problem in more detail, I would be grateful!

How to chain delete pairs from a vector in C++?

I have this text file where I am reading each line into a std::vector<std::pair>,
handgun bullets
bullets ore
bombs ore
turret bullets
The first item depends on the second item. And I am writing a delete function where, when the user inputs an item name, it deletes the pair containing the item as second item. Since there is a dependency relationship, the item depending on the deleted item should also be deleted since it is no longer usable. For example, if I delete ore, bullets and bombs can no longer be usable because ore is unavailable. Consequently, handgun and turret should also be removed since those pairs are dependent on bullets which is dependent on ore i.e. indirect dependency on ore. This chain should continue until all dependent pairs are deleted.
I tried to do this for the current example and came with the following pseudo code,
for vector_iterator_1 = vector.begin to vector.end
{
if user_input == vector_iterator_1->second
{
for vector_iterator_2 = vector.begin to vector.end
{
if vector_iterator_1->first == vector_iterator_2->second
{
delete pair_of_vector_iterator_2
}
}
delete pair_of_vector_iterator_1
}
}
Not a very good algorithm, but it explains what I intend to do. In the example, if I delete ore, then bullets and bombs gets deleted too. Subsequently, pairs depending on ore and bullets will also be deleted (bombs have no dependency). Since, there is only one single length chain (ore-->bullets), there is only one nested for loop to check for it. However, there may be zero or large number of dependencies in a single chain resulting in many or no nested for loops. So, this is not a very practical solution. How would I do this with a chain of dependencies of variable length? Please tell me. Thank you for your patience.
P. S. : If you didn't understand my question, please let me know.
One (naive) solution:
Create a queue of items-to-delete
Add in your first item (user-entered)
While(!empty(items-to-delete)) loop through your vector
Every time you find your current item as the second-item in your list, add the first-item to your queue and then delete that pair
Easy optimizations:
Ensure you never add an item to the queue twice (hash table/etc)
personally, I would just use the standard library for removal:
vector.erase(remove_if(vector.begin(), vector.end(), [](pair<string,string> pair){ return pair.second == "ore"; }));
remove_if() give you an iterator to the elements matching the criteria, so you could have a function that takes in a .second value to erase, and erases matching pairs while saving the .first values in those being erased. From there, you could loop until nothing is removed.
For your solution, it might be simpler to use find_if inside a loop, but either way, the standard library has some useful things you could use here.
I couldn't help myself to not write a solution using standard algorithms and data structures from the C++ standard library. I'm using a std::set to remember which objects we delete (I prefer it since it has log-access and does not contain duplicates). The algorithm is basically the same as the one proposed by #Beth Crane.
#include <iostream>
#include <vector>
#include <utility>
#include <algorithm>
#include <string>
#include <set>
int main()
{
std::vector<std::pair<std::string, std::string>> v
{ {"handgun", "bullets"},
{"bullets", "ore"},
{"bombs", "ore"},
{"turret", "bullets"}};
std::cout << "Initially: " << std::endl << std::endl;
for (auto && elem : v)
std::cout << elem.first << " " << elem.second << std::endl;
// let's remove "ore", this is our "queue"
std::set<std::string> to_remove{"bullets"}; // unique elements
while (!to_remove.empty()) // loop as long we still have elements to remove
{
// "pop" an element, then remove it via erase-remove idiom
// and a bit of lambdas
std::string obj = *to_remove.begin();
v.erase(
std::remove_if(v.begin(), v.end(),
[&to_remove](const std::pair<const std::string,
const std::string>& elem)->bool
{
// is it on the first position?
if (to_remove.find(elem.first) != to_remove.end())
{
return true;
}
// is it in the queue?
if (to_remove.find(elem.second) != to_remove.end())
{
// add the first element in the queue
to_remove.insert(elem.first);
return true;
}
return false;
}
),
v.end()
);
to_remove.erase(obj); // delete it from the queue once we're done with it
}
std::cout << std::endl << "Finally: " << std::endl << std::endl;
for (auto && elem : v)
std::cout << elem.first << " " << elem.second << std::endl;
}
#vsoftco I looked at Beth's answer and went off to try the solution. I did not see your code until I came back. On closer examination of your code, I see that we have done pretty much the same thing. Here's what I did,
std::string Node;
std::cout << "Enter Node to delete: ";
std::cin >> Node;
std::queue<std::string> Deleted_Nodes;
Deleted_Nodes.push(Node);
while(!Deleted_Nodes.empty())
{
std::vector<std::pair<std::string, std::string>>::iterator Current_Iterator = Pair_Vector.begin(), Temporary_Iterator;
while(Current_Iterator != Pair_Vector.end())
{
Temporary_Iterator = Current_Iterator;
Temporary_Iterator++;
if(Deleted_Nodes.front() == Current_Iterator->second)
{
Deleted_Nodes.push(Current_Iterator->first);
Pair_Vector.erase(Current_Iterator);
}
else if(Deleted_Nodes.front() == Current_Iterator->first)
{
Pair_Vector.erase(Current_Iterator);
}
Current_Iterator = Temporary_Iterator;
}
Deleted_Nodes.pop();
}
To answer your question in the comment of my question, that's what the else if statement is for. It's supposed to be a directed graph so it removes only next level elements in the chain. Higher level elements are not touched.
1 --> 2 --> 3 --> 4 --> 5
Remove 5: 1 --> 2 --> 3 --> 4
Remove 3: 1 --> 2 4 5
Remove 1: 2 3 4 5
Although my code is similar to yours, I am no expert in C++ (yet). Tell me if I made any mistakes or overlooked anything. Thanks. :-)

Finding the intersection of two vectors of strings

I have two vectors of strings and want to find the strings which are present in both, filling a third vector with the common elements. EDIT: I've added the complete code listing with the respective output so that things are clear.
std::cout << "size " << m_HLTMap->size() << std::endl;
/// Vector to store the wanted, present and found triggers
std::vector<std::string> wantedTriggers;
wantedTriggers.push_back("L2_xe25");
wantedTriggers.push_back("L2_vtxbeamspot_FSTracks_L2Star_A");
std::vector<std::string> allTriggers;
// Push all the trigger names to a vector
std::map<std::string, int>::iterator itr = m_HLTMap->begin();
std::map<std::string, int>::iterator itrLast = m_HLTMap->end();
for(;itr!=itrLast;++itr)
{
allTriggers.push_back((*itr).first);
}; // End itr
/// Sort the list of trigger names and find the intersection
/// Build a typdef to make things clearer
std::vector<std::string>::iterator wFirst = wantedTriggers.begin();
std::vector<std::string>::iterator wLast = wantedTriggers.end();
std::vector<std::string>::iterator aFirst = allTriggers.begin();
std::vector<std::string>::iterator aLast = allTriggers.end();
std::vector<std::string> foundTriggers;
for(;aFirst!=aLast;++aFirst)
{
std::cout << "Found:" << (*aFirst) << std::endl;
};
std::vector<std::string>::iterator it;
std::sort(wFirst, wLast);
std::sort(aFirst, aLast);
std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTriggers));
std::cout << "Found this many triggers: " << foundTriggers.size() << std::endl;
for(it=foundTriggers.begin();it!=foundTriggers.end();++it)
{
std::cout << "Found in both" << (*it) << std::endl;
}; // End for intersection
The output is then
Here is the partial output, there are over 1000 elements in the vector so I didn't include the full output:
Found:L2_te1400
Found:L2_te1600
Found:L2_te600
Found:L2_trk16_Central_Tau_IDCalib
Found:L2_trk16_Fwd_Tau_IDCalib
Found:L2_trk29_Central_Tau_IDCalib
Found:L2_trk29_Fwd_Tau_IDCalib
Found:L2_trk9_Central_Tau_IDCalib
Found:L2_trk9_Fwd_Tau_IDCalib
Found:L2_vtxbeamspot_FSTracks_L2Star_A
Found:L2_vtxbeamspot_FSTracks_L2Star_B
Found:L2_vtxbeamspot_activeTE_L2Star_A_peb
Found:L2_vtxbeamspot_activeTE_L2Star_B_peb
Found:L2_vtxbeamspot_allTE_L2Star_A_peb
Found:L2_vtxbeamspot_allTE_L2Star_B_peb
Found:L2_xe25
Found:L2_xe35
Found:L2_xe40
Found:L2_xe45
Found:L2_xe45T
Found:L2_xe55
Found:L2_xe55T
Found:L2_xe55_LArNoiseBurst
Found:L2_xe65
Found:L2_xe65_tight
Found:L2_xe75
Found:L2_xe90
Found:L2_xe90_tight
Found:L2_xe_NoCut_allL1
Found:L2_xs15
Found:L2_xs30
Found:L2_xs45
Found:L2_xs50
Found:L2_xs60
Found:L2_xs65
Found:L2_zerobias_NoAlg
Found:L2_zerobias_Overlay_NoAlg
Found this many triggers: 0
Possible Reason
I am starting to think that the way in which I compile my code is to blame. I am currently compiling with ROOT (the physics data analysis framework) instead of doing a standalone compile. I get the feeling that it doesn't work all that well with the STL Algorithm library and that's the cause of the issue, especially given how many people seem to have the code working for them. I will try to do a stand-alone compilation and re-running.
Passing foundTriggers.begin(), with foundTriggers empty, as the output argument will not cause the output to be pushed onto foundTriggers. Instead, it will increment the iterator past the end of the vector without resizing it, randomly corrupting memory.
You want to use an insert iterator:
std::set_intersection(wFirst, wLast, aFirst, aLast,
std::back_inserter(foundTriggers));
UPDATE: As pointed out in the comments, the vector is resized to be at least large enough for the result, so your code should work. Note that you should use the iterator returned from set_intersection to indicate the end of the intersection - your code ignores it, so you will also iterate over the empty strings left at the end of the output.
Could you post a complete test case so that we can see whether the intersection is actually empty or not?
Your allTrigers vector is empty, afterall. You never reset itr to the beginning of the map when you're filling it.
EDIT:
Actually, you never reset aFirst:
for(;aFirst!=aLast;++aFirst)
{
std::cout << "Found:" << (*aFirst) << std::endl;
};
// here aFirst == aLast
std::vector<std::string>::iterator it;
std::sort(wFirst, wLast);
std::sort(aFirst, aLast); // **** sorting empty range ****
std::set_intersection(wFirst, wLast, aFirst, aLast, back_inserter(foundTrigger));
// ^^^^^^^^^^^^^^
// ***** empty range *****
I hope you can now see why it is good practice to narrow down the scope of your variables.
You never use the return value of set_intersection. In this case you could use it to resize foundIterators after set_intersection has returned, or as the upper limit of the for loop. Otherwise your code seems to work. Can we see a full compilable program and its actual output please?

Invalid heap error when trying to copy elements from a Map to a compatible priority Queue

My program makes a frequency map of characters (which I store in , surprise surprise, a Map), I am trying to copy each element from this map into a Priority Queue so that I can have a sorted copy of these values (I plan to make further use of the Q, that's why am not sorting the map) , but whenever I try to copy these values , the program executes fine for the first two or three iterations and fails on the fourth citing an "Invalid heap" error.
I'm not sure how to proceed from here, so I am posting the code for the classes in question.
#include "srcFile.h"
#include <string>
#include <iostream>
srcFile::srcFile(std::string s_flName)
{
// Storing the file name
s_fileName= s_flName;
}
srcFile::srcFile()
{
// Default constructor (never to be used)
}
srcFile::~srcFile(void)
{
}
void srcFile::dispOverallMap ()
{
std::map<char,int>::iterator dispIterator;
dispIterator = map_charFreqDistribution.begin();
charElement *currentChar;
std::cout<<"\n Frequency distribution map \n";
while(dispIterator != map_charFreqDistribution.end())
{
std::cout<< "Character : " << (int)dispIterator->first << " Frequency : "<< dispIterator->second<<'\n';
currentChar = new charElement(dispIterator->first,dispIterator->second);
Q_freqDistribution.push(*currentChar);
dispIterator++;
// delete currentChar;
}
while(!Q_freqDistribution.empty())
{
std::cout<<'\n'<<"Queue Element : " << (int)Q_freqDistribution.top().ch_elementChar << " Frequency : " << Q_freqDistribution.top().i_frequency;
Q_freqDistribution.pop();
}
}
map_charFreqDistribution has already been populated, if I remove the line
Q_freqDistribution.push(*currentChar);
Then I can verify that the Map is indeed there.
Also , both the Q and the use charElement as the template type , its nothing except the character and its frequency, along with 2 pointers to facilitate tree generation (unused upto this point)
Adding the definition of charElement on request
#pragma once
class charElement
{
public:
// Holds the character for the element in question
char ch_elementChar;
// Holds the number of times the character appeared in the file
int i_frequency;
// Left pointer for tree
charElement* ptr_left;
// Right pointer for tree
charElement* ptr_right;
charElement(char,int);
charElement(void);
~charElement(void);
void operator=(charElement&);
};
class compareCharElt
{
public:
bool operator()(charElement &operand1,charElement &operand2)
{
// If the frequency of op1 < op2 then return true
if(operand1.i_frequency < operand2.i_frequency) return true;
// If the frequency of op1 > op2 then return false
if(operand1.i_frequency > operand2.i_frequency)return false;
// If the frequency of op1 == op2 then return true (that is priority is indicated to be less even though frequencies are equal)
if(operand1.i_frequency == operand2.i_frequency)return false;
}
};
Definition of Map and Queue
// The map which holds the frequency distribution of all characters in the file
std::map<char,int> map_charFreqDistribution;
void dispOverallMap();
// Create Q which holds character elements
std::priority_queue<charElement,std::vector<charElement>,compareCharElt> Q_freqDistribution;
P.S.This may be a noob question, but Is there an easier way to post blocks of code , putting 4 spaces in front of huge code chunks doesn't seem all that efficient to me! Are pastebin links acceptable here ?
Your vector is reallocating and invalidating your pointers. You need to use a different data structure, or an index into the vector, instead of a raw pointer. When you insert elements into a vector, then pointers to the contents become invalid.
while(dispIterator != map_charFreqDistribution.end())
{
std::cout<< "Character : " << (int)dispIterator->first << " Frequency : "<< dispIterator->second<<'\n';
currentChar = new charElement(dispIterator->first,dispIterator->second);
Q_freqDistribution.push(*currentChar);
dispIterator++;
delete currentChar;
}
Completely throws people off because it's very traditional for people to have huge problems when using new and delete directly, but there's actually no need for it whatsoever in this code, and everything is actually done by value.
You have two choices. Pick a structure (e.g. std::list) that does not invalidate pointers, or, allocate all charElements on the heap directly and use something like shared_ptr that cleans up for you.
currentChar = new charElement(dispIterator->first,dispIterator->second);
Q_freqDistribution.push(*currentChar);
dispIterator++;
delete currentChar;
In the above code, you create a new charElement object, then push it, and then delete it. When you call delete, that object no longer exists -- not even in the queue. That's probably not what you want.