Find a specific node in a XML document with TinyXML - c++

I want to parse some data from an xml file with TinyXML.
Here's my text.xml file content:
<?xml version="1.0" encoding="iso-8859-1"?>
<toto>
<tutu>
<tata>
<user name="toto" pass="13" indice="1"/>
<user name="tata" pass="142" indice="2"/>
<user name="titi" pass="azerty" indice="1"/>
</tata>
</tutu>
</toto>
I want to access to the first element 'user'. The way to do this is the following :
TiXmlDocument doc("test.xml");
if (doc.LoadFile())
{
TiXmlNode *elem = doc.FirstChildElement()->FirstChildElement()->FirstChildElement()->FirstChildElement();
std::cout << elem->Value() << std::endl;
}
In output : user.
But the code is pretty ugly and not generic. I tried the code below to simulate the same behaviour than the code above but it doesn't work and an error occured.
TiXmlElement *getElementByName(TiXmlDocument &doc, std::string const &elemt_value)
{
TiXmlElement *elem = doc.FirstChildElement(); //Tree root
while (elem)
{
if (!std::string(elem->Value()).compare(elemt_value))
return (elem);
elem = elem->NextSiblingElement();
}
return (NULL);
}
Maybe I missed a special function in the library which can do this work (a getElementByName function). I just want to get a pointer to the element where the value is the one I'm looking for. Does anyone can help me? Thanks in advance for your help.

Try this
TiXmlElement * getElementByName(TiXmlDocument & doc, std::string const & elemt_value) {
TiXmlElement * elem = doc.RootElement(); //Tree root
while (elem) {
if (!std::string(elem - > Value()).compare(elemt_value)) return (elem);
/*elem = elem->NextSiblingElement();*/
if (elem - > FirstChildElement()) {
elem = elem - > FirstChildElement();
} else if (elem - > NextSiblingElement()) {
elem = elem - > NextSiblingElement();
} else {
while (!elem - > Parent() - > NextSiblingElement()) {
if (elem - > Parent() - > ToElement() == doc.RootElement()) {
return NULL;
}
elem = elem - > Parent() - > NextSiblingElement();
}
}
}
return (NULL);
}

Adi's answer didn't work when i just copy pasted it into my code,
but i modified it and now it works fine for me.
since i made quite a lot of changes i thought i should post my final code here.
void parseXML(tinyxml2::XMLDocument& xXmlDocument, std::string sSearchString, std::function<void(tinyxml2::XMLNode*)> fFoundSomeElement)
{
if ( xXmlDocument.ErrorID() != tinyxml2::XML_SUCCESS )
{
// XML file is not ok ... we throw some exception
throw DataReceiverException( "XML file parsing failed" );
} // if
//ispired by http://stackoverflow.com/questions/11921463/find-a-specific-node-in-a-xml-document-with-tinyxml
tinyxml2::XMLNode * xElem = xXmlDocument.FirstChild();
while(xElem)
{
if (xElem->Value() && !std::string(xElem->Value()).compare(sSearchString))
{
fFoundSomeElement(xElem);
}
/*
* We move through the XML tree following these rules (basically in-order tree walk):
*
* (1) if there is one or more child element(s) visit the first one
* else
* (2) if there is one or more next sibling element(s) visit the first one
* else
* (3) move to the parent until there is one or more next sibling elements
* (4) if we reach the end break the loop
*/
if (xElem->FirstChildElement()) //(1)
xElem = xElem->FirstChildElement();
else if (xElem->NextSiblingElement()) //(2)
xElem = xElem->NextSiblingElement();
else
{
while(xElem->Parent() && !xElem->Parent()->NextSiblingElement()) //(3)
xElem = xElem->Parent();
if(xElem->Parent() && xElem->Parent()->NextSiblingElement())
xElem = xElem->Parent()->NextSiblingElement();
else //(4)
break;
}//else
}//while
}
(for completeness) example how to call the function:
tinyxml2::XMLDocument xXmlDocument;
xXmlDocument.Parse(sXmlDocument.c_str());
parseXML(xXmlDocument, "user",[](tinyxml2::XMLNode* xElem)
{
int iPass;
xElem->QueryIntAttribute( "pass", &iPass );
std::cout << iPass << "\n";
});

You can also iterate through your XML elements one-by-one by using recursive function combined with lamda-function as a handler.
//
// This function will iterate through your XML tree and call the 'parseElement' function for each found element.
//
void RecursiveXMLParse(TiXmlElement* element, std::function<void(TiXmlElement*)>& parseElement)
{
if (element != nullptr)
{
parseElement(element);
auto child = element->FirstChildElement();
if (child != nullptr)
{
RecursiveXMLParse(child, parseElement);
}
for (auto sibling = element->NextSiblingElement(); sibling != nullptr; sibling = sibling->NextSiblingElement())
{
RecursiveXMLParse(sibling, parseElement);
}
}
}
Usage: Just pass the XML root element, and your data handler lambda function to the recursive Parser-function.
int main()
{
//
// Define your data handler labmda
//
std::function<void(TiXmlElement*)>parseElement = [&](TiXmlElement* e) -> void
{
if (std::string(elem->Value()).compare("user"))
{
// Parse your user data
}
};
// Pass the root element along with the above defined lambda to the recursive function
RecursiveXMLParse(doc.RootElement(), parseElement);
return 0;
}

XMLElement *getElementByName(XMLDocument &ele, std::string const &elemt_value)
{
XMLElement *elem = ele.FirstChildElement(); //Tree root
while (elem)
{
if (!std::string(elem->Value()).compare(elemt_value))
return elem;
if (elem->FirstChildElement())
{
elem = elem->FirstChildElement();
}
else if (elem->NextSiblingElement())
{
elem = elem->NextSiblingElement();
}
else
{
if (elem->Parent()->ToElement()->NextSiblingElement())
{
elem = elem->Parent()->ToElement()->NextSiblingElement();
}
else if (elem->Parent()->ToElement()->FirstChildElement()
&& strcmp(elem->Name(), elem->Parent()->ToElement()->FirstChildElement()->Name()))
{
elem = elem->Parent()->ToElement()->FirstChildElement();
}
else {
break;
}
}
}
return NULL;
}
// A small tweak in the above given solution

Related

C++ call and definition mismatch

I am looking at this code block at
https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/profiler.cpp#L141
pushCallback(
[config](const RecordFunction& fn) {
auto* msg = (fn.seqNr() >= 0) ? ", seq = " : "";
if (config.report_input_shapes) {
std::vector<std::vector<int64_t>> inputSizes;
inputSizes.reserve(fn.inputs().size());
for (const c10::IValue& input : fn.inputs()) {
if (!input.isTensor()) {
inputSizes.emplace_back();
continue;
}
const at::Tensor& tensor = input.toTensor();
if (tensor.defined()) {
inputSizes.push_back(input.toTensor().sizes().vec());
} else {
inputSizes.emplace_back();
}
}
pushRangeImpl(fn.name(), msg, fn.seqNr(), std::move(inputSizes));
} else {
pushRangeImpl(fn.name(), msg, fn.seqNr(), {});
}
},
[](const RecordFunction& fn) {
if (fn.getThreadId() != 0) {
// If we've overridden the thread_id on the RecordFunction, then find
// the eventList that was created for the original thread_id. Then,
// record the end event on this list so that the block is added to
// the correct list, instead of to a new list. This should only run
// when calling RecordFunction::end() in a different thread.
if (state == ProfilerState::Disabled) {
return;
} else {
std::lock_guard<std::mutex> guard(all_event_lists_map_mutex);
const auto& eventListIter =
all_event_lists_map.find(fn.getThreadId());
TORCH_INTERNAL_ASSERT(
eventListIter != all_event_lists_map.end(),
"Did not find thread_id matching ",
fn.getThreadId());
auto& eventList = eventListIter->second;
eventList->record(
EventKind::PopRange,
StringView(""),
fn.getThreadId(),
state == ProfilerState::CUDA);
}
} else {
popRange();
}
},
config.report_input_shapes);
This only has three arguments. But the definition of pushCallback seems to be at this location
https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/record_function.cpp#L35 and takes four parameters.
void pushCallback(
RecordFunctionCallback start,
RecordFunctionCallback end,
bool needs_inputs,
bool sampled) {
start_callbacks.push_back(std::move(start));
end_callbacks.push_back(std::move(end));
if (callback_needs_inputs > 0 || needs_inputs) {
++callback_needs_inputs;
}
is_callback_sampled.push_back(sampled);
if (sampled) {
++num_sampled_callbacks;
}
}
I don't know why that function call could work in that way.
If you look at the header you find that it is declared with 4 parameters, out of which the last three have defaults:
TORCH_API void pushCallback(
RecordFunctionCallback start,
RecordFunctionCallback end = [](const RecordFunction&){},
bool needs_inputs = false,
bool sampled = false);
Default arguments only appear on the declaration not on the definition.

Insert Node recursively in BTree using C++

Creating a template class to insert data recursively in BTree. The code gives segmentation error, I have included the getter and setter functions in header file, not sure where it is going wrong, please help to suggest, if I remove the check on Node->right==NULL, then the code works fine, but in that case I am unable to add recursion.
template <class S>
class STree
{
public:
// Constructors
STree()
{root = NULL; size=0;}
STree(S part)
{root = new BTNode<S>(part); size=0;}
// Destructor
~STree(){};
void insert(S part)
{
if (root == NULL)
{
root = new BTNode<S>(part);
cout<< "\n from root " << root->get_data();
}
else
{add(root,part);}
size++;
}
void add(BTNode<S>* node, S part)
{
S part2 = node->get_data();
cout << "part "<< part;
cout<< "part2 "<< node->get_data();
if( part == part2)
{node->set_data(part);}
else if (part > part2)
{
if (node->get_right()== NULL) //if I remove this check the segmentation error goes away
{node->set_right(new BTNode<S>(part)); cout<< "from right "<< node->right->get_data();}
else
{add(node->get_right(),part);} //If I remove the if statement above the recursion won't work
}
else
{
if (node->get_left()==NULL)
{ node->set_left(new BTNode<S>(part));}
else
{add(node->get_left(),part);}
}
}
private:
BTNode<S>* root;
BTNode<S>* current;
int size;
};
#endif
int main()
{
STree<int> trees;
trees.insert(10);
trees.insert(20);
trees.insert(30);
trees.insert(40);
return 0;
}
BS part2 = node->get_data() will access nullptr every time you insert. When implementing recursion, always take care of the base case first.
void add(BTNode<BS>* node, BS part)
{
if (part < node->get_data() && node->get_left() == nullptr)
node->set_left(new BTNode<BS>(part));
else if (part > node->get_data() && node->get_right() == nullptr)
node->set_right(new BTNode<BS>(part));
if (node->get_data() == part)
return;
else
{
if (part < node->get_data())
add(node->get_left(), part);
else
add(node->get_right(), part);
}
}

A segmentation fault?

I am currently working on this project for school, where I am implementing an algorithm to find a solution for the 'knight game' (it's about finding the shortest way from the top left corner of the board to the bottom right corner), but I've been getting this segmentation fault for three days now, I've checked every pointer I used and everything seems right.
I implemented " searching algorithms, bfs and dfs and ucs, the first two ones work fine, but ucs gives me the segmentation fault, even though they use the same thing except a popBest function.
Here are some pictures of the ucs and popBest function:
Item *popBest( list_t *list ) // and remove the best board from the list.
{
assert(list);
assert(list->numElements);
int min_f;
Item *item = list->first;
Item *best;
min_f = list->first->f;
while (item) {
if (item->f < min_f) {
min_f = item->f;
best = item;
}
item = item->next;
}
//item = onList(list, board);
delList(list, best);
return best;
}
void ucs(void)
{
Item *cur_node, *child_p, *temp;
while ( listCount(&openList_p) ) { /* While items are on the open list
printLt(openList_p );
/* Get the first item on the open list*/
cur_node = popBest(&openList_p);
//printf("%d %f\n", listCount(&openList_p), evaluateBoard( cur_node ));
printBoard(cur_node);
addFirst(&closedList_p, cur_node);
if ( evaluateBoard(cur_node) == 0.0 ) {
showSolution(cur_node);
printf("\nParcours en largeur (bfs)\n" );
return;
}
else {
for (int i = 0; i < MAX_BOARD; i++) {
child_p = getChildBoard( cur_node, i );
if (child_p != NULL) {
child_p->f = cur_node->f+1;
temp = onList(&openList_p, child_p->board);
if (temp ==NULL) addLast( &openList_p, temp);
else if (temp != NULL && child_p->f < temp->f )
{
delList(&openList_p, temp);
addLast( &openList_p, temp);
}
}
}
}
}
return;
}
All the functions work fine for bfs and dfs, the only difference is the popBest function.
You do list->first->f without checking whether list->first is the null pointer.
The cause of your problem is probably that best is potentially uninitialised after the loop, and it will definitely be if the first element in the list is "best".
Here is a safer version.
Item *popBest( list_t *list )
{
assert(list);
assert(list->numElements);
assert(list->first);
// Assume that the first element is best.
Item *best = list->first;
int min_f = best->f;
// Search from the second element (if it exists).
Item* item = best->next;
while (item) {
if (item->f < min_f) {
min_f = item->f;
best = item;
}
item = item->next;
}
delList(list, best);
return best;
}

Logic flaw in trie search

I'm currently working on a trie implementation for practice and have run into a mental roadbloack.
The issue is with my searching function. I am attempting to have my trie tree be able to retrieve a list of strings from a supplied prefix after they are loaded into the programs memory.
I also understand I could be using a queue/shouldnt use C functions in C++ ect.. This is just a 'rough draft' so to speak.
This is what I have so far:
bool SearchForStrings(vector<string> &output, string data)
{
Node *iter = GetLastNode("an");
Node *hold = iter;
stack<char> str;
while (hold->visited == false)
{
int index = GetNextChild(iter);
if (index > -1)
{
str.push(char('a' + index));
//current.push(iter);
iter = iter->next[index];
}
//We've hit a leaf so we want to unwind the stack and print the string
else if (index < 0 && IsLeaf(iter))
{
iter->visited = true;
string temp("");
stringstream ss;
while (str.size() > 0)
{
temp += str.top();
str.pop();
}
int i = 0;
for (std::string::reverse_iterator it = temp.rbegin(); it != temp.rend(); it++)
ss << *it;
//Store the string we have
output.push_back(data + ss.str());
//Move our iterator back to the root node
iter = hold;
}
//We know this isnt a leaf so we dont want to print out the stack
else
{
iter->visited = true;
iter = hold;
}
}
return (output.size() > 0);
}
int GetNextChild(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr && s->next[i]->visited == false)
return i;
}
return -1;
}
bool IsLeaf(Node *s)
{
for (int i = 0; i < 26; i++)
{
if (s->next[i] != nullptr)
return false;
}
return true;
}
struct Node{
int value;
Node *next[26];
bool visited;
};
The code is too long or i'd post it all, GetLastNode() retrieves the node at the end of the data passed in, so if the prefix was 'su' and the string was 'substring' the node would be pointing to the 'u' to use as an artificial root node
(might be completely wrong... just typed it here, no testing)
something like:
First of all, we need a way of indicating that a node represents an entry.
So let's have:
struct Node{
int value;
Node *next[26];
bool entry;
};
I've removed your visited flag because I don't have a use for it.
You should modify your insert/update/delete functions to support this flag. If the flag is true it means there's an actual entry up to that node.
Now we can modify the
bool isLeaf(Node *s) {
return s->entry;
}
Meaning that we consider a leaf when there's an entry... perhaps the name is wrong now, as the leaf might have childs ("y" node with "any" and "anywhere" is a leaf, but it has childs)
Now for the search:
First a public function that can be called.
bool searchForStrings(std::vector<string> &output, const std::string &key) {
// start the recursion
// theTrieRoot is the root node for the whole structure
return searchForString(theTrieRoot,output,key);
}
Then the internal function that will use for recursion.
bool searchForStrings(Node *node, std::vector<string> &output, const std::string &key) {
if(isLeaf(node->next[i])) {
// leaf node - add an empty string.
output.push_back(std::string());
}
if(key.empty()) {
// Key is empty, collect all child nodes.
for (int i = 0; i < 26; i++)
{
if (node->next[i] != nullptr) {
std::vector<std::string> partial;
searchForStrings(node->next[i],partial,key);
// so we got a list of the childs,
// add the key of this node to them.
for(auto s:partial) {
output.push_back(std::string('a'+i)+s)
}
}
} // end for
} // end if key.empty
else {
// key is not empty, try to get the node for the
// first character of the key.
int c=key[0]-'a';
if((c<0 || (c>26)) {
// first character was not a letter.
return false;
}
if(node->next[c]==nullptr) {
// no match (no node where we expect it)
return false;
}
// recurse into the node matching the key
std::vector<std::string> partial;
searchForStrings(node->next[c],partial,key.substr(1));
// add the key of this node to the result
for(auto s:partial) {
output.push_back(std::string(key[0])+s)
}
}
// provide a meaningful return value
if(output.empty()) {
return false;
} else {
return true;
}
}
And the execution for "an" search is.
Call searchForStrings(root,[],"an")
root is not leaf, key is not empty. Matched next node keyed by "a"
Call searchForStrings(node(a),[],"n")
node(a) is not leaf, key is not empty. Matched next node keyed by "n"
Call searchForStrings(node(n),[],"")
node(n) is not leaf, key is empty. Need to recurse on all not null childs:
Call searchForStrings(node(s),[],"")
node(s) is not leaf, key is empty, Need to recurse on all not null childs:
... eventually we will reach Node(r) which is a leaf node, so it will return an [""], going back it will get added ["r"] -> ["er"] -> ["wer"] -> ["swer"]
Call searchForStings(node(y),[],"")
node(y) is leaf (add "" to the output), key is empty,
recurse, we will get ["time"]
we will return ["","time"]
At this point we will add the "y" to get ["y","ytime"]
And here we will add the "n" to get ["nswer","ny","nytime"]
Adding the "a" to get ["answer","any","anytime"]
we're done

Algorithm for finding the maximum difference in an array of numbers

I have an array of a few million numbers.
double* const data = new double (3600000);
I need to iterate through the array and find the range (the largest value in the array minus the smallest value). However, there is a catch. I only want to find the range where the smallest and largest values are within 1,000 samples of each other.
So I need to find the maximum of: range(data + 0, data + 1000), range(data + 1, data + 1001), range(data + 2, data + 1002), ...., range(data + 3599000, data + 3600000).
I hope that makes sense. Basically I could do it like above, but I'm looking for a more efficient algorithm if one exists. I think the above algorithm is O(n), but I feel that it's possible to optimize. An idea I'm playing with is to keep track of the most recent maximum and minimum and how far back they are, then only backtrack when necessary.
I'll be coding this in C++, but a nice algorithm in pseudo code would be just fine. Also, if this number I'm trying to find has a name, I'd love to know what it is.
Thanks.
This type of question belongs to a branch of algorithms called streaming algorithms. It is the study of problems which require not only an O(n) solution but also need to work in a single pass over the data. the data is inputted as a stream to the algorithm, the algorithm can't save all of the data and then and then it is lost forever. the algorithm needs to get some answer about the data, such as for instance the minimum or the median.
Specifically you are looking for a maximum (or more commonly in literature - minimum) in a window over a stream.
Here's a presentation on an article that mentions this problem as a sub problem of what they are trying to get at. it might give you some ideas.
I think the outline of the solution is something like that - maintain the window over the stream where in each step one element is inserted to the window and one is removed from the other side (a sliding window). The items you actually keep in memory aren't all of the 1000 items in the window but a selected representatives which are going to be good candidates for being the minimum (or maximum).
read the article. it's abit complex but after 2-3 reads you can get the hang of it.
The algorithm you describe is really O(N), but i think the constant is too high. Another solution which looks reasonable is to use O(N*log(N)) algorithm the following way:
* create sorted container (std::multiset) of first 1000 numbers
* in loop (j=1, j<(3600000-1000); ++j)
- calculate range
- remove from the set number which is now irrelevant (i.e. in index *j - 1* of the array)
- add to set new relevant number (i.e. in index *j+1000-1* of the array)
I believe it should be faster, because the constant is much lower.
This is a good application of a min-queue - a queue (First-In, First-Out = FIFO) which can simultaneously keep track of the minimum element it contains, with amortized constant-time updates. Of course, a max-queue is basically the same thing.
Once you have this data structure in place, you can consider CurrentMax (of the past 1000 elements) minus CurrentMin, store that as the BestSoFar, and then push a new value and pop the old value, and check again. In this way, keep updating BestSoFar until the final value is the solution to your question. Each single step takes amortized constant time, so the whole thing is linear, and the implementation I know of has a good scalar constant (it's fast).
I don't know of any documentation on min-queue's - this is a data structure I came up with in collaboration with a coworker. You can implement it by internally tracking a binary tree of the least elements within each contiguous sub-sequence of your data. It simplifies the problem that you'll only pop data from one end of the structure.
If you're interested in more details, I can try to provide them. I was thinking of writing this data structure up as a paper for arxiv. Also note that Tarjan and others previously arrived at a more powerful min-deque structure that would work here, but the implementation is much more complex. You can google for "mindeque" to read about Tarjan et al.'s work.
std::multiset<double> range;
double currentmax = 0.0;
for (int i = 0; i < 3600000; ++i)
{
if (i >= 1000)
range.erase(range.find(data[i-1000]));
range.insert(data[i]);
if (i >= 999)
currentmax = max(currentmax, *range.rbegin());
}
Note untested code.
Edit: fixed off-by-one error.
read in the first 1000 numbers.
create a 1000 element linked list which tracks the current 1000 number.
create a 1000 element array of pointers to linked list nodes, 1-1 mapping.
sort the pointer array based on linked list node's values. This will rearrange the array but keep the linked list intact.
you can now calculate the range for the first 1000 numbers by examining the first and last element of the pointer array.
remove the first inserted element, which is either the head or the tail depending on how you made your linked list. Using the node's value perform a binary search on the pointer array to find the to-be-removed node's pointer, and shift the array one over to remove it.
add the 1001th element to the linked list, and insert a pointer to it in the correct position in the array, by performing one step of an insertion sort. This will keep the array sorted.
now you have the min. and max. value of the numbers between 1 and 1001, and can calculate the range using the first and last element of the pointer array.
it should now be obvious what you need to do for the rest of the array.
The algorithm should be O(n) since the delete and insertion is bounded by log(1e3) and everything else takes constant time.
I decided to see what the most efficient algorithm I could think of to solve this problem was using actual code and actual timings. I first created a simple solution, one that tracks the min/max for the previous n entries using a circular buffer, and a test harness to measure the speed. In the simple solution, each data value is compared against the set of min/max values, so that's about window_size * count tests (where window size in the original question is 1000 and count is 3600000).
I then thought about how to make it faster. First off, I created a solution that used a fifo queue to store window_size values and a linked list to store the values in ascending order where each node in the linked list was also a node in the queue. To process a data value, the item at the end of the fifo was removed from the linked list and the queue. The new value was added to the start of the queue and a linear search was used to find the position in the linked list. The min and max values could then be read from the start and end of the linked list. This was quick, but wouldn't scale well with increasing window_size (i.e. linearly).
So I decided to add a binary tree to the system to try to speed up the search part of the algorithm. The final timings for window_size = 1000 and count = 3600000 were:
Simple: 106875
Quite Complex: 1218
Complex: 1219
which was both expected and unexpected. Expected in that using a sorted linked list helped, unexpected in that the overhead of having a self balancing tree didn't offset the advantage of a quicker search. I tried the latter two with an increased window size and found that the were always nearly identical up to a window_size of 100000.
Which all goes to show that theorising about algorithms is one thing, implementing them is something else.
Anyway, for those that are interested, here's the code I wrote (there's quite a bit!):
Range.h:
#include <algorithm>
#include <iostream>
#include <ctime>
using namespace std;
// Callback types.
typedef void (*OutputCallback) (int min, int max);
typedef int (*GeneratorCallback) ();
// Declarations of the test functions.
clock_t Simple (int, int, GeneratorCallback, OutputCallback);
clock_t QuiteComplex (int, int, GeneratorCallback, OutputCallback);
clock_t Complex (int, int, GeneratorCallback, OutputCallback);
main.cpp:
#include "Range.h"
int
checksum;
// This callback is used to get data.
int CreateData ()
{
return rand ();
}
// This callback is used to output the results.
void OutputResults (int min, int max)
{
//cout << min << " - " << max << endl;
checksum += max - min;
}
// The program entry point.
void main ()
{
int
count = 3600000,
window = 1000;
srand (0);
checksum = 0;
std::cout << "Simple: Ticks = " << Simple (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
srand (0);
checksum = 0;
std::cout << "Quite Complex: Ticks = " << QuiteComplex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
srand (0);
checksum = 0;
std::cout << "Complex: Ticks = " << Complex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
}
Simple.cpp:
#include "Range.h"
// Function to actually process the data.
// A circular buffer of min/max values for the current window is filled
// and once full, the oldest min/max pair is sent to the output callback
// and replaced with the newest input value. Each value inputted is
// compared against all min/max pairs.
void ProcessData
(
int count,
int window,
GeneratorCallback input,
OutputCallback output,
int *min_buffer,
int *max_buffer
)
{
int
i;
for (i = 0 ; i < window ; ++i)
{
int
value = input ();
min_buffer [i] = max_buffer [i] = value;
for (int j = 0 ; j < i ; ++j)
{
min_buffer [j] = min (min_buffer [j], value);
max_buffer [j] = max (max_buffer [j], value);
}
}
for ( ; i < count ; ++i)
{
int
index = i % window;
output (min_buffer [index], max_buffer [index]);
int
value = input ();
min_buffer [index] = max_buffer [index] = value;
for (int k = (i + 1) % window ; k != index ; k = (k + 1) % window)
{
min_buffer [k] = min (min_buffer [k], value);
max_buffer [k] = max (max_buffer [k], value);
}
}
output (min_buffer [count % window], max_buffer [count % window]);
}
// A simple method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t Simple
(
int count,
int window,
GeneratorCallback input,
OutputCallback output
)
{
int
*min_buffer = new int [window],
*max_buffer = new int [window];
clock_t
start = clock ();
ProcessData (count, window, input, output, min_buffer, max_buffer);
clock_t
end = clock ();
delete [] max_buffer;
delete [] min_buffer;
return end - start;
}
QuiteComplex.cpp:
#include "Range.h"
template <class T>
class Range
{
private:
// Class Types
// Node Data
// Stores a value and its position in various lists.
struct Node
{
Node
*m_queue_next,
*m_list_greater,
*m_list_lower;
T
m_value;
};
public:
// Constructor
// Allocates memory for the node data and adds all the allocated
// nodes to the unused/free list of nodes.
Range
(
int window_size
) :
m_nodes (new Node [window_size]),
m_queue_tail (m_nodes),
m_queue_head (0),
m_list_min (0),
m_list_max (0),
m_free_list (m_nodes)
{
for (int i = 0 ; i < window_size - 1 ; ++i)
{
m_nodes [i].m_list_lower = &m_nodes [i + 1];
}
m_nodes [window_size - 1].m_list_lower = 0;
}
// Destructor
// Tidy up allocated data.
~Range ()
{
delete [] m_nodes;
}
// Function to add a new value into the data structure.
void AddValue
(
T value
)
{
Node
*node = GetNode ();
// clear links
node->m_queue_next = 0;
// set value of node
node->m_value = value;
// find place to add node into linked list
Node
*search;
for (search = m_list_max ; search ; search = search->m_list_lower)
{
if (search->m_value < value)
{
if (search->m_list_greater)
{
node->m_list_greater = search->m_list_greater;
search->m_list_greater->m_list_lower = node;
}
else
{
m_list_max = node;
}
node->m_list_lower = search;
search->m_list_greater = node;
}
}
if (!search)
{
m_list_min->m_list_lower = node;
node->m_list_greater = m_list_min;
m_list_min = node;
}
}
// Accessor to determine if the first output value is ready for use.
bool RangeAvailable ()
{
return !m_free_list;
}
// Accessor to get the minimum value of all values in the current window.
T Min ()
{
return m_list_min->m_value;
}
// Accessor to get the maximum value of all values in the current window.
T Max ()
{
return m_list_max->m_value;
}
private:
// Function to get a node to store a value into.
// This function gets nodes from one of two places:
// 1. From the unused/free list
// 2. From the end of the fifo queue, this requires removing the node from the list and tree
Node *GetNode ()
{
Node
*node;
if (m_free_list)
{
// get new node from unused/free list and place at head
node = m_free_list;
m_free_list = node->m_list_lower;
if (m_queue_head)
{
m_queue_head->m_queue_next = node;
}
m_queue_head = node;
}
else
{
// get node from tail of queue and place at head
node = m_queue_tail;
m_queue_tail = node->m_queue_next;
m_queue_head->m_queue_next = node;
m_queue_head = node;
// remove node from linked list
if (node->m_list_lower)
{
node->m_list_lower->m_list_greater = node->m_list_greater;
}
else
{
m_list_min = node->m_list_greater;
}
if (node->m_list_greater)
{
node->m_list_greater->m_list_lower = node->m_list_lower;
}
else
{
m_list_max = node->m_list_lower;
}
}
return node;
}
// Member Data.
Node
*m_nodes,
*m_queue_tail,
*m_queue_head,
*m_list_min,
*m_list_max,
*m_free_list;
};
// A reasonable complex but more efficent method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t QuiteComplex
(
int size,
int window,
GeneratorCallback input,
OutputCallback output
)
{
Range <int>
range (window);
clock_t
start = clock ();
for (int i = 0 ; i < size ; ++i)
{
range.AddValue (input ());
if (range.RangeAvailable ())
{
output (range.Min (), range.Max ());
}
}
clock_t
end = clock ();
return end - start;
}
Complex.cpp:
#include "Range.h"
template <class T>
class Range
{
private:
// Class Types
// Red/Black tree node colours.
enum NodeColour
{
Red,
Black
};
// Node Data
// Stores a value and its position in various lists and trees.
struct Node
{
// Function to get the sibling of a node.
// Because leaves are stored as null pointers, it must be possible
// to get the sibling of a null pointer. If the object is a null pointer
// then the parent pointer is used to determine the sibling.
Node *Sibling
(
Node *parent
)
{
Node
*sibling;
if (this)
{
sibling = m_tree_parent->m_tree_less == this ? m_tree_parent->m_tree_more : m_tree_parent->m_tree_less;
}
else
{
sibling = parent->m_tree_less ? parent->m_tree_less : parent->m_tree_more;
}
return sibling;
}
// Node Members
Node
*m_queue_next,
*m_tree_less,
*m_tree_more,
*m_tree_parent,
*m_list_greater,
*m_list_lower;
NodeColour
m_colour;
T
m_value;
};
public:
// Constructor
// Allocates memory for the node data and adds all the allocated
// nodes to the unused/free list of nodes.
Range
(
int window_size
) :
m_nodes (new Node [window_size]),
m_queue_tail (m_nodes),
m_queue_head (0),
m_tree_root (0),
m_list_min (0),
m_list_max (0),
m_free_list (m_nodes)
{
for (int i = 0 ; i < window_size - 1 ; ++i)
{
m_nodes [i].m_list_lower = &m_nodes [i + 1];
}
m_nodes [window_size - 1].m_list_lower = 0;
}
// Destructor
// Tidy up allocated data.
~Range ()
{
delete [] m_nodes;
}
// Function to add a new value into the data structure.
void AddValue
(
T value
)
{
Node
*node = GetNode ();
// clear links
node->m_queue_next = node->m_tree_more = node->m_tree_less = node->m_tree_parent = 0;
// set value of node
node->m_value = value;
// insert node into tree
if (m_tree_root)
{
InsertNodeIntoTree (node);
BalanceTreeAfterInsertion (node);
}
else
{
m_tree_root = m_list_max = m_list_min = node;
node->m_tree_parent = node->m_list_greater = node->m_list_lower = 0;
}
m_tree_root->m_colour = Black;
}
// Accessor to determine if the first output value is ready for use.
bool RangeAvailable ()
{
return !m_free_list;
}
// Accessor to get the minimum value of all values in the current window.
T Min ()
{
return m_list_min->m_value;
}
// Accessor to get the maximum value of all values in the current window.
T Max ()
{
return m_list_max->m_value;
}
private:
// Function to get a node to store a value into.
// This function gets nodes from one of two places:
// 1. From the unused/free list
// 2. From the end of the fifo queue, this requires removing the node from the list and tree
Node *GetNode ()
{
Node
*node;
if (m_free_list)
{
// get new node from unused/free list and place at head
node = m_free_list;
m_free_list = node->m_list_lower;
if (m_queue_head)
{
m_queue_head->m_queue_next = node;
}
m_queue_head = node;
}
else
{
// get node from tail of queue and place at head
node = m_queue_tail;
m_queue_tail = node->m_queue_next;
m_queue_head->m_queue_next = node;
m_queue_head = node;
// remove node from tree
node = RemoveNodeFromTree (node);
RebalanceTreeAfterDeletion (node);
// remove node from linked list
if (node->m_list_lower)
{
node->m_list_lower->m_list_greater = node->m_list_greater;
}
else
{
m_list_min = node->m_list_greater;
}
if (node->m_list_greater)
{
node->m_list_greater->m_list_lower = node->m_list_lower;
}
else
{
m_list_max = node->m_list_lower;
}
}
return node;
}
// Rebalances the tree after insertion
void BalanceTreeAfterInsertion
(
Node *node
)
{
node->m_colour = Red;
while (node != m_tree_root && node->m_tree_parent->m_colour == Red)
{
if (node->m_tree_parent == node->m_tree_parent->m_tree_parent->m_tree_more)
{
Node
*uncle = node->m_tree_parent->m_tree_parent->m_tree_less;
if (uncle && uncle->m_colour == Red)
{
node->m_tree_parent->m_colour = Black;
uncle->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
node = node->m_tree_parent->m_tree_parent;
}
else
{
if (node == node->m_tree_parent->m_tree_less)
{
node = node->m_tree_parent;
LeftRotate (node);
}
node->m_tree_parent->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
RightRotate (node->m_tree_parent->m_tree_parent);
}
}
else
{
Node
*uncle = node->m_tree_parent->m_tree_parent->m_tree_more;
if (uncle && uncle->m_colour == Red)
{
node->m_tree_parent->m_colour = Black;
uncle->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
node = node->m_tree_parent->m_tree_parent;
}
else
{
if (node == node->m_tree_parent->m_tree_more)
{
node = node->m_tree_parent;
RightRotate (node);
}
node->m_tree_parent->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
LeftRotate (node->m_tree_parent->m_tree_parent);
}
}
}
}
// Adds a node into the tree and sorted linked list
void InsertNodeIntoTree
(
Node *node
)
{
Node
*parent = 0,
*child = m_tree_root;
bool
greater;
while (child)
{
parent = child;
child = (greater = node->m_value > child->m_value) ? child->m_tree_more : child->m_tree_less;
}
node->m_tree_parent = parent;
if (greater)
{
parent->m_tree_more = node;
// insert node into linked list
if (parent->m_list_greater)
{
parent->m_list_greater->m_list_lower = node;
}
else
{
m_list_max = node;
}
node->m_list_greater = parent->m_list_greater;
node->m_list_lower = parent;
parent->m_list_greater = node;
}
else
{
parent->m_tree_less = node;
// insert node into linked list
if (parent->m_list_lower)
{
parent->m_list_lower->m_list_greater = node;
}
else
{
m_list_min = node;
}
node->m_list_lower = parent->m_list_lower;
node->m_list_greater = parent;
parent->m_list_lower = node;
}
}
// Red/Black tree manipulation routine, used for removing a node
Node *RemoveNodeFromTree
(
Node *node
)
{
if (node->m_tree_less && node->m_tree_more)
{
// the complex case, swap node with a child node
Node
*child;
if (node->m_tree_less)
{
// find largest value in lesser half (node with no greater pointer)
for (child = node->m_tree_less ; child->m_tree_more ; child = child->m_tree_more)
{
}
}
else
{
// find smallest value in greater half (node with no lesser pointer)
for (child = node->m_tree_more ; child->m_tree_less ; child = child->m_tree_less)
{
}
}
swap (child->m_colour, node->m_colour);
if (child->m_tree_parent != node)
{
swap (child->m_tree_less, node->m_tree_less);
swap (child->m_tree_more, node->m_tree_more);
swap (child->m_tree_parent, node->m_tree_parent);
if (!child->m_tree_parent)
{
m_tree_root = child;
}
else
{
if (child->m_tree_parent->m_tree_less == node)
{
child->m_tree_parent->m_tree_less = child;
}
else
{
child->m_tree_parent->m_tree_more = child;
}
}
if (node->m_tree_parent->m_tree_less == child)
{
node->m_tree_parent->m_tree_less = node;
}
else
{
node->m_tree_parent->m_tree_more = node;
}
}
else
{
child->m_tree_parent = node->m_tree_parent;
node->m_tree_parent = child;
Node
*child_less = child->m_tree_less,
*child_more = child->m_tree_more;
if (node->m_tree_less == child)
{
child->m_tree_less = node;
child->m_tree_more = node->m_tree_more;
node->m_tree_less = child_less;
node->m_tree_more = child_more;
}
else
{
child->m_tree_less = node->m_tree_less;
child->m_tree_more = node;
node->m_tree_less = child_less;
node->m_tree_more = child_more;
}
if (!child->m_tree_parent)
{
m_tree_root = child;
}
else
{
if (child->m_tree_parent->m_tree_less == node)
{
child->m_tree_parent->m_tree_less = child;
}
else
{
child->m_tree_parent->m_tree_more = child;
}
}
}
if (child->m_tree_less)
{
child->m_tree_less->m_tree_parent = child;
}
if (child->m_tree_more)
{
child->m_tree_more->m_tree_parent = child;
}
if (node->m_tree_less)
{
node->m_tree_less->m_tree_parent = node;
}
if (node->m_tree_more)
{
node->m_tree_more->m_tree_parent = node;
}
}
Node
*child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
if (node->m_tree_parent->m_tree_less == node)
{
node->m_tree_parent->m_tree_less = child;
}
else
{
node->m_tree_parent->m_tree_more = child;
}
if (child)
{
child->m_tree_parent = node->m_tree_parent;
}
return node;
}
// Red/Black tree manipulation routine, used for rebalancing a tree after a deletion
void RebalanceTreeAfterDeletion
(
Node *node
)
{
Node
*child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
if (node->m_colour == Black)
{
if (child && child->m_colour == Red)
{
child->m_colour = Black;
}
else
{
Node
*parent = node->m_tree_parent,
*n = child;
while (parent)
{
Node
*sibling = n->Sibling (parent);
if (sibling && sibling->m_colour == Red)
{
parent->m_colour = Red;
sibling->m_colour = Black;
if (n == parent->m_tree_more)
{
LeftRotate (parent);
}
else
{
RightRotate (parent);
}
}
sibling = n->Sibling (parent);
if (parent->m_colour == Black &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
n = parent;
parent = n->m_tree_parent;
continue;
}
else
{
if (parent->m_colour == Red &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
parent->m_colour = Black;
break;
}
else
{
if (n == parent->m_tree_more &&
sibling->m_colour == Black &&
(sibling->m_tree_more && sibling->m_tree_more->m_colour == Red) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
sibling->m_tree_more->m_colour = Black;
RightRotate (sibling);
}
else
{
if (n == parent->m_tree_less &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(sibling->m_tree_less && sibling->m_tree_less->m_colour == Red))
{
sibling->m_colour = Red;
sibling->m_tree_less->m_colour = Black;
LeftRotate (sibling);
}
}
sibling = n->Sibling (parent);
sibling->m_colour = parent->m_colour;
parent->m_colour = Black;
if (n == parent->m_tree_more)
{
sibling->m_tree_less->m_colour = Black;
LeftRotate (parent);
}
else
{
sibling->m_tree_more->m_colour = Black;
RightRotate (parent);
}
break;
}
}
}
}
}
}
// Red/Black tree manipulation routine, used for balancing the tree
void LeftRotate
(
Node *node
)
{
Node
*less = node->m_tree_less;
node->m_tree_less = less->m_tree_more;
if (less->m_tree_more)
{
less->m_tree_more->m_tree_parent = node;
}
less->m_tree_parent = node->m_tree_parent;
if (!node->m_tree_parent)
{
m_tree_root = less;
}
else
{
if (node == node->m_tree_parent->m_tree_more)
{
node->m_tree_parent->m_tree_more = less;
}
else
{
node->m_tree_parent->m_tree_less = less;
}
}
less->m_tree_more = node;
node->m_tree_parent = less;
}
// Red/Black tree manipulation routine, used for balancing the tree
void RightRotate
(
Node *node
)
{
Node
*more = node->m_tree_more;
node->m_tree_more = more->m_tree_less;
if (more->m_tree_less)
{
more->m_tree_less->m_tree_parent = node;
}
more->m_tree_parent = node->m_tree_parent;
if (!node->m_tree_parent)
{
m_tree_root = more;
}
else
{
if (node == node->m_tree_parent->m_tree_less)
{
node->m_tree_parent->m_tree_less = more;
}
else
{
node->m_tree_parent->m_tree_more = more;
}
}
more->m_tree_less = node;
node->m_tree_parent = more;
}
// Member Data.
Node
*m_nodes,
*m_queue_tail,
*m_queue_head,
*m_tree_root,
*m_list_min,
*m_list_max,
*m_free_list;
};
// A complex but more efficent method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t Complex
(
int count,
int window,
GeneratorCallback input,
OutputCallback output
)
{
Range <int>
range (window);
clock_t
start = clock ();
for (int i = 0 ; i < count ; ++i)
{
range.AddValue (input ());
if (range.RangeAvailable ())
{
output (range.Min (), range.Max ());
}
}
clock_t
end = clock ();
return end - start;
}
Idea of algorithm:
Take the first 1000 values of data and sort them
The last in the sort - the first is range(data + 0, data + 999).
Then remove from the sort pile the first element with the value data[0]
and add the element data[1000]
Now, the last in the sort - the first is range(data + 1, data + 1000).
Repeat until done
// This should run in (DATA_LEN - RANGE_WIDTH)log(RANGE_WIDTH)
#include <set>
#include <algorithm>
using namespace std;
const int DATA_LEN = 3600000;
double* const data = new double (DATA_LEN);
....
....
const int RANGE_WIDTH = 1000;
double range = new double(DATA_LEN - RANGE_WIDTH);
multiset<double> data_set;
data_set.insert(data[i], data[RANGE_WIDTH]);
for (int i = 0 ; i < DATA_LEN - RANGE_WIDTH - 1 ; i++)
{
range[i] = *data_set.end() - *data_set.begin();
multiset<double>::iterator iter = data_set.find(data[i]);
data_set.erase(iter);
data_set.insert(data[i+1]);
}
range[i] = *data_set.end() - *data_set.begin();
// range now holds the values you seek
You should probably check this for off by 1 errors, but the idea is there.