Algorithm for finding the maximum difference in an array of numbers - c++

I have an array of a few million numbers.
double* const data = new double (3600000);
I need to iterate through the array and find the range (the largest value in the array minus the smallest value). However, there is a catch. I only want to find the range where the smallest and largest values are within 1,000 samples of each other.
So I need to find the maximum of: range(data + 0, data + 1000), range(data + 1, data + 1001), range(data + 2, data + 1002), ...., range(data + 3599000, data + 3600000).
I hope that makes sense. Basically I could do it like above, but I'm looking for a more efficient algorithm if one exists. I think the above algorithm is O(n), but I feel that it's possible to optimize. An idea I'm playing with is to keep track of the most recent maximum and minimum and how far back they are, then only backtrack when necessary.
I'll be coding this in C++, but a nice algorithm in pseudo code would be just fine. Also, if this number I'm trying to find has a name, I'd love to know what it is.
Thanks.

This type of question belongs to a branch of algorithms called streaming algorithms. It is the study of problems which require not only an O(n) solution but also need to work in a single pass over the data. the data is inputted as a stream to the algorithm, the algorithm can't save all of the data and then and then it is lost forever. the algorithm needs to get some answer about the data, such as for instance the minimum or the median.
Specifically you are looking for a maximum (or more commonly in literature - minimum) in a window over a stream.
Here's a presentation on an article that mentions this problem as a sub problem of what they are trying to get at. it might give you some ideas.
I think the outline of the solution is something like that - maintain the window over the stream where in each step one element is inserted to the window and one is removed from the other side (a sliding window). The items you actually keep in memory aren't all of the 1000 items in the window but a selected representatives which are going to be good candidates for being the minimum (or maximum).
read the article. it's abit complex but after 2-3 reads you can get the hang of it.

The algorithm you describe is really O(N), but i think the constant is too high. Another solution which looks reasonable is to use O(N*log(N)) algorithm the following way:
* create sorted container (std::multiset) of first 1000 numbers
* in loop (j=1, j<(3600000-1000); ++j)
- calculate range
- remove from the set number which is now irrelevant (i.e. in index *j - 1* of the array)
- add to set new relevant number (i.e. in index *j+1000-1* of the array)
I believe it should be faster, because the constant is much lower.

This is a good application of a min-queue - a queue (First-In, First-Out = FIFO) which can simultaneously keep track of the minimum element it contains, with amortized constant-time updates. Of course, a max-queue is basically the same thing.
Once you have this data structure in place, you can consider CurrentMax (of the past 1000 elements) minus CurrentMin, store that as the BestSoFar, and then push a new value and pop the old value, and check again. In this way, keep updating BestSoFar until the final value is the solution to your question. Each single step takes amortized constant time, so the whole thing is linear, and the implementation I know of has a good scalar constant (it's fast).
I don't know of any documentation on min-queue's - this is a data structure I came up with in collaboration with a coworker. You can implement it by internally tracking a binary tree of the least elements within each contiguous sub-sequence of your data. It simplifies the problem that you'll only pop data from one end of the structure.
If you're interested in more details, I can try to provide them. I was thinking of writing this data structure up as a paper for arxiv. Also note that Tarjan and others previously arrived at a more powerful min-deque structure that would work here, but the implementation is much more complex. You can google for "mindeque" to read about Tarjan et al.'s work.

std::multiset<double> range;
double currentmax = 0.0;
for (int i = 0; i < 3600000; ++i)
{
if (i >= 1000)
range.erase(range.find(data[i-1000]));
range.insert(data[i]);
if (i >= 999)
currentmax = max(currentmax, *range.rbegin());
}
Note untested code.
Edit: fixed off-by-one error.

read in the first 1000 numbers.
create a 1000 element linked list which tracks the current 1000 number.
create a 1000 element array of pointers to linked list nodes, 1-1 mapping.
sort the pointer array based on linked list node's values. This will rearrange the array but keep the linked list intact.
you can now calculate the range for the first 1000 numbers by examining the first and last element of the pointer array.
remove the first inserted element, which is either the head or the tail depending on how you made your linked list. Using the node's value perform a binary search on the pointer array to find the to-be-removed node's pointer, and shift the array one over to remove it.
add the 1001th element to the linked list, and insert a pointer to it in the correct position in the array, by performing one step of an insertion sort. This will keep the array sorted.
now you have the min. and max. value of the numbers between 1 and 1001, and can calculate the range using the first and last element of the pointer array.
it should now be obvious what you need to do for the rest of the array.
The algorithm should be O(n) since the delete and insertion is bounded by log(1e3) and everything else takes constant time.

I decided to see what the most efficient algorithm I could think of to solve this problem was using actual code and actual timings. I first created a simple solution, one that tracks the min/max for the previous n entries using a circular buffer, and a test harness to measure the speed. In the simple solution, each data value is compared against the set of min/max values, so that's about window_size * count tests (where window size in the original question is 1000 and count is 3600000).
I then thought about how to make it faster. First off, I created a solution that used a fifo queue to store window_size values and a linked list to store the values in ascending order where each node in the linked list was also a node in the queue. To process a data value, the item at the end of the fifo was removed from the linked list and the queue. The new value was added to the start of the queue and a linear search was used to find the position in the linked list. The min and max values could then be read from the start and end of the linked list. This was quick, but wouldn't scale well with increasing window_size (i.e. linearly).
So I decided to add a binary tree to the system to try to speed up the search part of the algorithm. The final timings for window_size = 1000 and count = 3600000 were:
Simple: 106875
Quite Complex: 1218
Complex: 1219
which was both expected and unexpected. Expected in that using a sorted linked list helped, unexpected in that the overhead of having a self balancing tree didn't offset the advantage of a quicker search. I tried the latter two with an increased window size and found that the were always nearly identical up to a window_size of 100000.
Which all goes to show that theorising about algorithms is one thing, implementing them is something else.
Anyway, for those that are interested, here's the code I wrote (there's quite a bit!):
Range.h:
#include <algorithm>
#include <iostream>
#include <ctime>
using namespace std;
// Callback types.
typedef void (*OutputCallback) (int min, int max);
typedef int (*GeneratorCallback) ();
// Declarations of the test functions.
clock_t Simple (int, int, GeneratorCallback, OutputCallback);
clock_t QuiteComplex (int, int, GeneratorCallback, OutputCallback);
clock_t Complex (int, int, GeneratorCallback, OutputCallback);
main.cpp:
#include "Range.h"
int
checksum;
// This callback is used to get data.
int CreateData ()
{
return rand ();
}
// This callback is used to output the results.
void OutputResults (int min, int max)
{
//cout << min << " - " << max << endl;
checksum += max - min;
}
// The program entry point.
void main ()
{
int
count = 3600000,
window = 1000;
srand (0);
checksum = 0;
std::cout << "Simple: Ticks = " << Simple (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
srand (0);
checksum = 0;
std::cout << "Quite Complex: Ticks = " << QuiteComplex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
srand (0);
checksum = 0;
std::cout << "Complex: Ticks = " << Complex (count, window, CreateData, OutputResults) << ", checksum = " << checksum << std::endl;
}
Simple.cpp:
#include "Range.h"
// Function to actually process the data.
// A circular buffer of min/max values for the current window is filled
// and once full, the oldest min/max pair is sent to the output callback
// and replaced with the newest input value. Each value inputted is
// compared against all min/max pairs.
void ProcessData
(
int count,
int window,
GeneratorCallback input,
OutputCallback output,
int *min_buffer,
int *max_buffer
)
{
int
i;
for (i = 0 ; i < window ; ++i)
{
int
value = input ();
min_buffer [i] = max_buffer [i] = value;
for (int j = 0 ; j < i ; ++j)
{
min_buffer [j] = min (min_buffer [j], value);
max_buffer [j] = max (max_buffer [j], value);
}
}
for ( ; i < count ; ++i)
{
int
index = i % window;
output (min_buffer [index], max_buffer [index]);
int
value = input ();
min_buffer [index] = max_buffer [index] = value;
for (int k = (i + 1) % window ; k != index ; k = (k + 1) % window)
{
min_buffer [k] = min (min_buffer [k], value);
max_buffer [k] = max (max_buffer [k], value);
}
}
output (min_buffer [count % window], max_buffer [count % window]);
}
// A simple method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t Simple
(
int count,
int window,
GeneratorCallback input,
OutputCallback output
)
{
int
*min_buffer = new int [window],
*max_buffer = new int [window];
clock_t
start = clock ();
ProcessData (count, window, input, output, min_buffer, max_buffer);
clock_t
end = clock ();
delete [] max_buffer;
delete [] min_buffer;
return end - start;
}
QuiteComplex.cpp:
#include "Range.h"
template <class T>
class Range
{
private:
// Class Types
// Node Data
// Stores a value and its position in various lists.
struct Node
{
Node
*m_queue_next,
*m_list_greater,
*m_list_lower;
T
m_value;
};
public:
// Constructor
// Allocates memory for the node data and adds all the allocated
// nodes to the unused/free list of nodes.
Range
(
int window_size
) :
m_nodes (new Node [window_size]),
m_queue_tail (m_nodes),
m_queue_head (0),
m_list_min (0),
m_list_max (0),
m_free_list (m_nodes)
{
for (int i = 0 ; i < window_size - 1 ; ++i)
{
m_nodes [i].m_list_lower = &m_nodes [i + 1];
}
m_nodes [window_size - 1].m_list_lower = 0;
}
// Destructor
// Tidy up allocated data.
~Range ()
{
delete [] m_nodes;
}
// Function to add a new value into the data structure.
void AddValue
(
T value
)
{
Node
*node = GetNode ();
// clear links
node->m_queue_next = 0;
// set value of node
node->m_value = value;
// find place to add node into linked list
Node
*search;
for (search = m_list_max ; search ; search = search->m_list_lower)
{
if (search->m_value < value)
{
if (search->m_list_greater)
{
node->m_list_greater = search->m_list_greater;
search->m_list_greater->m_list_lower = node;
}
else
{
m_list_max = node;
}
node->m_list_lower = search;
search->m_list_greater = node;
}
}
if (!search)
{
m_list_min->m_list_lower = node;
node->m_list_greater = m_list_min;
m_list_min = node;
}
}
// Accessor to determine if the first output value is ready for use.
bool RangeAvailable ()
{
return !m_free_list;
}
// Accessor to get the minimum value of all values in the current window.
T Min ()
{
return m_list_min->m_value;
}
// Accessor to get the maximum value of all values in the current window.
T Max ()
{
return m_list_max->m_value;
}
private:
// Function to get a node to store a value into.
// This function gets nodes from one of two places:
// 1. From the unused/free list
// 2. From the end of the fifo queue, this requires removing the node from the list and tree
Node *GetNode ()
{
Node
*node;
if (m_free_list)
{
// get new node from unused/free list and place at head
node = m_free_list;
m_free_list = node->m_list_lower;
if (m_queue_head)
{
m_queue_head->m_queue_next = node;
}
m_queue_head = node;
}
else
{
// get node from tail of queue and place at head
node = m_queue_tail;
m_queue_tail = node->m_queue_next;
m_queue_head->m_queue_next = node;
m_queue_head = node;
// remove node from linked list
if (node->m_list_lower)
{
node->m_list_lower->m_list_greater = node->m_list_greater;
}
else
{
m_list_min = node->m_list_greater;
}
if (node->m_list_greater)
{
node->m_list_greater->m_list_lower = node->m_list_lower;
}
else
{
m_list_max = node->m_list_lower;
}
}
return node;
}
// Member Data.
Node
*m_nodes,
*m_queue_tail,
*m_queue_head,
*m_list_min,
*m_list_max,
*m_free_list;
};
// A reasonable complex but more efficent method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t QuiteComplex
(
int size,
int window,
GeneratorCallback input,
OutputCallback output
)
{
Range <int>
range (window);
clock_t
start = clock ();
for (int i = 0 ; i < size ; ++i)
{
range.AddValue (input ());
if (range.RangeAvailable ())
{
output (range.Min (), range.Max ());
}
}
clock_t
end = clock ();
return end - start;
}
Complex.cpp:
#include "Range.h"
template <class T>
class Range
{
private:
// Class Types
// Red/Black tree node colours.
enum NodeColour
{
Red,
Black
};
// Node Data
// Stores a value and its position in various lists and trees.
struct Node
{
// Function to get the sibling of a node.
// Because leaves are stored as null pointers, it must be possible
// to get the sibling of a null pointer. If the object is a null pointer
// then the parent pointer is used to determine the sibling.
Node *Sibling
(
Node *parent
)
{
Node
*sibling;
if (this)
{
sibling = m_tree_parent->m_tree_less == this ? m_tree_parent->m_tree_more : m_tree_parent->m_tree_less;
}
else
{
sibling = parent->m_tree_less ? parent->m_tree_less : parent->m_tree_more;
}
return sibling;
}
// Node Members
Node
*m_queue_next,
*m_tree_less,
*m_tree_more,
*m_tree_parent,
*m_list_greater,
*m_list_lower;
NodeColour
m_colour;
T
m_value;
};
public:
// Constructor
// Allocates memory for the node data and adds all the allocated
// nodes to the unused/free list of nodes.
Range
(
int window_size
) :
m_nodes (new Node [window_size]),
m_queue_tail (m_nodes),
m_queue_head (0),
m_tree_root (0),
m_list_min (0),
m_list_max (0),
m_free_list (m_nodes)
{
for (int i = 0 ; i < window_size - 1 ; ++i)
{
m_nodes [i].m_list_lower = &m_nodes [i + 1];
}
m_nodes [window_size - 1].m_list_lower = 0;
}
// Destructor
// Tidy up allocated data.
~Range ()
{
delete [] m_nodes;
}
// Function to add a new value into the data structure.
void AddValue
(
T value
)
{
Node
*node = GetNode ();
// clear links
node->m_queue_next = node->m_tree_more = node->m_tree_less = node->m_tree_parent = 0;
// set value of node
node->m_value = value;
// insert node into tree
if (m_tree_root)
{
InsertNodeIntoTree (node);
BalanceTreeAfterInsertion (node);
}
else
{
m_tree_root = m_list_max = m_list_min = node;
node->m_tree_parent = node->m_list_greater = node->m_list_lower = 0;
}
m_tree_root->m_colour = Black;
}
// Accessor to determine if the first output value is ready for use.
bool RangeAvailable ()
{
return !m_free_list;
}
// Accessor to get the minimum value of all values in the current window.
T Min ()
{
return m_list_min->m_value;
}
// Accessor to get the maximum value of all values in the current window.
T Max ()
{
return m_list_max->m_value;
}
private:
// Function to get a node to store a value into.
// This function gets nodes from one of two places:
// 1. From the unused/free list
// 2. From the end of the fifo queue, this requires removing the node from the list and tree
Node *GetNode ()
{
Node
*node;
if (m_free_list)
{
// get new node from unused/free list and place at head
node = m_free_list;
m_free_list = node->m_list_lower;
if (m_queue_head)
{
m_queue_head->m_queue_next = node;
}
m_queue_head = node;
}
else
{
// get node from tail of queue and place at head
node = m_queue_tail;
m_queue_tail = node->m_queue_next;
m_queue_head->m_queue_next = node;
m_queue_head = node;
// remove node from tree
node = RemoveNodeFromTree (node);
RebalanceTreeAfterDeletion (node);
// remove node from linked list
if (node->m_list_lower)
{
node->m_list_lower->m_list_greater = node->m_list_greater;
}
else
{
m_list_min = node->m_list_greater;
}
if (node->m_list_greater)
{
node->m_list_greater->m_list_lower = node->m_list_lower;
}
else
{
m_list_max = node->m_list_lower;
}
}
return node;
}
// Rebalances the tree after insertion
void BalanceTreeAfterInsertion
(
Node *node
)
{
node->m_colour = Red;
while (node != m_tree_root && node->m_tree_parent->m_colour == Red)
{
if (node->m_tree_parent == node->m_tree_parent->m_tree_parent->m_tree_more)
{
Node
*uncle = node->m_tree_parent->m_tree_parent->m_tree_less;
if (uncle && uncle->m_colour == Red)
{
node->m_tree_parent->m_colour = Black;
uncle->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
node = node->m_tree_parent->m_tree_parent;
}
else
{
if (node == node->m_tree_parent->m_tree_less)
{
node = node->m_tree_parent;
LeftRotate (node);
}
node->m_tree_parent->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
RightRotate (node->m_tree_parent->m_tree_parent);
}
}
else
{
Node
*uncle = node->m_tree_parent->m_tree_parent->m_tree_more;
if (uncle && uncle->m_colour == Red)
{
node->m_tree_parent->m_colour = Black;
uncle->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
node = node->m_tree_parent->m_tree_parent;
}
else
{
if (node == node->m_tree_parent->m_tree_more)
{
node = node->m_tree_parent;
RightRotate (node);
}
node->m_tree_parent->m_colour = Black;
node->m_tree_parent->m_tree_parent->m_colour = Red;
LeftRotate (node->m_tree_parent->m_tree_parent);
}
}
}
}
// Adds a node into the tree and sorted linked list
void InsertNodeIntoTree
(
Node *node
)
{
Node
*parent = 0,
*child = m_tree_root;
bool
greater;
while (child)
{
parent = child;
child = (greater = node->m_value > child->m_value) ? child->m_tree_more : child->m_tree_less;
}
node->m_tree_parent = parent;
if (greater)
{
parent->m_tree_more = node;
// insert node into linked list
if (parent->m_list_greater)
{
parent->m_list_greater->m_list_lower = node;
}
else
{
m_list_max = node;
}
node->m_list_greater = parent->m_list_greater;
node->m_list_lower = parent;
parent->m_list_greater = node;
}
else
{
parent->m_tree_less = node;
// insert node into linked list
if (parent->m_list_lower)
{
parent->m_list_lower->m_list_greater = node;
}
else
{
m_list_min = node;
}
node->m_list_lower = parent->m_list_lower;
node->m_list_greater = parent;
parent->m_list_lower = node;
}
}
// Red/Black tree manipulation routine, used for removing a node
Node *RemoveNodeFromTree
(
Node *node
)
{
if (node->m_tree_less && node->m_tree_more)
{
// the complex case, swap node with a child node
Node
*child;
if (node->m_tree_less)
{
// find largest value in lesser half (node with no greater pointer)
for (child = node->m_tree_less ; child->m_tree_more ; child = child->m_tree_more)
{
}
}
else
{
// find smallest value in greater half (node with no lesser pointer)
for (child = node->m_tree_more ; child->m_tree_less ; child = child->m_tree_less)
{
}
}
swap (child->m_colour, node->m_colour);
if (child->m_tree_parent != node)
{
swap (child->m_tree_less, node->m_tree_less);
swap (child->m_tree_more, node->m_tree_more);
swap (child->m_tree_parent, node->m_tree_parent);
if (!child->m_tree_parent)
{
m_tree_root = child;
}
else
{
if (child->m_tree_parent->m_tree_less == node)
{
child->m_tree_parent->m_tree_less = child;
}
else
{
child->m_tree_parent->m_tree_more = child;
}
}
if (node->m_tree_parent->m_tree_less == child)
{
node->m_tree_parent->m_tree_less = node;
}
else
{
node->m_tree_parent->m_tree_more = node;
}
}
else
{
child->m_tree_parent = node->m_tree_parent;
node->m_tree_parent = child;
Node
*child_less = child->m_tree_less,
*child_more = child->m_tree_more;
if (node->m_tree_less == child)
{
child->m_tree_less = node;
child->m_tree_more = node->m_tree_more;
node->m_tree_less = child_less;
node->m_tree_more = child_more;
}
else
{
child->m_tree_less = node->m_tree_less;
child->m_tree_more = node;
node->m_tree_less = child_less;
node->m_tree_more = child_more;
}
if (!child->m_tree_parent)
{
m_tree_root = child;
}
else
{
if (child->m_tree_parent->m_tree_less == node)
{
child->m_tree_parent->m_tree_less = child;
}
else
{
child->m_tree_parent->m_tree_more = child;
}
}
}
if (child->m_tree_less)
{
child->m_tree_less->m_tree_parent = child;
}
if (child->m_tree_more)
{
child->m_tree_more->m_tree_parent = child;
}
if (node->m_tree_less)
{
node->m_tree_less->m_tree_parent = node;
}
if (node->m_tree_more)
{
node->m_tree_more->m_tree_parent = node;
}
}
Node
*child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
if (node->m_tree_parent->m_tree_less == node)
{
node->m_tree_parent->m_tree_less = child;
}
else
{
node->m_tree_parent->m_tree_more = child;
}
if (child)
{
child->m_tree_parent = node->m_tree_parent;
}
return node;
}
// Red/Black tree manipulation routine, used for rebalancing a tree after a deletion
void RebalanceTreeAfterDeletion
(
Node *node
)
{
Node
*child = node->m_tree_less ? node->m_tree_less : node->m_tree_more;
if (node->m_colour == Black)
{
if (child && child->m_colour == Red)
{
child->m_colour = Black;
}
else
{
Node
*parent = node->m_tree_parent,
*n = child;
while (parent)
{
Node
*sibling = n->Sibling (parent);
if (sibling && sibling->m_colour == Red)
{
parent->m_colour = Red;
sibling->m_colour = Black;
if (n == parent->m_tree_more)
{
LeftRotate (parent);
}
else
{
RightRotate (parent);
}
}
sibling = n->Sibling (parent);
if (parent->m_colour == Black &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
n = parent;
parent = n->m_tree_parent;
continue;
}
else
{
if (parent->m_colour == Red &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
parent->m_colour = Black;
break;
}
else
{
if (n == parent->m_tree_more &&
sibling->m_colour == Black &&
(sibling->m_tree_more && sibling->m_tree_more->m_colour == Red) &&
(!sibling->m_tree_less || sibling->m_tree_less->m_colour == Black))
{
sibling->m_colour = Red;
sibling->m_tree_more->m_colour = Black;
RightRotate (sibling);
}
else
{
if (n == parent->m_tree_less &&
sibling->m_colour == Black &&
(!sibling->m_tree_more || sibling->m_tree_more->m_colour == Black) &&
(sibling->m_tree_less && sibling->m_tree_less->m_colour == Red))
{
sibling->m_colour = Red;
sibling->m_tree_less->m_colour = Black;
LeftRotate (sibling);
}
}
sibling = n->Sibling (parent);
sibling->m_colour = parent->m_colour;
parent->m_colour = Black;
if (n == parent->m_tree_more)
{
sibling->m_tree_less->m_colour = Black;
LeftRotate (parent);
}
else
{
sibling->m_tree_more->m_colour = Black;
RightRotate (parent);
}
break;
}
}
}
}
}
}
// Red/Black tree manipulation routine, used for balancing the tree
void LeftRotate
(
Node *node
)
{
Node
*less = node->m_tree_less;
node->m_tree_less = less->m_tree_more;
if (less->m_tree_more)
{
less->m_tree_more->m_tree_parent = node;
}
less->m_tree_parent = node->m_tree_parent;
if (!node->m_tree_parent)
{
m_tree_root = less;
}
else
{
if (node == node->m_tree_parent->m_tree_more)
{
node->m_tree_parent->m_tree_more = less;
}
else
{
node->m_tree_parent->m_tree_less = less;
}
}
less->m_tree_more = node;
node->m_tree_parent = less;
}
// Red/Black tree manipulation routine, used for balancing the tree
void RightRotate
(
Node *node
)
{
Node
*more = node->m_tree_more;
node->m_tree_more = more->m_tree_less;
if (more->m_tree_less)
{
more->m_tree_less->m_tree_parent = node;
}
more->m_tree_parent = node->m_tree_parent;
if (!node->m_tree_parent)
{
m_tree_root = more;
}
else
{
if (node == node->m_tree_parent->m_tree_less)
{
node->m_tree_parent->m_tree_less = more;
}
else
{
node->m_tree_parent->m_tree_more = more;
}
}
more->m_tree_less = node;
node->m_tree_parent = more;
}
// Member Data.
Node
*m_nodes,
*m_queue_tail,
*m_queue_head,
*m_tree_root,
*m_list_min,
*m_list_max,
*m_free_list;
};
// A complex but more efficent method of calculating the results.
// Memory management is done here outside of the timing portion.
clock_t Complex
(
int count,
int window,
GeneratorCallback input,
OutputCallback output
)
{
Range <int>
range (window);
clock_t
start = clock ();
for (int i = 0 ; i < count ; ++i)
{
range.AddValue (input ());
if (range.RangeAvailable ())
{
output (range.Min (), range.Max ());
}
}
clock_t
end = clock ();
return end - start;
}

Idea of algorithm:
Take the first 1000 values of data and sort them
The last in the sort - the first is range(data + 0, data + 999).
Then remove from the sort pile the first element with the value data[0]
and add the element data[1000]
Now, the last in the sort - the first is range(data + 1, data + 1000).
Repeat until done
// This should run in (DATA_LEN - RANGE_WIDTH)log(RANGE_WIDTH)
#include <set>
#include <algorithm>
using namespace std;
const int DATA_LEN = 3600000;
double* const data = new double (DATA_LEN);
....
....
const int RANGE_WIDTH = 1000;
double range = new double(DATA_LEN - RANGE_WIDTH);
multiset<double> data_set;
data_set.insert(data[i], data[RANGE_WIDTH]);
for (int i = 0 ; i < DATA_LEN - RANGE_WIDTH - 1 ; i++)
{
range[i] = *data_set.end() - *data_set.begin();
multiset<double>::iterator iter = data_set.find(data[i]);
data_set.erase(iter);
data_set.insert(data[i+1]);
}
range[i] = *data_set.end() - *data_set.begin();
// range now holds the values you seek
You should probably check this for off by 1 errors, but the idea is there.

Related

Correctly managing pointers in C++ Quadtree implementation

I'm working on a C++ quadtree implementation for collision detection. I tried to adapt this Java implementation to C++ by using pointers; namely, storing the child nodes of each node as Node pointers (code at the end). However, since my understanding of pointers is still rather lacking, I am struggling to understand why my Quadtree class produces the following two issues:
When splitting a Node in 4, the debugger tells me that all my childNodes entries are identical to the first one, i.e., same address and bounds.
Even if 1. is ignored, I get an Access violation reading location 0xFFFFFFFFFFFFFFFF, which I found out is a consequence of the childNode pointees being deleted after the first split, resulting in undefined behaviour.
My question is: what improvements should I make to my Quadtree.hpp so that each Node can contain 4 distinct child node pointers and have those references last until the quadtree is cleared?
What I have tried so far:
Modifying getChildNode according to this guide and using temporary variables in split() to avoid all 4 entries of childNodes to point to the same Node:
void split() {
for (int i = 0; i < 4; i++) {
Node temp = getChildNode(level, bounds, i + 1);
childNodes[i] = &(temp);
}
}
but this does not solve the problem.
This one is particularly confusing. My initial idea was to just store childNodes as Nodes themselves, but turns out that cannot be done while we're defining the Node class itself. Hence, it looks like the only way to store Nodes is by first creating them and then storing pointers to them as I tried to do in split(), yet it seems that those will not "last" until we've inserted all the objects since the pointees get deleted (run out of scope) and we get the aforementioned undefined behaviour. I also thought of using smart pointers, but that seems to only overcomplicate things.
The code:
Quadtree.hpp
#pragma once
#include <vector>
#include <algorithm>
#include "Box.hpp"
namespace quadtree {
class Node {
public:
Node(int p_level, quadtree::Box<float> p_bounds)
:level(p_level), bounds(p_bounds)
{
parentWorld = NULL;
}
// NOTE: mandatory upon Quadtree initialization
void setParentWorld(World* p_world_ptr) {
parentWorld = p_world_ptr;
}
/*
Clears the quadtree
*/
void clear() {
objects.clear();
for (int i = 0; i < 4; i++) {
if (childNodes[i] != nullptr) {
(*(childNodes[i])).clear();
childNodes[i] = nullptr;
}
}
}
/*
Splits the node into 4 subnodes
*/
void split() {
for (int i = 0; i < 4; i++) {
childNodes[i] = &getChildNode(level, bounds, i + 1);;
}
}
/*
Determine which node the object belongs to. -1 means
object cannot completely fit within a child node and is part
of the parent node
*/
int getIndex(Entity* p_ptr_entity) {
quadtree::Box<float> nodeBounds;
quadtree::Box<float> entityHitbox;
for (int i = 0; i < 4; i++) {
nodeBounds = childNodes[i]->bounds;
ComponentHandle<Hitbox> hitbox;
parentWorld->unpack(*p_ptr_entity, hitbox);
entityHitbox = hitbox->box;
if (nodeBounds.contains(entityHitbox)) {
return i;
}
}
return -1; // if no childNode completely contains Entity Hitbox
}
/*
Insert the object into the quadtree. If the node
exceeds the capacity, it will split and add all
objects to their corresponding nodes.
*/
void insertObject(Entity* p_ptr_entity) {
if (childNodes[0] != nullptr) {
int index = getIndex(p_ptr_entity);
if (index != -1) {
(*childNodes[index]).insertObject(p_ptr_entity); // insert in child node
return;
}
}
objects.push_back(p_ptr_entity); // add to parent node
if (objects.size() > MAX_OBJECTS && level < MAX_DEPTH) {
if (childNodes[0] == nullptr) {
split();
}
int i = 0;
while (i < objects.size()) {
int index = getIndex(objects[i]);
if (index != -1)
{
Entity* temp_entity = objects[i];
{
// remove i-th element of the vector
using std::swap;
swap(objects[i], objects.back());
objects.pop_back();
}
(*childNodes[index]).insertObject(temp_entity);
}
else
{
i++;
}
}
}
}
/*
Return all objects that could collide with the given object
*/
std::vector<Entity*> retrieve(Entity* p_ptr_entity, std::vector<Entity*> returnObjects) {
int index = getIndex(p_ptr_entity);
if (index != -1 && childNodes[0] == nullptr) {
(*childNodes[index]).retrieve(p_ptr_entity, returnObjects);
}
returnObjects.insert(returnObjects.end(), objects.begin(), objects.end());
return returnObjects;
}
World* getParentWorld() {
return parentWorld;
}
private:
int MAX_OBJECTS = 10;
int MAX_DEPTH = 5;
World* parentWorld; // used to unpack entities
int level; // depth of the node
quadtree::Box<float> bounds; // boundary of nodes in the game's map
std::vector<Entity*> objects; // list of objects contained in the node: pointers to Entitites in the game
Node* childNodes[4];
quadtree::Box<float> getQuadrantBounds(quadtree::Box<float> p_parentBounds, int p_quadrant_id) {
quadtree::Box<float> quadrantBounds;
quadrantBounds.width = p_parentBounds.width / 2;
quadrantBounds.height = p_parentBounds.height / 2;
switch (p_quadrant_id) {
case 1: // NE
quadrantBounds.top = p_parentBounds.top;
quadrantBounds.left = p_parentBounds.width / 2;
break;
case 2: // NW
quadrantBounds.top = p_parentBounds.top;
quadrantBounds.left = p_parentBounds.left;
break;
case 3: // SW
quadrantBounds.top = p_parentBounds.height / 2;
quadrantBounds.left = p_parentBounds.left;
break;
case 4: // SE
quadrantBounds.top = p_parentBounds.height / 2;
quadrantBounds.left = p_parentBounds.width / 2;
break;
}
return quadrantBounds;
}
Node& getChildNode(int parentLevel, Box<float> parentBounds, int quadrant) {
static Node temp = Node(parentLevel + 1, getQuadrantBounds(parentBounds, quadrant));
return temp;
}
};
}
Where Box is just a helper class that contains some helper methods for rectangular shapes and collision detection. Any help would be greatly appreciated!

A segmentation fault?

I am currently working on this project for school, where I am implementing an algorithm to find a solution for the 'knight game' (it's about finding the shortest way from the top left corner of the board to the bottom right corner), but I've been getting this segmentation fault for three days now, I've checked every pointer I used and everything seems right.
I implemented " searching algorithms, bfs and dfs and ucs, the first two ones work fine, but ucs gives me the segmentation fault, even though they use the same thing except a popBest function.
Here are some pictures of the ucs and popBest function:
Item *popBest( list_t *list ) // and remove the best board from the list.
{
assert(list);
assert(list->numElements);
int min_f;
Item *item = list->first;
Item *best;
min_f = list->first->f;
while (item) {
if (item->f < min_f) {
min_f = item->f;
best = item;
}
item = item->next;
}
//item = onList(list, board);
delList(list, best);
return best;
}
void ucs(void)
{
Item *cur_node, *child_p, *temp;
while ( listCount(&openList_p) ) { /* While items are on the open list
printLt(openList_p );
/* Get the first item on the open list*/
cur_node = popBest(&openList_p);
//printf("%d %f\n", listCount(&openList_p), evaluateBoard( cur_node ));
printBoard(cur_node);
addFirst(&closedList_p, cur_node);
if ( evaluateBoard(cur_node) == 0.0 ) {
showSolution(cur_node);
printf("\nParcours en largeur (bfs)\n" );
return;
}
else {
for (int i = 0; i < MAX_BOARD; i++) {
child_p = getChildBoard( cur_node, i );
if (child_p != NULL) {
child_p->f = cur_node->f+1;
temp = onList(&openList_p, child_p->board);
if (temp ==NULL) addLast( &openList_p, temp);
else if (temp != NULL && child_p->f < temp->f )
{
delList(&openList_p, temp);
addLast( &openList_p, temp);
}
}
}
}
}
return;
}
All the functions work fine for bfs and dfs, the only difference is the popBest function.
You do list->first->f without checking whether list->first is the null pointer.
The cause of your problem is probably that best is potentially uninitialised after the loop, and it will definitely be if the first element in the list is "best".
Here is a safer version.
Item *popBest( list_t *list )
{
assert(list);
assert(list->numElements);
assert(list->first);
// Assume that the first element is best.
Item *best = list->first;
int min_f = best->f;
// Search from the second element (if it exists).
Item* item = best->next;
while (item) {
if (item->f < min_f) {
min_f = item->f;
best = item;
}
item = item->next;
}
delList(list, best);
return best;
}

Problems implementing A star search C++

I am trying to implement an A star algorithm with C++ in a game that I am creating and it is not working, I don't really know if there's something I've missed about the code or the algorithm. I've used sets because they are sorted and the return value is a vector with the nodes I've got to visit. I've never used this algorithm before so probably I've got some kind of error.
struct node {
Pos pos;
int f; //the sum of the distance from the goal to succcessor
int g; // the sum of the cost of the current plus the one from the successor
int h; //distance from goal to successor
friend bool operator< (node right, node left) {
return (right.f < left.f);
} };
vector<node> search(Pos inicio,Pos desti){
set<node> opennodes;
vector<node> closednodes;
node inici;
node successor;
inici.pos = inicio;
inici.h = heuristic(inicio,desti);
inici.g = getcost(inicio);
inici.f = inici.g + inici.h;
opennodes.insert(inici);
closednodes.push_back(inici);
while(not opennodes.empty()){
node current = *(opennodes.begin());
opennodes.erase(opennodes.begin());
if(current.pos == desti) cerr<<"encontrao";
Dir direccio;
for(int i = 0; i < 4;++i){
if(i==0){
direccio = LEFT;
}
else if(i==1){
direccio = RIGHT;
}
else if(i==2){
direccio = TOP;
}
else {
direccio = BOTTOM;
}
successor.pos = current.pos + direccio;
if(successor.pos == desti) return closednodes;
if(pos_ok(successor.pos)){
successor.g = current.g + getcost(successor.pos);
successor.h = heuristic(successor.pos,desti);
successor.f = successor.g + successor.h;
node n1 = checkposition(successor.pos, opennodes); //I had to create two checkposition just to know if there's the node in the set or in the vector
node n2 = checkposition2(successor.pos, closednodes);
if (n1.f != -1 and n1.f < successor.f);
else if (n2.f != -1 and n2.f < successor.f);
else opennodes.insert(successor);
}
}
closednodes.push_back(current);
}
return closednodes;
}
So, first:
if(current.pos == desti) cerr<<"encontrao";
Shouldn't there be a break statement here? the cerr function doesn't break your loop, just throws and error message to your stdout.
And the for statement inside your while is always running up to 4, so direccio is always set to BOTTOM.
Other than that, I think the heuristic is fine, the problem is within the code structure, I'd suggest debugging it and posting here your results.

A Star Unpredictable Errors

My feeble attempt at an A* Algorithm is generating unpredictable errors.
My FindAdjacent() function is clearly a mess, and it actually doesn't work when I step through it. This is my first time trying a path finding algorithm, so this is all new to me.
When the application actually manages to find the goal nodes and path (or so I think), it can never set the path (called from within main by pressing enter). I do not know why it is unable to do this from looking at the SetPath() function.
Any help would be hugely appreciated, here's my code:
NODE CLASS
enum
{
NODE_TYPE_NONE = 0,
NODE_TYPE_NORMAL,
NODE_TYPE_SOLID,
NODE_TYPE_PATH,
NODE_TYPE_GOAL
};
class Node
{
public:
Node () : mTypeID(0), mNodeCost(0), mX(0), mY(0), mParent(0){};
public:
int mTypeID;
int mNodeCost;
int mX;
int mY;
Node* mParent;
};
PATH FINDING
/**
* finds the path between star and goal
*/
void AStarImpl::FindPath()
{
cout << "Finding Path." << endl;
GetGoals();
while (!mGoalFound)
GetF();
}
/**
* modifies linked list to find adjacent, walkable nodes
*/
void AStarImpl::FindAdjacent(Node* pNode)
{
for (int i = -1; i <= 1; i++)
{
for (int j = -1; j <= 1; j++)
if (i != 0 && j != 0)
if (Map::GetInstance()->mMap[pNode->mX+i][pNode->mY+j].mTypeID != NODE_TYPE_SOLID)
{
for (vector<Node*>::iterator iter = mClosedList.begin(); iter != mClosedList.end(); iter++)
{
if ((*iter)->mX != Map::GetInstance()->mMap[pNode->mX + i][pNode->mY + j].mX && (*iter)->mY != Map::GetInstance()->mMap[pNode->mX + i][pNode->mY + j].mY)
{
Map::GetInstance()->mMap[pNode->mX+i][pNode->mY+j].mParent = pNode;
mOpenList.push_back(&Map::GetInstance()->mMap[pNode->mX+i][pNode->mY+j]);
}
}
}
}
mClosedList.push_back(pNode);
}
/**
* colour the found path
*/
void AStarImpl::SetPath()
{
vector<Node*>::iterator tParent;
mGoalNode->mTypeID = NODE_TYPE_PATH;
Node *tNode = mGoalNode;
while (tNode->mParent)
{
tNode->mTypeID = NODE_TYPE_PATH;
tNode = tNode->mParent;
}
}
/**
* returns a random node
*/
Node* AStarImpl::GetRandomNode()
{
int tX = IO::GetInstance()->GetRand(0, MAP_WIDTH - 1);
int tY = IO::GetInstance()->GetRand(0, MAP_HEIGHT - 1);
Node* tNode = &Map::GetInstance()->mMap[tX][tY];
return tNode;
}
/**
* gets the starting and goal nodes, then checks te starting nodes adjacent nodes
*/
void AStarImpl::GetGoals()
{
// get the two nodes
mStartNode = GetRandomNode();
mGoalNode = GetRandomNode();
mStartNode->mTypeID = NODE_TYPE_GOAL;
mGoalNode->mTypeID = NODE_TYPE_GOAL;
// insert start node into the open list
mOpenList.push_back(mStartNode);
// find the starting nodes adjacent ndoes
FindAdjacent(*mOpenList.begin());
// remove starting node from open list
mOpenList.erase(mOpenList.begin());
}
/**
* finds the best f
*/
void AStarImpl::GetF()
{
int tF = 0;
int tBestF = 1000;
vector<Node*>::const_iterator tIter;
vector<Node*>::const_iterator tBestNode;
for (tIter = mOpenList.begin(); tIter != mOpenList.end(); ++tIter)
{
tF = GetH(*tIter);
tF += (*tIter)->mNodeCost;
if (tF < tBestF)
{
tBestF = tF;
tBestNode = tIter;
}
}
if ((*tBestNode) != mGoalNode)
{
Node tNode = **tBestNode;
mOpenList.erase(tBestNode);
FindAdjacent(&tNode);
}
else
{
mClosedList.push_back(mGoalNode);
mGoalFound = true;
}
}
/**
* returns the heuristic from the given node to goal
*/
int AStarImpl::GetH(Node *pNode)
{
int H = (int) fabs((float)pNode->mX - mGoalNode->mX);
H += (int) fabs((float)pNode->mY - mGoalNode->mY);
H *= 10;
return H;
}
A few suggestions:
ADJACENCY TEST
The test in FindAdjacent will only find diagonal neighbours at the moment
if (i != 0 && j != 0)
If you also want to find left/right/up/down neighbours you would want to use
if (i != 0 || j != 0)
ADJACENCY LOOP
I think your code looks suspicious in FindAdjacent at the line
for (vector<Node*>::iterator iter = mClosedList.begin(); iter != mClosedList.end(); iter++)
I don't really understand the intention here. I would have expected mClosedList to start empty, so this loop will never execute, and so nothing will ever get added to mOpenList.
My expectation at this part of the algorithm would be for you to test for each neighbour whether it should be added to the open list.
OPENLIST CHECK
If you look at the A* algorithm on wikipedia you will see that you are also missing the section starting
if neighbor not in openset or tentative_g_score < g_score[neighbor]
in which you should also check in FindAdjacent whether your new node is already in the OpenSet before adding it, and if it is then only add it if the score is better.

Iterative version of a recursive algorithm is slower

I'm trying to implement an iterative version of Tarjan's strongly connected components (SCCs), reproduced here for your convenience (source: http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm).
Input: Graph G = (V, E)
index = 0 // DFS node number counter
S = empty // An empty stack of nodes
forall v in V do
if (v.index is undefined) // Start a DFS at each node
tarjan(v) // we haven't visited yet
procedure tarjan(v)
v.index = index // Set the depth index for v
v.lowlink = index
index = index + 1
S.push(v) // Push v on the stack
forall (v, v') in E do // Consider successors of v
if (v'.index is undefined) // Was successor v' visited?
tarjan(v') // Recurse
v.lowlink = min(v.lowlink, v'.lowlink)
else if (v' is in S) // Was successor v' in stack S?
v.lowlink = min(v.lowlink, v'.lowlink )
if (v.lowlink == v.index) // Is v the root of an SCC?
print "SCC:"
repeat
v' = S.pop
print v'
until (v' == v)
My iterative version uses the following Node struct.
struct Node {
int id; //Signed int up to 2^31 - 1 = 2,147,483,647
int index;
int lowlink;
Node *caller; //If you were looking at the recursive version, this is the node before the recursive call
unsigned int vindex; //Equivalent to the iterator in the for-loop in tarjan
vector<Node *> *nodeVector; //Vector of adjacent Nodes
};
Here's what I did for the iterative version:
void Graph::runTarjan(int out[]) { //You can ignore out. It's a 5-element array that keeps track of the largest 5 SCCs
int index = 0;
tarStack = new stack<Node *>();
onStack = new bool[numNodes];
for (int n = 0; n < numNodes; n++) {
if (nodes[n].index == unvisited) {
tarjan_iter(&nodes[n], index);
}
}
}
void Graph::tarjan_iter(Node *u, int &index) {
u->index = index;
u->lowlink = index;
index++;
u->vindex = 0;
tarStack->push(u);
u->caller = NULL; //Equivalent to the node from which the recursive call would spawn.
onStack[u->id - 1] = true;
Node *last = u;
while(true) {
if(last->vindex < last->nodeVector->size()) { //Equivalent to the check in the for-loop in the recursive version
Node *w = (*(last->nodeVector))[last->vindex];
last->vindex++; //Equivalent to incrementing the iterator in the for-loop in the recursive version
if(w->index == unvisited) {
w->caller = last;
w->vindex = 0;
w->index = index;
w->lowlink = index;
index++;
tarStack->push(w);
onStack[w->id - 1] = true;
last = w;
} else if(onStack[w->id - 1] == true) {
last->lowlink = min(last->lowlink, w->index);
}
} else { //Equivalent to the nodeSet iterator pointing to end()
if(last->lowlink == last->index) {
numScc++;
Node *top = tarStack->top();
tarStack->pop();
onStack[top->id - 1] = false;
int size = 1;
while(top->id != last->id) {
top = tarStack->top();
tarStack->pop();
onStack[top->id - 1] = false;
size++;
}
insertNewSCC(size); //Ranks the size among array of 5 elements
}
Node *newLast = last->caller; //Go up one recursive call
if(newLast != NULL) {
newLast->lowlink = min(newLast->lowlink, last->lowlink);
last = newLast;
} else { //We've seen all the nodes
break;
}
}
}
}
My iterative version runs and gives me the same output as the recursive version. The problem is that the iterative version is slower, and I'm not sure why. Can anyone give me some insight on my implementation? Is there a better way to implement the recursive algorithm iteratively?
A recursive algorithm uses the stack as storage area. In the iterative version, you use some vectors, which themselves rely on heap allocation. Stack-based allocation is known to be very fast, since it is only a matter of moving an end-of-stack pointer, whereas heap allocation may be substantially slower. That the iterative version is slower is not fully surprising.
Generally speaking, if the problem at hand fits well within a stack-only recursive model, then, by all means, recurse.