C++ Mark for contiguous sections in a 3D array of objects - c++

If we have a 3x3x3 array of objects, which contain two members: a boolean, and an integer; can anyone suggest an efficient way of marking this array in to contiguous chunks, based on the boolean value.
For example, if we picture it as a Rubix cube, and a middle slice was missing (everything on 1,x,x == false), could we mark the two outer slices as separate groups, by way of a unique group identifier on the int member.
The same needs to apply if the "slice" goes through 90 degrees, leaving an L shape and a strip.
Could it be done with very large 3D arrays using recursion? Could it be threaded.
I've hit the ground typing a few times so far but have ended up in a few dead ends and stack overflows.
Very grateful for any help, thanks.

It could be done that way:
struct A {int m_i; bool m_b;};
enum {ELimit = 3};
int neighbour_offsets_positive[3] = {1, ELimit, ELimit*ELimit};
A cube[ELimit][ELimit][ELimit];
A * first = &cube[0][0][0];
A * last = &cube[ELimit-1][ELimit-1][ELimit-1];
// Init 'cube'.
for(A * it = first; it <= last; ++it)
it->m_i = 0, it->m_b = true;
// Slice.
for(int i = 0; i != ELimit; ++i)
for(int j = 0; j != ELimit; ++j)
cube[1][i][j].m_b = false;
// Assign unique ids to coherent parts.
int id = 0;
for(A * it = first; it <= last; ++it)
{
if (it->m_b == false)
continue;
if (it->m_i == 0)
it->m_i = ++id;
for (int k = 0; k != 3; ++k)
{
A * neighbour = it + neighbour_offsets_positive[k];
if (neighbour <= last)
if (neighbour->m_b == true)
neighbour->m_i = it->m_i;
}
}

If I understand the term "contiguous chunk" correctly, i.e the maximal set of all those array elements for which there is a path from each vertex to all other vertices and they all share the same boolean value, then this is a problem of finding connected components in a graph which can be done with a simple DFS. Imagine that each array element is a vertex, and two vertices are connected if and only if 1) they share the same boolean value 2) they differ only by one coordinate and that difference is 1 by absolute value (i.e. they are adjacent)

Related

When loading multiple TypedArrayContents arrays into a v8 array as elements (array of arrays), the last element overwrites all elements

I am writing a C++ addon to node.js. I want to process and bundle a large amount of data into an array of arrays to pass from the C++ process back to js. I've learned that NAN provides a v8 helper class called TypedArrayContents, which lets me directly access a v8 Float32Array through a pointer.
v8::Local<Float32Array> tempHeatmap = v8::Float32Array::New(
v8::ArrayBuffer::New(isolate, 4 * DIMENSIONS), 0, DIMENSIONS);
Nan::TypedArrayContents<float> dest(tempHeatmap);
for (int i = 0; i < heatmapCollection.size(); i++) {
v8::Local<Array> sensorDataArray =
Local<Array>::Cast(parentData->Get(i + 1));
for (int a = 0; a < nodeCount; a++) {
sensorDataVec[a] = sensorDataArray->Get(a)->NumberValue();
}
for (int j = 0; j < DIMENSIONS; j++) {
float num = 0.0, denom = 0.0;
for (int k = 0; k < nodeCount; k++) {
if (heatmapGrid[j][k] == 0.0) {
}
else {
num += float(sensorDataVec[k]) / float(heatmapGrid[j][k]);
denom += 1 / float(heatmapGrid[j][k]);
}
}
(*dest)[j] = num / denom;
}
returnHeatmapCollection->Set(i, tempHeatmap);
}
returnHeatmapCollection is a v8 array that stores Float32Array's as elements. I can access it from js side no problem.
However, every element inside of returnHeatmapCollection defaults to the post-processing result of the last element. In other words, if I generate 10 tempHeatmap's, the 10th one is loaded into every element in returnHeatmapArray through ->Set().
Why would this occur? It makes infinitely less sense than if it defaulted to the first element. But the fact that the last element overwrites all previous ones would indicate that the every previous element is somehow modified every time I use ->Set. How can this be the case?
You have a single Float32Array (declared outside of your for loop) that you're setting to every index of returnHeatmapCollection. You need to make a new Float32Array for each index.

Tallest tower with stacked boxes in the given order

Given N boxes. How can i find the tallest tower made with them in the given order ? (Given order means that the first box must be at the base of the tower and so on). All boxes must be used to make a valid tower.
It is possible to rotate the box on any axis in a way that any of its 6 faces gets parallel to the ground, however the perimeter of such face must be completely restrained inside the perimeter of the superior face of the box below it. In the case of the first box it is possible to choose any face, because the ground is big enough.
To solve this problem i've tried the following:
- Firstly the code generates the rotations for each rectangle (just a permutation of the dimensions)
- secondly constructing a dynamic programming solution for each box and each possible rotation
- finally search for the highest tower made (in the dp table)
But my algorithm is taking wrong answer in unknown test cases. What is wrong with it ? Dynamic programming is the best approach to solve this problem ?
Here is my code:
#include <cstdio>
#include <vector>
#include <algorithm>
#include <cstdlib>
#include <cstring>
struct rectangle{
int coords[3];
rectangle(){ coords[0] = coords[1] = coords[2] = 0; }
rectangle(int a, int b, int c){coords[0] = a; coords[1] = b; coords[2] = c; }
};
bool canStack(rectangle &current_rectangle, rectangle &last_rectangle){
for (int i = 0; i < 2; ++i)
if(current_rectangle.coords[i] > last_rectangle.coords[i])
return false;
return true;
}
//six is the number of rotations for each rectangle
int dp(std::vector< std::vector<rectangle> > &v){
int memoization[6][v.size()];
memset(memoization, -1, sizeof(memoization));
//all rotations of the first rectangle can be used
for (int i = 0; i < 6; ++i) {
memoization[i][0] = v[0][i].coords[2];
}
//for each rectangle
for (int i = 1; i < v.size(); ++i) {
//for each possible permutation of the current rectangle
for (int j = 0; j < 6; ++j) {
//for each permutation of the previous rectangle
for (int k = 0; k < 6; ++k) {
rectangle &prev = v[i - 1][k];
rectangle &curr = v[i][j];
//is possible to put the current rectangle with the previous rectangle ?
if( canStack(curr, prev) ) {
memoization[j][i] = std::max(memoization[j][i], curr.coords[2] + memoization[k][i-1]);
}
}
}
}
//what is the best solution ?
int ret = -1;
for (int i = 0; i < 6; ++i) {
ret = std::max(memoization[i][v.size()-1], ret);
}
return ret;
}
int main ( void ) {
int n;
scanf("%d", &n);
std::vector< std::vector<rectangle> > v(n);
for (int i = 0; i < n; ++i) {
rectangle r;
scanf("%d %d %d", &r.coords[0], &r.coords[1], &r.coords[2]);
//generate all rotations with the given rectangle (all combinations of the coordinates)
for (int j = 0; j < 3; ++j)
for (int k = 0; k < 3; ++k)
if(j != k) //micro optimization disease
for (int l = 0; l < 3; ++l)
if(l != j && l != k)
v[i].push_back( rectangle(r.coords[j], r.coords[k], r.coords[l]) );
}
printf("%d\n", dp(v));
}
Input Description
A test case starts with an integer N, representing the number of boxes (1 ≤ N ≤ 10^5).
Following there will be N rows, each containing three integers, A, B and C, representing the dimensions of the boxes (1 ≤ A, B, C ≤ 10^4).
Output Description
Print one row containing one integer, representing the maximum height of the stack if it’s possible to pile all the N boxes, or -1 otherwise.
Sample Input
2
5 2 2
1 3 4
Sample Output
6
Sample image for the given input and output.
Usually you're given the test case that made you fail. Otherwise, finding the problem is a lot harder.
You can always approach it from a different angle! I'm going to leave out the boring parts that are easily replicated.
struct Box { unsigned int dim[3]; };
Box will store the dimensions of each... box. When it comes time to read the dimensions, it needs to be sorted so that dim[0] >= dim[1] >= dim[2].
The idea is to loop and read the next box each iteration. It then compares the second largest dimension of the new box with the second largest dimension of the last box, and same with the third largest. If in either case the newer box is larger, it adjusts the older box to compare the first largest and third largest dimension. If that fails too, then the first and second largest. This way, it always prefers using a larger dimension as the vertical one.
If it had to rotate a box, it goes to the next box down and checks that the rotation doesn't need to be adjusted there too. It continues until there are no more boxes or it didn't need to rotate the next box. If at any time, all three rotations for a box failed to make it large enough, it stops because there is no solution.
Once all the boxes are in place, it just sums up each one's vertical dimension.
int main()
{
unsigned int size; //num boxes
std::cin >> size;
std::vector<Box> boxes(size); //all boxes
std::vector<unsigned char> pos(size, 0); //index of vertical dimension
//gets the index of dimension that isn't vertical
//largest indicates if it should pick the larger or smaller one
auto get = [](unsigned char x, bool largest) { if (largest) return x == 0 ? 1 : 0; return x == 2 ? 1 : 2; };
//check will compare the dimensions of two boxes and return true if the smaller one is under the larger one
auto check = [&boxes, &pos, &get](unsigned int x, bool largest) { return boxes[x - 1].dim[get(pos[x - 1], largest)] < boxes[x].dim[get(pos[x], largest)]; };
unsigned int x = 0, y; //indexing variables
unsigned char change; //detects box rotation change
bool fail = false; //if it cannot be solved
for (x = 0; x < size && !fail; ++x)
{
//read in the next three dimensions
//make sure dim[0] >= dim[1] >= dim[2]
//simple enough to write
//mine was too ugly and I didn't want to be embarrassed
y = x;
while (y && !fail) //when y == 0, no more boxes to check
{
change = pos[y - 1];
while (check(y, true) || check(y, false)) //while invalid rotation
{
if (++pos[y - 1] == 3) //rotate, when pos == 3, no solution
{
fail = true;
break;
}
}
if (change != pos[y - 1]) //if rotated box
--y;
else
break;
}
}
if (fail)
{
std::cout << -1;
}
else
{
unsigned long long max = 0;
for (x = 0; x < size; ++x)
max += boxes[x].dim[pos[x]];
std::cout << max;
}
return 0;
}
It works for the test cases I've written, but given that I don't know what caused yours to fail, I can't tell you what mine does differently (assuming it also doesn't fail your test conditions).
If you are allowed, this problem might benefit from a tree data structure.
First, define the three possible cases of block:
1) Cube - there is only one possible option for orientation, since every orientation results in the same height (applied toward total height) and the same footprint (applied to the restriction that the footprint of each block is completely contained by the block below it).
2) Square Rectangle - there are three possible orientations for this rectangle with two equal dimensions (for examples, a 4x4x1 or a 4x4x7 would both fit this).
3) All Different Dimensions - there are six possible orientations for this shape, where each side is different from the rest.
For the first box, choose how many orientations its shape allows, and create corresponding nodes at the first level (a root node with zero height will allow using simple binary trees, rather than requiring a more complicated type of tree that allows multiple elements within each node). Then, for each orientation, choose how many orientations the next box allows but only create nodes for those that are valid for the given orientation of the current box. If no orientations are possible given the orientation of the current box, remove that entire unique branch of orientations (the first parent node with multiple valid orientations will have one orientation removed by this pruning, but that parent node and all of its ancestors will be preserved otherwise).
By doing this, you can check for sets of boxes that have no solution by checking whether there are any elements below the root node, since an empty tree indicates that all possible orientations have been pruned away by invalid combinations.
If the tree is not empty, then just walk the tree to find the highest sum of heights within each branch of the tree, recursively up the tree to the root - the sum value is your maximum height, such as the following pseudocode:
std::size_t maximum_height() const{
if(leftnode == nullptr || rightnode == nullptr)
return this_node_box_height;
else{
auto leftheight = leftnode->maximum_height() + this_node_box_height;
auto rightheight = rightnode->maximum_height() + this_node_box_height;
if(leftheight >= rightheight)
return leftheight;
else
return rightheight;
}
}
The benefits of using a tree data structure are
1) You will greatly reduce the number of possible combinations you have to store and check, because in a tree, the invalid orientations will be eliminated at the earliest possible point - for example, using your 2x2x5 first box, with three possible orientations (as a Square Rectangle), only two orientations are possible because there is no possible way to orient it on its 2x2 end and still fit the 4x3x1 block on it. If on average only two orientations are possible for each block, you will need a much smaller number of nodes than if you compute every possible orientation and then filter them as a second step.
2) Detecting sets of blocks where there is no solution is much easier, because the data structure will only contain valid combinations.
3) Working with the finished tree will be much easier - for example, to find the sequence of orientations of the highest, rather than just the actual height, you could pass an empty std::vector to a modified highest() implementation, and let it append the actual orientation of each highest node as it walks the tree, in addition to returning the height.

Optimal way to find shared elements between combination pairs

I have a list of ordered items of type A, who each contain a subset from a list of items B. For each pair of items in A, I would like to find the number of items B that they share (intersect).
For example, if I have this data:
A1 : B1
A2 : B1 B2 B3
A3 : B1
Then I would get the following result:
A1, A2 : 1
A1, A3 : 1
A2, A3 : 1
The problem I'm having is making the algorithm efficient. The size of my dataset is about 8.4K items of type A. This means 8.4K choose 2 = 35275800 combinations. The algorithm I'm using is simply going through each combination pair and doing a set intersection.
The gist of what I have so far is below. I am storing the counts as a key in a map, with the value as a vector of A pairs. I'm using a graph data structure to store the data, but the only 'graph' operation I'm using is get_neighbors() which returns the B subset for an item from A. I happen to know that the elements in the graph are ordered from index 0 to 8.4K.
void get_overlap(Graph& g, map<int, vector<A_pair> >& overlap) {
map<int, vector<A_pair> >::iterator it;
EdgeList el_i, el_j;
set<int> intersect;
size_t i, j;
VertexList vl = g.vertices();
for (i = 0; i < vl.size()-1; i++) {
el_i = g.get_neighbors(i);
for (j = i+1; j < vl.size(); j++) {
el_j = g.get_neighbors(j);
set_intersection(el_i.begin(), el_i.end(), el_j.begin(), el_j.end(), inserter(intersect, intersect.begin()));
int num_overlap = intersect.size();
it = overlap.find(num_overlap);
if (it == overlap.end()) {
vector<A_pair> temp;
temp.push_back(A_pair(i, j));
overlap.insert(pair<int, vector<A_pair> >(num_overlap, temp));
}
else {
vector<A_pair> temp = it->second;
temp.push_back(A_pair(i, j));
overlap[num_overlap] = temp;
}
}
}
}
I have been running this program for nearly 24 hours, and the ith element in the for loop has reached iteration 250 (I'm printing each i to a log file). This, of course, is a long way from 8.4K (although I know as iterations go on, the number of comparisons will shorten since j = i +1). Is there a more optimal approach?
Edit: To be clear, the goal here is ultimately to find the top k overlapped pairs.
Edit 2: Thanks to #Beta and others for pointing out optimizations. In particular, updating the map directly (instead of copying its contents and resetting the map value) drastically improved the performance. It now runs in a matter of seconds.
I think you may be able to make things faster by pre-computing a reverse (edge-to-vertex) map. This would allow you to avoid the set_intersection call, which performs a bunch of costly set insertions. I am missing some declarations to make fully functional code, but hopefully you will get the idea. I am assuming that EdgeList is some sort of int vector:
void get_overlap(Graph& g, map<int, vector<A_pair> >& overlap) {
map<int, vector<A_pair> >::iterator it;
EdgeList el_i, el_j;
set<int> intersect;
size_t i, j;
VertexList vl = g.vertices();
// compute reverse map
map<int, set<int>> reverseMap;
for (i = 0; i < vl.size()-1; i++) {
el_i = g.get_neighbors(i);
for (auto e : el_i) {
const auto findIt = reverseMap.find(e);
if (end(reverseMap) == findIt) {
reverseMap.emplace(e, set<int>({i})));
} else {
findIt->second.insert(i);
}
}
}
for (i = 0; i < vl.size()-1; i++) {
el_i = g.get_neighbors(i);
for (j = i+1; j < vl.size(); j++) {
el_j = g.get_neighbors(j);
int num_overlap = 0;
for (auto e: el_i) {
auto findIt = reverseMap.find(e);
if (end(reverseMap) != findIt) {
if (findIt->second.count(j) > 0) {
++num_overlap;
}
}
}
it = overlap.find(num_overlap);
if (it == overlap.end()) {
overlap.emplace(num_overlap, vector<A_pair>({ A_pair(i, j) }));
}
else {
it->second.push_back(A_pair(i,j));
}
}
}
I didn't do the precise performance analysis, but inside the double loop, you replace "At most 4N comparisons" + some costly set insertions (from set_intersection) with N*log(M)*log(E) comparisons, where N is the average number of edge per vertex, and M is the average number of vertex per edge, and E is the number of edges, so it could be beneficial depending on your data set.
Also, if your edge indexes are compact, then you can use a simplae vector rather than a map to represent the reverse map, which removed the log(E) performance cost.
One question, though. Since you're talking about vertices and edges, don't you have the additional constraint that edges always have 2 vertices ? This could simplify some computations.

Create Minimum Spanning Tree from Adjacency Matrix using Prims Algorithm

I want to implement Prims algorithm to find the minimal spanning tree of a graph. I have written some code to start with what I think is the way to do it, but Im kind of stuck on how to complete this.
Right now, I have a matrix stored in matrix[i][j], which is stored as a vector>. I have also a list of IP address stored in the variable ip. (This becomes the labels of each column/row in the graph)
int n = 0;
for(int i = 0; i<ip.size();i++) // column
{
for(int j = ip.size()-1; j>n;j--)
{
if(matrix[i][j] > 0)
{
edgef test;
test.ip1 = ip[i];
test.ip2 = ip[j];
test.w = matrix[i][j];
add(test);
}
}
n++;
}
At the moment, this code will look into one column, and add all the weights associated with that column to a binary min heap. What I want to do is, dequeue an item from the heap and store it somewhere if it is the minimum edge weight.
void entry::add(edgef x)
{
int current, temp;
current = heap.size();
heap.push_back(x);
if(heap.size() > 1)
{
while(heap[current].w < heap[current/2].w) // if child is less than parent, min heap style
{
edgef temp = heap[current/2]; // swap
heap[current/2] = heap[current];
heap[current] = temp;
current = current/2;
}
}
}

How to efficiently change a contiguous portion of a matrix?

Given a matrix of M rows and N columns, and allocated as a byte array of M*N elements (these elements are initially set to zero), I would modify this matrix in according to the following rule: the elements that are found in the neighborhood of a certain element must be set to a given value. In other words, given a matrix, I should set a region of the matrix: for this purpose I should access not contiguous portion of the array.
In order to perform the above operation, I have access to the following information:
the pointer to the element that is located in the center of the neighborhood (this pointer must not be changed during the above operation); the position (row and column) of this element is also provided;
the size L*L of the neighborhood (L is always an odd number).
The code that implements this operation should be executed as fast as possible in C++: for this reason I thought of using the above pointer to access different pieces of the array. Instead, the position (row and column) of the central element of the neighborhood could allow me to check whether the specified region exceeds the dimensions of the matrix (for example, the center of the region may be located on the edge of the matrix): in this case I should set only that part of the region that is located in the matrix.
int M = ... // number of matrix rows
int N = ... // number of matrix columns
char* centerPtr = ... // pointer to the center of the region
int i = ... // position of the central element
int j = ... // of the region to be modified
char* tempPtr = centerPtr - (N+1)*L/2;
for(int k=0; k < L; k++)
{
memset(tempPtr,value,N);
tempPtr += N;
}
How can I improve the code?
How to handle the fact that one region may exceeds the dimensions of a matrix?
How to make the code more efficient with respect to the execution time?
Your code is probably optimal for the general case where the region does not overlap the outside of the matrix. The main efficiency problem you can cause with this kind of code is to make the outer loop over columns instead of rows. This destroys cache and paging performance. You haven't done that.
Using pointers has little or no speed advantage with most modern compilers. Optimizers will come up with very good pointer code from normal array indices. In some cases I've seen array index code run substantially faster than hand-tweaked pointer code for the same thing. So don't use pointer arithmetic if index arithmetic is clearer.
There are 8 boundary cases: north, northwest, west, ..., northeast. Each of these will need a custom version of your loop to touch the right elements. I'll show the northwest case and let you work out the rest.
The fastest possible way to handle the cases is a 3-level "if" tree:
if (j < L/2) { // northwest, west, or southwest
if (i < L/2) {
// northwest
char* tempPtr = centerPtr - (L/2 - i) * N - (L/2 - j);
for(int k = 0; k < L; k++) {
memset(tempPtr, value, L - j);
tempPtr += N;
}
} else if (i >= M - L/2) {
// southwest
} else {
// west
}
} else if (j >= N - L/2) { // symmetrical cases for east.
if (i < L/2) {
// northeast
} else if (i >= M - L/2) {
// southeast
} else {
// east
}
} else {
if (i < L/2) {
// north
} else if (i >= M - L/2) {
// south
} else {
// no overlap
}
}
It's tedious to do it like this, but you'll have no more than 3 comparisons per region.