Generating "unique" matricies

Generating "unique" matricies - c++

This may be more of a math question than a programming question but since I am specifically working in c++ I figured maybe there was a library or something I didn't know about.
Anyway I'm working on a game where I'm generating some X by X arrays of booleans and randomly assigning Y of them to be true. Think tetris block kind of stuff. What I need to know is if there's a clever way to generate "unique" arrays without having to rotate the array 4 times and compare each time. To use tetris as an example again. an "L" piece is an "L" piece no matter how it's rotated, but a "J" piece would be a different unique piece. As a side question, is there a way to determine the maximum number of unique possible configurations for an X by X array with Y filled in elements?

You could sum (x-X/2)^2 + (y-X/2)^2 for each (x,y) true grid element. This effectively gives the squared distances from the centre of your grid to each "true" cell. Two grids that are the same when rotated share the property that their "true" cells are all the same distances from the centre, so this sum will also be the same. If the grids all have unique sums of squares, they are unique under rotation.
Note that although unique sums guarantees no rotational duplicates, the converse isn't true; two non-matching grids can have the same sum of squares.
If your grids are quite small and you are struggling to maximize the number of different patterns, you'll probably want to test those with equal sums. Otherwise, if your generator spits out a grid with a sum of squares that matches a previously created grid, reject it.

What you can do, is make a basic form: somehow uniquely decide which orientation among the 4 possible ones is the basic one and then compare them via the basic forms only.
How to decide which form is the basic one? It doesn't really matter as long as it is consistent. Say, pick the highest one according to lexicographical comparison.
Edit:
About the number of unique shapes: roughly speaking it is binomial number (n^2 over k)/4 - only that it doesn't take into account symetrical shapes that are preserved by 180° rotation, though there are only a few such shapes in comparison (at least for large n,k).
Side note: you should also consider the case of shapes that differ by shift only.

Related

Linear Programming - Constraints

I am trying to encode this (a small part of a project) to linear programming:
For each package p we know its length (xDimp) and width (yDimp). Also, we have the length (xTruck) and width (yTruck) of the Truck. All the numbers are integers.
Due to the design of the packages, they cannot be rotated when placed in a truck.
The Truck is represented as a matrix of 2 dimensions, only with x and y coordinates. We ignore the height.
Decision variables:
– pxy[p,x,y] = package p is in the cell with upper-right coordinates (x, y)
– pbl[p,x,y] = the bottom left cell of p has upper-right coordinates (x, y)
How do I write such constraints to set pbl and pxy variables? I supouse that I should set the variable pbl to assure that the package fits in the truck and the value of pxy variable depends of the value of pbl.
Thank you,

This is a variant of the bin packing problem, a two dimensional packing of multiple rectangles of varying widths and heights in an enclosing rectangle (2BP). If they are only allowed to be rotated by 90°, we got the orthogonal orientations rectangular packing problem, and in your case we have a non-rotatable rectangular packing problem. Its computational complexity is NP-hard, but it's not unfeasible.
From your description, the problem is already discretised, restricting the possible placements to the grid, which means that the optimum of the continuous version may not be available anymore.
One approch is to calculate certain conflict graph in advance, which represents your search space and holds information about the overlap of the rectangles:
where
Every edge represents a conflict and every node represents a possible placement within your truck. Two packages p and q intersect iff
and pairwise.
Now, the packing problem on the grid is a maximum independent set problem on the conflict graph (MIS), assuming you want to maximize the number of packages on the truck. The MIS, in turn, has the following ILP formulation:
This is an integer relaxation of the MIS but still not good for the branch and bound solving method. If C is clique in G then any independent set can pick at most one node from C, therefore use the following constraint:
The resulting linear program's number of variables grows exponentially.
In order to go further, you can try a meta constraint satisfaction approach.
Firstly, use the following contraints to make sure your packages are within the truck:
Secondly, use a set of disjunctive constraints to prevent overlap:
From that point on, you can start to formulate a meta program, as descriped here
I think this should be be enough for a start :-)
You can find more information in the literature about combinatorial optimization.
Sources:
http://www.staff.uni-mainz.de/schoemer/publications/ESA03.pdf
https://kluedo.ub.uni-kl.de/frontdoor/index/index/docId/2046

what kind of algorithm for generating height-map from contour line?

I'm looking for interpolating some contour lines to generating a 3D view. The contours are not stored in a picture, coordinates of each point of the contour are simply stored in a std::vector.
for convex contours :
, it seems (I didn't check by myself) that the height can be easily calculates (linear interpolation) by using the distance between the two closest points of the two closest contours.
my contours are not necessarily convex :
, so it's more tricky... actualy I don't have any idea what kind of algorithm I can use.
UPDATE : 26 Nov. 2013
I finished to write a Discrete Laplace example :
you can get the code here

What you have is basically the classical Dirichlet problem:
Given the values of a function on the boundary of a region of space, assign values to the function in the interior of the region so that it satisfies a specific equation (such as Laplace's equation, which essentially requires the function to have no arbitrary "bumps") everywhere in the interior.
There are many ways to calculate approximate solutions to the Dirichlet problem. A simple approach, which should be well suited to your problem, is to start by discretizing the system; that is, you take a finite grid of height values, assign fixed values to those points that lie on a contour line, and then solve a discretized version of Laplace's equation for the remaining points.
Now, what Laplace's equation actually specifies, in plain terms, is that every point should have a value equal to the average of its neighbors. In the mathematical formulation of the equation, we require this to hold true in the limit as the radius of the neighborhood tends towards zero, but since we're actually working on a finite lattice, we just need to pick a suitable fixed neighborhood. A few reasonable choices of neighborhoods include:
the four orthogonally adjacent points surrounding the center point (a.k.a. the von Neumann neighborhood),
the eight orthogonally and diagonally adjacent grid points (a.k.a. the Moore neigborhood), or
the eight orthogonally and diagonally adjacent grid points, weighted so that the orthogonally adjacent points are counted twice (essentially the sum or average of the above two choices).
(Out of the choices above, the last one generally produces the nicest results, since it most closely approximates a Gaussian kernel, but the first two are often almost as good, and may be faster to calculate.)
Once you've picked a neighborhood and defined the fixed boundary points, it's time to compute the solution. For this, you basically have two choices:
Define a system of linear equations, one per each (unconstrained) grid point, stating that the value at each point is the average of its neighbors, and solve it. This is generally the most efficient approach, if you have access to a good sparse linear system solver, but writing one from scratch may be challenging.
Use an iterative method, where you first assign an arbitrary initial guess to each unconstrained grid point (e.g. using linear interpolation, as you suggest) and then loop over the grid, replacing the value at each point with the average of its neighbors. Then keep repeating this until the values stop changing (much).

You can generate the Constrained Delaunay Triangulation of the vertices and line segments describing the contours, then use the height defined at each vertex as a Z coordinate.
The resulting triangulation can then be rendered like any other triangle soup.
Despite the name, you can use TetGen to generate the triangulations, though it takes a bit of work to set up.

Partition an n-dimensional "square" space into cubes

right now I am stuck solving the following "semi"-mathematical Problem.
I would like to partition an n-dimensinal restricted space (a hypercube to be precise)
D={(x_1, ...,x_n), x_i \in IR and -limits<=x_i<=limits \forall i<=n} Into smaller cubes.
Meaning I would like to specify n,limits,m where m would be the number of partitions per side of the cube - 2*limits/m would be the length of the small cubes and I would get m^n such cubes.
Now I would like to return a vector of vectors containing some distinct coordinates of these small cubes. (or perhaps one could represent the cubes as objects which are characterized by a vector pointing to the "left" outer corner ? )
Basically I have no idea whether something like that is even doable using C++. Implementing this for fixed n does not pose a problem. But I would like to enable the user to have free choice of the dimension.
Background: Something like that would be priceless in optimization. Where one would partition the space into smaller ones and use e.g. a genetic algorithms on each of the subspaces and later compare the results. Thus huge initial Populations could be avoided and the search results drastically improved.
Also I am just curious whether sth. like that is doable :)
My Suggestion: Use B+ Trees ?

Let m be the number of partitions per dimension, i.e. per edge, of the hypercube D.
Then there are m^n different subspaces S of D, like you say. Let the subspaces S be uniquely represented by integer coordinates S=[y_1,y_2,...,y_n] where the y_i are integers in the range 1, ..., m. In Cartesian coordinates, then, S=(x_1,x_2,...,x_n) where Delta*(y_i-1)-limits <= x_i < Delta*y_i-limits, and Delta=2*limits/m.
The "left outer corner" or origin of S you were looking for is just the point corresponding to the smallest x_i, i.e. the point (Delta*(y_1-1)-limits, ..., Delta*(y_n-1)-limits). Instead of representing the different S by this point, it makes a lot more sense (and will be faster in a computer) to represent them using the integer coordinates above.

What does it mean to normalize a value?

I'm currently studying lighting in OpenGL, which utilizes a function in GLSL called normalize. According to OpenGL docs, it says that it "calculates the normalized product of two vectors". However, it still doesn't explain what "normalized" mean. I have tried look for what a normalized product is on Google, however I can't seem to find anything about it. Can anyone explain what normalizing means and provide a few example of a normalized value?

I think the confusion comes from the idea of normalizing "a value" as opposed to "a vector"; if you just think of a single number as a value, normalization doesn't make any sense. Normalization is only useful when applied to a vector.
A vector is a sequence of numbers; in 3D graphics it is usually a coordinate expressed as v = <x,y,z>.
Every vector has a magnitude or length, which can be found using Pythagora's theorem: |v| = sqrt(x^2 + y^2 + z^2) This is basically the length of a line from the origin <0,0,0> to the point expressed by the vector.
A vector is normal if its length is 1. That's it!
To normalize a vector means to change it so that it points in the same direction (think of that line from the origin) but its length is one.
The main reason we use normal vectors is to represent a direction; for example, if you are modeling a light source that is an infinite distance away, you can't give precise coordinates for it. But you can indicate where to find it from a particular point by using a normal vector.

It's a mathematical term and this link explains its meaning in quite simple terms:
Operations in 2D and 3D computer graphics are often performed using copies of vectors that have been normalized ie. converted to unit vectors... Normalizing a vector involves two steps:
calculate its length, then,
divide each of its (xy or xyz) components by its length...

It's something complicated to explain if you don't know too much about vectors or even vectorial algebra. (You can check this article about general concepts as vector, normal vector or even normalization procedure ) Check it
But the procedure or concept of "normalize" refers to the process of making something standard or “normal.”
In the case of vectors, let’s assume for the moment that a standard vector has a length of 1. To normalize a vector, therefore, is to take a vector of any length and, keeping it pointing in the same direction, change its length to 1, turning it into what is called a unit vector.

Is there a data structure with these characteristics?

I'm looking for a data structure that would allow me to store an M-by-N 2D matrix of values contiguously in memory, such that the distance in memory between any two points approximates the Euclidean distance between those points in the matrix. That is, in a typical row-major representation as a one-dimensional array of M * N elements, the memory distance differs between adjacent cells in the same row (1) and adjacent cells in neighbouring rows (N).
I'd like a data structure that reduces or removes this difference. Really, the name of such a structure is sufficient—I can implement it myself. If answers happen to refer to libraries for this sort of thing, that's also acceptable, but they should be usable with C++.
I have an application that needs to perform fast image convolutions without hardware acceleration, and though I'm aware of the usual optimisation techniques for this sort of thing, I feel a specialised data structure or data ordering could improve performance.

Given the requirement that you want to store the values contiguously in memory, I'd strongly suggest you research space-filling curves, especially Hilbert curves.
To give a bit of context, such curves are sometimes used in database indexes to improve the locality of multidimensional range queries (e.g., "find all items with x/y coordinates in this rectangle"), thereby aiming to reduce the number of distinct pages accessed. A bit similar to the R-trees that have been suggested here already.
Either way, it looks that you're bound to an M*N array of values in memory, so the whole question is about how to arrange the values in that array, I figure. (Unless I misunderstood the question.)
So in fact, such orderings would probably still only change the characteristics of distance distribution.. average distance for any two randomly chosen points from the matrix should not change, so I have to agree with Oli there. Potential benefit depends largely on your specific use case, I suppose.

I would guess "no"! And if the answer happens to be "yes", then it's almost certainly so irregular that it'll be way slower for a convolution-type operation.
EDIT
To qualify my guess, take an example. Let's say we store a[0][0] first. We want a[k][0] and a[0][k] to be similar distances, and proportional to k, so we might choose to interleave the storage of first row and first column (i.e. a[0][0], a[1][0], a[0][1], a[2][0], a[0][2], etc.) But how do we now do the same for e.g. a[1][0]? All the locations near it in memory are now taken up by stuff that's near a[0][0].
Whilst there are other possibilities than my example, I'd wager that you always end up with this kind of problem.
EDIT
If your data is sparse, then there may be scope to do something clever (re Cubbi's suggestion of R-trees). However, it'll still require irregular access and pointer chasing, so will be significantly slower than straightforward convolution for any given number of points.

You might look at space-filling curves, in particular the Z-order curve, which (mostly) preserves spatial locality. It might be computationally expensive to look up indices, however.
If you are using this to try and improve cache performance, you might try a technique called "bricking", which is a little bit like one or two levels of the space filling curve. Essentially, you subdivide your matrix into nxn tiles, (where nxn fits neatly in your L1 cache). You can also store another level of tiles to fit into a higher level cache. The advantage this has over a space-filling curve is that indices can be fairly quick to compute. One reference is included in the paper here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.8959

This sounds like something that could be helped by an R-tree. or one of its variants. There is nothing like that in the C++ Standard Library, but looks like there is an R-tree in the boost candidate library Boost.Geometry (not a part of boost yet). I'd take a look at that before writing my own.

It is not possible to "linearize" a 2D structure into an 1D structure and keep the relation of proximity unchanged in both directions. This is one of the fundamental topological properties of the world.
Having that that, it is true that the standard row-wise or column-wise storage order normally used for 2D array representation is not the best one when you need to preserve the proximity (as much as possible). You can get better result by using various discrete approximations of fractal curves (space-filling curves).
Z-order curve is a popular one for this application: http://en.wikipedia.org/wiki/Z-order_(curve)
Keep in mind though that regardless of which approach you use, there will always be elements that violate your distance requirement.

You could think of your 2D matrix as a big spiral, starting at the center and progressing to the outside. Unwind the spiral, and store the data in that order, and distance between addresses at least vaguely approximates Euclidean distance between the points they represent. While it won't be very exact, I'm pretty sure you can't do a whole lot better either. At the same time, I think even at very best, it's going to be of minimal help to your convolution code.

The answer is no. Think about it - memory is 1D. Your matrix is 2D. You want to squash that extra dimension in - with no loss? It's not going to happen.
What's more important is that once you get a certain distance away, it takes the same time to load into cache. If you have a cache miss, it doesn't matter if it's 100 away or 100000. Fundamentally, you cannot get more contiguous/better performance than a simple array, unless you want to get an LRU for your array.

I think you're forgetting that distance in computer memory is not accessed by a computer cpu operating on foot :) so the distance is pretty much irrelevant.
It's random access memory, so really you have to figure out what operations you need to do, and optimize the accesses for that.

You need to reconvert the addresses from memory space to the original array space to accomplish this. Also, you've stressed distance only, which may still cause you some problems (no direction)
If I have an array of R x C, and two cells at locations [r,c] and [c,r], the distance from some arbitrary point, say [0,0] is identical. And there's no way you're going to make one memory address hold two things, unless you've got one of those fancy new qubit machines.
However, you can take into account that in a row major array of R x C that each row is C * sizeof(yourdata) bytes long. Conversely, you can say that the original coordinates of any memory address within the bounds of the array are
r = (address / C)
c = (address % C)
so
r1 = (address1 / C)
r2 = (address2 / C)
c1 = (address1 % C)
c2 = (address2 % C)
dx = r1 - r2
dy = c1 - c2
dist = sqrt(dx^2 + dy^2)
(this is assuming you're using zero based arrays)
(crush all this together to make it run more optimally)
For a lot more ideas here, go look for any 2D image manipulation code that uses a calculated value called 'stride', which is basically an indicator that they're jumping back and forth between memory addresses and array addresses

This is not exactly related to closeness but might help. It certainly helps for minimation of disk accesses.
one way to get better "closness" is to tile the image. If your convolution kernel is less than the size of a tile you typical touch at most 4 tiles at worst. You can recursively tile in bigger sections so that localization improves. A Stokes-like (At least I thinks its Stokes) argument (or some calculus of variations ) can show that for rectangles the best (meaning for examination of arbitrary sub rectangles) shape is a smaller rectangle of the same aspect ratio.
Quick intuition - think about a square - if you tile the larger square with smaller squares the fact that a square encloses maximal area for a given perimeter means that square tiles have minimal boarder length. when you transform the large square I think you can show you should the transform the tile the same way. (might also be able to do a simple multivariate differentiation)
The classic example is zooming in on spy satellite data images and convolving it for enhancement. The extra computation to tile is really worth it if you keep the data around and you go back to it.
Its also really worth it for the different compression schemes such as cosine transforms. (That's why when you download an image it frequently comes up as it does in smaller and smaller squares until the final resolution is reached.
There are a lot of books on this area and they are helpful.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js