C++ Searching for occupied fields in 2d map

C++ Searching for occupied fields in 2d map - c++

I have a 2d map in my rts. On the map there are some units. I want to check if there is any unit in range of another. The units range is given in fields. See the image:
On the pic none of units (red, blue, green) can attack each other. I want to, for example check, for example if there is any units in range of blue. The answer is no. I know the blue's range and position, I also know positions of the rest. I also know if the map xy is occupied. How can I check this?

You want to iterate over all points (x + i, y + j) around your unit at (x, y) such that
|i| + |j| <= R ,
where R is the range of attack. (This is a disk in the L1-metric.) So, like this:
for (i = -R; i <= +R; ++i)
{
jRange = R - abs(i);
for (j = -jRange; j <= +jRange; ++j)
{
// access (x + i, y + j)
}
}
Alternatively, you can halve the outer loop by unrolling:
for (i = 0; i <= R; ++i)
{
jRange = R - i;
for (j = -jRange; i <= +jRange; ++j)
{
// access (x - i, y + j)
// if (i > 0) access (x + i, y + j)
}
}
As #Alink says, you'll have to handle the map boundary in some way or another.

On other answers (too long as comment):
Pathfinding is really wrong here. First of all, we have a grid with no restrictions and equal costs. Using any kind of pathfinding is neither necessary nor makes sense at all. I get that you are thinking ahead in a way that this exact property might change / usually is different for RTS games, but I really think we should stick to the exact problem if the author carried it out precisely and quite well.
Especially, A* is a terrible, terrible choice:
Dijkstra calculates shortest paths to all destinations from a given source node. A* uses the fact that you often have one distinct destination and a heuristic can be used to "guide" Dijkstra in the right direction. It makes you reach the interesting destination earlier and therefore you pay a small overhead. If you want to check areas "around" some source node (the unit here), this just counter-productive.
Bitmaps will have the problem of aligning them with the grid. Either way there surely are ways to optimize and check more they once field at once, but those are just optimizations, imho.
On the problem itself:
I have no experience with games at all, so this is w.r.t of the abstract problem you outline above. I have added some speculations on your RTS application but take them with a grain of salt.
Simply checking all fields around a unit, as suggested by Kerrek SB is pretty good. No unnecessary field is checked and all fields are accessed directly. I think I'd propose the same thing.
If the number of the checks from the question greatly dominates the number of unit movements (I doubt it, because of the "real-time" thing), it might be possible to precompute this problem for every unit and update it whenever a unit moves. I'll propose something that is more hungry for memory and most probably inferior to the straightfoward approach Kerrik SB proposed:
If a unit U moves to field F, it will:
notify all Unitis registered at F that they now can attack something
register itself at all the fields around F that it can now reach and
at the same time check if one of this fields is already occupied so that it could attack right away
remember all those fields to "unregister" once U moves away in the future
Consequently, each unit will know if it has something in range and does not have to recalculate that. Moving a unit will trigger recalculation only for that given unit and fields will simply notify only relevant other units.
However, there is memory overhead. And "real-time" and plenty of units moving all the time will largely decrease benefits. So I have a strong feeling this isn't the best way to go, either. However, depending on your requirements it might also work very well.

Create a bitmap for the ranges of each unit, this will allow you to shape them in any shape you want.
Simplified example:
char range1[] =
"010"
"111"
"010";
char range2[] =
"00100"
"01110"
"11111"
"01110"
"00100";
And so on...
then just check if the point lays on the bitmap (you have to figure that out yourself).

I would use a pathfinding algorithm, especially since map squares can be occupied, you'll need a pathfinding algorithm sooner or later. A* would probably be the easiest one to implement for you and perform well enough for your given scenario. It is very well described on wikipedia, and googling it should return a lot of results for you as well as sample code.
You would basically calculate a path between each entity and another. If that path exceeds the given range for the unit, it is out of range. This could of course be optimized so that you will not continue checking once all the range is exhausted.

Related

Finding shortest path in a graph, with additional restrictions

I have a graph with 2n vertices where every edge has a defined length. It looks like **
**.
I'm trying to find the length of the shortest path from u to v (smallest sum of edge lengths), with 2 additional restrictions:
The number of blue edges that the path contains is the same as the number of red edges.
The number of black edges that the path contains is not greater than p.
I have come up with an exponential-time algorithm that I think would work. It iterates through all binary combinations of length n - 1 that represent the path starting from u in the following way:
0 is a blue edge
1 is a red edge
There's a black edge whenever
the combination starts with 1. The first edge (from u) is then the first black one on the left.
the combination ends with 0. Then last edge (to v) is then the last black one on the right.
adjacent digits are different. That means we went from a blue edge to a red edge (or vice versa), so there's a black one in between.
This algorithm would ignore the paths that don't meet the 2 requirements mentioned earlier and calculate the length for the ones that do, and then find the shortest one. However doing it this way would probably be awfully slow and I'm looking for some tips to come up with a faster algorithm. I suspect it's possible to achieve with dynamic programming, but I don't really know where to start. Any help would be very appreciated. Thanks.

Seems like Dynamic Programming problem to me.
In here, v,u are arbitrary nodes.
Source node: s
Target node: t
For a node v, such that its outgoing edges are (v,u1) [red/blue], (v,u2) [black].
D(v,i,k) = min { ((v,u1) is red ? D(u1,i+1,k) : D(u1,i-1,k)) + w(v,u1) ,
D(u2,i,k-1) + w(v,u2) }
D(t,0,k) = 0 k <= p
D(v,i,k) = infinity k > p //note, for any v
D(t,i,k) = infinity i != 0
Explanation:
v - the current node
i - #reds_traversed - #blues_traversed
k - #black_edges_left
The stop clauses are at the target node, you end when reaching it, and allow reaching it only with i=0, and with k<=p
The recursive call is checking at each point "what is better? going through black or going though red/blue", and choosing the best solution out of both options.
The idea is, D(v,i,k) is the optimal result to go from v to the target (t), #reds-#blues used is i, and you can use up to k black edges.
From this, we can conclude D(s,0,p) is the optimal result to reach the target from the source.
Since |i| <= n, k<=p<=n - the total run time of the algorithm is O(n^3), assuming implemented in Dynamic Programming.

Edit: Somehow I looked at the "Finding shortest path" phrase in the question and ignored the "length of" phrase where the original question later clarified intent. So both my answers below store lots of extra data in order to easily backtrack the correct path once you have computed its length. If you don't need to backtrack after computing the length, my crude version can change its first dimension from N to 2 and just store one odd J and one even J, overwriting anything older. My faster version can drop all the complexity of managing J,R interactions and also just store its outer level as [0..1][0..H] None of that changes the time much, but it changes the storage a lot.
To understand my answer, first understand a crude N^3 answer: (I can't figure out whether my actual answer has better worst case than crude N^3 but it has much better average case).
Note that N must be odd, represent that as N=2H+1. (P also must be odd. Just decrement P if given an even P. But reject the input if N is even.)
Store costs using 3 real coordinates and one implied coordinate:
J = column 0 to N
R = count of red edges 0 to H
B = count of black edges 0 to P
S = side odd or even (S is just B%1)
We will compute/store cost[J][R][B] as the lowest cost way to reach column J using exactly R red edges and exactly B black edges. (We also used J-R blue edges, but that fact is redundant).
For convenience write to cost directly but read it through an accessor c(j,r,b) that returns BIG when r<0 || b<0 and returns cost[j][r][b] otherwise.
Then the innermost step is just:
If (S)
cost[J+1][R][B] = red[J]+min( c(J,R-1,B), c(J,R-1,B-1)+black[J] );
else
cost[J+1][R][B] = blue[J]+min( c(J,R,B), c(J,R,B-1)+black[J] );
Initialize cost[0][0][0] to zero and for the super crude version initialize all other cost[0][R][B] to BIG.
You could super crudely just loop through in increasing J sequence and whatever R,B sequence you like computing all of those.
At the end, we can find the answer as:
min( min(cost[N][H][all odd]), black[N]+min(cost[N][H][all even]) )
But half the R values aren't really part of the problem. In the first half any R>J are impossible and in the second half any R<J+H-N are useless. You can easily avoid computing those. With a slightly smarter accessor function, you could avoid using the positions you never computed in the boundary cases of ones you do need to compute.
If any new cost[J][R][B] is not smaller than a cost of the same J, R, and S but lower B, that new cost is useless data. If the last dim of the structure were map instead of array, we could easily compute in a sequence that drops that useless data from both the storage space and the time. But that reduced time is then multiplied by log of the average size (up to P) of those maps. So probably a win on average case, but likely a loss on worst case.
Give a little thought to the data type needed for cost and the value needed for BIG. If some precise value in that data type is both as big as the longest path and as small as half the max value that can be stored in that data type, then that is a trivial choice for BIG. Otherwise you need a more careful choice to avoid any rounding or truncation.
If you followed all that, you probably will understand one of the better ways that I thought was too hard to explain: This will double the element size but cut the element count to less than half. It will get all the benefits of the std::map tweak to the basic design without the log(P) cost. It will cut the average time way down without hurting the time of pathological cases.
Define a struct CB that contains cost and black count. The main storage is a vector<vector<CB>>. The outer vector has one position for every valid J,R combination. Those are in a regular pattern so we could easily compute the position in the vector of a given J,R or the J,R of a given position. But it is faster to keep those incrementally so J and R are implied rather than directly used. The vector should be reserved to its final size, which is approx N^2/4. It may be best if you pre compute the index for H,0
Each inner vector has C,B pairs in strictly increasing B sequence and within each S, strictly decreasing C sequence . Inner vectors are generated one at a time (in a temp vector) then copied to their final location and only read (not modified) after that. Within generation of each inner vector, candidate C,B pairs will be generated in increasing B sequence. So keep the position of bestOdd and bestEven while building the temp vector. Then each candidate is pushed into the vector only if it has a lower C than best (or best doesn't exist yet). We can also treat all B<P+J-N as if B==S so lower C in that range replaces rather than pushing.
The implied (never stored) J,R pairs of the outer vector start with (0,0) (1,0) (1,1) (2,0) and end with (N-1,H-1) (N-1,H) (N,H). It is fastest to work with those indexes incrementally, so while we are computing the vector for implied position J,R, we would have V as the actual position of J,R and U as the actual position of J-1,R and minU as the first position of J-1,? and minV as the first position of J,? and minW as the first position of J+1,?
In the outer loop, we trivially copy minV to minU and minW to both minV and V, and pretty easily compute the new minW and decide whether U starts at minU or minU+1.
The loop inside that advances V up to (but not including) minW, while advancing U each time V is advanced, and in typical positions using the vector at position U-1 and the vector at position U together to compute the vector for position V. But you must cover the special case of U==minU in which you don't use the vector at U-1 and the special case of U==minV in which you use only the vector at U-1.
When combining two vectors, you walk through them in sync by B value, using one, or the other to generate a candidate (see above) based on which B values you encounter.
Concept: Assuming you understand how a value with implied J,R and explicit C,B is stored: Its meaning is that there exists a path to column J at cost C using exactly R red branches and exactly B black branches and there does not exist exists a path to column J using exactly R red branches and the same S in which one of C' or B' is better and the other not worse.

Your exponential algorithm is essentially a depth-first search tree, where you keep track of the cost as you descend.
You could make it branch-and-bound by keeping track of the best solution seen so far, and pruning any branches that would go beyond the best so far.
Or, you could make it a breadth-first search, ordered by cost, so as soon as you find any solution, it is among the best.
The way I've done this in the past is depth-first, but with a budget.
I prune any branches that would go beyond the budget.
Then I run if with budget 0.
If it doesn't find any solutions, I run it with budget 1.
I keep incrementing the budget until I get a solution.
This might seem like a lot of repetition, but since each run visits many more nodes than the previous one, the previous runs are not significant.
This is exponential in the cost of the solution, not in the size of the network.

Something wrong with BFS maze solving algorithm in OCaml

http://ideone.com/QXyVzR
The above link contains a program I wrote to solve mazes using a BFS algorithm. The maze is represented as a 2D array, initially passed in as numbers, (0's represent an empty block which can be visited, any other number represent a "wall" block), and then converted into a record type which I defined, which keeps track of various data:
type mazeBlock = {
walkable : bool;
isFinish : bool;
visited : bool;
prevCoordinate : int * int
}
The output is a list of ordered pairs (coordinates/indices) which trace a shortest path through the maze from the start to the finish, the coordinates of which are both passed in as parameters.
It works fine for smaller mazes with low branching factor, but when I test it on larger mazes (say 16 x 16 or larger), especially on ones with no walls(high branching factor) it takes up a LOT of time and memory. I am wondering if this is inherent to the algorithm or related to the way I implemented it. Can any OCaml hackers out there offer me their expertise?
Also, I have very little experience with OCaml so any advice on how to improve the code stylistically would be greatly appreciated. Thanks!
EDIT:
http://ideone.com/W0leMv
Here is an cleaned-up, edited version of the program. I fixed some stylistic issues, but I didn't change the semantics. As usual, the second test still takes up a huge amount of resources and cannot seem to finish at all. Still seeking help on this issue...
EDIT2:
SOLVED. Thanks so much to both answerers. Here is the final code:
http://ideone.com/3qAWnx

In your critical section, that is mazeSolverLoop, you should only visited elements that have not been visited before. When you take the element from the queue, you should first check if the element has been visited, and in that case do nothing but recurse to get the next element. This is precisely what makes the good time complexity of the algorithm (you never visit a place twice).
Otherwise, yes, your OCaml style could be improved. Some remarks:
the convention in OCaml-land is rather to write_like_this instead of writeLikeThis. I recommend that you follow it, but admittedly that is a matter of taste and not an objective criterion.
there is no point in returning a datastructure if it is a mutable structure that was updated; why do you make a point to always return a (grid, pair) queue, when it is exactly the same as the input? You could just have those functions return unit and have code that is simpler and easier to read.
the abstraction level allowed by pairs is good and you should preserve it; you currently don't. There is no point in writing for example, let (foo, bar) = dimension grid in if in_bounds pos (foo, bar). Just name the dimension dim instead of (foo, bar), it makes no sense to split it in two components if you don't need them separately. Remark that for the neighbor, you do use neighborX and neighborY for array access for now, but that is a style mistake: you should have auxiliary functions to get and set values in an array, taking a pair as input, so that you don't have to destruct the pair in the main function. Try to keep all the code inside a single function at the same level of abstraction: all working on separate coordinates, or all working on pairs (named as such instead of being constructed/deconstructed all the time).

If I understand you right, for an N x N grid with no walls you have a graph with N^2 nodes and roughly 4*N^2 edges. These don't seem like big numbers for N = 16.
I'd say the only trick is to make sure you track visited nodes properly. I skimmed your code and don't see anything obviously wrong in the way you're doing it.
Here is a good OCaml idiom. Your code says:
let isFinish1 = mazeGrid.(currentX).(currentY).isFinish in
let prevCoordinate1 = mazeGrid.(currentX).(currentY).prevCoordinate in
mazeGrid.(currentX).(currentY) <-
{ walkable = true;
isFinish = isFinish1;
visited = true;
prevCoordinate = prevCoordinate1}
You can say this a little more economically as follows:
mazeGrid.(currentX).(currentY) <-
{ mazeGrid.(currentX).(currentY) with visited = true }

surrounding objects algorithm

I'm working on a game where exactly one object may exist at location (x, y) where x and y are ints. For example, an object may exist at (0, 0) or it may not, but it is not possible for multiple objects to exist there at once.
I am trying to decide which STL container to use for the problem at hand and the best way to solve this problem.
Basically, I start with an object and its (x, y) location. The goal is to determine the tallest, largest possible rectangle based on that object's surrounding objects. The rectangle must be created by using all objects above and below the current object. That is, it must be the tallest that it can possibly be based on the starting object position.
For example, say the following represents my object grid and I am starting with the green object at location (3, 4):
Then, the rectangle I am looking for would be represented by the pink squares below:
So, assuming I start with the object at (3, 4) like the example shows, I will need to check if objects also exist at (2, 4), (4, 4), (3, 3), and (3, 5). If an object exists at any of those locations, I need to repeat the process for the object to find the largest possible rectangle.
These objects are rather rare and the game world is massive. It doesn't seem practical to just new a 2D array for the entire game world since most of the elements would be empty. However, I need to be to index into any position to check if an object is there at any time.
Instead, I thought about using a std::map like so:
std::map< std::pair<int, int>, ObjectData> m_objects;
Then, as I am checking the surrounding objects, I could use map::find() in my loop, checking if the surrounding objects exist:
if(m_objects.find(std::pair<3, 4>) != m_objects.end())
{
//An object exists at (3, 4).
//Add it to the list of surrounding objects.
}
I could potentially be making a lot of calls to map::find() if I decide to do this, but the map would take up much less memory than newing a 2D array of the entire world.
Does anyone have any advice on a simple algorithm I could use to find what I am looking for? Should I continue using a std::map or is there a better container for a problem like this?

How much data do you need to store at each grid location? If you are simply looking for a flag that indicates neighbors you have at least two "low tech" solutions
a) If your grid is sparse, how about each square keeps a neighbor list? So each square knows which neighboring squares are occupied. You'll have some work to do to maintain the lists when a square is occupied or vacated. But neighbor lists mean you don't need a grid map at all
b) If the grid map locations are truly just points, use 1 bit per grid location. The results map will be 8x8=64 times smaller that one that uses bytes for each grid point. Bit operations are lightening fast. A 10,000x10,000 map will take 100,000,000 bits or 12.5MB (approx)

An improvement would be to use a hashmap, if possible. This would allow you to at least do your potential extensive searches with an expected time complexity of O(1).
There's a thread here ( Mapping two integers to one, in a unique and deterministic way) that goes into some detail about how to hash two integers together.
If your compiler supports C++11, you could use std::unordered_map. If not, boost has basically the same thing: http://www.boost.org/doc/libs/1_38_0/doc/html/boost/unordered_map.html

You may want to consider a spatial data structure. If the data is 'sparse', as you say, then doing a quadtree neighbourhood search might save you a lot of processing power. I would personally use an R-tree, but that's most likely because I have an R-tree library that I've written and can easily import.
For example, suppose you have a 1000x1000 grid with 10,000 elements. Assuming for the moment, a uniformly-random distribution, we would (based on the density) expect no more than, say . . . a chain of three to five objects touching in either dimension (at this density, a chain of three vertically-oriented objects will happen with probability 0.01% of the time). Suppose the object under consideration is located at (x,y). A window search, starting at (x-5,y-5) and going to (x+5,y+5) would give you a list of at most 121 elements to perform a linear search through. If your rect-picking algorithm notices that it would be possible to form a taller rectangle (i.e. if a rect under consideration touches the edges of this 11x11 bounding box), just repeat the window search for another 5x5 region in one direction of the original. Repeat as necessary.
This, of course, only works well when you have extremely sparse data. It might be worth adapting an R-tree such that the leaves are an assoc. data structure (i.e. Int -> Int -> Object), but at that point it's probably best to just find a solution that works on denser data.
I'm likely over-thinking this; there is likely a much simpler solution around somewhere.
Some references on R-trees:
The original paper, for the original algorithms.
The Wikipedia page, which has some decent overview on the topic.
The R-tree portal, for datasets and algorithms relating to R-trees.
I'll edit this with a link to my own R-tree implementation (public domain) if I ever get around to cleaning it up a little.

This sounds suspiciously like a homework problem (because it's got that weird condition "The rectangle must be created by using all objects above and below the current object" that makes the solution trivial). But I'll give it a shot anyway. I'm going to use the word "pixel" instead of "object", for convenience.
If your application really deserves heavyweight solutions, you might try storing the pixels in a quadtree (whose leaves contain plain old 2D arrays of just a few thousand pixels each). Or you might group contiguous pixels together into "shapes" (e.g. your example would consist of only one "shape", even though it contains 24 individual pixels). Given an initial unstructured list of pixel coordinates, it's easy to find these shapes; google "union-find". The specific benefit of storing contiguous shapes is that when you're looking for largest rectangles, you only need to consider those pixels that are in the same shape as the initial pixel.
A specific disadvantage of storing contiguous shapes is that if your pixel-objects are moving around (e.g. if they represent monsters in a roguelike game), I'm not sure that the union-find data structure supports incremental updates. You might have to run union-find on every "frame", which would be pretty bad.
Anyway... let's just say you're using a std::unordered_map<std::pair<int,int>, ObjectData*>, because that sounds pretty reasonable to me. (You should almost certainly store pointers in your map, not actual objects, because copying around all those objects is going to be a lot slower than copying pointers.)
typedef std::pair<int, int> Pt;
typedef std::pair<Pt, Pt> Rectangle;
std::unordered_map<Pt, ObjectData *> myObjects;
/* This helper function checks a whole vertical stripe of pixels. */
static bool all_pixels_exist(int x, int min_y, int max_y)
{
assert(min_y <= max_y);
for (int y = min_y; y <= max_y; ++y) {
if (myObjects.find(Pt(x, y)) == myObjects.end())
return false;
}
return true;
}
Rectangle find_tallest_rectangle(int x, int y)
{
assert(myObjects.find(Pt(x,y)) != myObjects.end());
int top = y;
int bottom = y;
while (myObjects.find(Pt(x, top-1) != myObjects.end()) --top;
while (myObjects.find(Pt(x, bottom+1) != myObjects.end()) ++bottom;
// We've now identified the first vertical stripe of pixels.
// The next step is to "paint-roller" that stripe to the left as far as possible...
int left = x;
while (all_pixels_exist(left-1, top, bottom)) --left;
// ...and to the right.
int right = x;
while (all_pixels_exist(right+1, top, bottom)) ++right;
return Rectangle(Pt(top, left), Pt(bottom, right));
}

Is there a data structure with these characteristics?

I'm looking for a data structure that would allow me to store an M-by-N 2D matrix of values contiguously in memory, such that the distance in memory between any two points approximates the Euclidean distance between those points in the matrix. That is, in a typical row-major representation as a one-dimensional array of M * N elements, the memory distance differs between adjacent cells in the same row (1) and adjacent cells in neighbouring rows (N).
I'd like a data structure that reduces or removes this difference. Really, the name of such a structure is sufficient—I can implement it myself. If answers happen to refer to libraries for this sort of thing, that's also acceptable, but they should be usable with C++.
I have an application that needs to perform fast image convolutions without hardware acceleration, and though I'm aware of the usual optimisation techniques for this sort of thing, I feel a specialised data structure or data ordering could improve performance.

Given the requirement that you want to store the values contiguously in memory, I'd strongly suggest you research space-filling curves, especially Hilbert curves.
To give a bit of context, such curves are sometimes used in database indexes to improve the locality of multidimensional range queries (e.g., "find all items with x/y coordinates in this rectangle"), thereby aiming to reduce the number of distinct pages accessed. A bit similar to the R-trees that have been suggested here already.
Either way, it looks that you're bound to an M*N array of values in memory, so the whole question is about how to arrange the values in that array, I figure. (Unless I misunderstood the question.)
So in fact, such orderings would probably still only change the characteristics of distance distribution.. average distance for any two randomly chosen points from the matrix should not change, so I have to agree with Oli there. Potential benefit depends largely on your specific use case, I suppose.

I would guess "no"! And if the answer happens to be "yes", then it's almost certainly so irregular that it'll be way slower for a convolution-type operation.
EDIT
To qualify my guess, take an example. Let's say we store a[0][0] first. We want a[k][0] and a[0][k] to be similar distances, and proportional to k, so we might choose to interleave the storage of first row and first column (i.e. a[0][0], a[1][0], a[0][1], a[2][0], a[0][2], etc.) But how do we now do the same for e.g. a[1][0]? All the locations near it in memory are now taken up by stuff that's near a[0][0].
Whilst there are other possibilities than my example, I'd wager that you always end up with this kind of problem.
EDIT
If your data is sparse, then there may be scope to do something clever (re Cubbi's suggestion of R-trees). However, it'll still require irregular access and pointer chasing, so will be significantly slower than straightforward convolution for any given number of points.

You might look at space-filling curves, in particular the Z-order curve, which (mostly) preserves spatial locality. It might be computationally expensive to look up indices, however.
If you are using this to try and improve cache performance, you might try a technique called "bricking", which is a little bit like one or two levels of the space filling curve. Essentially, you subdivide your matrix into nxn tiles, (where nxn fits neatly in your L1 cache). You can also store another level of tiles to fit into a higher level cache. The advantage this has over a space-filling curve is that indices can be fairly quick to compute. One reference is included in the paper here: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.30.8959

This sounds like something that could be helped by an R-tree. or one of its variants. There is nothing like that in the C++ Standard Library, but looks like there is an R-tree in the boost candidate library Boost.Geometry (not a part of boost yet). I'd take a look at that before writing my own.

It is not possible to "linearize" a 2D structure into an 1D structure and keep the relation of proximity unchanged in both directions. This is one of the fundamental topological properties of the world.
Having that that, it is true that the standard row-wise or column-wise storage order normally used for 2D array representation is not the best one when you need to preserve the proximity (as much as possible). You can get better result by using various discrete approximations of fractal curves (space-filling curves).
Z-order curve is a popular one for this application: http://en.wikipedia.org/wiki/Z-order_(curve)
Keep in mind though that regardless of which approach you use, there will always be elements that violate your distance requirement.

You could think of your 2D matrix as a big spiral, starting at the center and progressing to the outside. Unwind the spiral, and store the data in that order, and distance between addresses at least vaguely approximates Euclidean distance between the points they represent. While it won't be very exact, I'm pretty sure you can't do a whole lot better either. At the same time, I think even at very best, it's going to be of minimal help to your convolution code.

The answer is no. Think about it - memory is 1D. Your matrix is 2D. You want to squash that extra dimension in - with no loss? It's not going to happen.
What's more important is that once you get a certain distance away, it takes the same time to load into cache. If you have a cache miss, it doesn't matter if it's 100 away or 100000. Fundamentally, you cannot get more contiguous/better performance than a simple array, unless you want to get an LRU for your array.

I think you're forgetting that distance in computer memory is not accessed by a computer cpu operating on foot :) so the distance is pretty much irrelevant.
It's random access memory, so really you have to figure out what operations you need to do, and optimize the accesses for that.

You need to reconvert the addresses from memory space to the original array space to accomplish this. Also, you've stressed distance only, which may still cause you some problems (no direction)
If I have an array of R x C, and two cells at locations [r,c] and [c,r], the distance from some arbitrary point, say [0,0] is identical. And there's no way you're going to make one memory address hold two things, unless you've got one of those fancy new qubit machines.
However, you can take into account that in a row major array of R x C that each row is C * sizeof(yourdata) bytes long. Conversely, you can say that the original coordinates of any memory address within the bounds of the array are
r = (address / C)
c = (address % C)
so
r1 = (address1 / C)
r2 = (address2 / C)
c1 = (address1 % C)
c2 = (address2 % C)
dx = r1 - r2
dy = c1 - c2
dist = sqrt(dx^2 + dy^2)
(this is assuming you're using zero based arrays)
(crush all this together to make it run more optimally)
For a lot more ideas here, go look for any 2D image manipulation code that uses a calculated value called 'stride', which is basically an indicator that they're jumping back and forth between memory addresses and array addresses

This is not exactly related to closeness but might help. It certainly helps for minimation of disk accesses.
one way to get better "closness" is to tile the image. If your convolution kernel is less than the size of a tile you typical touch at most 4 tiles at worst. You can recursively tile in bigger sections so that localization improves. A Stokes-like (At least I thinks its Stokes) argument (or some calculus of variations ) can show that for rectangles the best (meaning for examination of arbitrary sub rectangles) shape is a smaller rectangle of the same aspect ratio.
Quick intuition - think about a square - if you tile the larger square with smaller squares the fact that a square encloses maximal area for a given perimeter means that square tiles have minimal boarder length. when you transform the large square I think you can show you should the transform the tile the same way. (might also be able to do a simple multivariate differentiation)
The classic example is zooming in on spy satellite data images and convolving it for enhancement. The extra computation to tile is really worth it if you keep the data around and you go back to it.
Its also really worth it for the different compression schemes such as cosine transforms. (That's why when you download an image it frequently comes up as it does in smaller and smaller squares until the final resolution is reached.
There are a lot of books on this area and they are helpful.

how to create a 20000*20000 matrix in C++

I try to calculate a problem with 20000 points, so there is a distance matrix with 20000*20000 elements, how can I store this matrix in C++? I use Visual Studio 2008, on a computer with 4 GB of RAM. Any suggestion will be appreciated.

A sparse matrix may be what you looking for. Many problems don't have values in every cell of a matrix. SparseLib++ is a library which allows for effecient matrix operations.

Avoid the brute force approach you're contemplating and try to envision a solution that involves populating a single 20000 element list, rather than an array that covers every possible permutation.
For starters, consider the following simplistic approach which you may be able to improve upon, given the specifics of your problem:
int bestResult = -1; // some invalid value
int bestInner;
int bestOuter;
for ( int outer = 0; outer < MAX; outer++ )
{
for ( int inner = 0; inner < MAX; inner++ )
{
int candidateResult = SomeFunction( list[ inner ], list[ outer ] );
if ( candidateResult > bestResult )
{
bestResult = candidateResult;
bestInner = inner;
bestOuter = outer;
}
}
}

You can represent your matrix as a single large array. Whether it's a good idea to do so is for you to determine.
If you need four bytes per cell, your matrix is only 4*20000*20000, that is, 1.6GB. Any platform should give you that much memory for a single process. Windows gives you 2GiB by default for 32-bit processes -- and you can play with the linker options if you need more. All 32-bit unices I tried gave you more than 2.5GiB.

Is there a reason you need the matrix in memory?
Depending on the complexity of calculations you need to perform you could simply use a function that calculates your distances on the fly. This could even be faster than precalculating ever single distance value if you would only use some of them.

Without more references to the problem at hand (and the use of the matrix), you are going to get a lot of answers... so indulge me.
The classic approach here would be to go with a sparse matrix, however the default value would probably be something like 'not computed', which would require special handling.
Perhaps that you could use a caching approach instead.
Apparently I would say that you would like to avoid recomputing the distances on and on and so you'd like to keep them in this huge matrix. However note that you can always recompute them. In general, I would say that trying to store values that can be recomputed for a speed-off is really what caching is about.
So i would suggest using a distance class that abstract the caching for you.
The basic idea is simple:
When you request a distance, either you already computed it, or not
If computed, return it immediately
If not computed, compute it and store it
If the cache is full, delete some elements to make room
The practice is a bit more complicated, of course, especially for efficiency and because of the limited size which requires an algorithm for the selection of those elements etc...
So before we delve in the technical implementation, just tell me if that's what you're looking for.

Your computer should be able to handle 1.6 GB of data (assuming 32bit)
size_t n = 20000;
typedef long dist_type; // 32 bit
std::vector <dist_type> matrix(n*n);
And then use:
dist_type value = matrix[n * y + x];

You can (by using small datatypes), but you probably don't want to.
You are better off using a quad tree (if you need to find the nearest N matches), or a grid of lists (if you want to find all points within R).
In physics, you can just approximate distant points with a field, or a representative amalgamation of points.
There's always a solution. What's your problem?

Man you should avoid the n² problem...
Put your 20 000 points into a voxel grid.
Finding closest pair of points should then be something like n log n.

As stated by other answers, you should try hard to either use sparse matrix or come up with a different algorithm that doesn't need to have all the data at once in the matrix.
If you really need it, maybe a library like stxxl might be useful, since it's specially designed for huge datasets. It handles the swapping for you almost transparently.

Thanks a lot for your answers. What I am doing is to solve a vehicle routing problem with about 20000 nodes. I need one matrix for distance, one matrix for a neighbor list (for each node, list all other nodes according to the distance). This list will be used very often to find who can be some candidates. I guess sometimes distances matrix can be ommited if we can calculate when we need. But the neighbor list is not convenient to create every time. the list data type could be int.
To mgb:
how much can a 64 bit windows system help this situation?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js