Correct way of handling edges of a matrix - c++

Supposed we have to write a simple program that transforms a matrix. Each element should be the sum of its neighbor-elements.
What's the "correct" (i.e. most common, has best readability, most effective) way to do this, considering the edges of a matrix?
Two obvious obvious ways of achieving this that I can think of:
Handle corners first (4 separate lines), use 4 loops to do the remaining edges, then use the standard loop for the rest
Use one loop for the whole matrix with if's to check if we're in the middle or it's an edge-case.
The first one is faster (I guess), but it kinda looks off to me to have 4 lines plus 5 loops for this.
Is there a more elegant way? I tagged this as C++ because I'm coding in C++ currently and I have the feeling that the ternary operator ?: is gonna come in handy to write a cute solution.
Bonus points if your solution can be tweaked for a more complex rule (not just looking one up/right/left/down cell, but if you're doing a certain kind of recursion). Not sure if it would change things much, though.

One elegant way of going about it, is to use a larger matrix. If your matrix has NxM elements, make a temporary (N+2)x(M+2) matrix, fill it with zeros and then copy your values like so:
temp(i+1,j+1) <- original(i,j)
Now you actually have your original matrix with zeroed-out edges around it. You can now safely calculate the sum of all neighbors of all the non-edge cells in the temporary matrix. The result will be the matrix you were originally looking for.
Note - this will be less efficient than the straight-forward five-loop-solution you proposed.

Related

Advantages of a bit matrix over a bitmap

I want to create a simple representation of an environment that basically just represents if at a certain position is an object or is not.
I would thus only need a big matrix filled with 1's and 0'. It is important to work effectively on this matrix, since I am going to have random positioned get and set operations on it, but also iterate over the whole matrix.
What would be the best solution for this?
My approach would be to create a vector of vectors containing bit elements. Otherwise, would there be an advantage of using a bitmap?
Note that while std::vector<bool> may consume less memory it is also slower than std::vector<char> (depending on the use case), because of all the bitwise operations. As with any optimization questions, there is only one answer: try different solutions and profile properly.

Representation of a symmetric diagonal matrix

Lets assume we have a huge symmetric diagonal matrix. What is the efficient way to implement this?
The only way that i could think of is that by using the symmetric property where Xij = Xji, we can reduce the size of this matrix by half. But then representing this matrix using a 2D array would be inefficient, since we cant reduce the matrix size by using arrays.
Another thing representing this matrix using adjacency list also would be inefficient, because relating this matrix to a graph. It would be a density graph. And the operation of adj list takes lots of time such as removing, inserting and searching.
But what about using heaps?
There is no one answer until you decide what you are going to do with this matrix (or maybe matrices?).
If you are just going to store and remember it, then just store it sequentially, leaving out the redundant entries. (Your code knows how to access it, because that is all it does, right?)
More probably, you want to do normal matrix operations on it. In that case, are you trying to make the storage efficient, or the execution? In the later case, I don't see many opportunities based on it being symmetric--the multiplies are the expensive thing and you probably still need all of those. If it is the storage, then are you limiting yourself to operations that only take symmetric in and symmetric out? Sounds awfully specific. If so, then you only need to do the calculations for the part you are storing, because, by definition the other entries are symmetric, so just write your code to generate that part of the matrix and you are done.

Eigen: Efficient equivalent to MATLAB's changem()?

I am needing to perform an operation on an Eigen VectorXi, which is equivalent to MATLAB's changem():
http://www.mathworks.com/help/map/ref/changem.html
At the moment, the way I am doing this is looping over the values in the array and performing the remapping with a switch/case block. I am guessing this is not particularly efficient.
Is there a fast way to do this with Eigen? Speed is critical for my application.
Switch / case will be particularly slow and inflexible.
changem takes a matrix and two vectors of values, new and old. If an entry is found in the old list, it is replaced by the corresponding entry in the new list. So it's inherently going to be rather slow, you need to pass over the entire matrix, search the old list, and if, and entry is found, replace with the new list.
How can you speed it up? First, don't hardcode as a switch / case. A modern compiler will possibly optimise to a loop rather than lots of jumps, but I wouldn't guarantee it. And the approach is inflexible.
Secondly, you can sort the "old" vector and use a binary search rather than a linear one. That will only help significantly if the old vector is long.
Thirdly, you can take advantage of what you know about the matrix. Are the old values constrained to lie in certain regions? Is there one value which is overwhelmingly likely and can be tested for first? Can you quickly exclude some values as not allowed in the old list (Too big, too small, not integral).
Are the old values integers and can you use indexing? Or generalise that to hashing. That would be even faster than a binary search, though with more overhead for hashing.
Can you solve the problem another way and keep an index of matrix xy co-ordinates by value?
There are lots of approaches. But simply implement the Matlab function naively in C as the first step. It might well be fast enough.

What's the most efficient way to store a subset of column indices of big matrix and in C++?

I am working with a very big matrix X (say, 1,000-by-1,000,000). My algorithm goes like following:
Scan the columns of X one by one, based on some filtering rules, to identify only a subset of columns that are needed. Denote the subset of indices of columns be S. Its size depends on the filter, so is unknown before computation and will change if the filtering rules are different.
Loop over S, do some computation with a column x_i if i is in S. This step needs to be parallelized with openMP.
Repeat 1 and 2 for 100 times with changed filtering rules, defined by a parameter.
I am wondering what the best way is to implement this procedure in C++. Here are two ways I can think of:
(a) Use a 0-1 array (with length 1,000,000) to indicate needed columns for Step 1 above; then in Step 2 loop over 1 to 1,000,000, use if-else to check indicator and do computation if indicator is 1 for that column;
(b) Use std::vector for S and push_back the column index if identified as needed; then only loop over S, each time extract column index from S and then do computation. (I thought about using this way, but it's said push_back is expensive if just storing integers.)
Since my algorithm is very time-consuming, I assume a little time saving in the basic step would mean a lot overall. So my question is, should I try (a) or (b) or other even better way for better performance (and for working with openMP)?
Any suggestions/comments for achieving better speedup are very appreciated. Thank you very much!
To me, it seems that "step #1 really does not matter much." (At the end of the day, you're going to wind up with: "a set of columns, however represented.")
To me, what's really going to matter is: "just what's gonna happen when you unleash ("parallelized ...") step #2.
"An array of 'ones and zeros,'" however large, should be fairly simple for parallelization, while a more-'advanced' data structure might well, in this case, "just get in the way."
"One thousand mega-bits, these days?" Sure. Done. No problem. ("And if not, a simple array of bit-sets.") However-many simultaneously executing entities should be able to navigate such a data structure, in parallel, with a minimum of conflict . . . Therefore, to my gut, "big bit-sets win."
I think you will find std::vector easier to use. Regarding push_back, the cost is when the vector reallocates (and maybe copies) the data. To avoid that (if it matters), you could set vector::capacity to 1,000,000. Your vector is then 8 MB, insignificant compared to your problem size. It's only 1 order magnitude bigger than a bitmap would be, and a lot simpler to deal with: If we call your vector S and the nth interesting column i, then your column access is just x[S[i]].
(Based on my gut feeling) I'd probably go for pushing back into a vector, but the answer is quite simple: Measure both methods (they are both trivial to implement). Most likely you won't see a noticeable difference.

How do we find a biggest white rectangle in a n x n bitmap ?

Any idea on how to solve such problems (in C++)-
like which is the best Algorithm to use.
Say you have a n x n rectangular area black and white (O and 1) pixels and you're looking for the biggest white rectangle in this area.
I would write something simple like below:
first pass: create a set of 1 line segments for each pixel row.
second pass aggregate rectangles:
for each segment iterate on rows to find the largest rectangle containing it.
if you use another segment in the process mark it as used, not need to try it again
at any point keep only the largest rectangle found
That's only a first draft of a possible solution. It should be rewritten using a more formal algorithmic syntax and many details should be provided. Each step hides pitfalls to avoid if you want to be efficient. But it should not be too hard to code.
If I did not missed something, what I described above should basically be O(n4) in the worst case, with the first pass O(n2) used to find horizontal segments (could be quite fast with a very small loop) and the second pass probably much less thant O(n4) in practice (depends on segment size, really is nb_total_segment x nb_segment_per_line x nb_overlapping_segment).
That looks not bad to me. Can't see any obvious way to do it with better O complexity, (but of course there may be some way, O(n4) is not that good).
If you provide some details on input structure and expected result it may even be some fun to code.
What you ask for is known as blob filtering on the computer vision world.
http://en.wikipedia.org/wiki/Blob_extraction
http://en.wikipedia.org/wiki/Connected_component_labeling
http://www.aforgenet.com/framework/features/blobs_processing.html