Determine all square sub matrices of a given NxN matrix in C++ - c++

GIven an NxN square matrix, I would like to determine all possible square sub matrices by removing equal number of rows and columns.
In order to determine all possible 2x2 matrices I need to loop 4 times. Similarly for 3x3 matrices I need to loop 6 times and so on. Is there a way to generate code in C++ so that the code for the loops is generated dynamically? I have checked some answers related to code generation in C++, but most of them use python in it. I have no idea regarding python. So, is it possible to write code to generate code in C++?

If I get what you are saying, you mean you require M loops to choose M rows, and M loops for M columns for an M x M sub matrix, 1 <= M <= N
You don't need 2*M loops to do this. No need to dynamically generate code with an ever-increasing number of loops!
Essentially, you need to "combine" all possible combinations of i_{1}, i_{2}, ..., i_{M} and j_{1}, j_{2}, ..., j_{M} such that 1 <= i_{1} < i_{2} < ... < i_{M} <= N (and similarly for j)
If you have all possible combinations of all such i_{1}, ..., i_{M} you are essentially done.
Say for example you are working with a 10 x 10 matrix and you require 4 x 4 sub matrices.
Suppose you selected rows {1, 2, 3, 4} and columns {1, 2, 3, 4} initially. Next select column {1, 2, 3, 5}. Next {1, 2, 3, 6} and so on till {1, 2, 3, 10}. Next select {1, 2, 4, 5}, next {1, 2, 4, 6} and so on till you reach {7, 8, 9, 10}. This is one way you could generate all ("10 choose 4") combinations in a sequence.
Go ahead, write a function that generates this sequence and you are done. It can take as input M, N, current combination (as an array of M values) and return the next combination.
You need to call this sequence to select the next row and the next column.
I have put this a little loosely. If something is not clear I can edit to update my answer.
Edit:
I will be assuming loop index starts from 0 (the C++ way!). To elaborate the algorithm further, given one combination as input the next combination can be generated by treating the combination as a "counter" of sorts (except that no digit repeats).
Disclaimer : I have not run or tested the below snippet of code. But the idea is there for you to see. Also, I don't use C++ anymore. Bear with me for any mistakes.
// Requires M <= N as input, (N as in N x N matrix)
void nextCombination( int *currentCombination, int M, int N ) {
int *arr = currentCombination;
for( int i = M - 1; i >= 0; i-- ) {
if( arr[i] < N - M + i ) {
arr[i]++;
for( i = i + 1, i < M; i++ ) {
arr[i] = arr[i - 1] + 1;
}
break;
}
}
}
// Write code for Initialization: arr = [0, 1, 2, 3]
nextCombination( arr, 4, 10 );
// arr = [0, 1, 2, 4]
// You can check if the last combination has been reached by checking if arr[0] == N - M + 1. Please incorporate that into the function if you wish.
Edit:
Actually I want to check singularity of all possible sub matrices. My approach is to compute all submatrices and then find their determinants. How ever after computing the determinant of 2x2 matrices , I'll store them and use while computing determinants of 3x3 matrices. And so on. Can you suggest me a better approach. I have no space and time constraints. – vineel
A straight-forward approach using what you suggest is to index the determinants based on the the rows-columns combination that makes a sub matrix. At first store determinants for 1 x 1 sub matrices in a hash map (basically the entries themselves).
So the hash map would look like this for the 10 x 10 case
{
"0-0" : arr_{0, 0},
"0-1" : arr_{0, 1},
.
.
.
"1-0" : arr_{1, 0},
"1-1" : arr_{1, 1},
.
.
.
"9-9" : arr_{9, 9}
}
When M = 2, you can calculate determinant using the usual formula (the determinants for 1 x 1 sub matrices having been initialized) and then add to the hash map. The hash string for a 2 x 2 sub matrix would look something like 1:3-2:8 where the row indices in the original 10 x 10 matrix are 1,3 and the column indices are 2, 8. In general, for m x m sub matrix, the determinant can be determined by looking up all necessary (already) computed (m - 1) x (m - 1) determinants - this is a simple hash map lookup. Again, add the determinant to hash map once calculated.
Of course, you may need to slightly modify the nextCombination() function - it currently assumes row and column indices run from 0 to N - 1.
On another note, since all sub matrices are to be processed starting from 1 x 1, you don't need something like a nextCombination() function. Given a 2 x 2 matrix, you just need to select one more row and column to form a 3 x 3 matrix. So you need to select one row-index (that's not part of the row indices that make the 2 x 2 sub matrix) and similarly one column-index. But doing this for every 2 x 2 matrix will generate duplicate 3 x 3 matrices - you need to think of some way to eliminate duplicates. One way to avoid duplicates is by choosing only such row/column whose index is greater than the highest row/column index in the sub matrix.
Again I have loosely defined the idea. You can build upon it.

Related

Find the same numbers between [a,b] intervals

Suppose I have 3 array of consecutive numbers
a = [1, 2, 3]
b = [2, 3, 4]
c = [3, 4]
Then the same number that appears in all 3 arrays is 3.
My algorithm is to use two for loops in each other to check for the same array and push it in another array (let's call it d). Then
d = [2, 3] (d = a overlap b)
And use it again to check for array d and c => The final result is 1, cause there are only 1 numbers that appears in all 3 arrays.
e = [3] (e = c overlap d) => e.length = 1
Other than that, if there exists only 1 array, then the algo should return the length of the array, as all of its numbers appear in itself. But I think my said algo above would take too long because the numbers of array can go up to 10^5. So, any idea of a better algorithm?
But I think my said algo above would take too long because the numbers of array can go up to 105. So, any idea of a better algorithm?
Yes, since these are ranges, you basically want to calculate the intersection of the ranges. This means that you can calculate the maximum m of all the first elements of the lists, and the minimum n of all the last elements of the list. All the numbers between m and n (both inclusive) are then members of all lists. If m>n, then there are no numbers in these lists.
You do not need to calculate the overlap by enumerating over the first list, and check if these are members of the last list. Since these are consecutive numbers, we can easily find out what the overlap is.
In short, the overlap of [a, ..., b] and [c, ..., d] is [ max(a,c), ..., min(b,d) ], there is no need to check the elements in between.

Every sum possibilities of elements

From a given array (call it numbers[]), i want another array (results[]) which contains all sum possibilities between elements of the first array.
For example, if I have numbers[] = {1,3,5}, results[] will be {1,3,5,4,8,6,9,0}.
there are 2^n possibilities.
It doesn't matter if a number appears two times because results[] will be a set
I did it for sum of pairs or triplet, and it's very easy. But I don't understand how it works when we sum 0, 1, 2 or n numbers.
This is what I did for pairs :
std::unordered_set<int> pairPossibilities(std::vector<int> &numbers) {
std::unordered_set<int> results;
for(int i=0;i<numbers.size()-1;i++) {
for(int j=i+1;j<numbers.size();j++) {
results.insert(numbers.at(i)+numbers.at(j));
}
}
return results;
}
Also, assuming that the numbers[] is sorted, is there any possibility to sort results[] while we fill it ?
Thanks!
This can be done with Dynamic Programming (DP) in O(n*W) where W = sum{numbers}.
This is basically the same solution of Subset Sum Problem, exploiting the fact that the problem has optimal substructure.
DP[i, 0] = true
DP[-1, w] = false w != 0
DP[i, w] = DP[i-1, w] OR DP[i-1, w - numbers[i]]
Start by following the above solution to find DP[n, sum{numbers}].
As a result, you will get:
DP[n , w] = true if and only if w can be constructed from numbers
Following on from the Dynamic Programming answer, You could go with a recursive solution, and then use memoization to cache the results, top-down approach in contrast to Amit's bottom-up.
vector<int> subsetSum(vector<int>& nums)
{
vector<int> ans;
generateSubsetSum(ans,0,nums,0);
return ans;
}
void generateSubsetSum(vector<int>& ans, int sum, vector<int>& nums, int i)
{
if(i == nums.size() )
{
ans.push_back(sum);
return;
}
generateSubsetSum(ans,sum + nums[i],nums,i + 1);
generateSubsetSum(ans,sum,nums,i + 1);
}
Result is : {9 4 6 1 8 3 5 0} for the set {1,3,5}
This simply picks the first number at the first index i adds it to the sum and recurses. Once it returns, the second branch follows, sum, without the nums[i] added. To memoize this you would have a cache to store sum at i.
I would do something like this (seems easier) [I wanted to put this in comment but can't write the shifting and removing an elem at a time - you might need a linked list]
1 3 5
3 5
-----
4 8
1 3 5
5
-----
6
1 3 5
3 5
5
------
9
Add 0 to the list in the end.
Another way to solve this is create a subset arrays of vector of elements then sum up each array's vector's data.
e.g
1 3 5 = {1, 3} + {1,5} + {3,5} + {1,3,5} after removing sets of single element.
Keep in mind that it is always easier said than done. A single tiny mistake along the implemented algorithm would take a lot of time in debug to find it out. =]]
There has to be a binary chop version, as well. This one is a bit heavy-handed and relies on that set of answers you mention to filter repeated results:
Split the list into 2,
and generate the list of sums for each half
by recursion:
the minimum state is either
2 entries, with 1 result,
or 3 entries with 3 results
alternatively, take it down to 1 entry with 0 results, if you insist
Then combine the 2 halves:
All the returned entries from both halves are legitimate results
There are 4 additional result sets to add to the output result by combining:
The first half inputs vs the second half inputs
The first half outputs vs the second half inputs
The first half inputs vs the second half outputs
The first half outputs vs the second half outputs
Note that the outputs of the two halves may have some elements in common, but they should be treated separately for these combines.
The inputs can be scrubbed from the returned outputs of each recursion if the inputs are legitimate final results. If they are they can either be added back in at the top-level stage or returned by the bottom level stage and not considered again in the combining.
You could use a bitfield instead of a set to filter out the duplicates. There are reasonably efficient ways of stepping through a bitfield to find all the set bits. The max size of the bitfield is the sum of all the inputs.
There is no intelligence here, but lots of opportunity for parallel processing within the recursion and combine steps.

Is there any number repeated in the array?

There's array of size n. The values can be between 0 and (n-1) as the indices.
For example: array[4] = {0, 2, 1, 3}
I should say if there's any number that is repeated more than 1 time.
For example: array[5] = {3,4,1,2,4} -> return true because 4 is repeated.
This question has so many different solutions and I would like to know if this specific solution is alright (if yes, please prove, else refute).
My solution (let's look at the next example):
array: indices 0 1 2 3 4
values 3 4 1 2 0
So I suggest:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
=> if in each condition, it's equal then I say that there is NO repeated number. otherwise, there IS a repeated number.
Is it correct? if yes, please prove it or show me a link. else, please refute it.
Why your algorithm is wrong?
Your solution is wrong, here is a counter example (there may be simpler ones, but I found this one quite quickly):
int arr[13] = {1, 1, 2, 3, 4, 10, 6, 7, 8, 9, 10, 11, 6};
The sum is 78, and the product is 479001600, if you take the normal array of size 13:
int arr[13] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
It also has a sum of 78 and a product of 479001600 so your algorithm does not work.
How to find counter examples?1
To find a counter example2 3:
Take an array from 0 to N - 1;
Pick two even numbers3 M1 > 2 and M2 > 2 between 0 and N - 1 and halve them;
Replace P1 = M1/2 - 1 by 2 * P1 and P2 = M2/2 + 1 by 2 * P2.
In the original array you have:
Product = M1 * P1 * M2 * P2
Sum = 0 + M1 + P1 + M2 + P2
= M1 + M1/2 - 1 + M2 + M2/2 + 1
= 3/2 * (M1 + M2)
In the new array you have:
Product = M1/2 * 2 * P1 + M2/2 * 2 * P2
= M1 * P1 * M2 * P2
Sum = M1/2 + 2P1 + M2/2 + 2P2
= M1/2 + 2(M1/2 - 1) + M2/2 + 2(M2/2 + 1)
= 3/2 * M1 - 2 + 3/2 * M2 + 2
= 3/2 * (M1 + M2)
So both array have the same sum and product, but one has repeated values, so your algorithm does not work.
1 This is one method of finding counter examples, there may be others (there are probably others).
2 This is not exactly the same method I used to find the first counter example - In the original method, I used only one number M and was using the fact that you can replace 0 by 1 without changing the product, but I propose a more general method here in order to avoid argument such as "But I can add a check for 0 in my algorithm.".
3 That method does not work with small array because you need to find 2 even numbers M1 > 2 and M2 > 2 such that M1/2 != M2 (and reciprocally) and M1/2 - 1 != M2/2 + 1, which (I think) is not possible for any array with a size lower than 14.
What algorithms do work?4
Algorithm 1: O(n) time and space complexity.
If you can allocate a new array of size N, then:
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
std::array<bool, N> rep = {0};
for (auto v: array) {
if (rep[v]) {
return true;
}
rep[v] = true;
}
return false;
}
Algorithm 2: O(nlog(n)) time complexity and O(1) space complexity, with a mutable array.
You can simply sort the array:
template <std::size_t N>
bool has_repetition (std::array<int, N> &array) {
std::sort(std::begin(array), std::end(array));
auto it = std::begin(array);
auto ne = std::next(it);
while (ne != std::end(array)) {
if (*ne == *it) {
return true;
}
++it; ++ne;
}
return false;
}
Algorithm 3: O(n^2) time complexity and O(1) space complexity, with non mutable array.
template <std::size_t N>
bool has_repetition (std::array<int, N> const& array) {
for (auto it = std::begin(array); it != std::end(array); ++it) {
for (auto jt = std::next(it); jt != std::end(array); ++jt) {
if (*it == *jt) {
return true;
}
}
}
return false;
}
4 These algorithms do work, but there may exist other ones that performs better - These are only the simplest ones I could think of given some "restrictions".
What's wrong with your method?
Your method computes some statistics of the data and compares them with those expected for a permutation (= correct answers). While a violation of any of these comparisons is conclusive (the data cannot satisfy the constraint), the inverse is not necessarily the case. You only look at two statistics, and these are too few for sufficiently large data sets. Owing to the fact that the data are integer, the smallest number of data for which your method may fail is larger than 3.
If you are searching duplicates in your array there is simple way:
int N =5;
int array[N] = {1,2,3,4,4};
for (int i = 0; i< N; i++){
for (int j =i+1; j<N; j++){
if(array[j]==array[i]){
std::cout<<"DUPLICATE FOUND\n";
return true;
}
}
}
return false;
Other simple way to find duplicates is using the std::set container for example:
std::set<int> set_int;
set_int.insert(5);
set_int.insert(5);
set_int.insert(4);
set_int.insert(4);
set_int.insert(5);
std::cout<<"\nsize "<<set_int.size();
the output will be 2, because there is 2 individual values
A more in depth explanation why your algorithm is wrong:
count the sum of the indices (4x5 / 2 = 10) and check that the values' sum (3+4+1+2+0) is equal to this sum. if not, there's repeated number.
Given any array A which has no duplicates, it is easy to create an array that meets your first requirement but now contains duplicates. Just take take two values and subtract one of them by some value v and add that value to the other one. Or take multiple values and make sure the sum of them stays the same. (As long as new values are still within the 0 .. N-1 range.) For N = 3 it is already possible to change {0,1,2} to {1,1,1}. For an array of size 3, there are 7 compositions that have correct sum, but 1 is a false positive. For an array of size 4 there are 20 out of 44 have duplicates, for an array of size 5 that's 261 out of 381, for an array of size 6 that's 3612 out of 4332, and so on. It is save to say that the number of false positives grows much faster than real positives.
in addition to the first condition, get the multiplication of the indices(except 0. so: 1x2x3x4) and check if it's equal to the values' multiplication (except 0, so: 3x4x1x2x0).
The second requirement involves the multiplication of all indices above 0. It is easy to realize this is could never be a very strong restriction either. As soon as one of the indices is not prime, the product of all indices is no longer uniquely tied to the multiplicands and a list can be constructed of different values with the same result. E.g. a pair of 2 and 6 can be replaced with 3 and 4, 2 and 9 can be replaced with 6 and 3 and so on. Obviously the number of false positives increases as the array-size gets larger and more non-prime values are used as multiplicands.
None of these requirements is really strong and the cannot compensate for the other. Since 0 is not even considered for the second restriction a false positive can be created fairly easy for arrays starting at size 5. any pair of 0 and 4 can simply be replaced with two 2's in any unique array, for example {2, 1, 2, 3, 2}
What you would need, is to have a result that is uniquely tight to the occurring values. You could tweak your second requirement to a more complex approach and skip over the non-prime values and take 0 into account. For example you could use the first prime as multiplicand (2) for 0, use 3 as multiplicand for 1, 5 as multiplicand for 2, and so on. That would work (you would not need the first requirement), but this approach would be overly complex. An simpler way to get a unique result would be to OR the i-th bit for each value (0 => 1 << 0, 1 => 1 << 1, 2 => 1 << 2, and so on. (Obviously it is faster to check wether a bit was already set by a reoccurring value, rather than wait for the final result. And this is conceptually the same as using a bool array/vector from the other examples!)

Number of swaps in a permutation [duplicate]

This question already has answers here:
Counting the adjacent swaps required to convert one permutation into another
(6 answers)
Closed 8 years ago.
Is there an efficient algorithm (efficient in terms of big O notation) to find number of swaps to convert a permutation P into identity permutation I? The swaps do not need to be on adjacent elements, but on any elements.
So for example:
I = {0, 1, 2, 3, 4, 5}, number of swaps is 0
P = {0, 1, 5, 3, 4, 2}, number of swaps is 1 (2 and 5)
P = {4, 1, 3, 5, 0, 2}, number of swaps is 3 (2 with 5, 3 with 5, 4 with 0)
One idea is to write an algorithm like this:
int count = 0;
for(int i = 0; i < n; ++ i) {
for(; P[i] != i; ++ count) { // could be permuted multiple times
std::swap(P[P[i]], P[i]);
// look where the number at hand should be
}
}
But it is not very clear to me whether that is actually guaranteed to terminate or whether it finds a correct number of swaps. It works on the examples above. I tried generating all permutation on 5 and on 12 numbers and it always terminates on those.
This problem arises in numerical linear algebra. Some matrix decompositions use pivoting, which effectively swaps row with the greatest value for the next row to be manipulated, in order to avoid division by small numbers and improve numerical stability. Some decompositions, such as the LU decomposition can be later used to calculate matrix determinant, but the sign of the determinant of the decomposition is opposite to that of the original matrix, if the number of permutations is odd.
EDIT: I agree that this question is similar to Counting the adjacent swaps required to convert one permutation into another. But I would argue that this question is more fundamental. Converting permutation from one to another can be converted to this problem by inverting the target permutation in O(n), composing the permutations in O(n) and then finding the number of swaps from there to identity. Solving this question by explicitly representing identity as another permutation seems suboptimal. Also, the other question had, until yesterday, four answers where only a single one (by |\/|ad) was seemingly useful, but the description of the method seemed vague. Now user lizusek provided answer to my question there. I don't agree with closing this question as duplicate.
EDIT2: The proposed algorithm actually seems to be rather optimal, as pointed out in a comment by user rcgldr, see my answer to Counting the adjacent swaps required to convert one permutation into another.
I believe the key is to think of the permutation in terms of the cycle decomposition.
This expresses any permutation as a product of disjoint cycles.
Key facts are:
Swapping elements in two disjoint cycles produces one longer cycle
Swapping elements in the same cycle produces one fewer cycle
The number of permutations needed is n-c where c is the number of cycles in the decomposition
Your algorithm always swaps elements in the same cycle so will correctly count the number of swaps needed.
If desired, you can also do this in O(n) by computing the cycle decomposition and returning n minus the number of cycles found.
Computing the cycle decomposition can be done in O(n) by starting at the first node and following the permutation until you reach the start again. Mark all visited nodes, then start again at the next unvisited node.
I believe the following are true:
If S(x[0], ..., x[n-1]) is the minimum number of swaps needed to convert x to {0, 1, ..., n - 1}, then:
If x[n - 1] == n - 1, then S(x) == S(x[0],...,x[n-2]) (ie, cut off the last element)
If x[-1] != n - 1, then S(x) == S(x[0], ..., x[n-1], ..., x[i], ... x[n-2]) + 1, where x[i] == n - 1.
S({}) = 0.
This suggests a straightforward algorithm for computing S(x) that runs in O(n) time:
int num_swaps(int[] x, int n) {
if (n == 0) {
return 0;
} else if (x[n - 1] == n - 1) {
return num_swaps(x, n - 1);
} else {
int* i = std::find(x, x + n, n - 1);
std::swap(*i, x[n - 1])
return num_swaps(x, n - 1) + 1;
}
}

find a matrix in a big matrix

I have a very large n*m matrix S. I want to efficiently determine whether there exists a submatrix F inside of S. The large matrix S can have a size as big as 500*500.
To clarify, consider the following:
S = 1 2 3
4 5 6
7 8 9
F1 = 2 3
5 6
F2 = 1 2
4 6
In such a case:
F1 is inside S
F2 is not inside S
Each element in the matrix is a 32-bit integer. I can only think of using a brute-force approach to find whether F is a submatrix of S. I googled to find an effective algorithm, but I can't find anything.
Is there some algorithm or principle to do it faster? (Or possibly some method to optimize the brute force approach?)
PS the statistics data
A total of 8 S
On average, each S will be matched against about 44 F.
The probability of success match (i.e. F appears in a S) is
19%.
It involves preprocessing the matrix. This will be heavy on memory, but it should be better in terms of computation time.
Check if the size of the sub-matrix is less than that of the matrix before you do the check.
While constructing the matrix, build a construct that maps a value in the matrix to an array of (x,y) positions in the matrix. This will allow you to check for the existence of a sub-matrix where candidates could exist. You would use the value at (0,0) in the sub-matrix and get the possible positions of this value in the larger matrix. If the list of positions is empty, you have no candidates, and so, the sub-matrix does not exist. There's a start (More experienced people might consider this a naive approach however).
Modified Code of deepu-benson
int Ma[][5]= {
{0, 0, 1, 0, 0},
{0, 0, 1, 0, 0},
{0, 1, 0, 0, 0},
{0, 1, 0, 0, 0},
{1, 1, 1, 1, 0}
};
int Su[][3]= {
{1, 0, 0},
{1, 0, 0},
};
int S = 5;// Size of main matrix row
int T = 5;//Size of main matrix column
int M = 2; // size of desire matrix row
int N = 3; // Size of desire matrix column
int flag, i,j,p,q;
for(i=0; i<=(S-M); i++)
{
for(j=0; j<=(T-N); j++)
{
flag=0;
for(p=0; p<M; p++)
{
for(int q=0; q<N; q++)
{
if(Ma[i+p][j+q] != Su[p][q])
{
flag=1;
break;
}
}
}
if(flag==0)
{
printf("Match Found in the Main Matrix at starting location %d, %d",(i+1) ,(j+1));
break;
}
}
if(flag==0)
{
printf("Match Found in the Main Matrix at starting location %d, %d",(i+1) ,(j+1));
break;
}
}
If you want to query multiple times for a same big matrix and same size submatrices. There are many solutions to preprocess the big matrix.
A similar ( or even same ) problem is here.
Fastest way to Find a m x n submatrix in M X N matrix
Since you only want to know whether a given matrix is inside another big matrix. If you know how to use Matlab code from C++, you may directly use ismember from Matlab. Another way may be try to figure out how ismember works in Matlab, then implement the same thing in C++.
See Find location of submatrix
Since you have tagged the question as C++ also, I am providing this code. This is a brute force technique and definitely not the ideal solution for this problem. For an S X T Main Matrix and a M X N Sub Matrix, the time complexity of the algorithm is O(STMN).
cout<<"\nEnter the order of the Main Matrix";
cin>>S>>T;
cout<<"\nEnter the order of the Sub Matrix";
cin>>M>>N;
// Read the Main Matrix into MAT[S][T]
// Read the Sub Matrix into SUB[M][N]
for(i=0; i<(S-M); i++)
{
for(j=0; j<(T-N); j++)
{
flag=0;
for(p=0; p<M; p++)
{
for(q=0; q<N; q++)
{
if(MAT[i+p][j+q] != SUB[p][q])
{
flag=1;
break;
}
}
if(flag==0)
{
cout<<"Match Found in the Main Matrix at starting location "<<(i+1) <<"X"<<(j+1);
break;
}
}
if(flag==0)
{
break;
}
}
if(flag==0)
{
break;
}
}
Much of the answer depends on what you're doing repetitively. Are you testing a bunch of huge matrices for the same submatrix? Are you testing one huge matrix looking for a bunch of different submatrices?
Do any of the matrices have repetitive patterns, or are they nice and random, or can you make no assumptions about the data?
Also, does the submatrix have to be contiguous? Does S contain
F3 = 1 3
7 9
If the data in matrix isn't randomly distributed, it would be helpful to run some statistical analysis on it. Then you could find the sub matrix by comparing its element ranged by their inverse probability. It could be faster, then a plain bruteforce.
Say, you have the matrix of some normally distributed integers with the Gaussian center in 0. And you want to find submatrix say:
1 3 -12
-3 43 -1
198 2 2
You have to start searching for 198, then checking upper right element to be 43 then its upper right for -12, then any 3 or -3 will do; and so on. This would greatly reduce the number of comparisons comparing to the most brutal solution.
My original answer is below the break, thinking about it there are several optimisations, these optimisations refer to the steps of the original answer.
For Step B) do not search the entirety of S: you can discount all columns and rows which would not allow F to fit. (in the below example, only search the upper left 2x2 matrix). In cases where F is a significant proportion of S this would save considerable time.
If the range of values within S is quite low then creating a lookup table would greatly reduce the time required for step B).
Working with these 2 matrices
find inside
A) Select one value from the smaller matrix:
B) locate it within the larger
C) Check the adjacent cells to see if they match
-
It's possible to do in O(N*M*(logN+logM)).
Equality can be expressed as sum of squared differences is 0:
sum[i,j](square(S(n+i,m+j)-F(i,j)))=0
sum[i,j]square(S(n+i,m+j))+sum[i,j](square(F(i,j))-2*sum[i,j](S(n+i,m+j)*F(i,j))=0
First part can be calculated for all (n,m) in O(N*M) similarly to running average.
Second part is calculated as usual in O(sizeof(F)) which is less than O(N*M).
Third part is the most interesting. It's convolution which can be calculated in O(N*M*(logN+logM)) using Fast Fourier Transform: http://en.wikipedia.org/wiki/Convolution#Fast_convolution_algorithms