I have to store matrices where the non-zero elements are arranged in a "chess table"-like pattern (1,1), (1,3), (2,2), (2,4), etc. I can't store the zero elements, but I also need to implement addition and multiplication.
The elements are stored in this vector:
std::vector<std::vector<int>> _v;
And I have
size_t n,m
for their size, so printing and addition are fairly straightforward.
Where I run into problems is multiplication.
As an example, multiplying [ [1,0,3],[0,2,0] ] and [ [2,0],[0,1], [1,0] ] would result in [ [5,0],[0,2] ]. The way I'm storing the first matrix is [ [1],[2],[3] ] and the second [ [2,1], [1] ]. Is storing these matrices this way fundamentally wrong? If so, what would be a proper way to store these so I can multiply them?
I would strongly recommend:
Storing data in a simple one-dimensional valarray or vector of
integers. The size of the vector would be "width * height / 2", since
you won't store every second element.
Implementing your own get/set methods for accessing items. In these methods, you'll perform the transformation from x, y to index. And you'll get or set the value
only if x * y is an even number, because you know that every second
item would be zero. Example:
class CheckeredMatrix
{
// ...
public:
void set(int x, int y, int value)
{
int i = y * _width + x;
if (i % 2 == 0)
_data[i / 2] = value;
}
}
Implementing operations add(), sub(), mul() using
only these get/set methods. Then you don't need to modify algorithms
in any way. Since you know that every second item is zero, you can
then optimize the algorithms by skipping every second step, but it
will work without it.
Wrapping everything in a class.
I suggest to represent your matrix as 2 matrices, one with (2*i+1, 2*j+1) coordinates, the other with (2*i, 2*j).
So
a 0 b 0 c
0 d 0 e 0
f 0 g 0 h
->
a b c
f g h
and
d e
Those 2 matrices allow some regular matrix operations (sum, multiplication, transposition).
Related
Recently, I'm doing something about C++ pointers, I got this question when I want to access elements in multi-dimensional array with a 1-dimensional array which contains index.
Say I have a array arr, which is a 4-dimensional array with all elements set to 0 except for arr[1][2][3][4] is 1, and a array idx which contains index in every dimension for arr, I can access this element by using arr[idx[0]][idx[1]][idx[2]][idx[3]], or by using *(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3]).
The question comes with when n is large, this would be not so good, so I wonder if there is a better way to work with multi-dimensional accessing?
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
return 0;
}
output
Expected: 1 at 0x7fff54c61f28
Got with ****: 1
The way you constructor your algorithm for indexing a multi dimensional array will vary depending on the language of choice; you have tagged this question with both C and C++. I will stick with the latter since my answer would pertain to C++. For a little while now I've been working on something similar but different so this becomes an interesting question as I was building a multipurpose multidimensional matrix class template.
What I have discovered about higher levels of multi dimensional vectors and matrices is that the order of 3 repetitiously works miracles in understanding the nature of higher dimensions. Think of this in the geometrical perspective before considering the algorithmic software implementation side of it.
Mathematically speaking Let's consider the lowest dimension of 0 with the first shape that is a 0 Dimensional object. This happens to be any arbitrary point where this point can have an infinite amount of coordinate location properties. Points such as p0(0), p1(1), p2(2,2), p3(3,3,3),... pn(n,n,...n) where each of these objects point to a specific locale with the defined number of dimensional attributes. This means that there is no linear distance such as length, width, or height and conversely a magnitude in any direction or dimension where this shape or object that has no bounds of magnitude does not define any area, volume or higher dimensions of volume. Also with these 0 dimensional points there is no awareness of direction which also implies that there is no angle of rotation that defines magnitude. Another thing to consider is that any arbitrary point is also the zero vector. Another thing to help in understand this is by the use of algebraic polynomials such that f(x) = mx+b which is linear is a One Dimensional equation, shape(in this case a line) or graph, f(x) = x^2 is Two Dimensional, f(x) = x^3 is Three Dimensional, f(x) = x^4 is Four Dimensional and so on up to f(x) = x^n where this would be N Dimensional. Length or Magnitude, Direction or Angle of Rotation, Area, Volume, and others can not be defined until you relate two distinct points to give you at least 1 line segment or vector with a specified direction. Once you have an implied direction you then have slope.
When looking at operations in mathematics the simplest is addition and it is nothing more than a linear translation and once you introduce addition you also introduce all other operations such as subtraction, multiplication, division, powers, and radicals; once you have multiplication and division you define rotation, angles of rotation, area, volume, rates of change, slope (also tangent function), which thus defines geometry and trigonometry which then also leads into integrations and derivatives. Yes, we have all had our math lessons but I think that this is important in to understanding how to construct the relationships of one order of magnitude to another, which then will help us to work through higher dimensional orders with ease once you know how to construct it. Once you can understand that even your higher orders of operations are nothing more than expansions of addition and subtraction you will begin to learn that their continuous operations are still linear in nature it is just that they expand into multiple dimensions.
Early I stated that the order of 3 repetitiously works miracles so let me explain my meaning. Since we perceive things on a daily basis in the perspective of 3D; we can only visualize 3 distinct vectors that are orthogonal to each other giving you our natural 3 Dimensions of Space such as Left & Right, Forward & Backward giving you the Horizontal axis and planes and Up & Down giving you the Vertical axis and planes. We can not visualize anything higher so dimensions of the order of x^4, x^5, x^6 etc... we can not visualize but yet they do exist. If we begin to look at the graphs of the mathematical polynomials we can begin to see a pattern between odd and even functions where x^4, x^6, x^8 are similar where they are nothing more than expansions of x^2 and functions of x^5, x^7 & x^9 are nothing more than expansions of x^3. So I consider the first few dimensions as normal: Zero - Point, 1st - Linear, 2nd - Area, and 3rd - Volume and as for the 4th and higher dimensions I call all of them Volumetric.
So if you see me use Volume then it relates directly to the 3rd Dimension where if I refer to Volumetric it relates to any Dimension higher than the 3rd. Now lets consider a matrix such that you have seen in regular algebra where the common matrices are defined by MxN. Well this is a 2D flat matrix that has M * N elements and this matrix also has an area of M * N as well. Let's expand to a higher dimensional matrix such as MxNxO this is a 3D Matrix with M * N * O elements and now has M * N * O Volume. So when you visualize this think of the MxN 2D part as being a page to a book and the O components represents each page of a book or slice of a box. The elements of these matrices can be anything from a simple value, to an applied operation, to an equation, system of equations, sets or just an arbitrary object as in a storage container. So now when we have a matrix that is of the 4th order such as MxNxOxP this now has a 4th dimensional aspect but the easiest way to visualize this is that This would be a 1 dimensional array or vector to where all of its P elements would be a 3D Matrix of a Volume of MxNxO. When you have a matrix of MxNxOxPxQ now you have a 2D Area Matrix of PxQ where each of those elements are a MxNxO Volume Matrix. Then again if you have a MxNxOxPxQxR you now have a 6th dimensional matrix and this time you have a 3D Volume Matrix where each of the PxQxR elements are in fact 3D Matrices of MxNxO. And once you go higher and higher this patter repeats and merges again. So the order of how arbitrary matrices behave is that these dimensionalities repeat: 1D are Linear Vectors or Matrices, 2D are Area or Planar Matrices and 3D is Volume Matrices and any thing of a higher repeats this process compressing the previous step of Volumes thus the terminology of Volumetric Matrices. Take a Look at this table:
// Order of Magnitude And groupings
-----------------------------------
Linear Area Volume
x^1 x^2 x^3
x^4 x^5 x^6
x^7 x^8 x^9
x^10 x^11 x^12
... ... ...
----------------------------------
Now it is just a matter of using a little bit of calculus to know which order of magnitude to index into which higher level of dimensionality. Once you know a specific dimension it is simple to take multiple derivatives to give you a linear expression; then traverse the space, then integrate to the same orders of the multiple derivatives to give the results. This should eliminate a good amount of intermediate work by at first ignoring the least significant lower dimensions in a high dimensional order. If you are working in something that has 12 dimensions you can assume that the first 3 dimensions that define the first set of volume is packed tight being an element to another 3D Volumetric Matrix and then once again that 2d order of Volumetric Matrix is itself an element of another 3D Volumetric Matrix. Thus we have a repeating pattern and now it's just a matter of apply this to construct an algorithm and once you have an algorithm; it should be quite easy to implement the methods in any programmable language. So you may have to have a 3 case switch to determine which algorithmic approach to use knowing the overall dimensionality of your matrix or n-d array where one handles orders of linearity, another to handle area, and the final to handle volumes and if they are 4th+ then the overall process becomes recursive in nature.
I figured out a way to solve this myself.
The idea is that use void * pointers, we know that every memory cell holds value or an address of a memory cell, so we can directly compute the offset of the target to the base address.
In this case, we use void *p = arr to get the base address of the n-d array, and then loop over the array idx, to calculate the offset.
For arr[10][10][10][10], the offset between arr[0] and arr[1] is 10 * 10 * 10 * sizeof(int), since arr is 4-d, arr[0] and arr[1] is 3-d, so there is 10 * 10 * 10 = 1000 elements between arr[0] and arr[1], after that, we should know that the offset between two void * adjacent addresses is 1 byte, so we should multiply sizeof(int) to get the correct offset, according to this, we finally get the exact address of the memory cell we want to access.
Finally, we have to cast void * pointer to int * pointer and access the address to get the correct int value, that's it!
With void *(not so good)
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
void *p = arr;
for(int i = 0; i < 4; i++)
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
cout<<"Got with void *:";
cout<<*((int*)p)<<" at "<<p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff5e3a3f18
Got with ****: 1
Got with void *:1 at 0x7fff5e3a3f18
Notice:
There is a warning when compiling it, but I choose to ignore it.
test.cpp: In function 'int main()':
test.cpp:23:53: warning: pointer of type 'void *' used in arithmetic [-Wpointer-arith]
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
Use char * instead of void *(better)
Since we want to manipulate pointer byte by byte, it would be better to use char * to replace void *.
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
char *p = (char *)arr;
for(int i = 0; i < 4; i++)
p += idx[i] * int(pow(10, 3-i)) * sizeof(int);
cout<<"Got with char *:";
cout<<*((int*)p)<<" at "<<(void *)p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff4ffd7f18
Got with char *:1 at 0x7fff4ffd7f18
With int *(In this specific case)
I have been told it's not a good practice for void * used in arithmetic, it would be better to use int *, so I cast arr into int * pointer and also replace pow.
#include <bits/stdc++.h>
using namespace std;
#define N 10
int main()
{
int arr[N][N][N][N] = {0};
int idx[4] = {1, 2, 3, 4};
arr[1][2][3][4] = 1;
cout<<"Expected: "<<arr[1][2][3][4]<<" at "<<&arr[1][2][3][4]<<endl;
cout<<"Got with ****: ";
cout<<*(*(*(*(arr + idx[0]) + idx[1]) + idx[2]) + idx[3])<<endl;
int *p = (int *)arr;
int offset = 1e3;
for(int i = 0; i < 4; i++)
{
p += idx[i] * offset;
offset /= 10;
}
cout<<"Got with int *:";
cout<<*p<<" at "<<p<<endl;
return 0;
}
Output
Expected: 1 at 0x7fff5eaf9f08
Got with ****: 1
Got with int *:1 at 0x7fff5eaf9f08
Assume we want to translate a point p(1, 2, 3, w=1) with a vector v(a, b, c, w=0) to a new point p'
Note: w=0 represents a vector and w=1 represent a point in OpenGL, please correct me if I'm wrong.
In Affine transformation definition, we have:
p + v = p'
=> p(1, 2, 3, 1) + v(a, b, c, 0) = p(1 + a, 2 + b, 3 + c, 1)
=> point + vector = point (everything works as expected)
In OpenGL, the translation matrix is as following:
1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 1
I assume (a, b, c, 1) is the vector from Affine transformation definition
why we have w=1, but not w=0 such as
1 0 0 a
0 1 0 b
0 0 1 c
0 0 0 0
Note: w=0 represents a vector and w=1 represent a point in OpenGL, please correct me if I'm wrong.
You are wrong. First of all, this hasn't really anything to do with OpenGL. This is about homogenous coordinates, which is a purely mathematical concept. It works by embedding an n-dimensional vector space into an n+1 dimensional vector space. In the 3D case, we use 4D homogenous coordinates, with the definition that the homogenous vector (x, y, z, w) represents the 3D point (x/w, y/w, z/w) in cartesian coordinates.
As a result, for any w != 0, you get a certain finite point, and for w = 0, you are discribing an infinitely far away point into a specific direction. This means that the homogenous coordinates are more powerful in the regard that they can actually describe infinitely far away points with finite coordinates (which is something which comes very handy for perspective transformations, where infinitely far away points are mapped to finite points, and vice versa).
You can, as a shortcut, imagine (x,y,z,0) as some direction vector. But for a point, it is not just w=1, but any w value unequal 0. Conceptually, this means that any cartesian 3D point is represented by a line in homogenous space (we did go up one dimension, so this actually makes sense).
I assume (a, b, c, 1) is the vector from Affine transformation definition why we have w=1, but not w=0?
Your assumption is wrong. One thing about homogenous coordinates is that we do not apply a translation in the 4D space. We get the effect of the translation in the 3D space by actually doing a shearing operation in 4D space.
So what we really want to do in homogenous space is
(x + w *a, y + w*b, z+ w*c, w)
since the 3D interpretation of the resulting vector will then be
(x + w*a) / w == x/w + a
(y + w*b) / w == y/w + b
(z + w*c) / w == z/w + c
which will represent the translation that we were after.
So to try to make this even more clear:
What you wrote in your question:
p(1, 2, 3, 1) + v(a, b, c, 0) = p(1 + a, 2 + b, 3 + c, 1)
Is explicitely not what we want to do. What you describe is an affine translation with respect to the 4D vector space.
But what we actually want is a translation in the 3D cartesian coordinates, so
(1, 2, 3) + (a, b, c) = (1 + a, 2 + b, 3 + c)
Applying your formula would actually mean doing a translation in the homogenous space, which would have the effect of doing a translation which is scaled by the w coordinate, while the formula I gave will always translate the point by (a,b,c), no matter what w we chose for the point.
This is of course not true if we chose w=0. Then, we will get no change at all, which is also correct because a translation will never change directions - your formula would change the direction. Your formula is correct only for w=1, which is aonly a special case. But the key point here is that we are not doing a vector addition after all, but a matrix * vector multiplication. And homogenous coordinates just allow us (among other, more powerful things), to represent a translation via matrix multiplication. But this does not mean that we can just interpret the last column as a translation vector as if we did vector addition.
Simple Answer
The reason is the way how matrix multiplications work. If you multiply a matrix by a vector then the w-component of the result is the inner product of the 4th line of the matrix with the vector. After applying the transformation, a point should still be a point and a direction should be a direction. If you would set that to a 0-vector, the result will always be 0 and thus, the resulting vector will have changed from position (w=1) to direction (w=0).
More detailed answer
The definition of a affine transformation is:
x' = A * x + t,
where is a A is a linear map and t a translation. Traditionally, linear maps are written by mathematicians in matrix form. Note, that t is here, similar to x, a 3-dimensional vector. It would now be cumbersome (and less general, thinking of projective mappings), if we would always have to handle the linear mapping matrix and the translation vector. This can be solved by introducing an additional dimension to the mapping, the so-called homogeneous coordinate, which allows us to store the linear mapping as well as the translation vector in a combined 4x4 matrix. This is called augmented matrix and by definition,
x' A | t x
[ ] = [ | ] * [ ]
1 0 | 1 1
It should also be noted, that affine transformations can now be combined very easily by just multiplying there augmented matrices, which would be hard to do in matrix plus vector notation.
One should also note, that the bottom-right 1 is not part of the translation vector, which is still 3-dimensional, but of the matrix augmentation.
You might also want to read the section about "Augmented matrix" here: https://en.wikipedia.org/wiki/Affine_transformation#Augmented_matrix
Okay, so I'm implementing an algorithm that calculates the determinant of a 3x3 matrix give by the following placements:
A = [0,0 0,1 0,2
1,0 1,1 1,2
2,0 2,1 2,2]
Currently, the algorithm is like so:
float a1 = A[0][0];
float calula1 = (A[1][1] * A[2][2]) - (A[2][1] * A[1][2])
Then we move over to the next column, so it would be be:
float a2 = A[0][1];
float calcula2 = (A[1][0] * A[2][2]) - (A[2][0] * A[1][2]);
Like so, moving across one more. Now, this, personally is not very efficient and I've already implemented a function that can calculate the determinant of a 2x2 matrix which, is basically what I'm doing for each of these calculations.
My question is therefore, is there an optimal way that I can do this? I've thought about the idea of having a function, that invokes a template (X, Y) which denotes the start and ending positions of the particular block of the 3x3 matrix:
template<typename X, Y>
float det(std::vector<Vector> data)
{
//....
}
But, I have no idea if this was the way to do this, how I would be able to access the different elements of this like the proposed approach?
You could hardcode the rule of Sarrus like so if you're exclusively dealing with 3 x 3 matrices.
float det_3_x_3(float** A) {
return A[0][0]*A[1][1]*A[2][2] + A[0][1]*A[1][2]*A[2][0]
+ A[0][2]*A[1][0]*A[2][1] - A[2][0]*A[1][1]*A[0][2]
- A[2][1]*A[1][2]*A[0][0] - A[2][2]*A[1][0]*A[0][1];
}
If you want to save 3 multiplications, you can go
float det_3_x_3(float** A) {
return A[0][0] * (A[1][1]*A[2][2] - A[2][1]*A[1][2])
+ A[0][1] * (A[1][2]*A[2][0] - A[2][2]*A[1][0])
+ A[0][2] * (A[1][0]*A[2][1] - A[2][0]*A[1][1]);
}
I expect this second function is pretty close to what you have already.
Since you need all those numbers to calculate the determinant and thus have to access each of them at least once, I doubt there's anything faster than this. Determinants aren't exactly pretty, computationally. Faster algorithms than the brute force approach (which the rule of Sarrus basically is) require you to transform the matrix first, and that'll eat more time for 3 x 3 matrices than just doing the above would. Hardcoding the Leibniz formula - which is all that the rule of Sarrus amounts to - is not pretty, but I expect it's the fastest way to go if you don't have to do any determinants for n > 3.
The parameters for HLSL's mul( x, y) indicated here: say that
if x is a vector, it is treated as a row vector.
if y is a vector, it is treated as a column vector.
Does this then follow through meaning that:
a.
if x is a vector, y is treated as a row-major matrix
if y is a vector, x is treated as a column-major matrix
b.
since ID3DXBaseEffect::SetMatrix() passes in a row-major matrix, hence I'd use the matrix passed into the shader in following order:
ex. Output.mPosition = mul( Input.mPosition, SetMatrix()value ); ?
I'm just starting out with shaders and current relearning my matrix math. It would be nice if someone could clarify this.
No. The terms "row-major" and "column-major" refer purely to the order of storage of the matrix components in memory. They have nothing to do with the order of multiplication of matrices and vectors. In fact, the D3D9 HLSL mul call interprets matrix arguments as column-major in all cases. The ID3DXBaseEffect::SetMatrix() call interprets its matrix argument as row-major, and transposes behind the scenes to mul's expected column-major order.
If you have a matrix that abstractly looks like this:
[ a b c d ]
[ e f g h ]
[ i j k l ]
[ m n o p ]
then when stored in row-major order, its memory looks like this:
a b c d e f g h i j k l m n o p
i.e. the elements of a row are all contiguous in memory. If stored in column-major order, its memory would look like this:
a e i m b f j n c g k o d h l p
with the elements of a column all contiguous. However, this has precisely zero effect on which element is which. Element b is still in the first row and second column, either way. The labeling of the elements has not changed, only the way they're mapped to memory.
If you declare an array like float matrix[rows][cols] in C, then you are using row-major storage. However, some other languages, like FORTRAN, use column-major storage for their multidimensional arrays by default; and OpenGL also uses column-major storage.
Now, entirely separately, there is another choice of convention, which is whether to use row-vector or column-vector math. This has nothing at all to do with the memory layout of matrices, but it affects how you build your matrices, and the order of multiplication. If you use row vectors, you'll do vector-matrix multiplication:
[ a b c d ]
[x y z w] * [ e f g h ] = [x*a + y*e + z*i + w*m, ... ]
[ i j k l ]
[ m n o p ]
and if you use column vectors, then you'll do matrix-vector multiplication:
[ a b c d ] [ x ]
[ e f g h ] * [ y ] = [x*a + y*b + z*c + w*d, ... ]
[ i j k l ] [ z ]
[ m n o p ] [ w ]
This is because in row-vector math, a vector is really a 1×n matrix (a single row), and in column-vector math it's an n×1 matrix (a single column), and the rule about what sizes of matrices are allowed to be multiplied together determines the order. (You can't multiply a 4×4 matrix by a 1×4 matrix, but you can multiply a 4×4 matrix with a 4×1 one.)
Note that the matrix didn't change between the two equations above; only the interpretation of the vector changed.
So, to get back to your original question:
When you pass a vector to HLSL's mul, it automatically interprets it "correctly" according to which argument it is. If the vector is on the left, it's a row vector, and if it's on the right, it's a column vector.
However, the matrix gets interpreted the same way always. A matrix is a matrix, regardless of whether it's being multiplied with a row vector on the left or a column vector on the right. You can freely decide whether to use row-vector or column-vector math in your code, as long as you're consistent about it. HLSL is agnostic on this point, although the D3DX math library uses row vectors.
And it turns out that for some reason, in D3D9 HLSL, mul always expects matrices to be stored in column-major order. However, the D3DX math library stores matrices in row-major order, and as the documentation says, ID3DXBaseEffect::SetMatrix() expects its input in row-major order. It does a transpose behind the scenes to prepare the matrix for use with mul.
BTW, D3D11 HLSL defaults to column-major order, but allows you to use a compiler directive to tell it to use row-major order instead. It is still agnostic as to row-vector versus column-vector math. And OpenGL GLSL also uses column-major order, but does not (as far as I know) provide a way to change it.
Further reading on these issues:
A word on Matrices by Catalin Zima
Row major vs. column major, row vectors vs. column vectors by Fabian Giesen
Yes, if x is a vector then x is treated as a row major vector and y is treated as a row major matrix; vice versa for column major so for a row-major matrix system:
float4 transformed = mul(position, world);
and for column-major:
float4 transformed = mul(world, position);
Because of the way that matrix multiplication works, if the matrix is column-major then you must post multiply by a column vector to get the correct result. If the matrix is row-major you must pre multiply by a row vector.
So really, hlsl doesn't care whether your matrix is row or column major, it is up to you to apply the vector multiplication in the correct order to get the correct result.
Is there a function in LAPACK, which will give me the elements of a particular submatrix? If so how what is the syntax in C++?
Or do I need to code it up?
There is no function for accessing a submatrix. However, because of the way matrix data is stored in LAPACK routines, you don't need one. This saves a lot of copying, and the data layout was (partially) chosen for this reason:
Recall that a dense (i.e., not banded, triangular, hermitian, etc) matrix in LAPACK is defined by four values:
a pointer to the top left corner of the matrix
the number of rows in the matrix
the number of columns in the matrix
the "leading dimension" of the matrix; typically this is the distance in memory between adjacent elements of a row.
Most of the time, most people only ever use a leading dimension that is equal to the number of rows; a 3x3 matrix is typically stored like so:
a[0] a[3] a[6]
a[1] a[4] a[7]
a[2] a[5] a[8]
Suppose instead that we wanted a 3x3 submatrix of a huge matrix with leading dimension lda. Suppose we specifically want the 3x3 submatrix whose top-left corner is located at a(15,42):
. . .
. . .
... a[15+42*lda] a[15+43*lda] a[15+44*lda] ...
... a[16+42*lda] a[16+43*lda] a[16+44*lda] ...
... a[17+42*lda] a[17+43*lda] a[17+44*lda] ...
. . .
. . .
We could copy this 3x3 matrix into contiguous storage, but if we want to pass it as an input (or output) matrix to an LAPACK routine, we don't need to; we only need to define the parameters appropriately. Let's call this submatrix b; we then define:
// pointer to the top-left corner of b:
float *b = &a[15 + 42*lda];
// number of rows in b:
const int nb = 3;
// number of columns in b:
const int mb = 3;
// leading dimension of b:
const int ldb = lda;
The only thing that might be surprising is the value of ldb; by using the value lda of the "big matrix", we can address the submatrix without copying, and operate on it in-place.
However
I lied (sort of). Sometimes you really can't operate on a submatrix in place, and genuinely need to copy it. I didn't want to talk about that, because it's rare, and you should use in-place operations whenever possible, but I would feel bad not telling you that it is possible. The routine:
SLACPY(UPLO,M,N,A,LDA,B,LDB)
copies the MxN matrix whose top-left corner is A and is stored with leading dimension LDA to the MxN matrix whose top-left corner is B and has leading dimension LDB. The UPLO parameter indicates whether to copy the upper triangle, lower triangle, or the whole matrix.
In the example I gave above, you would use it like this (assuming the clapack bindings):
...
const int m = 3;
const int n = 3;
float b[9];
const int ldb = 3;
slacpy("A", // anything except "U" or "L" means "copy everything"
&m, // number of rows to copy
&n, // number of columns to copy
&a[15 + 42*lda], // pointer to top-left element to copy
lda, // leading dimension of a (something huge)
b, // pointer to top-left element of destination
ldb); // leading dimension of b (== m, so storage is dense)
...