Linear Systems and Matrix Analysis

Systems of linear algebraic equations arise in many applications that involve scientific computations, such as signal processing, computational fluid dynamics, and others. Such systems occur naturally or are the result of approximating differential equations by algebraic equations.

Types of Matrices

Whatever the application, it is always necessary to find an accurate solution for the system of equations in a very efficient way. In matrix-vector notation, such a system of linear algebraic equations has the following form.

Ax = B (A)

where A is an n × n matrix, b is a given vector consisting of n elements, and x is the unknown solution vector to be determined.

A matrix is a 2D array of elements with m rows and n columns. The elements in the 2D array might be real numbers, complex numbers, functions, or operators. The matrix A shown below is an array of m rows and n columns with m × n elements.

(B)

Here, ai, j denotes the (i, j)th element located in the ith row and the jth column. In general, such a matrix is a rectangular matrix. When m = n so that the number of rows is equal to the number of columns, the matrix is a square matrix. An m × 1 matrix—m rows and one column—is a column vector. A row vector is a 1 × n matrix—one row and n columns. If all the elements other than the diagonal elements are zero—that is, ai, j = 0, ij—such a matrix is a diagonal matrix. For example,

(C)

is a diagonal matrix. A diagonal matrix with all the diagonal elements equal to one is an identity matrix, also known as unit matrix. If all the elements below the main diagonal are zero, the matrix is an upper triangular matrix. On the other hand, if all the elements above the main diagonal are zero, the matrix is a lower triangular matrix. When all the elements are real numbers, the matrix is a real matrix. On the other hand, when at least one of the elements of the matrix is a complex number, the matrix is a complex matrix.

Determinant of a Matrix

One of the most important attributes of a matrix is its determinant. In the simplest case, the determinant of a 2 × 2 matrix

(D)

is given by adbc. The determinant of a square matrix is formed by taking the determinant of its elements. For example, if

(E)

then the determinant of A, denoted by |A|, is

|A| = = = 2(–33) –5(47) + 3(35) = –196

The determinant of a diagonal matrix, an upper triangular matrix, or a lower triangular matrix is the product of its diagonal elements.

The determinant tells many important properties of the matrix. For example, if the determinant of the matrix is zero, the matrix is singular. In other words, the above matrix with nonzero determinant is nonsingular.

Transpose of a Matrix

The transpose of a real matrix is formed by interchanging its rows and columns. If the matrix B represents the transpose of A, denoted by AT, then bj, i = ai, j. For the matrix A defined above,

(F)

In the case of complex matrices, we define complex conjugate transposition. If the matrix D represents the complex conjugate transpose (if a = x + iy, then complex conjugate a* = xiy) of a complex matrix C, then

(G)

That is, the matrix D is obtained by replacing every element in C by its complex conjugate and then interchanging the rows and columns of the resulting matrix.

A real matrix is a symmetric matrix if the transpose of the matrix is equal to the matrix itself. The example matrix A is not a symmetric matrix. If a complex matrix C satisfies the relation C = CH, C is a Hermitian matrix.

Linear Independence

A set of vectors x1, x2, …, xn is linearly dependent only if there exist scalars α1, α2, …, αn, not all zero, such that

α1x1 + α2x2 + … + αnxn = 0 (H)

In simpler terms, if one of the vectors can be written in terms of a linear combination of the others, the vectors are linearly dependent.

If the only set of αi for which Equation H holds is α1 = 0, α2 = 0, …, αn = 0, the set of vectors x1, x2, …, xn is linearly independent. So in this case, none of the vectors can be written in terms of a linear combination of the others. Given any set of vectors, Equation H always holds for α1 = 0, α2 = 0, …, αn = 0. Therefore, to show the linear independence of the set, you must show that α1 = 0, α2 = 0, …, αn = 0 is the only set of αi for which Equation H holds.

For example, first consider the vectors

(I)

α1 = 0 and α2 = 0 are the only values for which the relation α1x + α2y = 0 holds true. Therefore, these two vectors are linearly independent of each other. Now consider the vectors

(J)

If α1 = –2 and α2 = 1, α1x + α2y = 0. Therefore, these two vectors are linearly dependent on each other. You must understand this definition of linear independence of vectors to fully appreciate the concept of the rank of the matrix.

Matrix Rank

The rank of a matrix A, denoted by ρ(A), is the maximum number of linearly independent columns in A. If you look at the example matrix A, you find that all the columns of A are linearly independent of each other. That is, none of the columns can be obtained by forming a linear combination of the other columns. Hence, the rank of the matrix is 3. Consider one more example matrix, B, where

(K)

This matrix has only two linearly independent columns because the third column of B is linearly dependent on the first two columns. Hence, the rank of this matrix is 2. It can be shown that the number of linearly independent columns of a matrix is equal to the number of independent rows. So the rank can never be greater than the smaller dimension of the matrix. Consequently, if n × m matrix, then

ρ(A) ≤ min(n, m) (L)

where min denotes the minimum of the two numbers. In matrix theory, the rank of a square matrix pertains to the highest order nonsingular matrix that can be formed from it. A matrix is singular if its determinant is zero. So the rank pertains to the highest order matrix that you can obtain whose determinant is not zero. For example, consider a 4 × 4 matrix

(M)

For this matrix, det(B) = 0, but

(N)

Hence, the rank of B is 3. A square matrix has full rank only if its determinant is different from zero. Matrix B is not a full-rank matrix.

Magnitude (Norms) of Matrices

You must develop a notion of the magnitude of vectors and matrices to measure errors and sensitivity in solving a linear system of equations. As an example, these linear systems can be obtained from applications in control systems and computational fluid dynamics. In two dimensions, for example, you cannot compare two vectors x = [x1 x2] and y = [y1 y2] because you might have x1 > y1 but x2 < y2. A vector norm is a way to assign a scalar quantity to these vectors so that they can be compared with each other. It is similar to the concept of magnitude, modulus, or absolute value for scalar numbers.

There are ways to compute the norm of a matrix. These include the 2-norm (Euclidean norm), the 1-norm, the Frobenius norm (F-norm), and the Infinity norm (inf-norm). Each norm has its own physical interpretation. Consider a unit ball containing the origin. The Euclidean norm of a vector is simply the factor by which the ball must be expanded or shrunk in order to encompass the given vector exactly, as shown in the following figure:

Part a of the previous figure shows a unit ball of radius = 1 unit. Part b shows a vector of length = = . As shown in Part c of the previous figure, the unit ball must be expanded by a factor of before it can exactly encompass the given vector. Hence, the Euclidean norm of the vector is .

The norm of a matrix is defined in terms of an underlying vector norm. It is the maximum relative stretching that the matrix does to any vector. With the vector 2-norm, the unit ball expands by a factor equal to the norm. On the other hand, with the matrix 2-norm, the unit ball might become an ellipsoidal (ellipse in 3D), with some axes longer than others. The longest axis determines the norm of the matrix.

Some matrix norms are much easier to compute than others. The 1-norm is obtained by finding the sum of the absolute value of all the elements in each column of the matrix. The largest of these sums is the 1-norm. In mathematical terms, the 1-norm is simply the maximum absolute column sum of the matrix.

(O)

For example,

(P)

then

||A||1 = max(3, 7) = 7 (Q)

The inf-norm of a matrix is the maximum absolute row sum of the matrix.

(R)

In this case, you add the magnitudes of all elements in each row of the matrix. The maximum value that you get is the inf-norm. For the Equation R example matrix,

||A|| = max(4, 6) = 6 (S)

The 2-norm is the most difficult to compute because it is given by the largest singular value of the matrix.

Determining Singularity (Condition Number)

Whereas the norm of the matrix provides a way to measure the magnitude of the matrix, the condition number of a matrix is a measure of how close the matrix is to being singular. The condition number of a square nonsingular matrix is defined as

cond(A) = ||A||p · ||A–1||p (T)

where p can be one of the four norm types described above. For example, to find the condition number of a matrix A, you can find the 2-norm of A, the 2-norm of the inverse of the matrix A, denoted by A–1, and then multiply them together. The inverse of a square matrix A is a square matrix B such that AB = I, where I is the identity matrix. The 2-norm is difficult to calculate on paper, so you can use MatrixNorm to compute the 2-norm.

For example,

(U)

The condition number can vary between 1 and infinity. A matrix with a large condition number is nearly singular, while a matrix with a condition number close to 1 is far from being singular. The matrix A above is nonsingular. However, consider the matrix

(V)

The condition number of this matrix is 47,168, and hence the matrix is close to being singular. A matrix is singular if its determinant is equal to zero. However, the determinant is not a good indicator for assessing how close a matrix is to being singular. For the matrix B above, the determinant (0.0299) is nonzero. However, the large condition number indicates that the matrix is close to being singular. Remember that the condition number of a matrix is always greater than or equal to one; the latter being true for identity and permutation matrices. A permutation matrix is an identity matrix with some rows and columns exchanged. The condition number is a very useful quantity in assessing the accuracy of solutions to linear systems.