Articles

7.5: Upper Triangular Matrices - Mathematics


As before, let (V) be a complex vector space.

Let (Tinmathcal{L}(V,V)) and ((v_1,ldots,v_n)) be a basis for (V). Recall that we can associate a matrix (M(T) in mathbb{C}^{n imes n}) to the operator (T). By Theorem 7.4.1, we know that (T) has at least one eigenvalue, say (lambdain mathbb{C}). Let (v_1 eq 0) be an eigenvector corresponding to (lambda). By the Basis Extension Theorem, we can extend the list ((v_1)) to a basis of (V). Since (Tv_1 = lambda v_1), the first column of (M(T)) with respect to this basis is

[ egin{bmatrix} lambda 0 vdots 0 end{bmatrix}. ]

What we will show next is that we can find a basis of (V)such that the matrix (M(T)) is upper triangular.

Definition 7.5.1: Upper Trianglar Matrix

A matrix (A=(a_{ij})in mathbb{F}^{n imes n}) is called upper triangular if (a_{ij}=0) for (i>j).

Schematically, an upper triangular matrix has the form

[ egin{bmatrix} * && * &ddots& 0 &&* end{bmatrix}, ]

where the entries (*) can be anything and every entry below the main diagonal is zero.

Here are two reasons why having an operator (T) represented by an upper triangular matrix can be quite convenient:

  1. the eigenvalues are on the diagonal (as we will see later);
  2. it is easy to solve the corresponding system of linear equations by back substitution (as discussed in Section A.3).

The next proposition tells us what upper triangularity means in terms of linear operators and invariant subspaces.

Proposition 7.5.2

Suppose (Tin mathcal{L}(V,V)) and that ((v_1,ldots,v_n)) is a basis of (V). Then the following statements are equivalent:

  1. the matrix (M(T)) with respect to the basis ((v_1,ldots,v_n)) is upper triangular;
  2. (Tv_k in Span(v_1,ldots,v_k)) for each (k=1,2,ldots,n);
  3. (Span(v_1,ldots,v_k)) is invariant under (T) for each (k=1,2,ldots,n).

Proof

The equivalence of Condition~1 and Condition~2 follows easily from the definition since Condition~2 implies that the matrix elements below the diagonal are zero.

Obviously, Condition~3 implies Condition~2. To show that Condition~2 implies Condition~3, note that any vector (v inSpan(v_1,ldots,v_k)) can be written as (v=a_1v_1+cdots+a_kv_k). Applying (T), we obtain

[ Tv = a_1 Tv_1 + cdots + a_k Tv_k in Span(v_1,ldots,v_k) ]

since, by Condition~2, each (Tv_j in Span(v_1,ldots,v_j)subset Span(v_1,ldots,v_k)) for (j=1,2,ldots,k) and since the span is a subspace of (V).

(square)

The next theorem shows that complex vector spaces indeed have some basis for which the matrix of a given operator is upper triangular.

Theorem 7.5.3

Let (V) be a finite-dimensional vector space over (mathbb{C}) and (Tinmathcal{L}(V,V)). Then there exists a basis (B) for (V) such that (M(T)) is upper triangular with respect to (B).

Proof

We proceed by induction on (dim(V)). If (dim(V)=1), then there is nothing to prove.

Hence, assume that (dim(V)=n>1)and that we have proven the result of the theorem for all (Tin mathcal{L}(W,W)), where (W) is a complex vector space with (dim(W)le n-1). By Theorem7.4.1, (T) has at least one eigenvalue (lambda).

Define

[ U = ange(T-lambda I), ]

and note that

  1. (dim(U)
  2. (U) is an invariant subspace of (T)since, for all (uin U), we have

[ Tu = (T-lambda I) u + lambda u, ]

which implies that (Tuin U) since ((T-lambda I) u in ange(T-lambda I)=U) and (lambda uin U).

Therefore, we may consider the operator (S=T|_U), which is the operator obtained by restricting (T) to the subspace (U). By the induction hypothesis, there exists a basis ((u_1,ldots,u_m)) of (U) with (mle n-1) such that (M(S)) is upper triangular with respect to ((u_1,ldots,u_m)). This means that

[ Tu_j = Su_jin Span(u_1,ldots,u_j), quad ext{for all (j=1,2,ldots,m).} ]

Extend this to a basis ((u_1,ldots,u_m,v_1,ldots,v_k))of (V). Then

[ Tv_j=(T-lambda I)v_j + lambda v_j, quad ext{for all (j=1,2,ldots,k).} ]

Since ((T-lambda I) v_jin ange(T-lambda I)=U=Span(u_1,ldots,u_m)), we have that

[ Tv_j in Span(u_1,ldots,u_m,v_1,ldots,v_j), quad ext{for all (j=1,2,ldots,k).} ]

Hence, (T) is upper triangular with respect to the basis ((u_1,ldots,u_m,v_1,ldots,v_k)).

(square)

The following are two very important facts about upper triangular matrices and their associated operators.

Proposition 7.5.4

Suppose (Tinmathcal{L}(V,V)) is a linear operator and that (M(T)) is upper triangular with respect to some basis of (V).

  1. (T) is invertible if and only if all entries on the diagonal of (M(T)) are nonzero.
  2. The eigenvalues of (T) are precisely the diagonal elements of (M(T)).

Proof of Proposition 7.5.4, Part 1

Let ((v_1,ldots,v_n))be a basis of (V) such that

egin{equation*}
M(T) = egin{bmatrix} lambda_1 &&*
&ddots&
0&&lambda_n
end{bmatrix}
end{equation*}

is upper triangular. The claim is that (T) is invertible if and only if (lambda_k eq 0) for all (k=1,2,ldots,n). Equivalently, this can be reformulated as follows: (T) is not invertible if and only if (lambda_k=0) for at least one (kin{1,2,ldots,n}).

Suppose (lambda_k=0). We will show that this implies the non-invertibility of (T). If (k=1), this is obvious since then (Tv_1=0), which implies that (v_1inkernel(T)) so that (T) is not injective and hence not invertible. So assume that (k>1). Then

( Tv_j in Span(v_1,ldots,v_{k-1}), quad ) for all (j le k),

since (T) is upper triangular and (lambda_k=0). Hence, we may define (S=T|_{Span(v_1,ldots,v_k)}) to be the restriction of (T) to the subspace (Span(v_1,ldots,v_k)) so that

[ S: Span(v_1,ldots,v_k) o Span(v_1,ldots,v_{k-1}). ]

The linear map (S) is not injective since the dimension of the domain is larger than the dimension of its codomain, i.e.,

[ dim(Span(v_1,ldots,v_k)) = k > k-1 = dim(Span(v_1,ldots,v_{k-1})). ]

Hence, there exists a vector (0 eq vin Span(v_1,ldots,v_k)) such that (Sv=Tv=0). This implies that (T)is also not injective and therefore also not invertible.

Now suppose that (T) is not invertible. We need to show that at least one (lambda_k=0). The linear map (T) not being invertible implies that (T) is not injective. Hence, there exists a vector (0 eq vin V) such that (Tv=0), and we can write

egin{equation*}
v = a_1 v_1 + cdots + a_k v_k
end{equation*}
for some (k), where (a_k eq 0). Then
egin{equation}label{eq:expansion}
0 = Tv = (a_1 Tv_1 + cdots + a_{k-1} Tv_{k-1}) + a_k Tv_k. label{7.5.1}
end{equation}

Since (T) is upper triangular with respect to the basis ((v_1,ldots,v_n)), we know that (a_1 Tv_1 + cdots + a_{k-1} Tv_{k-1}in Span(v_1,ldots,v_{k-1})). Hence, Equation ef{7.5.1} shows that (Tv_k in Span(v_1,ldots,v_{k-1})), which implies that (lambda_k=0).

(square)

Proof of Proposition 7.5.4, Part 2.

Recall that (lambdainmathbb{F}) is an eigenvalue of (T) if and only if the operator (T-lambda I) is not invertible. Let ((v_1,ldots,v_n)) be a basis such that (M(T)) is upper triangular. Then

egin{equation*}
M(T-lambda I) = egin{bmatrix} lambda_1-lambda &&*
&ddots&
0&&lambda_n-lambda
end{bmatrix}.
end{equation*}

Hence, by Proposition 7.5.4(1), (T-lambda I) is not invertible if and only if (lambda=lambda_k) for some (k).

(square)


7.5: Upper Triangular Matrices - Mathematics

L U decomposition of a matrix is the factorization of a given square matrix into two triangular matrices, one upper triangular matrix and one lower triangular matrix, such that the product of these two matrices gives the original matrix. It was introduced by Alan Turing in 1948, who also created the turing machine.

This method of factorizing a matrix as a product of two triangular matrices has various applications such as solution of a system of equations, which itself is an integral part of many applications such as finding current in a circuit and solution of discrete dynamical system problems finding the inverse of a matrix and finding the determinant of the matrix.
Basically, the L U decomposition method comes handy whenever it is possible to model the problem to be solved into matrix form. Conversion to the matrix form and solving with triangular matrices makes it easy to do calculations in the process of finding the solution.

A square matrix A can be decomposed into two square matrices L and U such that A = L U where U is an upper triangular matrix formed as a result of applying Gauss Elimination Method on A and L is a lower triangular matrix with diagonal elements being equal to 1.

For A = , we have L = and U = such that A = L U.

Here value of l21 , u11 etc can be compared and found.

Gauss Elimination Method
According to the Gauss Elimination method:
1. Any zero row should be at the bottom of the matrix.
2. The first non zero entry of each row should be on the right-hand side of the first non zero entry of the preceding row.
This method reduces the matrix to row echelon form.

Steps for L U Decomposition
Given a set of linear equations, first convert them into matrix form A X = C where A is the coefficient matrix, X is the variable matrix and C is the matrix of numbers on the right-hand side of the equations.

Now, reduce the coefficient matrix A, i.e., the matrix obtained from the coefficients of variables in all the given equations such that for ‘n’ variables we have an nXn matrix, to row echelon form using Gauss Elimination Method. The matrix so obtained is U.

To find L, we have two methods. The first one is to assume the remaining elements as some artificial variables, make equations using A = L U and solve them to find those artificial variables.
The other method is that the remaining elements are the multiplier coefficients because of which the respective positions became zero in the U matrix. (This method is a little tricky to understand by words but would get clear in the example below)

Now, we have A (the nXn coefficient matrix), L (the nXn lower triangular matrix), U (the nXn upper triangular matrix), X (the nX1 matrix of variables) and C (the nX1 matrix of numbers on the right-hand side of the equations).

The given system of equations is A X = C. We substitute A = L U. Thus, we have L U X = C.
We put Z = U X, where Z is a matrix or artificial variables and solve for L Z = C first and then solve for U X = Z to find X or the values of the variables, which was required.

Example:
Solve the following system of equations using LU Decomposition method:

A = and such that A X = C.

Now, we first consider and convert it to row echelon form using Gauss Elimination Method.

(1)

(2)

(3)

(Remember to always keep ‘ – ‘ sign in between, replace ‘ + ‘ sign by two ‘ – ‘ signs)

Hence, we get L = and U =

(notice that in L matrix, is from (1), is from (2) and is from (3))

Now, we assume Z and solve L Z = C.

So, we have

Solving, we get , and .

Therefore, we get ,

Thus, the solution to the given system of linear equations is , , and hence the matrix X =

Exercise:
In the LU decomposition of the matrix

, if the diagonal elements of U are both 1, then the lower diagonal entry l22 of L is (GATE CS 2015)
(A) 4
(B) 5
(C) 6
(D) 7
For Solution, see https://www.geeksforgeeks.org/gate-gate-cs-2015-set-1-question-28/

This article is compiled by Nishant Arora. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Attention reader! Don&rsquot stop learning now. Practice GATE exam well before the actual exam with the subject-wise and overall quizzes available in GATE Test Series Course.


Upper Hessenberg matrix Edit

An upper Hessenberg matrix is called unreduced if all subdiagonal entries are nonzero, i.e. if a i + 1 , i ≠ 0 eq 0> for all i ∈ < 1 , … , n − 1 >> . [3]

Lower Hessenberg matrix Edit

A lower Hessenberg matrix is called unreduced if all superdiagonal entries are nonzero, i.e. if a i , i + 1 ≠ 0 eq 0> for all i ∈ < 1 , … , n − 1 >> .

Consider the following matrices.

Many linear algebra algorithms require significantly less computational effort when applied to triangular matrices, and this improvement often carries over to Hessenberg matrices as well. If the constraints of a linear algebra problem do not allow a general matrix to be conveniently reduced to a triangular one, reduction to Hessenberg form is often the next best thing. In fact, reduction of any matrix to a Hessenberg form can be achieved in a finite number of steps (for example, through Householder's transformation of unitary similarity transforms). Subsequent reduction of Hessenberg matrix to a triangular matrix can be achieved through iterative procedures, such as shifted QR-factorization. In eigenvalue algorithms, the Hessenberg matrix can be further reduced to a triangular matrix through Shifted QR-factorization combined with deflation steps. Reducing a general matrix to a Hessenberg matrix and then reducing further to a triangular matrix, instead of directly reducing a general matrix to a triangular matrix, often economizes the arithmetic involved in the QR algorithm for eigenvalue problems.

A matrix that is both upper Hessenberg and lower Hessenberg is a tridiagonal matrix, of which symmetric or Hermitian Hessenberg matrices are important examples. A Hermitian matrix can be reduced to tri-diagonal real symmetric matrices. [5]

The Hessenberg operator is an infinite dimensional Hessenberg matrix. It commonly occurs as the generalization of the Jacobi operator to a system of orthogonal polynomials for the space of square-integrable holomorphic functions over some domain -- that is, a Bergman space. In this case, the Hessenberg operator is the right-shift operator S , given by

The eigenvalues of each principal submatrix of the Hessenberg operator are given by the characteristic polynomial for that submatrix. These polynomials are called the Bergman polynomials, and provide an orthogonal polynomial basis for Bergman space.


STABILITY, INERTIA, AND ROBUST STABILITY

Computing the Inertia of a Symmetric Matrix

If A is symmetric, then Sylvester's law of inertia provides an inexpensive and numerically effective method for computing its inertia.

A symmetric matrix A admits a triangular factorization:

where U is a product of elementary unit upper triangular and permutation matrices, and D is a symmetric block diagonal with blocks of order 1 or 2. This is known as diagonal pivoting factorization. Thus, by Sylvester's law of inertia In(A) = In(D)). Once this diagonal pivoting factorization is obtained, the inertia of the symmetric matrix A can be obtained from the entries of D as follows:

Let D have p blocks of order 1 and q blocks of order 2, with p + 2q = n. Assume that none of the 2 × 2 blocks of D is singular. Suppose that out of p blocks of order 1, p′ of them are positive, p″ of them are negative, and p″ of them are zero (i.e., p′ + p″ + p″ = p). Then,

The diagonal pivoting factorization can be achieved in a numerically stable way. It requires only n 3 /3 flops. For details of the diagonal pivoting factorization, see Bunch (1971), Bunch and Parlett (1971), and Bunch and Kaufman (1977).

LAPACK implementation: The diagonal pivoting method has been implemented in the LAPACK routine SSYTRF.


Choose the correct or the most suitable answer from the given four alternatives.

Question 1.
If aij = (frac<1><2>) (3i – 2j) and A = [aij]2×2 is

Solution:

Question 2.
What must be the matrix X, if 2X + (left[egin <1>& <2> <3>& <4>end ight]=left[egin <3>& <8> <7>& <2>end ight]) ?

Solution:

Question 3.
Which one of the following is not true about the matrix (left[egin <1>& <0>& <0> <0>& <0>& <0> <0>& <0>& <5>end ight])?
(a) a scalar matrix
(b) a diagonal matrix
(c) an upper triangular matrix
(d) A lower triangular matrix
Solution:
(b) a diagonal matrix

Question 4.
If A and B are two matrices such that A + B and AB are both defined, then …………
(a) A and B are two matrices not necessarily of same order.
(b) A and B are square matrices of same order.
(c) Number of columns of a is equal to the number of rows of B.
(d) A = B.
Solution:
(b) A and B are square matrices of same order.

Question 5.
If A = (left[egin & <1> <-1>& <-lambda>end ight]), then for what value of λ, A 2 = 0?
(a) 0
(b) ±1
(c) -1
(d) 1
Solution:

Question 6.
If and (A + B) 2 = A 2 + B 2 , then the values of a and b are ……………….
(a) a = 4, b = 1
(b) a = 1, b = 4
(c) a = 0, b = 4
(d) a = 2, b = 4
Solution:

Question 7.
If is a matrix satisfying the equation AA T = 9I, where I is 3 × 3 identity matrix, then the ordered pair (a, b) is equal to ………….
(a) (2, -1)
(b) (-2, 1)
(c) (2, 1)
(d) (-2, -1)
Solution:

Question 8.
If A is a square matrix, then which of the following is not symmetric?
(a) A + A T
(b) AA T
(c) A T A
(d)A – A T
Solution:
(b)

Question 9.
If A and B are symmetric matrices of order n, where (A ≠ B), then …………….
(a) A + B is skew-symmetric
(b) A + B is symmetric
(c) A + B is a diagonal matrix
(d) A + B is a zero matrix
Solution:
(b)

Question 10.
If and if xy = 1, then det (AA T ) is equal to …………..
(a) (a – 1) 2
(b) (a 2 + 1) 2
(c) a 2 – 1
(d) (a 2 – 1) 2
Solution:

Question 11.
The value of x, for which the matrix is singular is ………….
(a) 9
(b) 8
(c) 7
(d) 6
Solution:
(b) Hint: Given A is a singular matrix ⇒ |A| = 0

⇒ e x-2 .e 2x+3 – e 2+x .e 7+x = 0
⇒ e 3x+1 – e 9+2x = 0 ⇒ e 3x+1 = e 9+2x
⇒ 3x + 1 = 9 + 2x
3x – 2x = 9 – 1 ⇒ x = 8

Question 12.
If the points (x, -2), (5, 2), (8, 8) are collinear, then x is equal to …………
(a) -3
(b) (frac<1><3>)
(c) 1
(d) 3
Solution:
(d) Hint: Given that the points are collinear
So, area of the triangle formed by the points = 0

Question 13.

Solution:

Question 14.
If the square of the matrix is the unit matrix of order 2, then α, β and γ should satisfy the relation.
(a) 1 + α 2 + βγ = 0
(b) 1 – α 2 – βγ = 0
(c) 1 – α 2 + βγ = 0
(d) 1 + α 2 – βγ = 0
Solution:

Question 15.

(a) Δ
(b) kΔ
(c) 3kΔ
(d) k 3 Δ
Solution:

Question 16.
A root of the equation is …………….
(a) 6
(b) 3
(c) 0
(d) -6
Solution:

Question 17.
The value of the determinant of is ……………
(a) -2abc
(b) abc
(c) 0
(d) a 2 + b 2 + c 2
Solution:

Question 18.
If x1, x2, x3 as well as y1, y2, y3 are in geometric progression with the same common ratio, then the points (x1, y1), (x2, y2), (x3, y3) are
(a) vertices of an equilateral triangle
(b) vertices of a right angled triangle
(c) vertices of a right angled isosceles triangle
(d) collinear
Solution:
(d)

Question 19.
If (lfloor. floor) denotes the greatest integer less than or equal to the real number under consideration and -1 ≤ x < 0, 0 ≤ y < 1, 1 ≤ z ≤ 2, then the value of the determinant is …………..
(a) (lfloor z floor)
(b) (lfloor y floor)
(c) (lfloor x floor)
(d) (lfloor x floor+ 1)
Solution:
(a) Hint: From the given values
>

Question 20.
If a ≠ b, b, c satisfy then abc = ……………..
(a) a + b + c
(b) 0
(c) b 3
(d) ab + bc
Solution:
(c) Hint: Expanding along R1,
a(b 2 – ac) – 2b (3b – 4c) + 2c (3a – 4b) = 0
(b 2 – ac) (a – b) = 0
b 2 = ac (or) a = b
⇒ abc = b(b 2 ) = b 3

Question 21.
If then B is given by ………………..
(a) B = 4A
(b) B = -4A
(c) B = -A
(d) B = 6A
Solution:

Question 22.
IfA is skew-symmetric of order n and C ¡s a column matrix of order n × 1, then C T AC is ……………..
(a) an identity matrix of order n
(b) an identity matrix of order 1
(e) a zero matrix of order I
(d) an Identity matrix of order 2
Solution:
(c) Hint : Given A is of order n × n
C is of order n × 1
so, CT is of order 1 × n

Let it be equal to (x) say
Taking transpose on either sides
(C T , AC) T (x) T .
(i.e.) C T (A T )(C) = x
C T (-A)(C) = x
⇒ C T AC = -x
⇒ x = -x ⇒ 2x = 0 ⇒ x = 0

Question 23.
The matrix A satisfying the equation is ……………

Solution:

Question 24.
If A + I = , then (A + I) (A – I) is equal to …………….

Solution:

Question 25.
Let A and B be two symmetric matrices of same order. Then which one of the following statement is not true?
(a) A + B ¡s a symmetric matrix
(b) AB ¡s a symmetric matrix
(c) AB = (BA) T
(d) A T B = AB T
Solution:
(b)


Matrices

A rectangular array of symbols (which could be real or complex numbers) along rows and columns is called a matrix.
Thus a system of m x n symbols arranged in a rectangular formation along m rows and n columns and bounded by the brackets [.] is called an m by n matrix (which is written as m × n matrix).

In a compact form the above matrix is represented by A = [aij], 1 ≤ i ≤ m, 1 ≤j ≤ n or simply [aij]m x n
The numbers a11, a12, … etc of this rectangular array are called the elements of the matrix. The element aij belongs to the ith row and jth column and is called the (i, j)th element of a matrix.

Equal Matrices:

Two matrices are said to be equal if they have the same order and each element of one is equal to the corresponding element of the other.

CLASSIFICATION OF MATRICES

Row Matrix:
A matrix having a single row is called a row matrix. e. g. [1 3 5 7]

Column Matrix:

A matrix having a single column is called a column matrix. e.g.

$ large left[ egin 2 3 5 end ight] $

Square Matrix

An m × n matrix A is said to be a square matrix if m = n i.e. number of rows = number of columns.
For example: $ large A = left[ egin 1 & 2 & 3 2 & 3 & 4 3 & 4 & 5 end ight] $ is a square matrix of order 3 × 3

Note:
⋄ The diagonal from left hand side upper corner to right hand side lower corner is known as leading diagonal or principal diagonal. In the above example square matrix containing the elements 1, 3, 5 is called the leading or principal diagonal.

Trace of a Matrix

The sum of the elements of a square matrix A lying along the principal diagonal is called the trace of A i.e. tr(A). Thus if A = [aij]n×n , Then

Properties of Trace of a Matrix:

Diagonal Matrix:

A square matrix all of whose elements except those in the leading diagonal, are zero is called a diagonal matrix. For a square matrix A = [aij]n×n to be a diagonal matrix, aij = 0, whenever i ≠ j.

Note:
⋄ Here A can be also represented as diag(3 , 5 , -1)

$ large A = left[ egin 3 & 0 & 0 0 & 5 & 0 0 & 0 & -1 end ight] $ is a diagonal matrix of order 3 × 3

Scalar Matrix:

A diagonal matrix whose all the leading diagonal elements are equal is called a scalar matrix. For a square matrix A = [aij]n×n to be a scalar matrix

$ large a_ = left<egin 0 , & i e j m , & i = j end ight. $

For example:

$ large A = left[ egin 5 & 0 & 0 0 & 5 & 0 0 & 0 & 5 end ight] $ is a scalar matrix .

Unit Matrix or Identity Matrix:

A diagonal matrix of order n which has unity for all its diagonal elements, is called a unit matrix of order n and is denoted by In

Thus a square matrix A = [aij]n×n is a unit matrix if

$ large a_ = left<egin 1 , & i = j 0 , & i e j end ight. $

For example:

$ large A = left[ egin 1 & 0 & 0 0 & 1 & 0 0 & 0 & 1 end ight] $

Triangular Matrix

A square matrix in which all the elements below the diagonal are zero is called Upper Triangular matrix and a square matrix in which all the elements above diagonal are zero is called Lower Triangular matrix.

Given a square matrix A = [aij]nxn

For upper triangular matrix, aij = 0, i > j

and for lower triangular matrix, aij = 0, i < j

Note:
⋄ Diagonal matrix is both upper and lower triangular.

⋄ A triangular matrix A = [aij]nxn is called strictly triangular if aii = 0 for 1 ≤ i ≤ n.

$ large left[ egin a & h & g 0 & b & f 0 & 0 & c end ight] and left[ egin 1 & 0 & 0 2 & 3 & 0 1 & -5 & 4 end ight]$ are respectively upper and lower triangular matrices.

Null Matrix:

If all the elements of a matrix (square or rectangular) are zero, it is called a null or zero matrix.

For A = [aij] to be null matrix, aij = 0 ∀ i, j

For example: $large left[ egin 0 & 0 & 0 0 & 0 & 0 0 & 0 & 0 end ight] $
is a zero matrix

Transpose of a Matrix:

The matrix obtained from any given matrix A, by interchanging rows and columns, is called the transpose of A and is denoted by A’.
If A = [aij]mxn and A’ = [bij]nxm, then bij = aji, ∀ i, j

Properties of Transposes:

(ii) (A + B)’ = A’ + B’, A and B being conformable matrices

(iii) (αA)’ = αA’ , α being scalar

(iv) (AB)’ = B’A’ , A and B being conformable for multiplication


Upper Triangular Matrix

A square matrix whose all elements below the main diagonal are zero, is called an upper triangular matrix.

The name of upper triangular matrix describes the internal structure and formation of the matrix.

  1. The meaning of upper is above.
  2. The meaning of triangular is a triangle shape.
  3. The meaning of matrix is a rectangular array, in which elements are arranged in rows and columns.

Combine the meanings of three words, an upper triangular matrix is a special square matrix, in which the elements except below the main diagonal are non-zero elements and the shape of the nonzero elements is a triangle.

An upper triangular matrix can be expressed in the following general form.

The elements $e_<21>$, $e_<31>$, $e_<32>$, $e_<41>$, $e_<42>$, $e_<43>$ and etc. are zero but the remaining elements are nonzero elements.

The general form matrix can also be expressed in compact form.

There is a condition in the case of an upper triangular matrix. The elements are zero if $i > j$ and they are elements under the main diagonal of the matrix. The remaining elements are non-zero elements and form a triangle shape.

In other words, $e_ = 0$ if $i > j$.

Example

The following examples understand you the upper triangular matrices.

$A$ is a square matrix and also an upper triangular matrix of order $2 imes 2$. In this case the element $e_<21>$ is zero and other elements are nonzero and formed a triangle shape.

$B$ is an upper triangular matrix of order $3 imes 3$ and the elements $e_<21>$, $e_<31>$ and $e_<32>$ are zero and the shape of nonzero elements is a triangle.

$C$ is an example for an upper triangular matrix. In this square matrix, the elements $e_<21>$, $e_<31>$, $e_<32>$, $e_<41>$, $e_<42>$ and $e_<43>$ are zero and the remaining elements are nonzero and the arrangement of the nonzero elements is a triangle shape.


Choose the correct or the most suitable answer from the given four alternatives.

Question 1.
If aij = (frac<1><2>) (3i – 2j) and A = [aij]2×2 is

Solution:

Question 2.
What must be the matrix X, if 2X + (left[egin <1>& <2> <3>& <4>end ight]=left[egin <3>& <8> <7>& <2>end ight]) ?

Solution:

Question 3.
Which one of the following is not true about the matrix (left[egin <1>& <0>& <0> <0>& <0>& <0> <0>& <0>& <5>end ight])?
(a) a scalar matrix
(b) a diagonal matrix
(c) an upper triangular matrix
(d) A lower triangular matrix
Solution:
(b) a diagonal matrix

Question 4.
If A and B are two matrices such that A + B and AB are both defined, then …………
(a) A and B are two matrices not necessarily of same order.
(b) A and B are square matrices of same order.
(c) Number of columns of a is equal to the number of rows of B.
(d) A = B.
Solution:
(b) A and B are square matrices of same order.

Question 5.
If A = (left[egin & <1> <-1>& <-lambda>end ight]), then for what value of λ, A 2 = 0?
(a) 0
(b) ±1
(c) -1
(d) 1
Solution:

Question 6.
If and (A + B) 2 = A 2 + B 2 , then the values of a and b are ……………….
(a) a = 4, b = 1
(b) a = 1, b = 4
(c) a = 0, b = 4
(d) a = 2, b = 4
Solution:

Question 7.
If is a matrix satisfying the equation AA T = 9I, where I is 3 × 3 identity matrix, then the ordered pair (a, b) is equal to ………….
(a) (2, -1)
(b) (-2, 1)
(c) (2, 1)
(d) (-2, -1)
Solution:

Question 8.
If A is a square matrix, then which of the following is not symmetric?
(a) A + A T
(b) AA T
(c) A T A
(d)A – A T
Solution:
(b)

Question 9.
If A and B are symmetric matrices of order n, where (A ≠ B), then …………….
(a) A + B is skew-symmetric
(b) A + B is symmetric
(c) A + B is a diagonal matrix
(d) A + B is a zero matrix
Solution:
(b)

Question 10.
If and if xy = 1, then det (AA T ) is equal to …………..
(a) (a – 1) 2
(b) (a 2 + 1) 2
(c) a 2 – 1
(d) (a 2 – 1) 2
Solution:

Question 11.
The value of x, for which the matrix is singular is ………….
(a) 9
(b) 8
(c) 7
(d) 6
Solution:
(b) Hint: Given A is a singular matrix ⇒ |A| = 0

⇒ e x-2 .e 2x+3 – e 2+x .e 7+x = 0
⇒ e 3x+1 – e 9+2x = 0 ⇒ e 3x+1 = e 9+2x
⇒ 3x + 1 = 9 + 2x
3x – 2x = 9 – 1 ⇒ x = 8

Question 12.
If the points (x, -2), (5, 2), (8, 8) are collinear, then x is equal to …………
(a) -3
(b) (frac<1><3>)
(c) 1
(d) 3
Solution:
(d) Hint: Given that the points are collinear
So, area of the triangle formed by the points = 0

Question 13.

Solution:

Question 14.
If the square of the matrix is the unit matrix of order 2, then α, β and γ should satisfy the relation.
(a) 1 + α 2 + βγ = 0
(b) 1 – α 2 – βγ = 0
(c) 1 – α 2 + βγ = 0
(d) 1 + α 2 – βγ = 0
Solution:

Question 15.

(a) Δ
(b) kΔ
(c) 3kΔ
(d) k 3 Δ
Solution:

Question 16.
A root of the equation is …………….
(a) 6
(b) 3
(c) 0
(d) -6
Solution:

Question 17.
The value of the determinant of is ……………
(a) -2abc
(b) abc
(c) 0
(d) a 2 + b 2 + c 2
Solution:

Question 18.
If x1, x2, x3 as well as y1, y2, y3 are in geometric progression with the same common ratio, then the points (x1, y1), (x2, y2), (x3, y3) are
(a) vertices of an equilateral triangle
(b) vertices of a right angled triangle
(c) vertices of a right angled isosceles triangle
(d) collinear
Solution:
(d)

Question 19.
If (lfloor. floor) denotes the greatest integer less than or equal to the real number under consideration and -1 ≤ x < 0, 0 ≤ y < 1, 1 ≤ z ≤ 2, then the value of the determinant is …………..
(a) (lfloor z floor)
(b) (lfloor y floor)
(c) (lfloor x floor)
(d) (lfloor x floor+ 1)
Solution:
(a) Hint: From the given values
>

Question 20.
If a ≠ b, b, c satisfy then abc = ……………..
(a) a + b + c
(b) 0
(c) b 3
(d) ab + bc
Solution:
(c) Hint: Expanding along R1,
a(b 2 – ac) – 2b (3b – 4c) + 2c (3a – 4b) = 0
(b 2 – ac) (a – b) = 0
b 2 = ac (or) a = b
⇒ abc = b(b 2 ) = b 3

Question 21.
If then B is given by ………………..
(a) B = 4A
(b) B = -4A
(c) B = -A
(d) B = 6A
Solution:

Question 22.
IfA is skew-symmetric of order n and C ¡s a column matrix of order n × 1, then C T AC is ……………..
(a) an identity matrix of order n
(b) an identity matrix of order 1
(e) a zero matrix of order I
(d) an Identity matrix of order 2
Solution:
(c) Hint : Given A is of order n × n
C is of order n × 1
so, CT is of order 1 × n

Let it be equal to (x) say
Taking transpose on either sides
(C T , AC) T (x) T .
(i.e.) C T (A T )(C) = x
C T (-A)(C) = x
⇒ C T AC = -x
⇒ x = -x ⇒ 2x = 0 ⇒ x = 0

Question 23.
The matrix A satisfying the equation is ……………

Solution:

Question 24.
If A + I = , then (A + I) (A – I) is equal to …………….

Solution:

Question 25.
Let A and B be two symmetric matrices of same order. Then which one of the following statement is not true?
(a) A + B ¡s a symmetric matrix
(b) AB ¡s a symmetric matrix
(c) AB = (BA) T
(d) A T B = AB T
Solution:
(b)


Contents

A matrix is a rectangular array of numbers (or other mathematical objects) for which operations such as addition and multiplication are defined. [8] Most commonly, a matrix over a field F is a rectangular array of scalars, each of which is a member of F. [9] [10] Most of this article focuses on real and complex matrices, that is, matrices whose elements are respectively real numbers or complex numbers. More general types of entries are discussed below. For instance, this is a real matrix:

The numbers, symbols, or expressions in the matrix are called its entries or its elements. The horizontal and vertical lines of entries in a matrix are called rows and columns, respectively.

Size Edit

The size of a matrix is defined by the number of rows and columns it contains. There is no limit to the numbers of rows and columns a matrix (in the usual sense) can have as long as they are positive integers. A matrix with m rows and n columns is called an m × n matrix, or m-by-n matrix, while m and n are called its dimensions. For example, the matrix A above is a 3 × 2 matrix.

Matrices with a single row are called row vectors, and those with a single column are called column vectors. A matrix with the same number of rows and columns is called a square matrix. [11] A matrix with an infinite number of rows or columns (or both) is called an infinite matrix. In some contexts, such as computer algebra programs, it is useful to consider a matrix with no rows or no columns, called an empty matrix.

Overview of a matrix size
Name Size Example Description
Row vector 1 × n [ 3 7 2 ] 3&7&2end>> A matrix with one row, sometimes used to represent a vector
Column vector n × 1 [ 4 1 8 ] 418end>> A matrix with one column, sometimes used to represent a vector
Square matrix n × n [ 9 13 5 1 11 7 2 6 3 ] 9&13&51&11&72&6&3end>> A matrix with the same number of rows and columns, sometimes used to represent a linear transformation from a vector space to itself, such as reflection, rotation, or shearing.

Matrices are commonly written in box brackets or parentheses:

The specifics of symbolic matrix notation vary widely, with some prevailing trends. Matrices are usually symbolized using upper-case letters (such as A in the examples above), [3] while the corresponding lower-case letters, with two subscript indices (e.g., a11, or a1,1), represent the entries. In addition to using upper-case letters to symbolize matrices, many authors use a special typographical style, commonly boldface upright (non-italic), to further distinguish matrices from other mathematical objects. An alternative notation involves the use of a double-underline with the variable name, with or without boldface style (as in the case of A _ _ >> ).

The entry in the i-th row and j-th column of a matrix A is sometimes referred to as the i,j, (i,j), or (i,j)th entry of the matrix, and most commonly denoted as ai,j, or aij. Alternative notations for that entry are A[i,j] or Ai,j. For example, the (1,3) entry of the following matrix A is 5 (also denoted a13, a1,3, A[1,3] or A1,3):

Sometimes, the entries of a matrix can be defined by a formula such as ai,j = f(i, j). For example, each of the entries of the following matrix A is determined by the formula aij = ij.

In this case, the matrix itself is sometimes defined by that formula, within square brackets or double parentheses. For example, the matrix above is defined as A = [ij], or A = ((ij)). If matrix size is m × n, the above-mentioned formula f(i, j) is valid for any i = 1, . m and any j = 1, . n. This can be either specified separately, or indicated using m × n as a subscript. For instance, the matrix A above is 3 × 4, and can be defined as A = [ij] (i = 1, 2, 3 j = 1, . 4), or A = [ij]3×4.

Some programming languages utilize doubly subscripted arrays (or arrays of arrays) to represent an m-×-n matrix. Some programming languages start the numbering of array indexes at zero, in which case the entries of an m-by-n matrix are indexed by 0 ≤ im − 1 and 0 ≤ jn − 1 . [12] This article follows the more common convention in mathematical writing where enumeration starts from 1.

An asterisk is occasionally used to refer to whole rows or columns in a matrix. For example, ai,∗ refers to the i th row of A, and a∗,j refers to the j th column of A. The set of all m-by-n matrices is denoted M ( m , n ) , (m,n),> or R m × n ^> for real matrices.

There are a number of basic operations that can be applied to modify matrices, called matrix addition, scalar multiplication, transposition, matrix multiplication, row operations, and submatrix. [14]

Addition, scalar multiplication, and transposition Edit

This operation is called scalar multiplication, but its result is not named "scalar product" to avoid confusion, since "scalar product" is sometimes used as a synonym for "inner product".

Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, that is, the matrix sum does not depend on the order of the summands: A + B = B + A. [15] The transpose is compatible with addition and scalar multiplication, as expressed by (cA) T = c(A T ) and (A + B) T = A T + B T . Finally, (A T ) T = A.

Matrix multiplication Edit

Multiplication of two matrices is defined if and only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix product AB is the m-by-p matrix whose entries are given by dot product of the corresponding row of A and the corresponding column of B: [16]

where 1 ≤ im and 1 ≤ jp. [17] For example, the underlined entry 2340 in the product is calculated as (2 × 1000) + (3 × 100) + (4 × 10) = 2340:

Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A + B)C = AC + BC as well as C(A + B) = CA + CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined. [18] The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and mk. Even if both products are defined, they generally need not be equal, that is:

In other words, matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers, whose product is independent of the order of the factors. [16] An example of two matrices not commuting with each other is:

Besides the ordinary matrix multiplication just described, other less frequently used operations on matrices that can be considered forms of multiplication also exist, such as the Hadamard product and the Kronecker product. [19] They arise in solving matrix equations such as the Sylvester equation.

Row operations Edit

There are three types of row operations:

  1. row addition, that is adding a row to another.
  2. row multiplication, that is multiplying all entries of a row by a non-zero constant
  3. row switching, that is interchanging two rows of a matrix

These operations are used in several ways, including solving linear equations and finding matrix inverses.

Submatrix Edit

A submatrix of a matrix is obtained by deleting any collection of rows and/or columns. [20] [21] [22] For example, from the following 3-by-4 matrix, we can construct a 2-by-3 submatrix by removing row 3 and column 2:

The minors and cofactors of a matrix are found by computing the determinant of certain submatrices. [22] [23]

A principal submatrix is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. [24] [25] Other authors define a principal submatrix as one in which the first k rows and columns, for some number k, are the ones that remain [26] this type of submatrix has also been called a leading principal submatrix. [27]

Matrices can be used to compactly write and work with multiple linear equations, that is, systems of linear equations. For example, if A is an m-by-n matrix, x designates a column vector (that is, n×1-matrix) of n variables x1, x2, . xn, and b is an m×1-column vector, then the matrix equation

is equivalent to the system of linear equations [28]

Using matrices, this can be solved more compactly than would be possible by writing out all the equations separately. If n = m and the equations are independent, then this can be done by writing

where A −1 is the inverse matrix of A. If A has no inverse, solutions—if any—can be found using its generalized inverse.

Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation R nR m mapping each vector x in R n to the (matrix) product Ax, which is a vector in R m . Conversely, each linear transformation f: R nR m arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the i th coordinate of f(ej), where ej = (0. 0,1,0. 0) is the unit vector with 1 in the j th position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.

For example, the 2×2 matrix

can be viewed as the transform of the unit square into a parallelogram with vertices at (0, 0) , (a, b) , (a + c, b + d) , and (c, d) . The parallelogram pictured at the right is obtained by multiplying A with each of the column vectors [ 0 0 ] , [ 1 0 ] , [ 1 1 ] 0end>,<egin1end>,<egin11end>> , and [ 0 1 ] 01end>> in turn. These vectors define the vertices of the unit square.

The following table shows several 2×2 real matrices with the associated linear maps of R 2 . The blue original is mapped to the green grid and shapes. The origin (0,0) is marked with a black point.

Horizontal shear
with m = 1.25.
Reflection through the vertical axis Squeeze mapping
with r = 3/2
Scaling
by a factor of 3/2
Rotation
by π /6 = 30°
[ 1 1.25 0 1 ] 1&1.25&1end>> [ − 1 0 0 1 ] -1&0&1end>> [ 3 2 0 0 2 3 ] <2>>&0&<3>>end>> [ 3 2 0 0 3 2 ] <2>>&0&<2>>end>> [ cos ⁡ ( π 6 ) − sin ⁡ ( π 6 ) sin ⁡ ( π 6 ) cos ⁡ ( π 6 ) ] cos left(<6>> ight)&-sin left(<6>> ight)sin left(<6>> ight)&cos left(<6>> ight)end>>

Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps: [29] if a k-by-m matrix B represents another linear map g: R mR k , then the composition gf is represented by BA since

(gf)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.

The last equality follows from the above-mentioned associativity of matrix multiplication.

The rank of a matrix A is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors. [30] Equivalently it is the dimension of the image of the linear map represented by A. [31] The rank–nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix. [32]

A square matrix is a matrix with the same number of rows and columns. [11] An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. The entries aii form the main diagonal of a square matrix. They lie on the imaginary line that runs from the top left corner to the bottom right corner of the matrix.

Main types Edit

Name Example with n = 3
Diagonal matrix [ a 11 0 0 0 a 22 0 0 0 a 33 ] a_<11>&0&0&a_<22>&0&0&a_<33>end>>
Lower triangular matrix [ a 11 0 0 a 21 a 22 0 a 31 a 32 a 33 ] a_<11>&0&0a_<21>&a_<22>&0a_<31>&a_<32>&a_<33>end>>
Upper triangular matrix [ a 11 a 12 a 13 0 a 22 a 23 0 0 a 33 ] a_<11>&a_<12>&a_<13>&a_<22>&a_<23>&0&a_<33>end>>

Diagonal and triangular matrix Edit

If all entries of A below the main diagonal are zero, A is called an upper triangular matrix. Similarly if all entries of A above the main diagonal are zero, A is called a lower triangular matrix. If all entries outside the main diagonal are zero, A is called a diagonal matrix.

Identity matrix Edit

The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, for example,

It is a square matrix of order n, and also a special kind of diagonal matrix. It is called an identity matrix because multiplication with it leaves a matrix unchanged:

AIn = ImA = A for any m-by-n matrix A.

A nonzero scalar multiple of an identity matrix is called a scalar matrix. If the matrix entries come from a field, the scalar matrices form a group, under matrix multiplication, that is isomorphic to the multiplicative group of nonzero elements of the field.

Symmetric or skew-symmetric matrix Edit

A square matrix A that is equal to its transpose, that is, A = A T , is a symmetric matrix. If instead, A is equal to the negative of its transpose, that is, A = −A T , then A is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A ∗ = A, where the star or asterisk denotes the conjugate transpose of the matrix, that is, the transpose of the complex conjugate of A.

By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis that is, every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real. [33] This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns, see below.

Invertible matrix and its inverse Edit

A square matrix A is called invertible or non-singular if there exists a matrix B such that

AB = BA = In , [34] [35]

where In is the n×n identity matrix with 1s on the main diagonal and 0s elsewhere. If B exists, it is unique and is called the inverse matrix of A, denoted A −1 .

Definite matrix Edit

A symmetric n×n-matrix A is called positive-definite if the associated quadratic form

f (x) = x T A x

has a positive value for every nonzero vector x in R n . If f (x) only yields negative values then A is negative-definite if f does produce both negative and positive values then A is indefinite. [36] If the quadratic form f yields only non-negative values (positive or zero), the symmetric matrix is called positive-semidefinite (or if only non-positive values, then negative-semidefinite) hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.

A symmetric matrix is positive-definite if and only if all its eigenvalues are positive, that is, the matrix is positive-semidefinite and it is invertible. [37] The table at the right shows two possibilities for 2-by-2 matrices.

Allowing as input two different vectors instead yields the bilinear form associated to A:

BA (x, y) = x T Ay. [38]

Orthogonal matrix Edit

An orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (that is, orthonormal vectors). Equivalently, a matrix A is orthogonal if its transpose is equal to its inverse:

where In is the identity matrix of size n.

An orthogonal matrix A is necessarily invertible (with inverse A −1 = A T ), unitary ( A −1 = A* ), and normal ( A*A = AA* ). The determinant of any orthogonal matrix is either +1 or −1 . A special orthogonal matrix is an orthogonal matrix with determinant +1. As a linear transformation, every orthogonal matrix with determinant +1 is a pure rotation without reflection, i.e., the transformation preserves the orientation of the transformed structure, while every orthogonal matrix with determinant -1 reverses the orientation, i.e., is a composition of a pure reflection and a (possibly null) rotation. The identity matrices have determinant 1 , and are pure rotations by an angle zero.

The complex analogue of an orthogonal matrix is a unitary matrix.

Main operations Edit

Trace Edit

The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While matrix multiplication is not commutative as mentioned above, the trace of the product of two matrices is independent of the order of the factors:

This is immediate from the definition of matrix multiplication:

It follows that the trace of the product of more than two matrices is independent of cyclic permutations of the matrices, however this does not in general apply for arbitrary permutations (for example, tr(ABC) ≠ tr(BAC), in general). Also, the trace of a matrix is equal to that of its transpose, that is,

Determinant Edit

The determinant of a square matrix A (denoted det(A) or |A| [3] ) is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R 2 ) or volume (in R 3 ) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.

The determinant of 2-by-2 matrices is given by

The determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions. [39]

The determinant of a product of square matrices equals the product of their determinants:

det(AB) = det(A) · det(B). [40]

Adding a multiple of any row to another row, or a multiple of any column to another column does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1. [41] Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices, the determinant equals the product of the entries on the main diagonal this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, that is, determinants of smaller matrices. [42] This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables. [43]

Eigenvalues and eigenvectors Edit

A number λ and a non-zero vector v satisfying

are called an eigenvalue and an eigenvector of A, respectively. [44] [45] The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to

The polynomial pA in an indeterminate X given by evaluation of the determinant det(XInA) is called the characteristic polynomial of A. It is a monic polynomial of degree n. Therefore the polynomial equation pA(λ) = 0 has at most n different solutions, that is, eigenvalues of the matrix. [47] They may be complex even if the entries of A are real. According to the Cayley–Hamilton theorem, pA(A) = 0 , that is, the result of substituting the matrix itself into its own characteristic polynomial yields the zero matrix.

Matrix calculations can be often performed with different techniques. Many problems can be solved by both direct algorithms or iterative approaches. For example, the eigenvectors of a square matrix can be obtained by finding a sequence of vectors xn converging to an eigenvector when n tends to infinity. [48]

To choose the most appropriate algorithm for each specific problem, it is important to determine both the effectiveness and precision of all the available algorithms. The domain studying these matters is called numerical linear algebra. [49] As with other numerical situations, two main aspects are the complexity of algorithms and their numerical stability.

Determining the complexity of an algorithm means finding upper bounds or estimates of how many elementary operations such as additions and multiplications of scalars are necessary to perform some algorithm, for example, multiplication of matrices. Calculating the matrix product of two n-by-n matrices using the definition given above needs n 3 multiplications, since for any of the n 2 entries of the product, n multiplications are necessary. The Strassen algorithm outperforms this "naive" algorithm it needs only n 2.807 multiplications. [50] A refined approach also incorporates specific features of the computing devices.

In many practical situations additional information about the matrices involved is known. An important case are sparse matrices, that is, matrices most of whose entries are zero. There are specifically adapted algorithms for, say, solving linear systems Ax = b for sparse matrices A, such as the conjugate gradient method. [51]

An algorithm is, roughly speaking, numerically stable, if little deviations in the input values do not lead to big deviations in the result. For example, calculating the inverse of a matrix via Laplace expansion (adj(A) denotes the adjugate matrix of A)

A −1 = adj(A) / det(A)

may lead to significant rounding errors if the determinant of the matrix is very small. The norm of a matrix can be used to capture the conditioning of linear algebraic problems, such as computing a matrix's inverse. [52]

Most computer programming languages support arrays but are not designed with built-in commands for matrices. Instead, available external libraries provide matrix operations on arrays, in nearly all currently used programming languages. Matrix manipulation was among the earliest numerical applications of computers. [53] The original Dartmouth BASIC had built-in commands for matrix arithmetic on arrays from its second edition implementation in 1964. As early as the 1970s, some engineering desktop computers such as the HP 9830 had ROM cartridges to add BASIC commands for matrices. Some computer languages such as APL were designed to manipulate matrices, and various mathematical programs can be used to aid computing with matrices. [54]

There are several methods to render matrices into a more easily accessible form. They are generally referred to as matrix decomposition or matrix factorization techniques. The interest of all these techniques is that they preserve certain properties of the matrices in question, such as determinant, rank, or inverse, so that these quantities can be calculated after applying the transformation, or that certain matrix operations are algorithmically easier to carry out for some types of matrices.

The LU decomposition factors matrices as a product of lower (L) and an upper triangular matrices (U). [55] Once this decomposition is calculated, linear systems can be solved more efficiently, by a simple technique called forward and back substitution. Likewise, inverses of triangular matrices are algorithmically easier to calculate. The Gaussian elimination is a similar algorithm it transforms any matrix to row echelon form. [56] Both methods proceed by multiplying the matrix by suitable elementary matrices, which correspond to permuting rows or columns and adding multiples of one row to another row. Singular value decomposition expresses any matrix A as a product UDV ∗ , where U and V are unitary matrices and D is a diagonal matrix.

The eigendecomposition or diagonalization expresses A as a product VDV −1 , where D is a diagonal matrix and V is a suitable invertible matrix. [57] If A can be written in this form, it is called diagonalizable. More generally, and applicable to all matrices, the Jordan decomposition transforms a matrix into Jordan normal form, that is to say matrices whose only nonzero entries are the eigenvalues λ1 to λn of A, placed on the main diagonal and possibly entries equal to one directly above the main diagonal, as shown at the right. [58] Given the eigendecomposition, the n th power of A (that is, n-fold iterated matrix multiplication) can be calculated via

A n = (VDV −1 ) n = VDV −1 VDV −1 . VDV −1 = VD n V −1

and the power of a diagonal matrix can be calculated by taking the corresponding powers of the diagonal entries, which is much easier than doing the exponentiation for A instead. This can be used to compute the matrix exponential e A , a need frequently arising in solving linear differential equations, matrix logarithms and square roots of matrices. [59] To avoid numerically ill-conditioned situations, further algorithms such as the Schur decomposition can be employed. [60]

Matrices can be generalized in different ways. Abstract algebra uses matrices with entries in more general fields or even rings, while linear algebra codifies properties of matrices in the notion of linear maps. It is possible to consider matrices with infinitely many columns and rows. Another extension is tensors, which can be seen as higher-dimensional arrays of numbers, as opposed to vectors, which can often be realized as sequences of numbers, while matrices are rectangular or two-dimensional arrays of numbers. [61] Matrices, subject to certain requirements tend to form groups known as matrix groups. Similarly under certain conditions matrices form rings known as matrix rings. Though the product of matrices is not in general commutative yet certain matrices form fields known as matrix fields.

Matrices with more general entries Edit

This article focuses on matrices whose entries are real or complex numbers. However, matrices can be considered with much more general types of entries than real or complex numbers. As a first step of generalization, any field, that is, a set where addition, subtraction, multiplication, and division operations are defined and well-behaved, may be used instead of R or C, for example rational numbers or finite fields. For example, coding theory makes use of matrices over finite fields. Wherever eigenvalues are considered, as these are roots of a polynomial they may exist only in a larger field than that of the entries of the matrix for instance, they may be complex in the case of a matrix with real entries. The possibility to reinterpret the entries of a matrix as elements of a larger field (for example, to view a real matrix as a complex matrix whose entries happen to be all real) then allows considering each square matrix to possess a full set of eigenvalues. Alternatively one can consider only matrices with entries in an algebraically closed field, such as C, from the outset.

More generally, matrices with entries in a ring R are widely used in mathematics. [62] Rings are a more general notion than fields in that a division operation need not exist. The very same addition and multiplication operations of matrices extend to this setting, too. The set M(n, R) of all square n-by-n matrices over R is a ring called matrix ring, isomorphic to the endomorphism ring of the left R-module R n . [63] If the ring R is commutative, that is, its multiplication is commutative, then M(n, R) is a unitary noncommutative (unless n = 1) associative algebra over R. The determinant of square matrices over a commutative ring R can still be defined using the Leibniz formula such a matrix is invertible if and only if its determinant is invertible in R, generalising the situation over a field F, where every nonzero element is invertible. [64] Matrices over superrings are called supermatrices. [65]

Matrices do not always have all their entries in the same ring – or even in any ring at all. One special but common case is block matrices, which may be considered as matrices whose entries themselves are matrices. The entries need not be square matrices, and thus need not be members of any ring but their sizes must fulfill certain compatibility conditions.

Relationship to linear maps Edit

Linear maps R nR m are equivalent to m-by-n matrices, as described above. More generally, any linear map f: VW between finite-dimensional vector spaces can be described by a matrix A = (aij), after choosing bases v1, . vn of V, and w1, . wm of W (so n is the dimension of V and m is the dimension of W), which is such that

In other words, column j of A expresses the image of vj in terms of the basis vectors wi of W thus this relation uniquely determines the entries of the matrix A. The matrix depends on the choice of the bases: different choices of bases give rise to different, but equivalent matrices. [66] Many of the above concrete notions can be reinterpreted in this light, for example, the transpose matrix A T describes the transpose of the linear map given by A, with respect to the dual bases. [67]

These properties can be restated more naturally: the category of all matrices with entries in a field k with multiplication as composition is equivalent to the category of finite-dimensional vector spaces and linear maps over this field.

More generally, the set of m×n matrices can be used to represent the R-linear maps between the free modules R m and R n for an arbitrary ring R with unity. When n = m composition of these maps is possible, and this gives rise to the matrix ring of n×n matrices representing the endomorphism ring of R n .

Matrix groups Edit

A group is a mathematical structure consisting of a set of objects together with a binary operation, that is, an operation combining any two objects to a third, subject to certain requirements. [68] A group in which the objects are matrices and the group operation is matrix multiplication is called a matrix group. [69] [70] Since a group every element must be invertible, the most general matrix groups are the groups of all invertible matrices of a given size, called the general linear groups.

Any property of matrices that is preserved under matrix products and inverses can be used to define further matrix groups. For example, matrices with a given size and with a determinant of 1 form a subgroup of (that is, a smaller group contained in) their general linear group, called a special linear group. [71] Orthogonal matrices, determined by the condition

form the orthogonal group. [72] Every orthogonal matrix has determinant 1 or −1. Orthogonal matrices with determinant 1 form a subgroup called special orthogonal group.

Every finite group is isomorphic to a matrix group, as one can see by considering the regular representation of the symmetric group. [73] General groups can be studied using matrix groups, which are comparatively well understood, by means of representation theory. [74]

Infinite matrices Edit

It is also possible to consider matrices with infinitely many rows and/or columns [75] even if, being infinite objects, one cannot write down such matrices explicitly. All that matters is that for every element in the set indexing rows, and every element in the set indexing columns, there is a well-defined entry (these index sets need not even be subsets of the natural numbers). The basic operations of addition, subtraction, scalar multiplication, and transposition can still be defined without problem however matrix multiplication may involve infinite summations to define the resulting entries, and these are not defined in general.

If infinite matrices are used to describe linear maps, then only those matrices can be used all of whose columns have but a finite number of nonzero entries, for the following reason. For a matrix A to describe a linear map f: VW, bases for both spaces must have been chosen recall that by definition this means that every vector in the space can be written uniquely as a (finite) linear combination of basis vectors, so that written as a (column) vector v of coefficients, only finitely many entries vi are nonzero. Now the columns of A describe the images by f of individual basis vectors of V in the basis of W, which is only meaningful if these columns have only finitely many nonzero entries. There is no restriction on the rows of A however: in the product A·v there are only finitely many nonzero coefficients of v involved, so every one of its entries, even if it is given as an infinite sum of products, involves only finitely many nonzero terms and is therefore well defined. Moreover, this amounts to forming a linear combination of the columns of A that effectively involves only finitely many of them, whence the result has only finitely many nonzero entries because each of those columns does. Products of two matrices of the given type are well defined (provided that the column-index and row-index sets match), are of the same type, and correspond to the composition of linear maps.

If R is a normed ring, then the condition of row or column finiteness can be relaxed. With the norm in place, absolutely convergent series can be used instead of finite sums. For example, the matrices whose column sums are absolutely convergent sequences form a ring. Analogously, the matrices whose row sums are absolutely convergent series also form a ring.

Infinite matrices can also be used to describe operators on Hilbert spaces, where convergence and continuity questions arise, which again results in certain constraints that must be imposed. However, the explicit point of view of matrices tends to obfuscate the matter, [76] and the abstract and more powerful tools of functional analysis can be used instead.

Empty matrices Edit

An empty matrix is a matrix in which the number of rows or columns (or both) is zero. [77] [78] Empty matrices help dealing with maps involving the zero vector space. For example, if A is a 3-by-0 matrix and B is a 0-by-3 matrix, then AB is the 3-by-3 zero matrix corresponding to the null map from a 3-dimensional space V to itself, while BA is a 0-by-0 matrix. There is no common notation for empty matrices, but most computer algebra systems allow creating and computing with them. The determinant of the 0-by-0 matrix is 1 as follows regarding the empty product occurring in the Leibniz formula for the determinant as 1. This value is also consistent with the fact that the identity map from any finite-dimensional space to itself has determinant 1, a fact that is often used as a part of the characterization of determinants.

There are numerous applications of matrices, both in mathematics and other sciences. Some of them merely take advantage of the compact representation of a set of numbers in a matrix. For example, in game theory and economics, the payoff matrix encodes the payoff for two players, depending on which out of a given (finite) set of alternatives the players choose. [79] Text mining and automated thesaurus compilation makes use of document-term matrices such as tf-idf to track frequencies of certain words in several documents. [80]

Complex numbers can be represented by particular real 2-by-2 matrices via

under which addition and multiplication of complex numbers and matrices correspond to each other. For example, 2-by-2 rotation matrices represent the multiplication with some complex number of absolute value 1, as above. A similar interpretation is possible for quaternions [81] and Clifford algebras in general.

Early encryption techniques such as the Hill cipher also used matrices. However, due to the linear nature of matrices, these codes are comparatively easy to break. [82] Computer graphics uses matrices both to represent objects and to calculate transformations of objects using affine rotation matrices to accomplish tasks such as projecting a three-dimensional object onto a two-dimensional screen, corresponding to a theoretical camera observation. [83] Matrices over a polynomial ring are important in the study of control theory.

Chemistry makes use of matrices in various ways, particularly since the use of quantum theory to discuss molecular bonding and spectroscopy. Examples are the overlap matrix and the Fock matrix used in solving the Roothaan equations to obtain the molecular orbitals of the Hartree–Fock method.

Graph theory Edit

The adjacency matrix of a finite graph is a basic notion of graph theory. [84] It records which vertices of the graph are connected by an edge. Matrices containing just two different values (1 and 0 meaning for example "yes" and "no", respectively) are called logical matrices. The distance (or cost) matrix contains information about distances of the edges. [85] These concepts can be applied to websites connected by hyperlinks or cities connected by roads etc., in which case (unless the connection network is extremely dense) the matrices tend to be sparse, that is, contain few nonzero entries. Therefore, specifically tailored matrix algorithms can be used in network theory.

Analysis and geometry Edit

The Hessian matrix of a differentiable function ƒ: R nR consists of the second derivatives of ƒ with respect to the several coordinate directions, that is, [86]

It encodes information about the local growth behaviour of the function: given a critical point x = (x1, . xn), that is, a point where the first partial derivatives ∂ f / ∂ x i > of ƒ vanish, the function has a local minimum if the Hessian matrix is positive definite. Quadratic programming can be used to find global minima or maxima of quadratic functions closely related to the ones attached to matrices (see above). [87]

Another matrix frequently used in geometrical situations is the Jacobi matrix of a differentiable map f: R nR m . If f1, . fm denote the components of f, then the Jacobi matrix is defined as [88]

If n > m, and if the rank of the Jacobi matrix attains its maximal value m, f is locally invertible at that point, by the implicit function theorem. [89]

Partial differential equations can be classified by considering the matrix of coefficients of the highest-order differential operators of the equation. For elliptic partial differential equations this matrix is positive definite, which has a decisive influence on the set of possible solutions of the equation in question. [90]

The finite element method is an important numerical method to solve partial differential equations, widely applied in simulating complex physical systems. It attempts to approximate the solution to some equation by piecewise linear functions, where the pieces are chosen concerning a sufficiently fine grid, which in turn can be recast as a matrix equation. [91]

Probability theory and statistics Edit

Stochastic matrices are square matrices whose rows are probability vectors, that is, whose entries are non-negative and sum up to one. Stochastic matrices are used to define Markov chains with finitely many states. [92] A row of the stochastic matrix gives the probability distribution for the next position of some particle currently in the state that corresponds to the row. Properties of the Markov chain-like absorbing states, that is, states that any particle attains eventually, can be read off the eigenvectors of the transition matrices. [93]

Statistics also makes use of matrices in many different forms. [94] Descriptive statistics is concerned with describing data sets, which can often be represented as data matrices, which may then be subjected to dimensionality reduction techniques. The covariance matrix encodes the mutual variance of several random variables. [95] Another technique using matrices are linear least squares, a method that approximates a finite set of pairs (x1, y1), (x2, y2), . (xN, yN), by a linear function

yiaxi + b, i = 1, . N

which can be formulated in terms of matrices, related to the singular value decomposition of matrices. [96]

Random matrices are matrices whose entries are random numbers, subject to suitable probability distributions, such as matrix normal distribution. Beyond probability theory, they are applied in domains ranging from number theory to physics. [97] [98]

Symmetries and transformations in physics Edit

Linear transformations and the associated symmetries play a key role in modern physics. For example, elementary particles in quantum field theory are classified as representations of the Lorentz group of special relativity and, more specifically, by their behavior under the spin group. Concrete representations involving the Pauli matrices and more general gamma matrices are an integral part of the physical description of fermions, which behave as spinors. [99] For the three lightest quarks, there is a group-theoretical representation involving the special unitary group SU(3) for their calculations, physicists use a convenient matrix representation known as the Gell-Mann matrices, which are also used for the SU(3) gauge group that forms the basis of the modern description of strong nuclear interactions, quantum chromodynamics. The Cabibbo–Kobayashi–Maskawa matrix, in turn, expresses the fact that the basic quark states that are important for weak interactions are not the same as, but linearly related to the basic quark states that define particles with specific and distinct masses. [100]

Linear combinations of quantum states Edit

The first model of quantum mechanics (Heisenberg, 1925) represented the theory's operators by infinite-dimensional matrices acting on quantum states. [101] This is also referred to as matrix mechanics. One particular example is the density matrix that characterizes the "mixed" state of a quantum system as a linear combination of elementary, "pure" eigenstates. [102]

Another matrix serves as a key tool for describing the scattering experiments that form the cornerstone of experimental particle physics: Collision reactions such as occur in particle accelerators, where non-interacting particles head towards each other and collide in a small interaction zone, with a new set of non-interacting particles as the result, can be described as the scalar product of outgoing particle states and a linear combination of ingoing particle states. The linear combination is given by a matrix known as the S-matrix, which encodes all information about the possible interactions between particles. [103]

Normal modes Edit

A general application of matrices in physics is the description of linearly coupled harmonic systems. The equations of motion of such systems can be described in matrix form, with a mass matrix multiplying a generalized velocity to give the kinetic term, and a force matrix multiplying a displacement vector to characterize the interactions. The best way to obtain solutions is to determine the system's eigenvectors, its normal modes, by diagonalizing the matrix equation. Techniques like this are crucial when it comes to the internal dynamics of molecules: the internal vibrations of systems consisting of mutually bound component atoms. [104] They are also needed for describing mechanical vibrations, and oscillations in electrical circuits. [105]

Geometrical optics Edit

Geometrical optics provides further matrix applications. In this approximative theory, the wave nature of light is neglected. The result is a model in which light rays are indeed geometrical rays. If the deflection of light rays by optical elements is small, the action of a lens or reflective element on a given light ray can be expressed as multiplication of a two-component vector with a two-by-two matrix called ray transfer matrix analysis: the vector's components are the light ray's slope and its distance from the optical axis, while the matrix encodes the properties of the optical element. Actually, there are two kinds of matrices, viz. a refraction matrix describing the refraction at a lens surface, and a translation matrix, describing the translation of the plane of reference to the next refracting surface, where another refraction matrix applies. The optical system, consisting of a combination of lenses and/or reflective elements, is simply described by the matrix resulting from the product of the components' matrices. [106]

Electronics Edit

Traditional mesh analysis and nodal analysis in electronics lead to a system of linear equations that can be described with a matrix.

The behaviour of many electronic components can be described using matrices. Let A be a 2-dimensional vector with the component's input voltage v1 and input current i1 as its elements, and let B be a 2-dimensional vector with the component's output voltage v2 and output current i2 as its elements. Then the behaviour of the electronic component can be described by B = H · A, where H is a 2 x 2 matrix containing one impedance element (h12), one admittance element (h21), and two dimensionless elements (h11 and h22). Calculating a circuit now reduces to multiplying matrices.

Matrices have a long history of application in solving linear equations but they were known as arrays until the 1800s. The Chinese text The Nine Chapters on the Mathematical Art written in 10th–2nd century BCE is the first example of the use of array methods to solve simultaneous equations, [107] including the concept of determinants. In 1545 Italian mathematician Gerolamo Cardano brought the method to Europe when he published Ars Magna. [108] The Japanese mathematician Seki used the same array methods to solve simultaneous equations in 1683. [109] The Dutch Mathematician Jan de Witt represented transformations using arrays in his 1659 book Elements of Curves (1659). [110] Between 1700 and 1710 Gottfried Wilhelm Leibniz publicized the use of arrays for recording information or solutions and experimented with over 50 different systems of arrays. [108] Cramer presented his rule in 1750.

The term "matrix" (Latin for "womb", derived from mater—mother [111] ) was coined by James Joseph Sylvester in 1850, [112] who understood a matrix as an object giving rise to several determinants today called minors, that is to say, determinants of smaller matrices that derive from the original one by removing columns and rows. In an 1851 paper, Sylvester explains:

I have in previous papers defined a "Matrix" as a rectangular array of terms, out of which different systems of determinants may be engendered as from the womb of a common parent. [113]

Arthur Cayley published a treatise on geometric transformations using matrices that were not rotated versions of the coefficients being investigated as had previously been done. Instead, he defined operations such as addition, subtraction, multiplication, and division as transformations of those matrices and showed the associative and distributive properties held true. Cayley investigated and demonstrated the non-commutative property of matrix multiplication as well as the commutative property of matrix addition. [108] Early matrix theory had limited the use of arrays almost exclusively to determinants and Arthur Cayley's abstract matrix operations were revolutionary. He was instrumental in proposing a matrix concept independent of equation systems. In 1858 Cayley published his A memoir on the theory of matrices [114] [115] in which he proposed and demonstrated the Cayley–Hamilton theorem. [108]

An English mathematician named Cullis was the first to use modern bracket notation for matrices in 1913 and he simultaneously demonstrated the first significant use of the notation A = [ai,j] to represent a matrix where ai,j refers to the ith row and the jth column. [108]

The modern study of determinants sprang from several sources. [116] Number-theoretical problems led Gauss to relate coefficients of quadratic forms, that is, expressions such as x 2 + xy − 2y 2 , and linear maps in three dimensions to matrices. Eisenstein further developed these notions, including the remark that, in modern parlance, matrix products are non-commutative. Cauchy was the first to prove general statements about determinants, using as definition of the determinant of a matrix A = [ai,j] the following: replace the powers aj k by ajk in the polynomial

where Π denotes the product of the indicated terms. He also showed, in 1829, that the eigenvalues of symmetric matrices are real. [117] Jacobi studied "functional determinants"—later called Jacobi determinants by Sylvester—which can be used to describe geometric transformations at a local (or infinitesimal) level, see above Kronecker's Vorlesungen über die Theorie der Determinanten [118] and Weierstrass' Zur Determinantentheorie, [119] both published in 1903, first treated determinants axiomatically, as opposed to previous more concrete approaches such as the mentioned formula of Cauchy. At that point, determinants were firmly established.

Many theorems were first established for small matrices only, for example, the Cayley–Hamilton theorem was proved for 2×2 matrices by Cayley in the aforementioned memoir, and by Hamilton for 4×4 matrices. Frobenius, working on bilinear forms, generalized the theorem to all dimensions (1898). Also at the end of the 19th century, the Gauss–Jordan elimination (generalizing a special case now known as Gauss elimination) was established by Jordan. In the early 20th century, matrices attained a central role in linear algebra, [120] partially due to their use in classification of the hypercomplex number systems of the previous century.

The inception of matrix mechanics by Heisenberg, Born and Jordan led to studying matrices with infinitely many rows and columns. [121] Later, von Neumann carried out the mathematical formulation of quantum mechanics, by further developing functional analytic notions such as linear operators on Hilbert spaces, which, very roughly speaking, correspond to Euclidean space, but with an infinity of independent directions.

Other historical usages of the word "matrix" in mathematics Edit

The word has been used in unusual ways by at least two authors of historical importance.

Bertrand Russell and Alfred North Whitehead in their Principia Mathematica (1910–1913) use the word "matrix" in the context of their axiom of reducibility. They proposed this axiom as a means to reduce any function to one of lower type, successively, so that at the "bottom" (0 order) the function is identical to its extension:

"Let us give the name of matrix to any function, of however many variables, that does not involve any apparent variables. Then, any possible function other than a matrix derives from a matrix by means of generalization, that is, by considering the proposition that the function in question is true with all possible values or with some value of one of the arguments, the other argument or arguments remaining undetermined". [122]

For example, a function Φ(x, y) of two variables x and y can be reduced to a collection of functions of a single variable, for example, y, by "considering" the function for all possible values of "individuals" ai substituted in place of variable x. And then the resulting collection of functions of the single variable y, that is, ∀ai: Φ(ai, y), can be reduced to a "matrix" of values by "considering" the function for all possible values of "individuals" bi substituted in place of variable y:

Alfred Tarski in his 1946 Introduction to Logic used the word "matrix" synonymously with the notion of truth table as used in mathematical logic. [123]


7.5: Upper Triangular Matrices - Mathematics

In this section we need to take a look at the third method for solving systems of equations. For systems of two equations it is probably a little more complicated than the methods we looked at in the first section. However, for systems with more equations it is probably easier than using the method we saw in the previous section.

Before we get into the method we first need to get some definitions out of the way.

An augmented matrix for a system of equations is a matrix of numbers in which each row represents the constants from one equation (both the coefficients and the constant on the other side of the equal sign) and each column represents all the coefficients for a single variable.

Let’s take a look at an example. Here is the system of equations that we looked at in the previous section.

[eginx - 2y + 3z & = 7 2x + y + z & = 4 - 3x + 2y - 2z & = - 10end]

Here is the augmented matrix for this system.

The first row consists of all the constants from the first equation with the coefficient of the (x) in the first column, the coefficient of the (y) in the second column, the coefficient of the (z) in the third column and the constant in the final column. The second row is the constants from the second equation with the same placement and likewise for the third row. The dashed line represents where the equal sign was in the original system of equations and is not always included. This is mostly dependent on the instructor and/or textbook being used.

Next, we need to discuss elementary row operations. There are three of them and we will give both the notation used for each one as well as an example using the augmented matrix given above.

So, we do exactly what the operation says. Every entry in the third row moves up to the first row and every entry in the first row moves down to the third row. Make sure that you move all the entries. One of the more common mistakes is to forget to move one or more entries.

So, when we say we will multiply a row by a constant this really means that we will multiply every entry in that row by the constant. Watch out for signs in this operation and make sure that you multiply every entry.

Let’s go through the individual computation to make sure you followed this.

[egin - 3 - 4left( 1 ight) & = - 7hspace<0.25in> 2 - 4left( < - 2> ight) & = 10 - 2 - 4left( 3 ight) & = - 14 - 10 - 4left( 7 ight) & = - 38end]

Be very careful with signs here. We will be doing these computations in our head for the most part and it is very easy to get signs mixed up and add one in that doesn’t belong or lose one that should be there.

It is very important that you can do this operation as this operation is the one that we will be using more than the other two combined.

Okay, so how do we use augmented matrices and row operations to solve systems? Let’s start with a system of two equations and two unknowns.

[eginax + by & = p cx + dy & = qend]

We first write down the augmented matrix for this system,

and use elementary row operations to convert it into the following augmented matrix.

Once we have the augmented matrix in this form we are done. The solution to the system will be (x = h) and (y = k).

This method is called Gauss-Jordan Elimination.

  1. (egin3x - 2y & = 14 x + 3y & = 1end)
  2. (egin - 2x + y & = - 3 x - 4y & = - 2end)
  3. (egin3x - 6y & = - 9 - 2x - 2y & = 12end)

The first step here is to write down the augmented matrix for this system.

To convert it into the final form we will start in the upper left corner and work in a counter-clockwise direction until the first two columns appear as they should be.

So, the first step is to make the red three in the augmented matrix above into a 1. We can use any of the row operations that we’d like to. We should always try to minimize the work as much as possible however.

So, since there is a one in the first column already it just isn’t in the correct row let’s use the first row operation and interchange the two rows.

The next step is to get a zero below the 1 that we just got in the upper left hand corner. This means that we need to change the red three into a zero. This will almost always require us to use third row operation. If we add -3 times row 1 onto row 2 we can convert that 3 into a 0. Here is that operation.

Next, we need to get a 1 into the lower right corner of the first two columns. This means changing the red -11 into a 1. This is usually accomplished with the second row operation. If we divide the second row by -11 we will get the 1 in that spot that we need.

Okay, we’re almost done. The final step is to turn the red three into a zero. Again, this almost always requires the third row operation. Here is the operation for this final step.

We have the augmented matrix in the required form and so we’re done. The solution to this system is (x = 4) and (y = - 1).

In this part we won’t put in as much explanation for each step. We will mark the next number that we need to change in red as we did in the previous part.

We’ll first write down the augmented matrix and then get started with the row operations.

Before proceeding with the next step let’s notice that in the second matrix we had one’s in both spots that we needed them. However, the only way to change the -2 into a zero that we had to have as well was to also change the 1 in the lower right corner as well. This is okay. Sometimes it will happen and trying to keep both ones will only cause problems.

The solution to this system is then (x = 2) and (y = 1).

Let’s first write down the augmented matrix for this system.

Now, in this case there isn’t a 1 in the first column and so we can’t just interchange two rows as the first step. However, notice that since all the entries in the first row have 3 as a factor we can divide the first row by 3 which will get a 1 in that spot and we won’t put any fractions into the problem.

Here is the work for this system.

The solution to this system is (x = - 5) and (y = - 1).

It is important to note that the path we took to get the augmented matrices in this example into the final form is not the only path that we could have used. There are many different paths that we could have gone down. All the paths would have arrived at the same final augmented matrix however so we should always choose the path that we feel is the easiest path. Note as well that different people may well feel that different paths are easier and so may well solve the systems differently. They will get the same solution however.

For two equations and two unknowns this process is probably a little more complicated than just the straight forward solution process we used in the first section of this chapter. This process does start becoming useful when we start looking at larger systems. So, let’s take a look at a couple of systems with three equations in them.

In this case the process is basically identical except that there’s going to be more to do. As with two equations we will first set up the augmented matrix and then use row operations to put it into the form,

Once the augmented matrix is in this form the solution is (x = p), (y = q) and (z = r). As with the two equations case there really isn’t any set path to take in getting the augmented matrix into this form. The usual path is to get the 1’s in the correct places and 0’s below them. Once this is done we then try to get zeroes above the 1’s.

Let’s work a couple of examples to see how this works.

  1. (egin3x + y - 2z & = 2 x - 2y + z & = 3 2x - y - 3z & = 3end)
  2. (egin3x + y - 2z & = - 7 2x + 2y + z & = 9 - x - y + 3z & = 6end)

Let’s first write down the augmented matrix for this system.

As with the previous examples we will mark the number(s) that we want to change in a given step in red. The first step here is to get a 1 in the upper left hand corner and again, we have many ways to do this. In this case we’ll notice that if we interchange the first and second row we can get a 1 in that spot with relatively little work.

The next step is to get the two numbers below this 1 to be 0’s. Note as well that this will almost always require the third row operation to do. Also, we can do both of these in one step as follows.

Next, we want to turn the 7 into a 1. We can do this by dividing the second row by 7.

So, we got a fraction showing up here. That will happen on occasion so don’t get all that excited about it. The next step is to change the 3 below this new 1 into a 0. Note that we aren’t going to bother with the -2 above it quite yet. Sometimes it is just as easy to turn this into a 0 in the same step. In this case however, it’s probably just as easy to do it later as we’ll see.

So, using the third row operation we get,

Next, we need to get the number in the bottom right corner into a 1. We can do that with the second row operation.

Now, we need zeroes above this new 1. So, using the third row operation twice as follows will do what we need done.

Notice that in this case the final column didn’t change in this step. That was only because the final entry in that column was zero. In general, this won’t happen.

The final step is then to make the -2 above the 1 in the second column into a zero. This can easily be done with the third row operation.

So, we have the augmented matrix in the final form and the solution will be,

This can be verified by plugging these into all three equations and making sure that they are all satisfied.

Again, the first step is to write down the augmented matrix.

We can’t get a 1 in the upper left corner simply by interchanging rows this time. We could interchange the first and last row, but that would also require another operation to turn the -1 into a 1. While this isn’t difficult it’s two operations. Note that we could use the third row operation to get a 1 in that spot as follows.

Now, we can use the third row operation to turn the two red numbers into zeroes.

The next step is to get a 1 in the spot occupied by the red 4. We could do that by dividing the whole row by 4, but that would put in a couple of somewhat unpleasant fractions. So, instead of doing that we are going to interchange the second and third row. The reason for this will be apparent soon enough.

Now, if we divide the second row by -2 we get the 1 in that spot that we want.

Before moving onto the next step let’s think notice a couple of things here. First, we managed to avoid fractions, which is always a good thing, and second this row is now done. We would have eventually needed a zero in that third spot and we’ve got it there for free. Not only that, but it won’t change in any of the later operations. This doesn’t always happen, but if it does that will make our life easier.

Now, let’s use the third row operation to change the red 4 into a zero.

We now can divide the third row by 7 to get that the number in the lower right corner into a one.

Next, we can use the third row operation to get the -3 changed into a zero.

The final step is to then make the -1 into a 0 using the third row operation again.

The solution to this system is then,

Using Gauss-Jordan elimination to solve a system of three equations can be a lot of work, but it is often no more work than solving directly and is many cases less work. If we were to do a system of four equations (which we aren’t going to do) at that point Gauss-Jordan elimination would be less work in all likelihood that if we solved directly.

Also, as we saw in the final example worked in this section, there really is no one set path to take through these problems. Each system is different and may require a different path and set of operations to make. Also, the path that one person finds to be the easiest may not by the path that another person finds to be the easiest. Regardless of the path however, the final answer will be the same.