# 11.1: A- Jacobians, Inverses of Matrices, and Eigenvalues - Mathematics

In this appendix we collect together some results on Jacobians and inverses and eigenvalues of (2 imes 2) matrices that are used repeatedly in the material.

First, we consider the Taylor expansion of a vector valued function of two variables, denoted as follows:

[H(x,y) = egin{pmatrix} {f(x, y)} {g(x, y)} end{pmatrix}, (x, y) in mathbb{R}^2 , label{A.1}]

More precisely, we will need to Taylor expand such functions through second order:

[H(x_{0} +h, y_{0} +k) = H(x_{0}, y_{0})+DH(x_{0}, y_{0}) egin{pmatrix} {h} {k} end{pmatrix}+mathcal{O}(2). label{A.2}]

The Taylor expansion of a scalar valued function of one variable should be familiar to most students at this level. Possibly there is less familiarity with the Taylor expansion of a vector valued function of a vector variable. However, to compute this we just Taylor expand each component of the function (which is a scalar valued function of a vector variable) in each variable, holding the other variable fixed for the expansion in that particular variable, and then we gather the results for each component into matrix form.

Carrying this procedure out for the (f (x, y)) component of Equation ef{A.1} gives:

[ egin{align} f(x_{0}+h, y_{0}+k) &= f(x_{0}, y_{0}+k)+ frac{partial f }{partial x} (x_{0}, y_{0}+k)h+mathcal{O}(h^2) [4pt] &= f(x_{0}, y_{0})+frac{partial f}{partial y}(x_{0}, y_{0})k+mathcal{O}(k^2)+frac{partial f}{partial x}(x_{0}, y_{0})h + mathcal{O}(hk) + mathcal{O}(h^2). label{A.3} end{align}]

The same procedure can be applied to (g(x, y)). Recombining the terms back into the vector expression for Equation ef{A.1} gives:

[H(x_{0}+h, y_{0}+k) = egin{pmatrix} {f(x_{0}, y_{0})} {g(x_{0}, y_{0})} end{pmatrix} +egin{pmatrix} {frac{partial f}{partial x} (x_{0}, y_{0})}&{frac{partial f}{partial y} (x_{0}, y_{0})} {frac{partial g}{partial x} (x_{0}, y_{0})}&{frac{partial g}{partial y} (x_{0}, y_{0})} end{pmatrix} egin{pmatrix} {h} {k} end{pmatrix} + mathcal{O}(2), label{A.4}]

Hence, the Jacobian of Equation ef{A.1} at ((x_{0}, y_{0})) is:

[egin{pmatrix} {frac{partial f}{partial x} (x_{0}, y_{0})}&{frac{partial f}{partial y} (x_{0}, y_{0})} {frac{partial g}{partial x} (x_{0}, y_{0})}&{frac{partial g}{partial y} (x_{0}, y_{0})} end{pmatrix}, label{A.5}]

which is a (2 imes 2) matrix of real numbers.

We will need to compute the inverse of such matrices, as well as its eigenvalues.

We denote a general (2 imes 2) matrix of real numbers:

[A= egin{pmatrix} {a}&{b} {c}&{d} end{pmatrix}, a, b, c, d in mathbb{R}. label{A.6}]

It is easy to verify that the inverse of A is given by:

[A^{-1} = frac{1}{ad-bc} egin{pmatrix} {d}&{-b} {-c}&{a} end{pmatrix}. label{A.7}]

Let (mathbb{I}) denote the (2 imes 2) identity matrix. Then the eigenvalues of A are the solutions of the characteristic equation:

[det (A - lambda mathbb{I}) = 0. label{A.8}]

where "det" is notation for the determinant of the matrix. This is a quadratic equation in (lambda) which has two solutions:

[lambda_{1,2} = frac{tr A}{2} pm frac{1}{2} sqrt{(tr A)^2-4det A}, label{A.9}]

where we have used the notation:

(tr A equiv trace A = a + d), (det A equiv determinant A = ad-bc).

## Functions of Matrices A logarithm of A ∈ ℂ n×n is any matrix X such that e X = A. As we saw in Theorem 1.27, any nonsingular A has infinitely many logarithms. In this chapter A ∈ ℂ n×n is assumed to have no eigenvalues on ℝ − and “log” always denotes the principal logarithm, which we recall from Theorem 1.31 is the unique logarithm whose spectrum lies in the strip < z : − π < Im(z) < π >.

The importance of the matrix logarithm can be ascribed to it being the inverse function of the matrix exponential and this intimate relationship leads to close connections between the theory and computational methods for the two functions.

This chapter is organized as follows. We begin by developing some basic properties of the logarithm, including conditions under which the product formula log(BC) = log(B) + log(C) holds. Then we consider the Fréchet derivative and conditioning. The Mercator and Gregory series expansions are derived and various properties of the diagonal Padé approximants to the logarithm are explained. Two versions of the inverse scaling and squaring method are developed in some detail, one using the Schur form and the other working with full matrices. A Schur—Parlett algorithm employing inverse scaling and squaring on the diagonal blocks together with a special formula for 2 × 2 blocks is then derived. A numerical experiment comparing four different methods is then presented. Finally, an algorithm for evaluating the Fréchet derivative is described.

We begin with an integral expression for the logarithm.

Theorem 11.1 (Richter). For A ∈ ℂ n×n with no eigenvalues on ℝ − ,

 log ( A ) = ∫ 0 1 ( A − I ) [ t ( A − I ) + I ] − 1 d t . (11.1)

Proof. It suffices to prove the result for diagonalizable A, by Theorem 1.20, and hence it suffices to show that log x = ʃ0 1 (x − 1) [t(x − 1) + l] −1 dt for x ∈ ℂ lying off ℝ − this latter equality is immediate.

## SLATER DETERMINANTS

Mitchel Weissbluth , in Atoms and Molecules , 1978

### 11.1 Matrix Elements—General

In Section 8.4 we saw that multielectron wave functions,ψ(λ1, λ2,…, λN) must be antisymmetric with respect to an interchange of the (space and spin) coordinates of any two electrons. Antisymmetry can be ensured by expressing the wave function in terms of Slater determinants as in (8.4-13) . To facilitate the calculation of various physical quantities, we shall need expressions for matrix elements of operators when the wave functions are written in determinantal form.

Consider a two-electron system and let

In ψki), k is a label that identifies a particular spin orbital, i.e., a one-electron function that depends on both space and spin coordinates the index i is an electron label. The notation may be shortened by writing

It will also be assumed that for any two spin orbitals such as ψk and ψl

This has the immediate consequence that

where ψi and ψj are any of the determinantal functions (11.1-1)–(11.1-3) .

Let us now suppose that we have a sum of one-electron operators

where f1 and f2 have the same functional dependence but f1 operates only on the spin orbital occupied by electron 1, namely ψ(λ1), and f2 operates only on ψ(λ2). Since variables of integration are dummy variables we may write

Therefore, in view of the orthonormality relation (11.1-5) ,

with analogous expressions for 〈ψ2|F2〉 and 〈ψ3|F3〉. For the off-diagonal elements

A two-electron operator g12 operates on both ψ(λ1) and ψ(λ2), as, for example, in the case of the electronic Coulomb repulsion operator e 2 /r12. For a typical diagonal element

and for off-diagonal elements

These results for the special case of the Slater determinants (11.1-1)–(11.1-3) may be generalized to determinants of arbitrary dimension. Thus, let

We must also take note of the order in which the orbitals appear in (11.1-12) and (11.1-13) because an interchange of two columns (or rows) will change the sign of the determinantal wave function. As previously written the order is

For the diagonal matrix element of F,

in which the argument of ak and the subscript on f have been omitted, as they will be henceforth, since they are arbitrary (see, for example, (11.1-7) ). The matrix element 〈B|F|B〉 has the same form with respect to the b orbitals. For an off-diagonal matrix element

if A and B differ by more than one pair of orbitals, and

if a k ≠ b l , but the rest of the orbitals in B are the same as those in A. The plus sign occurs when an even number of interchanges are required to move the bl orbital into the kth position or, in other words, when the parity of the permutation is even the minus sign appears as a result of an odd-parity permutation. Examples of (11.1-18) are provided by (11.1-9a) and (11.1-9c) . It may also be remarked that for one-electron operators such as (11.1-14) , simple product functions, and determinantal functions give the same matrix elements.

The diagonal matrix elements of G are

and for off-diagonal elements we have the cases:

If A and B differ by more than two pairs of spin orbitals,

If A and B differ by one pair of orbitals, e.g., a k ≠ b l ,

The same rule as in (11.1-18) applies to the ± signs in (11.1-21) and (11.1-22) . Examples of diagonal and off-diagonal elements are given by (11.1-10) and (11.1-11) .

It will now be assumed that the general spin orbital ai) consists of a product of a spatial function φa(ri) and a spin function ζi a (ms). The latter is always either an α or a β spin function depending on whether ms is +1/2 or −½. Thus

in which the orthonormality of the spin functions has been inserted.

If a, b, c, and d are spin orbitals of form (11.1 −23), the general matrix element of a two-electron operator becomes

### Band Matrix

A band matrix is a sparse matrix whose nonzero elements occur only on the main diagonal and on zero or more diagonals on either side of the main diagonal.

#### Illustration

A diagonal matrix is a band matrix

A Hessenberg matrix is a band matrix

A shear matrix is a band matrix

Two-dimensional display of a band matrix A Jordan block is a band matrix

## Math 130 Linear Algebra

Course description. Math 130 is a requirement for mathematics and physics majors, and it&rsquos highly recommended for majors in other sciences especially including computer-science majors. Topics include systems of linear equations and their solutions, matrices and matrix algebra, inverse matrices determinants and permutations real n-dimensional vector spaces, abstract vector spaces and their axioms, linear transformations inner products (dot products), orthogonality, cross products, and their geometric applications subspaces, linear independence, bases for vector spaces, dimension, matrix rank eigenvectors, eigenvalues, matrix diagonalisation. Some applications of linear algebra will be discussed, such as computer graphics, Kirchoff&rsquos laws, linear regression (least squares), Fourier series, or differential equations.

Prof. R. Broker. Thursday 4:00&ndash5:00 and by appointment. Room BP 345
Prof. E. Joyce. MWF 10:00&ndash10:50, MWF 1:00&ndash2:00. Room BP 322
K. Schultz. Tutoring Monday 8:00&ndash10:00. Room BP 316
Regular class meetings, 14 weeks, 42 hours
Two evening midterms and final exam, 6 hours
Reading the text and preparing for class, 4 hours per week, 56 hours
Doing weekly homework assignments, 56 hours
Meeting with tutors or in study groups, variable 4 to 12 hours
Reviewing for midterms and finals, 12 hours
• To provide students with a good understanding of the concepts and methods of linear algebra, described in detail in the syllabus.
• To help the students develop the ability to solve problems using linear algebra.
• To connect linear algebra to other fields both within and without mathematics.
• To develop abstract and critical reasoning by studying logical proofs and the axiomatic method as applied to linear algebra.
• Knowledge of the Natural World and Human Cultures and Societies&mdashincluding foundational disciplinary knowledge and the ability to employ different ways of knowing the world in its many dimensions. Students will
• develop an understanding of linear algebra, a fundamental knowledge area of mathematics
• develop an understanding of applications of linear algebra in mathematics, natural, and social science
• develop an appreciation of the interaction of linear algebra with other fields
• be able to employ the concepts and methods described in the syllabus
• acquire communication and organizational skills, including effective written communication in their weekly assignments
• be able to follow complex logical arguments and develop modest logical arguments
• begin a commitment to life-long learning, recognizing that the fields of mathematics, mathematical modeling and applications advance at a rapid pace
• learn to manage their own learning and development, including managing time, priorities, and progress
• recognize recurring themes and general principles that have broad applications in mathematics beyond the domains in which they are introduced
• understand the fundamental interplay between theory and application in linear algebra
• be able to solve problems by means of linear algebra
• apply their knowledge toward solving real problems

The text and class discussion will introduce the concepts, methods, applications, and logical arguments students will practice them and solve problems on daily assignments, and they will be tested on quizzes, midterms, and the final.

We won&rsquot cover all of the topics listed below at the same depth. Some topics are fundamental and we&rsquoll cover them in detail others indicate further directions of study in linear algebra and we&rsquoll treat them as surveys. Besides those topics listed below, we will discuss some applications of linear algebra to other parts of mathematics and statistics and to physical and social sciences.

Matrices. Matrix addition and scalar multiplication. Matrix multiplication. Matrix algebra. Matrix inverses. Powers of a matrix. The transpose and symmetric matrices. Vectors: their addition, subtraction, and multiplication by scalars (i.e. real numbers). Graphical interpretation of these vector operations Developing geometric insight. Inner products and norms in Rn: inner products of vectors (also called dot products), norm of a vector (also called length), unit vectors. Applications of inner products in Rn: lines, planes in R 3 , and lines and hyperplanes in R n .
Matrix inverses. Elementary matrices. Introduction to determinants, 2x2 and 3x3 determinants, areas of triangles and parallelograms in the plane, volumes of parallelepipeds, Jacobians Characterizing properties and constructions of determinants, cofactors, diagonal and triangular matrices. More properties of determinants, an algorithm for evaluating determinants, determinants of products, inverses, and transposes, Cramer&rsquos rule. Permutations and determinants. Cross products.
The rank of a matrix. Rank and systems of linear equations. Range.
Fields. Vector Spaces, their axiomatic definition. Properties of vector spaces that follow from the axioms. Subspaces of vector spaces. Linear span.
Linear independence. Linear combinations and basis. Span, and independence. Bases. Coordinates. Dimension. Basis and dimension in R n .
Linear transformations. Linear transformations and matrices. Some linear transformations of the plane R 2 Range and null space. Coordinates. Composition and categories. Change of basis and similarity.
Eigenvalues, eigenvectors, and eigenspaces. Rotations and complex eigenvalues. Diagonalizable square matrices.
Powers of matrices. Systems of difference equations. Linear differential equations.
Inner products. Norm and inner products in Cn and abstract inner product spaces. Cauchy&rsquos inequality. Orthogonality. Orthogonal matrices. Gram-Schmidt orthonormalisation process
Orthogonal diagonalisation of symmetric matrices. Quadratic forms.
The direct sum of two subspaces. Orthogonal complements. Projections. Characterizing projections and orthogonal projections. Orthogonal projection onto the range of a matrix. Minimizing the distance to a subspace. Fitting functions to data: least squares approximation.
Complex numbers. Dave&rsquos Short Course on Complex Numbers. Complex vector spaces. Complex matrices. Complex inner product spaces. Hermitian conjugates. Unitary diagonalisation and normal matrices. Spectral decomposition.
• Linear transformations. The definition of a linear transformation L: V &rarr W from the domain space V to the codomain space W. When V = W, L is also called a linear operator on V.
• Examples L: R n &rarr R m . Linear operators on R 2 including rotations and reflections, dilations and contractions, shear transformations, projections, the identity and zero transformations
• The null space (kernel) and the range (image) of a transformation, and their dimensions, the nullity and rank of the transformation
• The dimension theorem: the rand plus nullity equals the dimension of the domain
• Matrix representation of a linear transformation between finite dimensional vector spaces with specified bases
• Operations on linear transformations V &rarr W. The vector space of all linear transformations V &rarr W. Composition of linear transformations
• Corresponding matrix operations, in particular, matrix multiplication corresponds to composition of linear transformations. Powers of square matrices. Matrix operations in Matlab
• Invertibility and isomorphisms. Invariance of dimension under isomorphism. Inverse matrices
• The change of coordinate matrix between two different bases of a vector space. Similar matrices.
• Dual Spaces.
• [A matrix representation for complex numbers, and another for quaternions. Historical note on quaternions.]
• Elementary row operations and elementary matrices.
• The rank of a matrix (row rank) and of its dual (its column rank).
• An algorithm for inverting a matrix. Matrix inversion in Matlab
• Systems of linear equations in terms of matrices. Coefficient matrix and augmented matrix. Homogeneous and nonhomogeneous equations. Solution space, consistency and inconsistency of systems.
• Reduced row-echelon form, the method of elimination (sometimes called Gaussian elimination or Gauss-Jordan reduction)
• 2x2 Determinants of Order 2. Multilinearity. Inverse of a 2x2 matrix. Signed area of a plane parallelogram, area of a triangle.
• nxn determinants. Cofactor expansion
• Computing determinants in Matlab
• Properties of determinants. Transposition, effect of elementary row operations, multilinearity. Determinants of products, inverses, and transposes. Cramer&rsquos rule for solving n equations in n unknowns.
• Signed volume of a parallelepiped in 3-space
• [Optional topic: permutations and inversions of permutations even and odd permutations]
• [Optional topic: cross products in R 3 ]
• An eigenspace of a linear operator is a subspace in which the operator acts as multiplication by a constant, called the eigenvalue (also called the characteristic value). The vectors in the eigenspace are alled eigenvectors for that eigenvalue.
• Geometric interpretation of eigenvectors and eigenvalues. Fixed points and the 1-eigenspace. Projections and their 0-eigenspace. Reflections have a &ndash1-eigenspace.
• Diagonalization question.
• Characteristic polynomial.
• Complex eigenvalues and rotations.
• An algorithm for computing eigenvalues and eigenvectors
• Inner products for real and complex vector spaces (for real vector spaces, inner products are also called dot products or scalar products) and norms (also called lengths or absolute values). Inner product spaces. Vectors in Matlab.
• The triangle inequality and the Cauchy-Schwarz inequality, other properties of inner products
• The angle between two vectors
• Orthogonality of vectors ("orthogonal" and "normal" are other words for "perpendicular")
• Unit vectors and standard unit vectors in R n
• Orthonormal basis

Class notes, quizzes, tests, homework assignments

• Some Assignments
1. Exercises 1.1 through 1.7 page 53, and problems 1.1 through 1.6 page 55.
2. Problems 1.8 through 1.14 page 57.
3. Problems 2.1 through 2.8 page 86.
4. Exercises 3.1 through 3.8, 3.11 page 125. (Note these are the exercises, not the problems.)
5. Exercises 4.1 through 4.6, page 144.
6. Problems 5.1 through 5.7, page 170.
7. Various problems from chapters 6 and 7. (different assignments in the different class sections)

: structures of theorems and proofs, synthetic and analytic proofs, logical symbols, and well-written proofs
• A little bit about sets

## Why Do It This Way?

This may seem an odd and complicated way of multiplying, but it is necessary!

I can give you a real-life example to illustrate why we multiply matrices in this way.

### Example: The local shop sells 3 types of pies.

• Apple pies cost $3 each • Cherry pies cost$4 each
• Blueberry pies cost $2 each And this is how many they sold in 4 days: Now think about this . the value of sales for Monday is calculated this way: So it is, in fact, the "dot product" of prices and how many were sold: ($3, $4,$2) • (13, 8, 6) = $3×13 +$4×8 + $2×6 =$83

We match the price to how many sold, multiply each, then sum the result.

• The sales for Monday were: Apple pies: $3×13=$39, Cherry pies: $4×8=$32, and Blueberry pies: $2×6=$12. Together that is $39 +$32 + $12 =$83
• And for Tuesday: $3×9 +$4×7 + $2×4 =$63
• And for Wednesday: $3×7 +$4×4 + $2×0 =$37
• And for Thursday: $3×15 +$4×6 + $2×3 =$75

So it is important to match each price to each quantity.

Now you know why we use the "dot product".

And here is the full result in Matrix form:

They sold $83 worth of pies on Monday,$63 on Tuesday, etc.

(You can put those values into the Matrix Calculator to see if they work.)

## Subsection 5.6.2 Stochastic Matrices and the Steady State

In this subsection, we discuss difference equations representing probabilities, like the Red Box example. Such systems are called Markov chains. The most important result in this section is the Perron–Frobenius theorem, which describes the long-term behavior of a Markov chain.

##### Definition

is stochastic if all of its entries are nonnegative, and the entries of each column sum to

A matrix is positive if all of its entries are positive numbers.

A positive stochastic matrix is a stochastic matrix whose entries are all positive numbers. In particular, no entry is equal to zero. For instance, the first matrix below is a positive stochastic matrix, and the second is not:

##### Remark

More generally, a regular stochastic matrix is a stochastic matrix

The Perron–Frobenius theorem below also applies to regular stochastic matrices.

##### Example

Continuing with the Red Box example, the matrix

is a positive stochastic matrix. The fact that the columns sum to

says that all of the movies rented from a particular kiosk must be returned to some other kiosk (remember that every customer returns their movie the next day). For instance, the first column says:

Of the movies rented from kiosk

30%willbereturnedtokiosk1 30%willbereturnedtokiosk2 40%willbereturnedtokiosk3.

as all of the movies are returned to one of the three kiosks.

represents the change of state from one day to the next:

This says that the total number of copies of Prognosis Negative in the three kiosks does not change from day to day, as we expect.

The fact that the entries of the vectors

sum to the same number is a consequence of the fact that the columns of a stochastic matrix sum to

be a stochastic matrix, let

Then the sum of the entries of

equals the sum of the entries of

Computing the long-term behavior of a difference equation turns out to be an eigenvalue problem. The eigenvalues of stochastic matrices have very special properties.

be a stochastic matrix. Then:

is a (real or complex) eigenvalue of

##### Proof

is stochastic, then the rows of

But multiplying a matrix by the vector

have the same characteristic polynomial:

so it is also an eigenvalue of

th entry of this vector equation is

with the largest absolute value, so

where the last equality holds because

In fact, for a positive stochastic matrix

is a (real or complex) eigenvalue of

-eigenspace of a stochastic matrix is very important.

##### Definition

A steady state of a stochastic matrix

such that the entries are positive and sum to

The Perron–Frobenius theorem describes the long-term behavior of a difference equation represented by a stochastic matrix. Its proof is beyond the scope of this text.

##### Perron–Frobenius Theorem

be a positive stochastic matrix. Then

with entries summing to some number

Translation: The Perron–Frobenius theorem makes the following assertions:

-eigenspace upon repeated multiplication by

One should think of a steady state vector

as a vector of percentages. For example, if the movies are distributed according to these percentages today, then they will be have the same distribution tomorrow, since

And no matter the starting distribution of movies, the long-term distribution will always be the steady state vector.

is the total number of things in the system being modeled. The total number does not change, so the long-term state of the system must approach

because it is contained in the

-eigenspace, and the entries of

##### Recipe 1: Compute the steady state vector

be a positive stochastic matrix. Here is how to compute the steady-state vector of

by the sum of the entries of

The above recipe is suitable for calculations by hand, but it does not take advantage of the fact that

is a stochastic matrix. In practice, it is generally faster to compute a steady state vector by computer as follows:

##### Recipe 2: Approximate the steady state vector by computer

be a positive stochastic matrix. Here is how to approximate the steady-state vector of

##### Example

Continuing with the Red Box example, we can illustrate the Perron–Frobenius theorem explicitly. The matrix

has characteristic polynomial

is strictly greater in absolute value than the other eigenvalues, and that it has algebraic (hence, geometric) multiplicity

We compute eigenvectors for the eigenvalues

necessarily has positive entries the steady-state vector is

Iterating multiplication by

which is an eigenvector with eigenvalue

, as guaranteed by the Perron–Frobenius theorem.

What do the above calculations say about the number of copies of Prognosis Negative in the Atlanta Red Box kiosks? Suppose that the kiosks start with 100 copies of the movie, with

be the vector describing this state. Then there will be

movies in the kiosks the next day,

the day after that, and so on. We let

(Of course it does not make sense to have a fractional number of movies the decimals are included here to illustrate the convergence.) The steady-state vector says that eventually, the movies will be distributed in the kiosks according to the percentages

38.888888% 33.333333% 27.777778%

which agrees with the above table. Moreover, this distribution is independent of the beginning distribution of movies in the kiosks.

Now we turn to visualizing the dynamics of (i.e., repeated multiplication by) the matrix

This matrix is diagonalizable we have

-coordinate unchanged, scales the

Repeated multiplication by

-coordinates very small, so it “sucks all vectors into the

but with respect to the coordinate system defined by the columns

“sucks all vectors into the

-eigenspace”, without changing the sum of the entries of the vectors.

Click “multiply” to multiply the colored points by

on the right. Note that on both sides, all vectors are “sucked into the

-eigenspace” (the green line). (We have scaled

so that vectors have roughly the same size on the right and the left. The “jump” that happens when you press “multiply” is a negation of the

-eigenspace, which is not animated.)

The picture of a positive stochastic matrix is always the same, whether or not it is diagonalizable: all vectors are “sucked into the

-eigenspace,” which is a line, without changing the sum of the entries of the vectors. This is the geometric content of the Perron–Frobenius theorem.

## Matrix Calculator

A matrix, in a mathematical context, is a rectangular array of numbers, symbols, or expressions that are arranged in rows and columns. Matrices are often used in scientific fields such as physics, computer graphics, probability theory, statistics, calculus, numerical analysis, and more.

The dimensions of a matrix, A, are typically denoted as m × n. This means that A has m rows and n columns. When referring to a specific value in a matrix, called an element, a variable with two subscripts is often used to denote each element based on their position in the matrix. For example, given ai,j, where i = 1 and j = 3, a1,3 is the value of the element in the first row and the third column of the given matrix.

Matrix operations such as addition, multiplication, subtraction, etc., are similar to what most people are likely accustomed to seeing in basic arithmetic and algebra, but do differ in some ways, and are subject to certain constraints. Below are descriptions of the matrix operations that this calculator can perform.

Matrix addition can only be performed on matrices of the same size. This means that you can only add matrices if both matrices are m × n. For example, you can add two or more 3 × 3, 1 × 2, or 5 × 4 matrices. You cannot add a 2 × 3 and a 3 × 2 matrix, a 4 × 4 and a 3 × 3, etc. The number of rows and columns of all the matrices being added must exactly match.

If the matrices are the same size, matrix addition is performed by adding the corresponding elements in the matrices. For example, given two matrices, A and B, with elements ai,j, and bi,j, the matrices are added by adding each element, then placing the result in a new matrix, C, in the corresponding position in the matrix:

In the above matrices, a1,1 = 1 a1,2 = 2 b1,1 = 5 b1,2 = 6 etc. We add the corresponding elements to obtain ci,j. Adding the values in the corresponding rows and columns:

 a1,1 + b1,1 = 1 + 5 = 6 = c1,1 a1,2 + b1,2 = 2 + 6 = 8 = c1,2 a2,1 + b2,1 = 3 + 7 = 10 = c2,1 a2,2 + b2,2 = 4 + 8 = 12 = c2,2

Thus, matrix C is:

### Matrix subtraction

Matrix subtraction is performed in much the same way as matrix addition, described above, with the exception that the values are subtracted rather than added. If necessary, refer to the information and examples above for description of notation used in the example below. Like matrix addition, the matrices being subtracted must be the same size. If the matrices are the same size, then matrix subtraction is performed by subtracting the elements in the corresponding rows and columns:

Thus, matrix C is:

### Matrix multiplication

Scalar multiplication:

Matrices can be multiplied by a scalar value by multiplying each element in the matrix by the scalar. For example, given a matrix A and a scalar c:

The product of c and A is:

Matrix-matrix multiplication:

Multiplying two (or more) matrices is more involved than multiplying by a scalar. In order to multiply two matrices, the number of columns in the first matrix must match the number of rows in the second matrix. For example, you can multiply a 2 × 3 matrix by a 3 × 4 matrix, but not a 2 × 3 matrix by a 4 × 3.

Note that when multiplying matrices, A × B does not necessarily equal B × A. In fact, just because A can be multiplied by B doesn't mean that B can be multiplied by A.

If the matrices are the correct sizes, and can be multiplied, matrices are multiplied by performing what is known as the dot product. The dot product involves multiplying the corresponding elements in the row of the first matrix, by that of the columns of the second matrix, and summing up the result, resulting in a single value. The dot product can only be performed on sequences of equal lengths. This is why the number of columns in the first matrix must match the number of rows of the second.

The dot product then becomes the value in the corresponding row and column of the new matrix, C. For example, from the section above of matrices that can be multiplied, the blue row in A is multiplied by the blue column in B to determine the value in the first column of the first row of matrix C. This is referred to as the dot product of row 1 of A and column 1 of B:

The dot product is performed for each row of A and each column of B until all combinations of the two are complete in order to find the value of the corresponding elements in matrix C. For example, when you perform the dot product of row 1 of A and column 1 of B, the result will be c1,1 of matrix C. The dot product of row 1 of A and column 2 of B will be c1,2 of matrix C, and so on, as shown in the example below:

When multiplying two matrices, the resulting matrix will have the same number of rows as the first matrix, in this case A, and the same number of columns as the second matrix, B. Since A is 2 × 3 and B is 3 × 4, C will be a 2 × 4 matrix. The colors here can help determine first, whether two matrices can be multiplied, and second, the dimensions of the resulting matrix. Next, we can determine the element values of C by performing the dot products of each row and column, as shown below:

Below, the calculation of the dot product for each row and column of C is shown:

 c1,1 = 1࡫ + 2࡭ + 1ࡧ = 20 c1,2 = 1࡬ + 2࡮ + 1ࡧ = 23 c1,3 = 1ࡧ + 2ࡧ + 1ࡧ = 4 c1,4 = 1ࡧ + 2ࡧ + 1ࡧ = 4 c2,1 = 3࡫ + 4࡭ + 1ࡧ = 44 c2,2 = 3࡬ + 4࡮ + 1ࡧ = 51 c2,3 = 3ࡧ + 4ࡧ + 1ࡧ = 8 c2,4 = 3ࡧ + 4ࡧ + 1ࡧ = 8

### Power of a matrix

For the intents of this calculator, "power of a matrix" means to raise a given matrix to a given power. For example, when using the calculator, "Power of 2" for a given matrix, A, means A 2. Exponents for matrices function in the same way as they normally do in math, except that matrix multiplication rules also apply, so only square matrices (matrices with an equal number of rows and columns) can be raised to a power. This is because a non-square matrix, A, cannot be multiplied by itself. A × A in this case is not possible to compute. Refer to the matrix multiplication section, if necessary, for a refresher on how to multiply matrices. Given:

A raised to the power of 2 is:

As with exponents in other mathematical contexts, A 3, would equal A × A × A, A 4 would equal A × A × A × A, and so on.

### Transpose of a matrix

The transpose of a matrix, typically indicated with a "T" as an exponent, is an operation that flips a matrix over its diagonal. This results in switching the row and column indices of a matrix, meaning that aij in matrix A, becomes aji in A T. If necessary, refer above for description of the notation used.

An m × n matrix, transposed, would therefore become an n × m matrix, as shown in the examples below:

### Determinant of a matrix

The determinant of a matrix is a value that can be computed from the elements of a square matrix. It is used in linear algebra, calculus, and other mathematical contexts. For example, the determinant can be used to compute the inverse of a matrix or to solve a system of linear equations.

There are a number of methods and formulas for calculating the determinant of a matrix. The Leibniz formula and the Laplace formula are two commonly used formulas.

Determinant of a 2 × 2 matrix:

The determinant of a 2 × 2 matrix can be calculated using the Leibniz formula, which involves some basic arithmetic. Given matrix A:

The determinant of A using the Leibniz formula is:

Note that taking the determinant is typically indicated with "| |" surrounding the given matrix. Given:

Determinant of a 3 × 3 matrix:

One way to calculate the determinant of a 3 × 3 matrix is through the use of the Laplace formula. Both the Laplace formula and the Leibniz formula can be represented mathematically, but involve the use of notations and concepts that won't be discussed here. Below is an example of how to use the Laplace formula to compute the determinant of a 3 × 3 matrix:

From this point, we can use the Leibniz formula for a 2 × 2 matrix to calculate the determinant of the 2 × 2 matrices, and since scalar multiplication of a matrix just involves multiplying all values of the matrix by the scalar, we can multiply the determinant of the 2 × 2 by the scalar as follows:

This can further be simplified to:

|A| = aei + bfg + cdh - ceg - bdi - afh

This is the Leibniz formula for a 3 × 3 matrix.

Determinant of a 4 × 4 matrix and higher:

The determinant of a 4 × 4 matrix and higher can be computed in much the same way as that of a 3 × 3, using the Laplace formula or the Leibniz formula. As with the example above with 3 × 3 matrices, you may notice a pattern that essentially allows you to "reduce" the given matrix into a scalar multiplied by the determinant of a matrix of reduced dimensions, i.e. a 4 × 4 being reduced to a series of scalars multiplied by 3 × 3 matrices, where each subsequent pair of scalar × reduced matrix has alternating positive and negative signs (i.e. they are added or subtracted).

The process involves cycling through each element in the first row of the matrix. Eventually, we will end up with an expression in which each element in the first row will be multiplied by a lower-dimension (than the original) matrix. The elements of the lower-dimension matrix is determined by blocking out the row and column that the chosen scalar are a part of, and having the remaining elements comprise the lower dimension matrix. Refer to the example below for clarification.

Here, we first choose element a. The elements in blue are the scalar, a, and the elements that will be part of the 3 × 3 matrix we need to find the determinant of:

Next, we choose element b:

Continuing in the same manner for elements c and d, and alternating the sign (+ - + - . ) of each term:

We continue the process as we would a 3 × 3 matrix (shown above), until we have reduced the 4 × 4 matrix to a scalar multiplied by a 2 × 2 matrix, which we can calculate the determinant of using Leibniz's formula. As can be seen, this gets tedious very quickly, but is a method that can be used for n × n matrices once you have an understanding of the pattern. There are other ways to compute the determinant of a matrix which can be more efficient, but require an understanding of other mathematical concepts and notations.

### Inverse of a matrix

The inverse of a matrix A is denoted as A -1, where A -1 is the inverse of A if the following is true:

A×A -1 = A -1 ×A = I, where I is the identity matrix

The identity matrix is a square matrix with "1" across its diagonal, and "0" everywhere else. The identity matrix is the matrix equivalent of the number "1." For example, the number 1 multiplied by any number n equals n. The same is true of an identity matrix multiplied by a matrix of the same size: A × I = A. Note that an identity matrix can have any square dimensions. For example, all of the matrices below are identity matrices. From left to right respectively, the matrices below are a 2 × 2, 3 × 3, and 4 × 4 identity matrix:

The n × n identity matrix is thus:

Inverse of a 2 × 2 matrix:

To invert a 2 × 2 matrix, the following equation can be used:

If you were to test that this is in fact the inverse of A you would find that both:

are equal to the identity matrix:

Inverse of a 3 × 3 matrix:

The inverse of a 3 × 3 matrix is more tedious to compute. An equation for doing so is provided below, but will not be computed. Given:

A=ei-fh B=-(di-fg) C=dh-eg D=-(bi-ch) E=ai-cg F=-(ah-bg) G=bf-ce H=-(af-cd) I=ae-bd

4 × 4 and larger get increasingly more complicated, and there are other methods for computing them.

## STAT 542: MULTIVARIATE STATISTICAL ANALYSIS: CLASSICAL THEORY AND RECENT DEVELOPMENTS

Prereq: STAT 581-582 plus linear algebra and matrix theory. In particular, familiarity with hypothesis testing, decision theory, and invariance. BIOSTAT/STAT 533 (univariate linear models) is also helpful.

The first 3/4 of the course will concentrate on "classical" multivariate analysis, i.e, distribution theory and statistical inference based on the multivariate normal distribution. The last 1/4 will cover special topics of interest to the instructor and/or requested by the class. There will be several homework assignments. Time permitting, each registered student will report on a topic of interest to her/him.

Topics include (as time permits):

0. Brief review of matrix algebra and the multivariate normal distribution: pdf, marginal and conditional distributions, covariance matrix, correlations and partial correlations.

1. The Wishart distribution: definition and properties, distribution of the sample covariance matrix, marginal and conditional distributions.

2. Estimation and testing: likelihood inference and invariance. Hotelling's T^2 test, multivariate linear models and MANOVA, testing independence, Bartlett's tests for equality of covariance matrices. The James-Stein estimator for the mean vector, the Stein estimator for the covariance matrix.

3. Distributions derived from the Wishart distribution and their role in hypothesis testing: eigenvalues, principle components, canonical correlations. Jacobians of multivariate distributions. Stein’s integral representation of the density of a maximal invariant statistic.

4. Group symmetry in estimation and testing (the Copenhagen theory.)

5. Multivariate probability inequalities and their applications to the power of multivariate tests and multiparameter confidence intervals.

6. Lattice conditional independence models and their applications to missing data problems and "seemingly unrelated regression" models.

Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis (3rd ed). Wiley, New York.

Andersson, S. A. (1999). An Introduction to Multivariate Statistical Analysis, Lecture Notes, Indiana University.

Bilodeau, M. and Brenner, D. (1999). Theory of Multivariate Statistics. Springer, New York.

Eaton, M. L. (1983). Multivariate Statistics. Wiley, New York.

Eaton, M. L. (1989). Group Invariance Applications in Statistics. IMS-ASA.

Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3nd ed. Wiley, New York.

Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.

Seber, G. A. F. (1984). Multivariate Observations. Wiley, New York.

Anderson, T. W. and Perlman, M. D. (1993). Parameter consistency of invariant tests for MANOVA and related multivariate hypotheses. Statistics and Probability: A Raghu Raj Bahadur Festschrift (J.K. Ghosh, S.K. Mitra, K.R. Parthasarathy, B.L.S. Prakasa Rao, eds.), 37-62. Wiley Eastern Ltd.

Andersson, S. A. (1990). The lattice structure of orthogonal linear models and orthogonal variance component models. Scand. J. Statist. 17 287-319.

Andersson, S. A., Brons, H. K., and Tolver Jensen, S. (1983). Distribution of eigenvalues in multivariate statistical analysis. Ann. Statist. 11 392-415.

Andersson, S. A. and Klein, T. (2010) On Riesz and Wishart distributions associated with decomposable undirected graphs. J. Multivariate Analysis 101 789-810.

Andersson, S. A. and Madsen, J. (1998). Symmetry and lattice conditional independence in a multivariate normal distribution. Ann. Statist. 26 525-572.

Andersson, S. A. and Perlman, M. D. (1991). Lattice-ordered conditional independence models for missing data. Statistics and Probability Letters 12 465-486.

Andersson, S. A. and Perlman, M. D. (1993). Lattice models for conditional independence in a multivariate normal distribution. Ann. Statist. 21 1318-1358.

Andersson, S. A. and Perlman, M. D. (1994). Normal linear models with lattice conditional independence restrictions. In Multivariate Analysis and its Applications (T.W. Anderson, K.T. Fang, I. Olkin,eds.), IMS Lecture Notes-Monograph Series Vol. 24 97-110.

Andersson, S. A. and Perlman, M. D. (1998). Normal linear regression models with recursive graphical Markov structure. J. Multivariate Analysis 66 133-187.

Andersson, S. A. and Wojnar, G. G. (2004). Wishart distributions on homogeneous cones. J. Theoret. Prob. 17 781-818.

Daniels, M. J. and Kass, R. E. (2001). Shrinkage estimators for covariance matrices. Biometrics 57 1173-1184.

Das Gupta, S., Anderson, T. W., and Mudholkar, G. S. (1964). Monotonicity of the power functions of some tests of the multivariate linear hpothesis. Ann. Math. Statist. 35 200-205.

Drton, M., Andersson, S. A., and Perlman, M. D. (2005). Lattice conditional independence models for seemingly unrelated regression models with missing data. J. Multivariate Analysis 97 385-411.

Joe, H. (2006). Generating random correlation matrices based on partial correlations. J. Multivariate Analysis 97 2177-2189.

Also see http://ms.mcmaster.ca/canty/seminars/Joe_vinecorr_print.pdf

Kiefer, J. and Schwartz, R. (1965). Admissible Bayes character of T2-, R2-, and other fully invariant tests for classical multivariate normal problems. Ann. Math. Statist. 36 747-770.

Ledet-Jensen, J. (1991). A large deviation-type approximation for the “Box class” of likelihood ratio criteria. J. Amer. Statist. Assoc. 86 437-440.

Madsen, J. (2000). Invariant normal models with recursive graphical Markov structure. Ann. Statist. 28

Marden, J. I. and Perlman, M. D. (1980). Invariant tests for means with covariates. Ann. Statist. 8 825-63.

Okamoto, M. (1973). Distinctiveness of the eigenvalues of a quadratic form in a multivariate sample. Ann. Statist. 1 763-754.

Perlman, M. D. (1980a). Unbiasedness of the likelihood ratio tests for equality of several covariance matrices and equality of several multivariate normal populations. Ann. Statist. 8 247-263.

Perlman, M. D. (1980b). Unbiasedness of multivariate tests: recent results. Proceedings of the Fifth International Symposium on Multivariate Analysis (P.R. Krishnaiah, ed.), 413-432. [Also see Anderson (2003) Section 8.10.2.]

Perlman, M. D. and Olkin, I. (1980). Unbiasedness of invariant tests for MANOVA and other multivariate problems. Annals of Statistics 8 1326-1341.

Perlman, M. D. (1987). Group symmetry covariance models. (Discussion of "A Review of Multivariate Analysis" by Mark Schervish.) Statistical Science 2, 421-425.

Perlman, M. D. (1990). T.W. Anderson's theorem on the integral of a symmetric unimodal function over a symmetric convex set and its applications in probability and statistics. The Collected Papers of T.W. Anderson: 1943-1985 (G. Styan, ed.), Vol. 2 1627-1641. J. Wiley & Sons, New York.

Schwartz, R. (1967). Admissible tests in multivariate analysis of variance. Ann. Statist. 38, 698-710. [Also see Anderson (2003) Section 8.10.1.]

Stein, C. (1956). The admissibility of Hotelling’s T2-test. Ann. Math. Statist. 27 616-623.

Tolver Jensen, S. (1988). Covariance hypotheses which are linear in both the covariance and the inverse covariance. Ann. Statist. 16 302-322.

Summarize the whole data set. In this example, summarize_all() generates a long-term mean of the data.

Filter out just the January values, and get a long-term mean of those:

Summarize the data by groups, in this case by months. First rearrange the data, and then summarize:

Note that grouped data set looks almost exactly like the ungrouped one, except when listed, it includes the mention of the grouping variable (i.e., Groups: mon  ).

Calculate annual averages of each variable, using the aggregate() function from the stats package.

## 11.1: A- Jacobians, Inverses of Matrices, and Eigenvalues - Mathematics

##### Course Timetable

The full timetable of all activities for this course can be accessed from Course Planner.

##### Course Learning Outcomes
1. Demonstrate understanding of basic concepts in linear algebra, relating to matrices, vector spaces and eigenvectors.
2. Demonstrate understanding of basic concepts in calculus, relating to functions, differentiation and integration.
3. Employ methods related to these concepts in a variety of applications.
4. Apply logical thinking to problem-solving in context.
5. Use appropriate technology to aid problem-solving.
6. Demonstrate skills in writing mathematics.

This course will provide students with an opportunity to develop the Graduate Attribute(s) specified below:

University Graduate Attribute Course Learning Outcome(s)
Knowledge and understanding of the content and techniques of a chosen discipline at advanced levels that are internationally recognised. all
The ability to locate, analyse, evaluate and synthesise information from a wide variety of sources in a planned and timely manner. 3,4
An ability to apply effective, creative and innovative solutions, both independently and cooperatively, to current and future problems. 1,2,3,4,5
A proficiency in the appropriate use of contemporary technologies. 5
A commitment to continuous learning and the capacity to maintain intellectual curiosity throughout life. all

##### Recommended Resources
1. Lay: Linear Algebra and its Applications 4th ed. (Addison Wesley Longman)
2. Stewart: Calculus 7th ed. (international ed.) (Brooks/Cole)
##### Online Learning

This course also makes use of online assessment software for mathematics called Maple TA, which we use to provide students with instantaneous formative feedback.

##### Learning & Teaching Modes

The information below is provided as a guide to assist students in engaging appropriately with the course requirements.

 Activity Quantity Workload hours Lectures 48 72 Tutorials 11 22 Assignments 11 55 Mid Semester Test 1 6 Total 156
##### Learning Activities Summary

In Mathematics IA the two topics of algebra and calculus detailed below are taught in parallel, with two lectures a week on each. The tutorials are a combination of algebra and calculus topics, pertaining to the previous week's lectures.

Lecture Outline

Algebra

• Matrices and Linear Equations (8 lectures)
• Algebraic properties of matrices.
• Systems of linear equations, coefficient and augmented matrices. Row operations.
• Gauss-Jordan reduction. Solution set.
• Linear combnations of vectors. Inverse matrix, elementary matrices, application to linear systems.
• Definition and properties. Computation. Adjoint.
• Convex sets, systems of linear inequalities.
• Optimization of a linear functional on a convex set: geometric and algebraic methods.
• Applications.
• Definition. Linear independence, subspaces, basis.
• Definitions and calculation: characteristic equation, trace, determinant, multiplicity.
• Similar matrices, diagonalization. Applications.
• Functions (6 lectures)
• Real and irrational numbers. Decimal expansions, intervals.
• Domain, range, graph of a function. Polynomial, rational, modulus, step, trig functions, odd and even functions.
• Combining functions, 1-1 and monotonic functions, inverse functions including inverse trig functions.
• Areas, summation notation. Upper and lower sums, area under a curve.
• Properties of the definite integral. Fundamental Theorem of Calculus.
• Revision of differentiation, derivatives of inverse functions.
• Logarithm as area under a curve. Properties.
• Exponential function as inverse of logarithm, properties. Other exponential and log functions. Hyperbolic functions.
• Substitution, integration by parts, partial fractions.
• Trig integrals, reduction formulae. Use of Matlab in evaluation of integrals.
• Riemann sums, trapezoidal and Simpson's rules.

Tutorial 1: Matrices and linear equations. Real numbers, domain and range of functions.

Tutorial 2: Gauss-Jordan elimination. Linear combinations of vectors. Composition of functions, 1-1 functions.
Tutorial 3: Systems of equations. Inverse functions. Exponential functions.
Tutorial 4: Inverse matrices. Summation, upper and lower sums.
Tutorial 5: Determinants. Definite integrals, average value.
Tutorial 6: Convex sets, optimization. Antiderivatives, Fundamental Theorem of Calculus.
Tutorial 7: Optimization. Linear dependence and independence. Differentiation of inverse functions.
Tutorial 8: Linear dependence, span, subspace. Log, exponential and hyperbolic functions.
Tutorial 9: Basis and dimension. Integration.
Tutorial 10: Eigenvalues and eigenvectors. Integration by parts, reduction formulae.
Tutorial 11: Eigenvalues and eigenvectors. Tirigonometric integrals.
Tutorial 12: Diagonalization, Markov processes. Numerical integration.
(Note: This tutorial is not an actual class, but is a set of typical problems with solutions provided.)

Note: Precise tutorial content may vary due to the vagaries of public holidays.