Matrix multiplication has multiple different motivations. The one presented here is fairly elementary and allows us to use a more concise representation of subsequent ideas.

Recall

S: { x1 + 3 x2 + 3 x3 + 2 x4 + x5 = 7 3x1 + 9 x2 - 6 x3 + 4 x4 + 3 x5 = -7 2 x1 + 6 x2 - 4 x3 + 2 x4 + 2 x5 = -4

If we add negative three times the first equation to the third equation we obtain:

-3ρ1 +ρ2 S -15 x3 - 2 x4 = -28

We will be explicit by including the fact that above we added zero times the third row.

-3ρ1 +1ρ2 +0ρ3 S -15 x3 - 2 x4 = -28

Here are a few more such combinations

Negative three times equation one plus equation three

-3ρ1 +0ρ2 +1ρ3 S -x1 -3x2 -13x3 -4x4 -x1 = -25

Negative three times equation two plus equation three

0ρ1 -3ρ2 +1ρ3 S -7x1 -23x2 +14x3 -10x4 -7x1 = 17

Equation two only

0ρ1 +1ρ2 +0ρ3 S 3x1 + 9 x2 - 6 x3 + 4 x4 + 3 x5 = -7

Sum of all three equations

1ρ1 +1ρ2 +1ρ3 S 6x1 + 18 x2 - 7 x3 + 8 x4 + 6 x5 = -4

Collecting all resulting equations we obtain

S: { -15 x3 - 2 x4 = -28 - x1 - 3 x2 - 13 x3 - 4 x4 - x5 = -25 -7 x1 - 23 x2 + 14 x3 - 10 x4 - 7 x5 = 17 3x1 + 9 x2 - 6 x3 + 4 x4 + 3 x5 = -7 6 x1 + 18 x2 - 7 x3 + 8 x4 + 6 x5 = -4

We want to encode such transformation so that there is an easy and convenient way to work just with the (augmented) matrices of the system of linear equation. The constants -3 and 1 are the same when:

  1. adding negative three times Equation 1 to Equation 2,
  2. adding negative three times Equation 1 to Equation 3 and
  3. adding negative three times Equation 2 to Equation 3.

However by being explicit we distinguish between ( -3,1,0 ), ( -3,0,1 ) and ( 0,-3,1 ) allowing us to encode the above transform of row combinations in a matrix rowcomb= ( -3 1 0 -3 0 1 0 -3 1 0 1 0 1 1 1 )

When rowcomb is applied to S the result is S, or rowcombS=S which written as (augmented) matrices is

( -3 1 0 -3 0 1 0 -3 1 0 1 0 1 1 1 ) ( 1 3 3 2 1 7 3 9 -6 4 3 -7 2 6 -4 2 2 -4 ) = ( 0 0 -15 -2 0 -28 -1 -3 -13 -4 -1 -25 -7 -23 14 -10 -7 17 1 3 3 2 1 7 6 18 -7 8 6 -4 )

Above we performed matrix-matrix multiplication or matrix multiplication for short. In that matrix the rth row of the result is the rth combination of equations that is taken. For example the second equation that was obtained was

-3ρ1 +0ρ2 +1ρ3 S - x3 - 3 x3 - 13 x3 - 4 x4 - x4 = -25

the same operation (highlighted) in the matrix multiplication below

( -3 1 0 -3 0 1 0 -3 1 0 1 0 1 1 1 ) ( 1 3 3 2 1 7 3 9 -6 4 3 -7 2 6 -4 2 2 -4 ) = ( 0 0 -15 -2 0 -28 -1 -3 -13 -4 -1 -25 -7 -23 14 -10 -7 17 1 3 3 2 1 7 6 18 -7 8 6 -4 )

reflects that for the result

-x1 -3 x2 -13 x3 -4 x4 - x5 = -25

the right side is

-3×7 +0×-7 +1×-4 =-25

and the left side is

-3 ( x1 +3 x2 +3 x3 +2 x4 +1 x5 ) + 0 ( 3 x1 +9 x2 -6 x3 +4 x4 +3 x5 ) + 1 ( 2 x1 +6 x2 -4 x3 +2 x4 +2 x5 ) = ( -3×1 +0×3 +1×2 ) =-1 x1 + ( -3×3 +0×9 +1×6 ) =-3 x2 + ( -3×3 +0×-6 +1×-4 ) =-13 x3 + ( -3×2 +0×4 +1×2 ) =-4 x4 + ( -3×1 +0×3 +1×2 ) =-1 x5 = - x1 - 3 x2 - 13 x3 - 4 x4 - x5

In the matrix multiplication

( -3 1 0 -3 0 1 0 -3 1 0 1 0 1 1 1 ) ( 1 3 3 2 1 7 3 9 -6 4 3 -7 2 6 -4 2 2 -4 ) = ( 0 0 -15 -2 0 -28 -1 -3 -13 -4 -1 -25 -7 -23 14 -10 -7 17 1 3 3 2 1 7 6 18 -7 8 6 -4 )

to obtain the entry in the first row sixth (last) column of the matrix on the right hand side of the equation we compute:

-28= -3×7 +1×-7 +0×-4

to obtain the entry in the first row third column we compute:

-15= -3×3 +1×-6 +0×-4

to obtain the entry in the third row second column we compute:

-23= 0×3 -3×9 +1×6

To summarize: when multiplying matrices AB=C

In the above matrix multiplication we multiplied a 5×3 matrix by a 3×6 to obtain a 5×6.

We are ready to define matrix multiplication

Let A= { αij } be an m×k matrix and B= { βij } be an k×n matrix. The matrix multiplication of matrix A with matrix B denoted by AB is an m×n matrix C

C= { ζrc } 1rm , 1cn

where

ζrc = i=1 k α ri β ic

Here is another example including a corresponding systems of linear equations:

Let

{ x3 + 2 x4 = 3 x1 + 3 x2 + 3 x3 + 2 x4 = 1 2x1 + 6 x2 + 5 x3 + 2 x4 = 0

then

1ρ1 +0ρ2 +0ρ3 1ρ1 -2ρ2 +1ρ3 S { 0x1 + 0x2 + 1 x3 + 2 x4 = 3 0 x1 + 0 x2 + 0 x3 + 0 x4 = 0

or as matrix multiplication

( 1 0 0 1 -2 1 ) ( 0 0 1 2 3 1 3 3 2 1 2 6 5 2 0 ) = ( 0 0 1 2 3 0 0 0 0 1 )

Matrix multipcation is independent from the underlying system of linear equations but some of the earlier notation we introduced is motivated by matrix multiplication.

Verify

(1332139-64326-422) (302-10) = (7-7-4)

Compare the above result with the matrix form

( 1 3 3 2 1 3 9 -6 4 3 2 6 -4 2 2 ) ( x1 x2 x3 x4 x5 ) = ( 7 -7 -4 )

of the system of linear equations

S: { x1 + 3 x2 + 3 x3 + 2 x4 + x5 = 7 3x1 + 9 x2 - 6 x3 + 4 x4 + 3 x5 = -7 2 x1 + 6 x2 - 4 x3 + 2 x4 + 2 x5 = -4

A word of caution: not all pair of matrices can be multiplied!

While the multiplications

(35-78) (1-12)

and

(35-78) (1-12)

cannot be performed, one can verify

(1-12) (35-78) = (35-78-3-57-8610-1416)

The above example illustrates that matrix multiplication is not commutative that is

AB BA

In fact the existence of AB does not imply the existence of BA. Here is an example where both multiplications exist but are not equal to each other.

Verify

(1-12) (35-7) = (35-7-3-57610-14)

and

(1-12) (35-7) = (-16)

Bear in mind that the last multiplication results in a one by one matrix not the number sixteen.

Let us consider the multiplication of three matrices.

Let

A= (1-12415-3-409-67) B= (35-7410340-1501-12) C= (012-130-111-2)

Then

AB= (1-12415-3-409-67) (35-7410340-1501-12) = (132-9263723-191113-9-275-1216227-802929) BC= (35-7410340-1501-12) (012-130-111-2) = (-14017-160)

And furthermore

( AB )C = (132-9263723-191113-9-275-1216227-802929) (012-130-111-2) = (-191-9-1-264-1866) A( BC ) = (1-12415-3-409-67) (-14017-160) = (-191-9-1-264-1866)

Matrix multiplication is not commutative operation but the above example suggests that it is an associative operation!

Matrix multiplication is associative

A( BC ) = ( AB ) C
Show proof:

Let A= { α ij } be an m×k matrix, B= { β ij } be an k×p matrix and C= { ζ ij } be an p×n matrix. Then

D=BC = { d ic }

is a k×n matrix where

dic = j=1 p β ij ζ jc

and

E=AB = { e rj }

is an m×p matrix where

erj = i=1 k α ri β ij

Consider X=AD = { χ rc } . It is an m×n matrix where

χrc = i=1 k α ri d ic = i=1 k α ri ( j=1 p β ij ζ jc ) = i=1 k j=1 p ( α ri β ij ζ jc )

Consider Y=EC = { γ rc } . It is an m×n matrix where

γrc = j=1 p e rj ζ jc = j=1 p ( i=1 k α ri β ij ) ζ jc = j=1 p i=1 k ( α ri β ij ζ jc )

Matrices X and Y have the same number of rows and columns and

χrc = i=1 k j=1 p ( α ri β ij ζ jc ) = j=1 p i=1 k ( α ri β ij ζ jc ) = γrc

hence X=Y, which concludes the argument.