# Matrix multiplication as composition | Essence of linear algebra, chapter 4

It is my experience that proofs involving matrices can be shortened by 50% if one throws matrices out.
— Emil Artin Hey everyone! Where we last left off, I showed what linear
transformations look like and how to represent them using matrices. This is worth a quick recap, because it’s
just really important. But of course, if this feels like more than
just a recap, go back and watch the full video. Technically speaking, linear transformations
are functions, with vectors as inputs and vectors as outputs. But I showed last time how we can think about
them visually as smooshing around space in such a way the gridlines
stay parallel and evenly spaced, and so that the origin remains fixed. The key take-away was that a linear transformation is completely determined,
by where it takes the basis vectors of the space which, for two dimensions, means i-hat and
j-hat. This is because any other vector can be described
as a linear combination of those basis vectors. A vector with coordinates (x, y) is x times i-hat + y times j-hat. After going through the transformation this property, the grid lines remain parallel
and evenly spaced, has a wonderful consequence. The place where your vector lands will be
x times the transformed version of i-hat + y times the transformed version of j-hat. This means if you keep a record of the coordinates
where i-hat lands and the coordinates where j-hat lands you can compute that a vector which starts
at (x, y), must land on x times the new coordinates of
i-hat + y times the new coordinates of j-hat. The convention is to record the coordinates
of where i-hat and j-hat land as the columns of a matrix and to define this sum of the scaled versions
of those columns by x and y to be matrix-vector multiplication. In this way, a matrix represents a specific linear transformation and multiplying a matrix by a vector is, what
it means computationally, to apply that transformation to that vector. Alright, recap over. Onto the new stuff. Often-times you find yourself wanting to describe
the effect of applying one transformation and then another. For example, maybe you want to describe what happens when
you first rotate the plane 90° counterclockwise then apply a shear. The overall effect here, from start to finish, is another linear transformation, distinct from the rotation and the shear. This new linear transformation is commonly called the “composition” of the two separate transformations we applied. And like any linear transformation it can be described with a matrix all of its
own, by following i-hat and j-hat. In this example, the ultimate landing spot
for i-hat after both transformations is (1, 1). So let’s make that the first column of the
matrix. Likewise, j-hat ultimately ends up at the
location (-1, 0), so we make that the second column of the matrix. This new matrix captures the overall effect
of applying a rotation then a sheer but as one single action, rather than two
successive ones. Here’s one way to think about that new matrix: if you were to take some vector and pump it
through the rotation then the sheer the long way to compute where it ends up is to, first, multiply it on the left by the
rotation matrix; then, take whatever you get and multiply that
on the left by the sheer matrix. This is, numerically speaking, what it means to apply a rotation then a sheer
to a given vector. But, whatever you get should be the same as
just applying this new composition matrix that we just found, by
that same vector, no matter what vector you chose, since this new matrix is supposed to capture
the same overall effect as the rotation-then-sheer action. Based on how things are written down here I think it’s reasonable to call this new matrix,
the “product” of the original two matrices. Don’t you? We can think about how to compute that product
more generally in just a moment, but it’s way too easy to get lost in the forest
of numbers. Always remember, the multiplying two matrices
like this has the geometric meaning of applying one
transformation then another. One thing that’s kinda weird here, is that
this has reading from right to left; you first apply the transformation represented
by the matrix on the right. Then you apply the transformation represented
by the matrix on the left. This stems from function notation, since we write functions on the left of variables, so every time you compose two functions, you
for the rest of us. Let’s look at another example. Take the matrix with columns (1, 1) and (-2, 0) whose transformation looks like this, and let’s call it M1. Next, take the matrix with columns (0, 1)
and (2, 0) whose transformation looks like this, and let’s call that guy M2. The total effect of applying M1 then M2 gives us a new transformation. So let’s find its matrix. But this time, let’s see if we can do it without
watching the animations and instead, just using the numerical entries
in each matrix. First, we need to figure out where i-hat goes after applying M1 the new coordinates of i-hat, by definition, are given by that first column
of M1, namely, (1, 1) to see what happens after applying M2 multiply the matrix for M2 by that vector
(1,1). Working it out, the way that I described last
video you’ll get the vector (2, 1). This will be the first column of the composition
matrix. Likewise, to follow j-hat the second column of M1 tells us the first
lands on (-2, 0) then, when we apply M2 to that vector you can work out the matrix-vector product
to get (0, -2) which becomes the second column of our composition
matrix. Let me talk to that same process again, but
this time, I’ll show variable entries in each matrix, just to show that the same line of reasoning
works for any matrices. This is more symbol heavy and will require
some more room, but it should be pretty satisfying for anyone
who has previously been taught matrix multiplication the more rote way. To follow where i-hat goes start by looking at the first column of the
matrix on the right, since this is where i-hat initially lands. Multiplying that column by the matrix on the
left, is how you can tell where the intermediate
version of i-hat ends up after applying the second transformation. So, the first column of the composition matrix will always equal the left matrix times the
first column of the right matrix. Likewise, j-hat will always initially land
on the second column of the right matrix. So multiplying the left matrix by this second
column will give its final location and hence, that’s the second column of the
composition matrix. Notice, there’s a lot of symbols here and it’s common to be taught this formula
as something to memorize along with a certain algorithmic process to
kind of help remember it. But I really do think that before memorizing
that process you should get in the habit of thinking about
what matrix multiplication really represents: applying one transformation after another. Trust me, this will give you a much better
conceptual framework that makes the properties of matrix multiplication
much easier to understand. For example, here’s a question: Does it matter what order we put the two matrices
in when we multiply them? Well, let’s think through a simple example like the one from earlier: Take a shear which fixes i-hat and smooshes
j-hat over to the right and a 90° rotation. If you first do the shear then rotate, we can see that i-hat ends up at (0, 1) and j-hat ends up at (-1, 1) both are generally pointing close together. If you first rotate then do the shear i-hat ends up over at (1, 1) and j-hat is off on a different direction
at (-1, 0) and they’re pointing, you know, farther apart. The overall effect here is clearly different so, evidently, order totally does matter. Notice, by thinking in terms of transformations that’s the kind of thing that you can do in
your head, by visualizing. No matrix multiplication necessary. I remember when I first took linear algebra there’s this one homework problem that asked
us to prove that matrix multiplication is associative. This means that if you have three matrices
A, B and C, and you multiply them all together, it shouldn’t matter if you first compute A
times B then multiply the result by C, or if you first multiply B times C then multiply
that result by A on the left. In other words, it doesn’t matter where you
put the parentheses. Now if you try to work through this numerically like I did back then, it’s horrible, just horrible, and unenlightening
for that matter. But when you think about matrix multiplication
as applying one transformation after another, this property is just trivial. Can you see why? What it’s saying is that if you first apply
C then B, then A, it’s the same as applying C, then B then A. I mean, there’s nothing to prove, you’re just applying the same three things
one after the other all in the same order. This might feel like cheating. But it’s not! This is an honest-to-goodness proof that matrix
multiplication is associative, and even better than that, it’s a good explanation
for why that property should be true. I really do encourage you to play around more
with this idea imagining two different transformations thinking about what happens when you apply
one after the other and then working out the matrix product numerically. Trust me, this is the kind of play time that
really makes the idea sink in. In the next video I’ll start talking about
extending these ideas beyond just two dimensions. See you then!

### 100 thoughts on “Matrix multiplication as composition | Essence of linear algebra, chapter 4”

• August 11, 2019 at 2:14 am

I clicked on one video and now i'm not able to stop !! Now if you ask me what is Mathematics and what is Music ,answer would be same !!!

• August 11, 2019 at 9:19 am

Amazing!!!!你简直是个天才！！我的天！！

• August 11, 2019 at 6:35 pm

Why the first time Shear affects j-hat but in the second one it affects i-hat?? 7:36
If shear affects i-hat or j-hat in the both times ,then there is no difference between M1*M2 and M2*M1 !!

• August 13, 2019 at 4:20 am

How did you do this mind blowing explanations sir?

• August 14, 2019 at 2:28 pm

I don't understand how you get the shear matrix (3:40), please could sb help ? 😀

• August 14, 2019 at 6:25 pm

Doesn’t (AB)C imply that he’s transforming B then A, then that is transformed by C? Do the order of operations not apply because otherwise you would have to transform B then A prior to C.

• August 17, 2019 at 6:48 am

I can fkin jerk off on matrices

• August 17, 2019 at 8:00 pm

This is a fantastic complement to more math proof oriented books, it's important to grasp both approaches! Thanks a lot.

• August 19, 2019 at 4:27 am

Ok !thanhks !

• August 19, 2019 at 7:06 am

This is the only channel where I disable my adblocker.

• August 19, 2019 at 5:07 pm

Is there an affine matrix video (linear + translation vector column)?

• August 20, 2019 at 10:28 pm

Best ever

• August 23, 2019 at 9:11 am

Maaaan you're awesome

• August 26, 2019 at 5:50 pm

The explanation at khan academy is great but the graphics here is waaaaaaaaaaayyy better.

• August 29, 2019 at 12:12 pm

catholic christians when the hebrew version of the bible was published: 4:26

• August 29, 2019 at 1:51 pm

This was superb! Thank you

• August 29, 2019 at 3:54 pm

I wish he would take the time to go over the last proof again. I think he totally missed the boat on a fundamental point – that because matrices are not just transformations, but LINEAR transformations, they have associativity. In other words, because multiplication is associative, and because matrices transform through multiplication, they are thus associative.

Instead, this video makes it seem like the whole concept of order of operations is totally trivial and you can just think of operations willy nilly.

If he wanted to stick to visual interpretations, he could have repeated his earlier demonstration showing how the bases i and j are transformed successively- but it would not prove anything formally.

• August 30, 2019 at 1:24 am

I'm thinking so many issues I once had with matrices are from them being organized top to bottom rather than left to right

• August 30, 2019 at 11:13 am

Truly inspiring … an absolutely amazing series … keep going! I have two masters degrees and I feel like I'm actually beginning to understand & really learn when I watch these videos.

• August 31, 2019 at 7:15 am

發出讚嘆的聲音

• September 3, 2019 at 5:46 am

Matrix Multiplication will never be same again!! These visualisations will keep flashing. Kudos!!

• September 5, 2019 at 2:29 am

You are. A gddamn hero.

• September 5, 2019 at 1:02 pm

Reminds me of Gilbert Strang's lectures on the same topic

• September 5, 2019 at 2:02 pm

Курс лекций саватеева о теории групп там ассоциативность явно видно

• September 5, 2019 at 3:58 pm

5:21 5:45 would have been better if rotated one axis at time

• September 5, 2019 at 4:30 pm

I think after I finish this and go to understand what Calculus is really about in your other videos, I will be dangerous.

• September 6, 2019 at 12:17 am

Why are you not the king of mathland yet?

• September 6, 2019 at 7:35 am

Towards the end of this video you demonstrated the associativity of linear transformations (aka matrix multiplication). I was a little confused until I figured out that what you were saying was the semantics of the expression resulted in the same sequence of transformations regardless of how the parenthesis were placed. That's all well and good, but I was curious if the associativity held if the actual sequence of operations was altered in accordance with the parenthesis. So I did as you advised and played around with it a little bit. I just used the matrix multiplication algorithm mechanically without regard to the transformation concept and did it in both possible sequences to see if it got the same result. It was, as you alluded to from your earlier school experience, messy. But I am pleased to report that it did. Associativity holds regardless of the sequence of operations. Relying on that semantic work-around is not necessary!

If you're curious what it looks like, I uploaded a rendering of the worksheet showing this result to this link:
https://www.dropbox.com/s/1zt118iz7kk670s/Matrix%20Multiplication%20Demo.PNG?dl=0

• September 9, 2019 at 6:06 pm

After finishing 4 years of engineering and now i wish i had studied on youtube rather than go to the shitty coll

• September 10, 2019 at 5:27 pm

WOW! I always wandered why the heck we multiply the matrices as we do, just what is all this! Yet here you are, clearing another query of mine. Thank you so much! 😃

• September 10, 2019 at 9:29 pm

you've always make me see things in math so much easier, thanks for that!!!!!!!!

• September 11, 2019 at 1:02 pm

I think there is a mistake in explanation what the associative property is: 1) for A(BC) you said: "first apply C, then B, then A"; 2) for (AB)C you said: "apply C, then B, then A", but probably you should say something like this: "Apply C, then multiply B times A as T and then apply T".

• September 11, 2019 at 7:06 pm

Great video. Difficult topic, especially the end. One thing I found helpful is remembering that unlike regular multiplication, the order in which matrices are multiplied matters. This is why the rotation after the shear compared to the shear after the rotation have different result.

Part of the reason for the confusion was that the associative property was discussed right after and seemed to contradict what was said in terms of the order of transformation mattering. I think it might have been easier to explain the property if instead you said the property holds because the result of taking your initial vector, rotating it and then shearing it has the same effect as determining the matrix that results from rotating and shearing the base vectors and then multiplying your initial vector by that matrix.

• September 14, 2019 at 8:03 pm

I'm French…. Too, another. Your video is really : incredible, extraordinary, great, wonderful. Really thanks you

• September 15, 2019 at 3:38 am

awesome！

• September 16, 2019 at 2:13 pm

BEAUTIFUL!!!!!!

• September 16, 2019 at 7:51 pm

I started learning linear algebra using this series. And I wonder what is going to happen when I see traditional ways in college. Thanks for this great series and videos!

• September 17, 2019 at 1:22 am

The two transformations can be any right? So by changing the position of basis ones to a new position when we apply another transformation is it according to the previous basis vectors or the new ones so formed. Help me as i am trying to plot the multiplication in my notebook rather that multiplying it. I want to achieve the ans by graphical approach.

• September 22, 2019 at 5:20 am

I used to wonder why mathematicians likes to play number game with matrices

• September 22, 2019 at 8:41 pm

I wish youtube were like nowadays 10 years ago, when I scratch my head over and over again but can't figure out the intuition of linear algebra

• September 24, 2019 at 5:46 pm

Matrix multiplication associativity proof [09:13] by Seddiqin argument!

• September 25, 2019 at 10:35 am

At 5:08 why is i hat the same as first column of m1?

• September 25, 2019 at 5:58 pm

This series should be on Netflix , after watching all those dumb teenager series people can get a little smart too.

• September 25, 2019 at 7:13 pm

I decided to do my basics on linear algebra with your wonderful videos, so where can I get worksheets??

• September 27, 2019 at 4:05 pm

Thanks a lot for making matrices so fun !!!!

• September 28, 2019 at 6:57 pm

You are the best! felt like i slept through secondary school and college

• September 30, 2019 at 5:06 am

I tried to prove whether it's associative or not and my conclusion was that it wasn't based on this finding:
I decided to frame this in terms of directions: North, South, and West.
If you go 1 unit N then S, then W it is the same as if you went S, N, W.
HOWEVER, that is only because I switched the grouping of two directions.
If you go W, N, S; switching all three, you actually get a different result. And for this reason I thought that matrix multiplication was non associative. I don't know where I went wrong.

• September 30, 2019 at 10:19 am

For the first time I understood matrix multiplication…n idk why Im crying. Thank you very much!!!

• September 30, 2019 at 9:49 pm

Triple like

• October 2, 2019 at 2:33 am

These are fantastic videos! My only criticism is the axis colours. x=red, y=green, z=blue is the standard everywhere i've ever seen except here where x=green and y=red. It means i constantly have to remember to invert the standard while processing the video

• October 3, 2019 at 7:50 pm

You are awesome!!!!!!!!!!….you saved me

• October 4, 2019 at 9:24 am

8:10 was a complete revelation!

• October 6, 2019 at 5:47 pm

Every exact-science student needs to see this series. I can honestly say that my linear algebra teacher was brilliant with extensive understanding of the topic, but was limited in transffering that knowledge by inability to show what lin-alg looks like millisecond after millisecond of transformations.
Well-done!

• October 7, 2019 at 9:33 am

I have try this method
it makes me feel marvelous
now I dont need to recite the old way of dealing 2×2

• October 7, 2019 at 1:56 pm

I finished my MSc in 2015…3blue1brown answers the questions that kept me up at night..

• October 7, 2019 at 7:24 pm

well , this applies to Arabic readers too

• October 9, 2019 at 9:51 pm

Could someone explain to me why the pink "shear matrix" is
1 1
0 1
I tried to reason it out in my head but I either got
1 -1
1 0
Or
1 0
-1 1

• October 11, 2019 at 5:53 am

Yes YouTube serve me lot of ads , afterall it's for the great tutor.

• October 13, 2019 at 4:22 am

Why is matrix multiplication associative but you have to read it right to left

• October 13, 2019 at 11:43 am

Those 10 mins felt like just 1 min.

• October 14, 2019 at 2:40 am

¿Por qué no lo explican de esta forma en la universidad? jajaja

• October 14, 2019 at 7:02 pm

Oh. My. God.
I always had problems remembering the matrices for transformations, but your explanations with i and j just blew my mind with its simplicity. Thanks man!

• October 15, 2019 at 4:23 pm

What a great intuition for matrix operation that sadly disappears seconds after you watch then you have to rememorise everything again because matrix notations make no intuitive sense.

• October 16, 2019 at 11:08 am

I really wonder why on Earth would any human being dislike any video of this playlist?

• October 16, 2019 at 2:28 pm

It cured my depression

• October 18, 2019 at 3:16 pm

Literally the GOD among teachers of linear algebra!!! AWESOME

• October 19, 2019 at 5:31 pm

This is such a good video whenever I watch it, how many ever times I watch it!

• October 20, 2019 at 2:40 pm

Gotta love the way you visually demonstrated how multiplying matrices is not commutative.

• October 22, 2019 at 9:43 am

INCREDIBLE

• October 22, 2019 at 11:32 pm

Hi @3Blue1Brown, which software do you use for your animations?

• October 26, 2019 at 2:09 pm

I want to like this video so many times

• October 26, 2019 at 6:54 pm

question, why is matrix multiplication called multiplication.whynot matrix composition

• October 29, 2019 at 5:02 pm

Learning matrix in high school was like learning how to construct a sentence but never know it was for communication

• October 29, 2019 at 5:20 pm

I really love u !

• November 1, 2019 at 8:27 pm

me: Already getting somewhat confuzzled with 2D.
3Blue1Brown: "Next video 3D!"
me: flips table
also me: Hm, I wonder if I could mathematically map how the table flipped.
brain: "You're in a 3D space so you'll need to watch the nex.."
me again: FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU

• November 1, 2019 at 10:26 pm

In 4:53, how does it make sense the way the vectors rotated and where they landed after being transformed by M2?

• November 3, 2019 at 6:23 pm

• November 6, 2019 at 7:02 am

thank you this makes way more sense than what I was thought. The way the show it in school doesn't even represent what you're actually doing

• November 6, 2019 at 10:07 pm

The coolest way to explain matrices and linear algebra. These videos gave a different way to look at matrices.

• November 7, 2019 at 7:57 am

This video August 2016
I watch this November 2019
Okay I feel guilty now

• November 8, 2019 at 4:26 pm

9:05 I get what you are saying here, but it seems more like you have to prove that it should be "Prove that you apply C to B, then apply whatever the output matrix of that is to A, is the same as applying B to A, and then apply C to that matrix." The way you're phrasing it seems to ignore the parentheses completely.
P.S. This is a great series! So great, that I am able to make the above comment! I think this is my 3rd time watching it…

• November 10, 2019 at 6:16 pm

I keep coming back every time I feel like I started thinking about matrices too much numerically and lost the intuitive connection. You're a godsend.

• November 11, 2019 at 2:38 am

I hope one day every teachers in the world teach like you

• November 12, 2019 at 2:06 am

Eso me llevó mucho tiempo aprender pero lo logré

• November 12, 2019 at 3:18 pm

What does none-square matrices mean then

• November 12, 2019 at 10:30 pm

he was so enthusiastic in the explanation in the end.. but I still feel like I'm ages from visualizing all this "intuitively"

• November 14, 2019 at 9:52 pm

"I mean, there's nothing to prove"! Where were you when I was in college struggling with the impenetrable way they taught linear algebra!! This is fantastic. I finally understand it!!!

• November 15, 2019 at 7:10 am

Sir, what is meant by taking transpose of a matrix?? An intuitive explanation would be highly appreciable!! Thank you Sir!!

• November 15, 2019 at 1:56 pm

I actually think that this entire playlist is a remarkable piece of art.
You are clearing my foggy understanding of Linear Algebra! Thanks a lot <3

• November 18, 2019 at 12:58 pm

What a shame that on searching for matrix multiplication this video is not among the first to show up

• November 19, 2019 at 6:25 pm

In (AB)C shouldn’t we first do AB and then multiply C by the result of A*B. This got me confused.

• November 22, 2019 at 5:32 pm

This… Is absolutely outstanding. It's the best approach of linear algebra I have ever seen.

• November 23, 2019 at 10:36 pm

I see a lot of comments about understanding the basis of matrices. So Is this the basis of matrices or is it one way to visualize and have an image for those things we have memorized?

• November 25, 2019 at 9:15 pm

• November 26, 2019 at 6:28 am

oh khan you are awesome. THis is khan

• November 28, 2019 at 7:34 am

Damn! That visual proof! I hate schools more now.

• November 28, 2019 at 10:16 am

please have a video on singular value decomposition and eigen value decomposition!!!!!!!!!!!!!!!!!!!! thnqq

• November 30, 2019 at 6:55 am

I was a curious student back then, the way they taught me made me sleep in the class. I always felt why the hell I need to just memorize these shit. If this was the way they taught us. Everything would have been different now. 😑, Thanks a lot ❤