/ factorpad.com / fin / quant-101 / array-math.html
An ad-free and cookie-free website.
Beginner
Videos are available at one organized Quant 101 Playlist on YouTube.
The operations of array math and matrix multiplication (18:51)
Welcome. Today's goal is to learn the primary operations of Linear Algebra so we can manipulate huge data sets in all fields involving data analysis, or to simply pass a test.
I'm Paul, and if you are confused by Linear Algebra then you are not alone. Just finding a quality resource for the basics is a challenge.
So here we will solve the problem and use scalars, vectors and matrices to cover the basic mathematical operations. Our playground here is Excel and we will see tips so you can confidently manipulate arrays in a spreadsheet.
If you'd like to read the video transcript and see the tables the first link in the video Description goes straight to a web page if you want to slow it down.
Here's our plan for the day.
First, we cover the conventions with Linear Algebra beginning with why we even use it in the first place.
Second, we cover the types of arrays.
Third, we walk through five exercises covering the must-know operations.
Fourth, we cover tips and tricks for simplifying array math in Excel.
And in our next episode we will return to our financial data set and get practice with these operations.
For Step 1, let's start with some conventions applied to Linear Algebra.
Before we start with that, what is the purpose of using Linear Algebra anyway? And how is it practical in the real world?
Linear algebra is used widely in computer systems to make calculations on large data sets. This allows you to write simpler formulas with fewer intermediate steps which first makes them faster and second saves computer disk space.
The third benefit is improved accuracy because we keep formulas intact and operate on rows and columns of data instead of each individual calculation, like individual cells in a spreadsheet.
Here in Excel we will be able to see this. Many people when creating spreadsheets write a unique formula in each cell. When using Linear Algebra we can treat columns and rows of data as units.
Also, on the point about speed, we will see how to make multiple calculations in one cell instead of taking intermediate steps.
Let's start with the term 'array' which refers to an arrangement of data that can take many shapes.
Keep in mind, depending on context the term 'array' may be used differently. Two areas that come to mind are in computer programming and in the Linear Algebra branch of Mathematics.
Here we will use 'array' as a catch-all for all types of data structures, scalars, vectors and matrices, which we will define in a moment. So all of these structures are arrays.
These arrays are often simply referred to by their dimensions because as we will see shortly the dimensions play an important part in the operations.
What is consistent is the row-by-column convention when specifying the dimensions of an array. So this is consistent across both computer programming and in Mathematics.
While in spreadsheets like Excel, our tool here, people often refer to structures of data in cells or ranges of cells, in computer programming and Linear Algebra, the dimensions are always named consistently using this row-by-column convention.
A structure is always given with the row first, then column. Yes, row-by-column.
I memorized it by thinking that the order isn't alphabetical, meaning r comes before c. So row first, colunn second.
For Step 2 let's define the names of arrays and describe their dimensions.
First we have a scalar. A scalar is used to scale, which you can use to multiply a whole row or column of data.
| 1 |
It always has dimensions of 1-by-1, one row and one column.
Next, a vector is said to be one-dimensional and it can be a row vector or column vector.
A row vector is a one-dimensional row of data, the example here has 1-by-3 dimensions, or 1 row by 3 columns.
| 1 2 3 |
Next, we have an example of a column vector with 3 rows and 1 column.
| 4 |
| 5 |
| 6 |
A matrix is considered a multi-dimensional array, so it will have at least two rows and columns. Here we have an example of a 2-by-3 matrix.
| 1 3 5 |
| 2 4 6 |
Also a 3-by-2 matrix.
| 1 2 |
| 3 4 |
| 5 6 |
There you have it, the three basic structures.
Okay, let's move on to Step 3. As mentioned our playground here is Excel and these examples we use can mostly be done in your head. That said, many people learn Linear Algebra on a pad and paper so if it helps to write it down please do so.
When we ask Excel to perform an array calculation we must submit it
using the Ctrl+Shift+Enter
keystroke
combination. Oftentimes many calculations are performed made at once,
so we need to tell Excel to employ array math using this keystroke
combination that is at times abbreviated as a CSE formula.
Ok, now we're ready to walk through the five operations with exercises using the required rules from Linear Algebra.
At times we need to reshape our data so it follows the rules. When we reshape it we call it transposing. You can think of it as turning the data on its side.
For example, you can change a matrix named A, with the format of 2-by-3, two rows and three columns, to what is called the Transpose of A with the dimensions of 3-by-2.
To do this manually we would put the first row down the first column, so it would read 1, 3 and 5 in the first column and 2, 4 and 6 in the second.
E | F | G | H | I | J | |
---|---|---|---|---|---|---|
5 | A: 2x3 matrix | Transpose A: 3x2 matrix | ||||
6 | 1 | 3 | 5 | 1 | 2 | |
7 | 2 | 4 | 6 | = | 3 | 4 |
8 | 5 | 6 |
Here to do the same thing in Excel, we highlight the whole output range
and use =TRANSPOSE(E6:G7)
followed by
the Ctrl+Shift+Enter
keystroke
combination and we get the same answer.
Transposing simply takes the rows and makes a columns out of them.
Visualizing this with real data may help. In our Quant 101 Playlist we use data located on a tab called Returns structured with dates down the rows with a corresponding return for the individual stocks stored across columns.
Let's use a 3-by-3 matrix from the top three rows and columns from this table and transpose them.
B | C | D | |
---|---|---|---|
6 | Date | MSFT | EBAY |
7 | 4/30/03 | 0.056175 | 0.080987 |
8 | 5/30/03 | -0.037544 | 0.094070 |
We select the whole output block where we want the answer and type the
formula =TRANSPOSE(Returns!B6:D8)
followed by the CSE keystroke combination and now we have stocks down
the rows and dates across the columns.
M | N | O | |
---|---|---|---|
5 | Transpose Returns: 3x3 matrix | ||
6 | Date | 4/30/03 | 5/30/03 |
7 | MSFT | 0.056175 | -0.03754 |
8 | EBAY | 0.089087 | 0.09407 |
One last note, formally in a text book, you may see an array name
which looks like it is raised to the power of T, meaning it has been
transposed. So we may use shorthand
A^T
for matrix A transposed.
The main purpose of transposing is that it helps to align data so the mathematical operations work. On that, let's jump to the operations.
For addition and subtraction the rules of Linear Algebra require that dimensions of the two arrays must be the same. The resulting array will have the same dimensions.
Let's walk through an example, let's take a 3-by-1 vector B and add it to a 3-by-1 vector C.
Here to do it manually we simply add the first row of vector B to the first row of vector C and place it in the answer vector D. So 1 plus 4 is 5, 2 plus 5 is 7 and 3 plus 6 is 9.
E | F | G | H | I | |
---|---|---|---|---|---|
11 | B: 3x1 vector | C: 3x1 vector | D: 3x1 vector | ||
12 | 1 | 4 | 5 | ||
13 | 2 | + | 5 | = | 7 |
14 | 3 | 6 | 9 |
Let's do it in Excel now with an array formula. First, we select the
whole answer block and enter
=E12:E14+G12:G14
followed by CSE and
our answer matches. You could put parentheses around the ranges to
visualize them better, but that isn't required.
The rules for subtraction are the same and to verify this in Excel
let's subtract C from B with
=E12:E14-G12:G14
in a
3-by-1 vector E.
E | F | G | M | N | |
---|---|---|---|---|---|
11 | B: 3x1 vector | C: 3x1 vector | E: 3x1 vector | ||
12 | 1 | 4 | -3 | ||
13 | 2 | - | 5 | = | -3 |
14 | 3 | 6 | -3 |
That's it for addition and subtraction, pretty straightforward. As mentioned, under the strict rules of Linear Algebra the arrays must have the same dimensions and the resulting array will have the same dimensions.
Just as with addition and subtraction, multiplication and division share a set of rules, but the rules are different.
Multiplication is the operation mentioned that helps you to perform many calculations at once, often to produce a 1-by-1 array result as we have here in cell K19.
In our first exercise let's multiply a 1-by-3 vector F times a 3-by-1 vector G and place the result in a single-cell output range.
E | F | G | H | I | J | K | |
---|---|---|---|---|---|---|---|
17 | F: 1x3 vector | G: 3x1 vector | H: 1x1 array | ||||
18 | 1 | 3 | 5 | * | 4 | ||
19 | 5 | = | 49 | ||||
20 | 6 |
Here we have three requirements. First, you need to line up the arrays properly. For example, here, a 1-by-3 vector multiplied by a 3-by-1 vector is aligned properly. This means that the inner dimensions must be the same, so 3 matches 3.
You couldn't multiply a 3-by-1 vector by a 3-by-1 vector, like vectors B and C for example because their inner dimensions don't match.
Second, when the arrays are aligned properly, the outer numbers give the dimensions of the resulting array, here 1-by-1.
And third, multiplying arrays takes the sum of the products. To visualize this think about the row, so 1, 3 and 5, times the whole column 4, 5 and 6. Take each of the multiplications and then sum them up.
To do this manually you would take 1 times 4, which is 4, then 3 times 5 which is 15 and finally 5 times 6 which is 30 and add these products together. So cell K19 is the sum of 4 plus 15 plus 30, or 49.
Now to do the same thing in Excel we use a specialized
=MMULT()
function.
So we should be able to get the same answer with
=MMULT(E18:G18,I18:I20)
multiplying
the first array F times the second G, with a comma in between the two
arrays.
Let's get more practice lining up the dimensions, this time multiplying a 1-by-1 scalar times a 1-by-3 vector. The inner dimensions match at 1 and the resulting answer is a 1-by-3 vector, named K.
E | F | G | H | I | J | K | L | M | |
---|---|---|---|---|---|---|---|---|---|
22 | I: 1x1 scalar | J: 1x3 vector | K: 1x3 vector | ||||||
23 | 4 | * | 1 | 3 | 5 | = | 4 | 12 | 20 |
Here there is no summation involved but we still need to highlight the
whole output array, enter
=MMULT(E23,G23:I23)
followed by the
CSE keystroke combination and we get a vector with 4, 12 and 20.
Her we can see how the scalar of 4 scaled each number in the J vector to produce the answers in vector K.
The rules for matrix division are the same as for multiplication however there is no function for matrix division so we will return to division after finding the inverse in Exercise 5.
Next let's tackle what is called the identity matrix.
Of course you remember from Algebra that we frequently multiply both sides of an equation by one. The identity matrix basically is a square matrix that equals one.
It has ones down the diagonal and zeros in the other cells and must be structured like this to produce the same matrix after multiplication.
E | F | G | H | I | J | K | L | |
---|---|---|---|---|---|---|---|---|
26 | L: 2x2 matrix | M: 2x2 matrix | N: 2x2 matrix | |||||
27 | 1 | 2 | 1 | 0 | 1 | 3 | ||
28 | 2 | 4 | * | 0 | 1 | = | 2 | 4 |
To produce the resulting 2-by-2 matrix N let's highlight the output
block and use =MMULT(E27:F28,H27:I28)
.
Notice how when you multiple matrix L with the identity matrix the result N looks just like L. That's what we wanted, we just multiplied a matrix by 1.
It might be good to pause here and walk through the math manually to solidify your understanding of matrix multiplication and the four different summation exercises going on in this one calculation.
Maybe take out a piece of paper, multiply the matrices together and see if you get the same answer. Let me describe the first cell, it is 1 times 1 equals 1, plus, 3 times 0 equals 0, and the total is 1, in the top-left.
As mentioned earlier with division, at times we need to invert an array.
If you recall from basic Algebra, the inverse is the same as 1-over a number, right? So the inverse of 2 is 1 over 2, or one-half.
In Linear Algebra, is a little trickier and we don't have time for it here. It involves taking 1 over the determinant times the adjugate. Look up those terms if you want to walk through the formula.
Here instead we will ask Excel to calculate it for us with the
specialized =MINVERSE()
function.
In Exercise 5 we will calculate the inverse of matrix O and name it P and when we multiply the two togeter we should get 1, or the identity matrix in Q, right?
E | F | G | H | I | J | K | L | |
---|---|---|---|---|---|---|---|---|
31 | O: 2x2 matrix | P: 2x2 matrix | Q: 2x2 matrix | |||||
32 | 1 | 3 | -2.0 | 1.5 | 1 | 0 | ||
33 | 2 | 4 | * | 1.0 | -0.5 | = | 0 | 1 |
Let's start with matrix P by highlighting the output range and then
typing =MINVERSE(E32:F33)
followed by
the CSE keystrokes and this produces the Inverse.
To test it let's create the Q matrix with
=MMULT(E32:F33,H32:I33)
and CSE to
get the identity matrix, or 1.
This is akin to our basic Algebra example of multiplying 2 times one-half which gives you 1.
Now, just like for the basic matrix multiplication example earlier, if you multiplied 1 times -2, you'd get -2, right, and add that to 3 times 1, or 3, and you get 1. That's the top left cell and I suggest finishing the calculation with a pad and paper on your own time.
This will solidify the concept of matrix multiplication showing you how it aggregates many calculations together saving us from having to create intermediate ranges to hold our data. This is one of the main reasons we noted for using arrays in the first place.
Don't forget to use =MMULT()
here
instead of the multiplication symbol. With addition and subtraction we
don't need a special function but for multiplication and division we do.
Speaking of remembering, now that we understand the inverse, let's head back and finish off the division from earlier in Excercise 3b.
What we will do here is take the inverse of the scalar in I and multiply that by the vector K and that should produce the exact same vector as we have in J, right?
M | N | O | |
---|---|---|---|
17 | R: 1x3 vector | ||
18 | 1 | 3 | 5 |
Let's select the output block of M18:O18 and then
input =MMULT(MINVERSE(E23),K23:M23)
followed by that CSE keystroke combination and voila, vector R matches
matches vector J from earlier.
This is a little tricky but just think of the inverse operation as the
first argument of the matrix multiplication. So the inverse needs to be
wrapped inside the =MMULT()
function
and separated by a comma.
Okay we went over a lot there and Excel helped us out tremendously. Now for Step 4 we need to back up a minute and cover tips and tricks when working with arrays in Excel.
The first trick is using named ranges. We went over this in the last tutorial and basically they allow you to select ranges once and give them a name in the Name Box. That way you can refer to them by name instead of using the cell ranges. This helps to minimize errors and saves time when you use the same ranges frequently.
Naming ranges not only helps when working with arrays, but will also generally advance of your spreadsheet skills.
Head back to the Linear Algebra Introduction tutorial for the details.
There we also touched on how array formulas we build on our own, like
with addition and subtraction here differ from built-in array functions
like =MINVERSE()
,
=MMULT()
and
=TRANSPOSE()
.
Technically, the rules of Linear Algebra Algebra for addition and subtraction require that the starting and ending arrays be the same size.
This rule can be broken using array formulas we build in Excel as we saw in that tutorial. As an example on your end try to subtract the array I from G and see if you get a 3-by-1 column vector containing 0, 1 and 2.
Also, I wanted to return to the point made earlier about how Excel identifies array calculations for you so you can keep your data intact and treat your arrays as units.
For this, Excel offer three clues. First, Excel puts curly braces around all array formulas.
Second, if you try to delete or edit part of an array, Excel sends a warning.
Third, every cell in the whole array is given the same formula.
All of these help to keep the integrity of your data and arrays intact so you know there are no cells in the middle that have been edited.
There you have it.
By way of summary, we learned the required procedures of Linear Algebra including conventions, array types and operations such as finding the transpose, identity and inverse arrays. We also saw distinctions between addition and subtraction versus multiplication and division.
Walking through it in Excel helps us nail these down so we don't have to stare blankly at formula notation in text books. We also used array math practically in a spreadsheet with tips and tricks to speed things up.
Many people fear Linear Algebra and array math, but I think that is a mistake. This wasn't a complete review like you might find in a college course, but with what you saw here, you can make make most calculations involving big data and this offers a nice bridge to statistical programming languages. The sky is the limit after that.
In the next episode we will put our learnings here to good use as we close out Chapter 5 by calculating portfolio return and portfolio variance. Matrix multiplication is all we need to simplify a calculation that looks very confusing in formula notation but is simply a matrix times two vectors. Piece of cake for us now.
If this is your first time here, we have hundreds of free tutorials on Finance and Technology made available for you without signing-up.
Please feel free to join us any time and have a nice day.
This tutorial in particular is helpful to watch in video as the Excel examples there are more clear.
Please join our growing YouTube Channel audience.
/ factorpad.com / fin / quant-101 / array-math.html
A newly-updated free resource. Connect and refer a friend today.