Build a Better Process

Install Excel Data Analysis ToolPak and Excel Solver for Quant 101

A step-by-step guide to prepare Excel to analyze stock portfolios.
2. Versions of Excel - Review Excel setup and discuss appropriate versions.
5. Why not Programming? - Touch on what happens beyond the spreadsheet.
6. Next: Return calculations - Learn the three methods for calculating return.
Updated: February 19, 2021
System setup is often a stumbling block, so press on for answers.

/ factorpad.com / fin / quant-101 / system-setup.html

Learn to get your Excel system set up for data analysis

Beginner

Video

Videos are available at one organized Quant 101 Playlist on YouTube.

Install Excel Data Analysis ToolPak and Solver Add-In (17:27)

Video Script

Welcome. Today's goal is to set up our Excel system and cover other helpful topics related to using a spreadsheet for data analysis.

I'm Paul, and I get frustrated by how long it takes to get systems ready before you can get to the fun part, analyzing data.

So here we will walk through the considerations for different operating systems and Excel versions, going step-by-step, to save you time.

If this is your first time with us, this tutorial sits within a series of 30 financial modeling tutorials, called Quant 101. The way it works is I ask you to duplicate all of these models on your system as a way to learn fairly advanced Excel skills and to learn about modeling stocks. So if you are sticking around for that, I will also cover how to download and import a free data set used throughout the series.

For those watching the video on YouTube, if you want to slow all of this down and see the instructions in print, everything I say here is duplicated on a web page. The first link in the video's Description section goes straight to it.

Overview

Second, we review my setup and cover appropriate versions of our spreadsheet of choice, Excel, for Windows and Mac users.

Third, we will install the Data Analysis ToolPak for statistical analysis and the Solver Add-in for optimization.

Fourth, we cover the one dataset we use throughout the playlist and how to freely download it so you can get started right away.

And in our next episode we will review the three methods for calculating returns on stocks.

Step 1 - Why a Spreadsheet?

Before we get started with the technical part, let's take a moment to cover the advantages and disadvantages of spreadsheets which will help us later in our discussion of comparisons with the programming world.

1. Pervasive - They are cheap or free and many people in Finance have already memorized many of the functions.
2. Easy to learn - The grid and cell format, with columns named with letters, and rows named with numbers, make it easy to draw relationships and make quick calculations, even on modestly large data sets, without having to learn programming.
3. Easy to visualize - The ability to visualize and customize tables, charts, and fonts make spreadsheets a viable tool for creating presentations. After all, there is a comfort level with being able to see, touch and edit the data.
4. Built-in functionality - Most beginner and intermediate-level financial and statistical operations can be performed in a spreadsheet. In later tutorials you will see how conditional logic can be written in cells. Also Visual Basic for Applications, or VBA, can be programmed to automate common tasks. It is also easy to update cells in a spreadsheet live, like security prices with links to financial data providers.

1. Error prone - By many estimates, as many as 80% of all spreadsheets contain errors. Adding rows and columns can invalidate cell references, requiring significant modifications. On top of this, many firms do not train users of spreadsheets even when mission-critical functions reside in their work.
2. Rarely Audited - Auditing spreadsheets is difficult. The cell-by-cell structure requires a review of each cell to be 100% certain of accuracy. Many large financial organizations build elaborate spreadsheets because the users do not have traditional programming expertise. The programming concept of version control is difficult to maintain with spreadsheets, especially in those used by several people.
3. Difficult for complex models - While many tables can be stored in a spreadsheet, the relationships and enforcing referential integrity as is done in a database is not as robust here. So complex models can break down in a spreadsheet. There have been several high-profile examples of public firms who incurred large trading losses as a result. In addition, faulty reports from academia have led to similarly embarrassing situations.
4. Formatting is slow - As you will see, the added flexibility associated with making data look pretty in a spreadsheet can get in the way of being able to quickly generate a chart to visualize statisticial relationships, for example.
c. Why Excel?

So why are we using Excel?

First, a spreadsheet is helpful for organizing and visualizing our analysis, and for learning the concepts before moving on to programming, should that be the path you choose.

Microsoft Excel still leads in market share over Google Sheets and Apple Numbers, however the lead is narrowing.

Any search for technical jobs in the field of Finance will generally require some Excel experience, so this is our spreadsheet of choice.

Step 2 - Versions of Excel

Let's move on to the versions of Excel and operating systems.

a. My setup

On my end, I have a Windows 7 operating system with a local installation of Excel 2010. This obviously isn't the latest version, but any later version should suffice.

b. Versions of Excel for Windows and Mac

For the Mac, common Excel versions include 2011 and 2016. For Windows, currently the versions are named 2010, 2013 and 2016.

Some of the functionality across versions may not be exactly the same, but will be close. Other popular spreadsheet options, like Apple Numbers and Google Sheets are available, but here we won't take the time to translate functions from one to the other.

Okay, let's get these Add-Ins installed.

Everyone should run through steps one to four, and if you don't have the necessary Add-Ins installed already, then we will move to Step 3b.

1. Run Excel locally - First off, run a locally-installed version of Excel, meaning one installed on your hard drive. This will allow you to work while away from the Internet. Also, some advanced functionality, like Solver, is only available on the locally-installed version, as of the time of this recording. So the version of Excel included with online Office 365 may not work for all of your data analysis needs.
2. Create a working directory - Next, create a working directory for your files. The name doesn't matter, but short names are preferred. Personally, I stay away from spaces and use the underscore character instead.
3. Create a spreadsheet file for Chapter 2 - Third, a quick comment for those looking to complete our series, I use the convention Quant_101_01_Getting_Started for the name of this Excel spreadsheet file for Chapter 1, but yours doesn't have to be as descriptive. In future tutorials name the tabs the same as mine as it will save time and will help to keep cell references to other tabs straight. It will be much easier to mimic what I'm doing. Also, it is a good idea to use the same columns and rows as me, if you want to make this go a whole lot quicker. You don't need to style your pages with colors and different backgrounds like mine, but it is a good idea to spend some time making it look nice as some day you may want to show off your project to a potential employer.
4. See which Add-Ins are already running - Fourth, see which Add-Ins are already running. To do this, click on the `Data` menu and if you see `Data Analysis` and `Solver` you are good to go.

Now, let's move on to step 3b, and a quick note, the Add-In name for the `Data Analysis` menu item is called Analysis Toolpak. It provides a variety of statistical functions; including, analysis of variance, correlation, random number generation, regression, t-tests and a z-test.

The menu item `Solver` is installed using the Add-In named Solver Add-in. We will use it to find a spot on a curve, with x-and-y coordinates, that maximizes the ratio of return over risk, for a portfolio through a technique called optimization.

Now if you don't have these pre-installed on your version of Excel, follow these steps.

1. See which Add-Ins are running - Click `File`, `Options` and `Add-Ins`.
2. Manage Add-Ins - Click `Manage:` with the dropdown `Excel Add-Ins`, then the `Go...` button.
3. Select both Add-Ins - Check boxes for `Analysis ToolPak` and `Solver Add-In`, then hit the `OK` button.
4. If Solver doesn't show up - Locate and navigate to a directory with the file `solver.xlam` and click `OK`.
5. Validate that Add-Ins are running - Click on the `Data` menu item and you should see them.

Step 4 - Data set

Let's move on to Step 4 to get that 61-row and 7-column data set.

a. We will use the Returns tab for every chapter

Here, we need to download the data file and create a data tab called Returns. The goal will be to include this data tab with each Chapter file. So basically, you will carry it along with you, attaching the same tab to each spreadsheet, which will prevent us from having long references to other files. Also, creating links and troubleshooting will be much easier.

I should mention this is an available data set with four stocks already selected. Later, in Chapter 2, I will show you how to generate returns should you want to customize your own data set.

For now, to get the sample data set, follow these steps:

2. Review instructions - Where it says Download Data Here review the instructions. You can examine the returns.txt text file with a left-click, and you can see it is a tab-delimited text file, which can be easily imported to Excel.
3. Save the file - Right-click on the file and all browsers give you the option to `Save File As...`, which may be worded differently depending on your browser.
4. Create a Returns tab - Click on the last tab and double-left-click on the name and give it the name Returns and now you have a blank tab.
5. Open returns.txt - Click `File`, then `Open` and then select the file `returns.txt` to open the Text Import Wizard.
6. Text Import Wizard - The Text Import Wizard shows the first rows of data and if it is jumbled we need to let Excel know it is delimited so select `Delimited` and then the `Next>` button.
7. Tab-delimited - Excel identified that the file was tab-delimited, so you can now see how it properly aligned the columns for the import, seven in total, the date and six columns of data for each date. If it doesn't say so already, make sure to select `Tab-delimited` and then click `Next>`.
8. Select date format - For the final step, we can instruct Excel to import the first column as a date by changing the toggle to `Date` and the format MDY identifies the month, day and year. Then click `Finish`. This creates a new file that you can save using `File`, then `Save As` in case there is a problem later.
9. Copy data to Returns tab - Select the whole range of data and hit the shortcut `Ctrl-c`, then navigate to the newly created tab, going to cell B6, then use the shortcut `Ctrl-v` to paste the data. Starting at B6 is important because as you follow along with what I'm doing it will save you time to keep the same cell numbers. Okay, we're all set.
10. Create a template - As mentioned earlier, we will create a new spreadsheet for each Chapter and carry over that Returns tab for each spreadsheet file. One way to do this is to make a copy of this whole file, calling it `template.xlsx`. So each time we start a new Chapter, we can just make a copy and the Returns tab is already there for us. This saves us from having links across multiple spreadsheets, which I have found to be more difficult to manage, plus we will run some unique calculations on the tab depending on which Chapter file we are working on.
c. Use decimal place convention

One more note about this data. If you noticed, all returns are in decimal notation. We have to be consistent, or down the road our math will get messy. So we will always keep the data in decimal format, and use cell formatting to present it. So a return that looks like 0.01, is the same as 1%, but we'll always keep it as 0.01. In Excel, you can always visualize it how you like.

Step 5 - Why not programming?

Next, I want answer a question that comes up. Why not teach the series with programming instead of a spreadsheet?

Earlier we covered advantages and disadvantages of spreadsheets. This series intentionally uses a small data set so we can benefit from several of those advantages from earlier. This means we can visualize and work with the data to learn the concepts, without having to write code.

In the real world this is done with a statistical analysis package like MatLab, R and Stata, or straight in traditional programming languages like Python, Java or C.

Our sample data set covers returns for four stocks, an index and Treasury Bills over monthly periods for five years. Imagine how large the data set would be if you covered 5,000 stocks and daily returns, and this leads us to a discussion of the advantages and disadvantages of programming, because if you want to take your knowledge further, to the Institutional level, then understanding how this is done at scale is important.

1. Higher pay - Learning these same concepts in a programming language might double your salary versus what you could do in a spreadsheet. Institutional-level firms are willing to pay for this.
2. Version control - Programming allows you to hold everything constant, roll-back updates and maintain all of the code in one working state called a release. Similar to when those Apps on your phone update. Rather than code being in a constant state of change, version control allows for a clean version to be published, then another round of work can be started on another version, which helps to isolate bugs.
3. User input validation - Programming also allows you to validate the data input by the user. Think about how Excel was written to protect users from coding. Everything you do in Excel – a drop down menu, a dialog box, a formula bar – sits on code. Almost every communication device you use – a phone, a tablet, a computer – is built in code that prevents you from making mistakes. Your brokerage accounts, bank accounts and social media Apps are all written to ensure that everyone follows the rules.
4. Speed - Speed is a factor as well as statistical programming languages can help you analyze big data quickly.
5. Repetition - Through looping, like in a monte carlo simulation, you can run hundreds, or thousands, of cases through automation.

Of course programming also has disadvantages.

1. Steep learning curve - It can take years to learn and implement these concepts in a programming language.
2. Less standardization - There are a variety of programs, versions and interfaces people use in programming so there are fewer experts to focus on the improvement of one program. This is especially the case with open-source software.
3. Cost - The cost associated with an implementation using programming is much higher when you consider the cost of retaining talent, planning and rolling out releases, versus ad-hoc analysis performed in a spreadsheet.
4. No GUI - Giving up the graphical user interface is a major hurdle for most people and focusing instead on learning the ins-and-outs of a programming language by editing text files is a reason Data Scientists and Financial Engineers are in such high demand.
Summary

So to summarize, we discussed spreadsheets and why they're well-suited for our first exploration of quantitative equity portfolio management.

We discussed Excel, set up the required Add-Ins then imported a sample data set and created a template for future chapters.

Then we closed with what our next steps could be, with programming, should you have a passion for portfolio management and want to take your career to the next level.

Step 6 - Next: System Setup

In the next episode we will start Chapter 2, with our Returns tab in place and start walking through 3 methods for calculating return: arithmetic, geometric and logarithmic.

Feel free to join us at any time, and have a nice day.

What's Next?

Don't lose sight of this valuable resource, subscribe to the email list below.

• To access all Quant 101 tutorials, click Outline.
• For the introduction to the series, click Back.
• I'll trade you, removing invasive ads for a cup of coffee, click Tip.
• For return calculation methods, click Next.

/ factorpad.com / fin / quant-101 / system-setup.html

data analysis excel
quantitative equity
investment quant