Alex McFarlane

Useful Stuff

10: Dictionaries and Sets

Learning Outcomes

  • How to use dictionary / set objects and differences to list / tuple
  • How to access dictionary objects in iteration
  • Utility of dictionaries as hashmaps
  • Utility of sets in dimensionality reduction

Contents

Sets

Sets only contain unique items

>>> trade_ids = [12342, 324562, 12342, 36452, 54767]
>>> set(trade_ids)
{12342, 36452, 54767, 324562}

We can also iterate a set like:

>>> for i in trade_ids:
...     print(i, end=',')
12342,324562,12342,36452,54767,

Set differences

They are also denoted by braces {}. Sets are a mathematical construct and python also supports some set logic such as set differences

>>> trade_ids_expected = {12342, 36452, 54767, 324569}  # shorter way of defining sets
>>> unexpected_trade_ids = set(trade_ids) - trade_ids_expected
>>> unexpected_trade_ids
{324562}

we can also do it the other way round to look for missing trades

>>> missing_trade_ids = trade_ids_expected - set(trade_ids)
>>> missing_trade_ids
{324569}

These two operations can be particularly useful when validating the inputs to functions.

Sets items must be immutable

We can also iterate sets in the same way that we iterate lists and tuples. Objects can also be part of sets as long as they are immutable - i.e. unchanging. Recall that lists are mutable and tuples are immutable.

This means that we can have a set of tuples

>>> set((1, 2,), (3, 4,))
{(1, 2,), (3, 4,)}

but not a set of lists

>>> {[1, 2], [3, 4]}
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-2-a0ff115cb325> in <module>()
----> 1 {[1, 2], [3, 4]}


TypeError: unhashable type: 'list'

Dictionaries

Dictionaries are python’s version of that is known as a hash map or hash table in other languages. If anyone in an interview asks you for a hash map in python you’ll know they just mean a dict (also they probably don’t really know python that well!)

All this jargon means is a key-value lookup where the key is unambigouously unique. Think VLOOKUP in Excel but if there couldn’t be any keys that are identical.

We set up a dictionary with a key value pair like follows

>>> d = {
...     'akey': 'avalue',
...     'anotherkey': 'avalue'
... }

values can be anything.

I actually wrote a load of stuff about this but I deleted it because I think you shoudl get used looking at python documentation now you are more familiar with the language.

See the offical python guide on dict - don’t bother with dict comprehensions yet as we will come onto those but have a read of the dictionaries and looping techniques sections.

Example: Trades by Asset Class

Lets assume we have the following data which we read in from a csv into a pandas DataFrame. Lets also assume that your credit and commodity desks for some reason give you the trade_id as a str - this is very annoying for you but a typical problem.

Finally, for any more advanced readers this example is focused on dict and not pandas so we shall avoid using pandas for now

>>> import pandas as pd
>>> data = [['rates', 346455, 568789.345],
...         ['rates', 3467457, 4568679.345],
...         ['rates', 56858, -6578965789.45],
...         ['fx', 93875, 67896789.34],
...         ['fx', 34896, -3464754.456],
...         ['fx', 30986, 0.3456457],
...         ['credit', '234537', 45765.456],
...         ['credit', '457568', -3455436.213],
...         ['credit', '3467457', 456546.034],
...         ['commodities', '93875', -34563456.23235],
...         ['commodities', '34457', 4560456.4567],
...         ['commodities', '457478', 4575678.345346],
...         ['equities', 3466, -457567.345],
...         ['equities', 564756, -12.93045],
...         ['equities', 457568, 546636.438996]]
>>> df = pd.DataFrame(data, columns=['risk', 'trade_id', 'dv01'])

How many trades are there per asset class with delta risk?

>>> trade_by_asset_class = dict()
>>> for asset_class, trade_id in df.values:
...     trade_by_asset_class[asset_class] = trade_id

Lets now figure out what went wrong here… Remembering the stack method we see that there are too many values to unpack and that the arrow is on the for line (if you are useing pyCharm - you know who you are - then you may have no arrow!)

With iteration errors it is often easiest to index the first element to see why we couldn’t unpack it:

>>> df.values[0]
array(['rates', 346455, 568789.345], dtype=object)

Here we can see there are three items and we are trying to unpack to two elements asset_class and trade_id therefore we need a third element even if we don’t currently care about the delta! A standard way of creating throwaway elements is to use _ like

>>> trade_by_asset_class = dict()
>>> for asset_class, trade_id, _ in df.values:
...     trade_by_asset_class[asset_class] = trade_id

but this doesn’t really help because each iteration we have overwritten the value!

>>> trade_by_asset_class
{'rates': 56858,
 'fx': 30986,
 'credit': '3467457',
 'commodities': '457478',
 'equities': 457568}

we therefore need to create a list as a value item and then append to the list - this is one of the most common dictionary structures.

>>> trade_by_asset_class = dict()
>>> for asset_class, trade_id, _ in df.values:
...     if asset_class not in trade_by_asset_class:
...         trade_by_asset_class[asset_class] = []
...     trade_by_asset_class[asset_class].append(trade_id)

Think about these operations if you have a large number of rows: The following should bve quicker have a think about why this might be the case…

>>> trade_by_asset_class = dict()
>>> for ac in set(df['risk']):
...     trade_by_asset_class[ac] = []
>>> for asset_class, trade_id, _ in df.values:
...     trade_by_asset_class[asset_class].append(trade_id)

which gives

>>> trade_by_asset_class
{'rates': [346455, 3467457, 56858],
 'fx': [93875, 34896, 30986],
 'commodities': ['93875', '34457', '457478'],
 'equities': [3466, 564756, 457568],
 'credit': ['234537', '457568', '3467457']}

we now have a structure for answering the question:

>>> for a, t in trade_by_asset_class.items():
...     print('risk: {:12s} trades: {:2d}'.format(a, len(t)))
risk: rates        trades:  3
risk: fx           trades:  3
risk: commodities  trades:  3
risk: equities     trades:  3
risk: credit       trades:  3

For more information string padding see: https://pyformat.info/#string_pad_align

Simplifying iterations with dictionaries

Lets imagine that the credit trading PnL system for some reason prepends '0s' on all the database ids under a length of 7 because some lunatic decided it looked nice in the 90s.

To link your PnL you will have to also prepend zeros to every trade_id

Whilst cussing out Diana Bloggs who retired last year after a distinguished trading career; but yet who also royally screwed you with one decision she made as a grad on a drizzly friday morning in 1999

We can zero pad integers to a length of 7 like

>>> str(346).zfill(8)
'00000346'

A naiive way of doing this would be just to call one of the following

>>> df['trade_id_pad'] = df['trade_id'].astype(str).str.zfill(8)

This example shows us two new operations: Firstly that we can call .astype on a pandas.Series (a series is a single column of a DataFrame). Secondly, if a pandas series is a str type then we can call .str to access operations that are normally found within str object types.

Imagine now that this dataframe is $10^6$x larger

WARNING if your laptop is aweful you may not want to run this next section

>>> df = pd.DataFrame(data*1e6, columns=['risk', 'trade_id', 'dv01'])

Timing this for me took about 10 seconds!

In [100]: %timeit df['trade_id'].astype(str).str.zfill(8)
10.5 s ± 464 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Lets now simplify the process by using the hashmap

In [101]: %%timeit
     ...: trade_ids = df['trade_id'].unique()  # pandas way to get unique items is fast
     ...: lookup = {}
     ...: for trade_id in trade_ids:
     ...:     lookup = {trade_id: str(trade_id).zfill(8)}
     ...: df['trade_id_pad'] = df['trade_id'].apply(lookup.get)
     ...: 
2.38 s ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Here we see the .apply method in action. This is pandas version of a map. A map iterates a single function across an array of items. map actually exists in python as a default function and we can call it like:

>>> list(map(str, [3456, 4576, 7343]))
['3456', '4576', '7343']

and

>>> list(map(len, ['36', '45sd76', '7343']))
[2, 6, 4]

pandas.Series.apply works in the same way and in this example iterates the .get method of lookup across every item in the dataframe.

Here the lookup method is exceedingly fast and creating it only requires us to use the far slower line str(trade_id).zfill(8) 15 times instead of 15 million times!

Exercises

Exercise 10.1: Dimensionality reduction

This example aims to build on previous examples to reinforce the idea of hash maps for reducing complexity.

You are working on an end-of-day regulatory risk model that requires the revaluation of all trades (e.g. Basel III: FRTB Sensitivities Based Approach). You have been instructed to calculate the present value (PV) as a new column in an Excel sheet. Someone else has done this and complained it was impossibly long and took over 40 hours to calculate. They have requested access to a compute grid to speed up their Excel sheet worth $10k per year

Assume your pricing function to get the PV of the trade is this:

>>> import time
>>> def dodgy_pricer(trade_id):
...     """Gets the given a trade_id and returns a random pv"""
...     time.sleep(.1)
...     return 2e9 * np.random.random() - 1e9

and it is called in Excel something like =DODGY_PRICER($B3)

Assume we have already read the Excel sheet with python (use our previous example data and make 1.5m rows as the example sheet data)

>>> df = pd.DataFrame(data * 10000, columns=['risk', 'trade_id', 'dv01'])

Given that

>>> .1 * 15e5 / 60 / 60
41.666666666666664

Use your knowledge of dictionaries to reduce the problem set and claim a portion of the cost savings for your bonus.

# Solve me

Exercise 10.2: Extracting corporate financials via an API

This example aims to help you understand how to navigate a simple dictionary object and also that there is a library for almost every problem. You should note that in a real Investment Banking use case there will be an API for internal / paid datasets and I would recommend you talk to your IT / Data desk / Strat emphasising that every dataset available in Excel will be available in python

I googled "python get balance sheet" and found the following decent library: https://github.com/JECSand/yahoofinancials

We can tell this is a decent library because:

  • It has good documentation
  • There are issues being worked upon recently
  • The last commit was recent These are all extremely important particularly when the core premise of the library is webscraping a site that can (and does regularly!) change.

We can follow the example on the github homepage and adapt for our use case.

I found for various reasons I needed to do a few improvements to this API to get exactly what we needed. We can pip install libraries directly from github. Please use the improved-fork I created of this API until it is accepted by the library author as follows:

$ pip install git+https://github.com/flipdazed/yahoofinancials

Your exercise is to use this API to extract the Equity for Apple. Lets assume that Apple is a 100% public company (I have no idea if it is or not) so then the equity is just the Market Cap.

Hint The ticker for Apple is "AAPL"… How did I know market capitalisation was definitely there? Check the API documentation!

# Solve me

Exercise 10.3: Comparable (Comp) analysis : Calculating Levered Beta

This is example aims to demonstrate some use cases of IBD-style analysis that can be done in python NB: I work in trading and not IBD so apologies for any mistakes!

The $\beta$ is the a measure used in CAPM modelling to measure the systemmatic risk of an individual stock vs. the unsystemmatic risk of the entire market (c.f. https://www.investopedia.com/terms/b/beta.asp)

In order to calculate the $\beta$ from comps we must first “deleverage” the $\beta$ for each comp which can be done via

where $D$ is debt, $E$ is equity, $r_T$ is effective tax rate and $E_p$ is preferred stock.

As a toy example lets calculate the $\beta$ of Microsoft (MSFT) from some comparables and compare to the Yahoo value.

Lets take the main competitors and use them as comps in the examples that follow

>>> comps = ['AAPL', 'SAP', 'IBM', 'ORCL', 'GOOG']

Exercise 10.3.1: Obtain the Debt/Equity Ratios for comps

This example shows you a real result of an API which typically contain nested dictionary objects and how they can be parsed into a pandas.DataFrame object for ease of use.

You can create a pandas.DataFrame object like follows:

>>> import pandas as pd
>>> data = {'AAPL': {'grossMargins': 0.346, 'profitMargins': 45.23},
...         'SAP': {'grossMargins': 346, 'profitMargins': 23}}
>>> df_comps = pd.DataFrame(data)
>>> df_comps
AAPL SAP
grossMargins 0.346 346
profitMargins 45.230 23

I can assure you that it will be a bit simpler if we transpose this dataframe like follows and use the tickers as the row indices as well like follows. This will let us use columns for our calculations. As beginners in pandas objects I think this easier

>>> df_comps = pd.DataFrame(data).T
>>> df_comps
grossMargins profitMargins
AAPL 0.346 45.23
SAP 346.000 23.00

To start you off, this will get the data - you can find this all on the README.md of the github repository

>>> from yahoofinancials import YahooFinancials
>>> comps = ['AAPL', 'SAP', 'IBM', 'ORCL', 'GOOG']
>>> parser = YahooFinancials(comps)
>>> bals = parser.get_financial_data()  # This is why we use my fork: We can't get D/E from the original

Convert this to a pandas DataFrame as shown above where the tickers are the row indices. Then filter the columns to only leave only the column ['debtToEquity'] this can be done by

>>> df_comps = df_comps[['debtToEquity']]
>>> df_comps['debtToEquity'] /= 100.  # we need to convert the values from percentages to decimal
# Solve me

Exercise 10.3.2: Calculate the Effective Tax Rate for Comps

This example will contain a non-trivial data structure containing lists and dictionaries nested within each other. It will also challenge you interact with the pandas.DataFrame object you have just created using .loc to create new entries

Calculate the effective tax rate for all these tech companies: This is given in the API as incomeTaxExpense / incomeBeforeTax as in https://github.com/JECSand/yahoofinancials#examples-of-returned-json-data

I will start you off with the following

>>> inc = parser.get_financial_stmts('annual', 'income')  # annual is more stable estimate to avoid rebates

Hint that you can create a new column/row combination in a dataframe by doing:

>>> df_comps.loc['AAPL', 'incomeBeforeTax'] = 45

You will want to iterate through each ticker and column combination to fill the values

# Solve me

Exercise 10.3.3: Calculate the beta for each comp

This example requires you to read the technical documentation for a repository and utilise it for a solution.

Create a new column for the $\beta$ for each comp.

Explore the documentation at https://github.com/JECSand/yahoofinancials to find a sensible method (There is more than one way to do this!)

# Solve me

Exercise 10.3.4: Calculate unlevered beta as a new column

This example uses some basic vectorised mathematics with pandas DataFrames

If you have forgotten the formula to calculate $\beta_{unlev}$ is

Calculate the $\beta_{unlev}$ for each comp in a new column called 'beta_unlev'. You should google how to do basic maths in pandas if you are unsure.

# Solve me

Exercise 10.3.5: Calculate the beta for Microsoft

We now need to relever the $\beta$ with the debt-equity ratio and effective tax rate of microsoft.

Exercise 10.3.5: Obtain $D/E$ and Effective Tax rate for Microsoft

This exercise revises the previous dictionary and list circus skills

Repeat just the vital steps above to obtain 'debtToEquity' and effectiveTax for "MSFT"

# Solve me

Exercise 10.3.5: Calculate the estimate of unlevered beta as the median of the comps

The exercise demonstrates how we can use numpy functionality through pandas

This is a maths problem and so naturally we use numpy which has the function np.median. We can use this on the column of the pd.DataFrame containing the 'beta_unlev' to obtain the median, remembering that Pandas DataFrames are really just wrappers around numpy arrays

Note Most simple math functions can be found as a .method() on the end of the pandas DataFrame. For example try typing

>>> df_comps['beta_unlev'].

Then hit TAB after the . and see all the methods available. You should find one to calculate the median too.

Assign the median beta to a variable name beta_unlev_msft

# Solve me

Exercise 10.3.5: Calculate the estimate of beta

The calculate $\beta$ for Microsoft as where $r_T$ is the effective tax rate for Microsoft and D/E is the Debt-to-Equity ratio for Microsoft and we already calcualted $\beta_{unlev}$ above

# Solve me

Appendix: Calculate WACC for Microsoft from comparables

This exercise is really aimed at those in Investment Banking roles because it’s nice to complete the picture

Calculate beta based upon comparables D/E

As a guess of an “optimal” capital structure from comparables, take the median debt and median equity from the comparables. Calculate the D/E ratio from these medians and use it to create a $\beta$, using the same effectixeTax. Add a new row containing ['debtToEquity', 'effectiveTax', 'beta'] for this new $\beta$ and D/E using the index name: 'MSFT_beta_optcap'

This won’t affect Bankers working in the insudstry as you’ll have your own systems / APIs firms… However, sometimes the 'totalDebt' can be a bit rogue in Yahoo Finance. I would recommend pulling (check API documentation!) the 'totalShareholderEquity' as the total Equity and calculating 'totalDebt' from the trivial relation

total_debt = debt_to_equity * total_shareholder_equity 

So we don’t give away the solutions to previous sections, I will hardcode the expected results required (as of November 13th!) below and use that in this demonstration

>>> df_msft = pd.DataFrame(
...     [[0.8068, 0.101813, 1.229776, 0.7130553],
...      [0.8068, 0.101813, 1.073342, 0.6223507]],
...     columns=['debtToEquity', 'effectiveTax', 'beta', 'beta_unlev'],
...     index=['MSFT_beta_hist', 'MSFT_beta_comp']
... )
debtToEquity effectiveTax beta beta_unlev
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351

and the comparables

>>> df_comps = pd.DataFrame(
...     [[1.194, 65.737e9, 10.481e9, 0.1594, 1.247, 0.6224, 90.488e9, 108.047e9, 0.4558, 0.5442],
...      [0.5439, 5.6e9, 1.511e9, 0.2698, 1.0666, 0.7634, 30.159e9, 16.403e9, 0.6477, 0.3523],
...      [3.9512, 11.342e9, 2.619e9, 0.2309, 1.3442, 0.3328, 17.956e9, 70.947e9, 0.202, 0.798],
...      [2.9852, 12.268e9, 1.185e9, 0.0966, 1.1506, 0.3112, 18.433e9, 55.027e9, 0.2509, 0.7491],
...      [0.0763, 34.913e9, 4.177e9, 0.1196, 1.0146, 0.9508, 194.969e9, 14.872e9, 0.9291, 0.0709]],
...      index=[ 'AAPL', 'SAP', 'IBM', 'ORCL', 'GOOG'],
...      columns=['debtToEquity', 'incomeBeforeTax', 'incomeTaxExpense',
...               'effectiveTax', 'beta', 'beta_unlev', 'totalStockholderEquity',
...               'totalDebt', 'ratioEquity', 'ratioDebt']
... )
debtToEquity incomeBeforeTax incomeTaxExpense effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt
AAPL 1.1940 6.573700e+10 1.048100e+10 0.1594 1.2470 0.6224 9.048800e+10 1.080470e+11 0.4558 0.5442
SAP 0.5439 5.600000e+09 1.511000e+09 0.2698 1.0666 0.7634 3.015900e+10 1.640300e+10 0.6477 0.3523
IBM 3.9512 1.134200e+10 2.619000e+09 0.2309 1.3442 0.3328 1.795600e+10 7.094700e+10 0.2020 0.7980
ORCL 2.9852 1.226800e+10 1.185000e+09 0.0966 1.1506 0.3112 1.843300e+10 5.502700e+10 0.2509 0.7491
GOOG 0.0763 3.491300e+10 4.177000e+09 0.1196 1.0146 0.9508 1.949690e+11 1.487200e+10 0.9291 0.0709

Obtain Total Debt and Total Equity for comps

I use the last 'quarterly’ data for unlevering $\beta$

>>> from yahoofinancials import YahooFinancials
>>> comps = ['AAPL', 'SAP', 'IBM', 'ORCL', 'GOOG']
>>> parser = YahooFinancials(comps)
>>> bal_comps = parser.get_financial_stmts('quarterly', 'balance')

Then extract the relevant elements from the balance sheet API return values and calculate the 'totalDebt' and totalEquity ratios

>>> for p in comps:
...     bal_last = list(bal_comps['balanceSheetHistoryQuarterly'][p][0].values())[0]
...     df_comps.loc[p, 'totalStockholderEquity'] = bal_last['totalStockholderEquity']
>>> df_comps['totalDebt'] = df_comps['debtToEquity'] * df_comps['totalStockholderEquity']
>>> df_comps['ratioEquity'] = df_comps['totalStockholderEquity'] / (df_comps['totalDebt'] + df_comps['totalStockholderEquity'])
>>> df_comps['ratioDebt'] = 1 - df_comps['ratioEquity'] # A lot of assumptions here!
debtToEquity incomeBeforeTax incomeTaxExpense effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt
AAPL 1.1940 6.573700e+10 1.048100e+10 0.1594 1.2470 0.6224 9.048800e+10 1.080427e+11 0.455789 0.544211
SAP 0.5439 5.600000e+09 1.511000e+09 0.2698 1.0666 0.7634 3.015900e+10 1.640348e+10 0.647710 0.352290
IBM 3.9512 1.134200e+10 2.619000e+09 0.2309 1.3442 0.3328 1.795600e+10 7.094775e+10 0.201971 0.798029
ORCL 2.9852 1.226800e+10 1.185000e+09 0.0966 1.1506 0.3112 1.843300e+10 5.502619e+10 0.250928 0.749072
GOOG 0.0763 3.491300e+10 4.177000e+09 0.1196 1.0146 0.9508 1.949690e+11 1.487613e+10 0.929109 0.070891

Obtain Total Debt and Total Equity for Microsoft

We can also extract the 'totalStockholderEquity' for Microsoft as well

>>> ticker = "MSFT"
>>> parser = YahooFinancials(ticker)
>>> bal_msft = parser.get_financial_stmts('quarterly', 'balance')
>>> bal_msft = list(bal_msft['balanceSheetHistoryQuarterly'][ticker][0].values())[0]
>>> df_msft.loc[:, 'totalStockholderEquity'] = bal_msft['totalStockholderEquity']
>>> df_msft['totalDebt'] = df_msft['debtToEquity'] * df_msft['totalStockholderEquity']
>>> df_msft['ratioEquity'] = df_msft['totalStockholderEquity'] / (df_msft['totalDebt'] + df_msft['totalStockholderEquity'])
>>> df_msft['ratioDebt'] = 1 - df_msft['ratioEquity']
debtToEquity effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055 106061000000 8.557001e+10 0.553465 0.446535
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351 106061000000 8.557001e+10 0.553465 0.446535

Include an ‘Optimal’ capital structured based on comps

Create new ‘optimal’ capital structure by taking the medians of the comps Debt and Equity ratios

>>> i = f'{ticker}_beta_optcap'
>>> df_msft.loc[i, :] = df_msft.loc[f'{ticker}_beta_hist', :].copy()
>>> df_msft.loc[i, 'ratioDebt'] = df_comps['ratioDebt'].median()
>>> df_msft.loc[i, 'ratioEquity'] = df_comps['ratioEquity'].median()
debtToEquity effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055 1.060610e+11 8.557001e+10 0.553465 0.446535
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351 1.060610e+11 8.557001e+10 0.553465 0.446535
MSFT_beta_optcap 0.8068 0.101813 1.229776 0.713055 1.060610e+11 8.557001e+10 0.455789 0.544211

Create beta values for the various capital structures

>>> df_msft.loc[i, 'totalDebt'] = df_msft.loc[i, 'ratioDebt'] * (df_msft.loc[f'{ticker}_beta_hist', 'totalDebt'] + df_msft.loc[f'{ticker}_beta_hist', 'totalStockholderEquity'])
>>> df_msft.loc[i, 'totalStockholderEquity'] = df_msft.loc[i, 'ratioEquity'] * (df_msft.loc[f'{ticker}_beta_hist', 'totalDebt'] + df_msft.loc[f'{ticker}_beta_hist', 'totalStockholderEquity'])
>>> df_msft.loc[i, 'debtToEquity'] = df_msft.loc[i, 'totalDebt'] / df_msft.loc[i, 'totalStockholderEquity']
>>> df_msft.loc[i, 'beta'] = df_msft.loc[i, 'beta_unlev'] * (1 + (1 - df_msft.loc[i, 'effectiveTax']) *  df_msft.loc[i, 'debtToEquity'])
debtToEquity effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055 1.060610e+11 8.557001e+10 0.553465 0.446535
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351 1.060610e+11 8.557001e+10 0.553465 0.446535
MSFT_beta_optcap 1.1940 0.101813 1.477761 0.713055 8.734322e+10 1.042878e+11 0.455789 0.544211

Calculate Cost of Equity, ${r}_{E}$

Lets assume the following below

  • implied equity risk premium is as of 1st Nov: http://pages.stern.nyu.edu/~adamodar/
  • risk free rate taken from 10 year US Treasury bonds when writing on 13th Nov
  • Current cost of debt, ${r}_{D}$, for Microsoft
>>> erp = 0.0509
>>> rf = 0.0189
>>> debt_cost_msft = 0.033539

Then cost of equity can be calculated as

>>> df_msft['costDebt'] = debt_cost_msft
>>> df_msft['costEquity'] = rf + erp * df_msft['beta']  # cost_equity = rf + erp * beta_levered 
debtToEquity effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt costDebt costEquity
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055 1.060610e+11 8.557001e+10 0.553465 0.446535 0.033539 0.081496
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351 1.060610e+11 8.557001e+10 0.553465 0.446535 0.033539 0.073533
MSFT_beta_optcap 1.1940 0.101813 1.477761 0.713055 8.734322e+10 1.042878e+11 0.455789 0.544211 0.033539 0.094118

Calculate the Weighted Average Cost of Capital (WACC)

The WACC formula is (ignoring tweaks and preferred stock)

so we can calculate in python as

>>> df_msft['WACC'] = df_msft['costEquity'] * df_msft['ratioEquity'] + df_msft['costDebt'] * df_msft['ratioDebt'] * (1 - df_msft['effectiveTax'])
debtToEquity effectiveTax beta beta_unlev totalStockholderEquity totalDebt ratioEquity ratioDebt costDebt costEquity WACC
MSFT_beta_hist 0.8068 0.101813 1.229776 0.713055 1.060610e+11 8.557001e+10 0.553465 0.446535 0.033539 0.081496 0.058556
MSFT_beta_comp 0.8068 0.101813 1.073342 0.622351 1.060610e+11 8.557001e+10 0.553465 0.446535 0.033539 0.073533 0.054150
MSFT_beta_optcap 1.1940 0.101813 1.477761 0.713055 8.734322e+10 1.042878e+11 0.455789 0.544211 0.033539 0.094118 0.059292

It turns out that this is slightly off the mark with what I found on the internet but I suspect that is due to the $\beta$ on Yahoo Finance being paticularly high compared to other sources

Next Topic

11: Classes and Modules