CAPM - Capital Asset Pricing Model Code Along

Watch the video for the full overview.

Formula

Portfolio Returns

$r_p(t) = \sum\limits_{i}^{n}w_i r_i(t)$

Market Weights

$ w_i = \frac{MarketCap_i}{\sum_{j}^{n}{MarketCap_j}} $

CAPM of a Portfolio

$ r_p(t) = \beta_pr_m(t) + \sum\limits_{i}^{n}w_i \alpha_i(t)$

In [1]:
# Treat Model CAPM as a simple linear regression

Imports

In [2]:
from scipy import stats

import pandas as pd
import pandas_datareader as web

Get Help on Function - linregress

In [3]:
help(stats.linregress)
Help on function linregress in module scipy.stats._stats_mstats_common:

linregress(x, y=None)
    Calculate a linear least-squares regression for two sets of measurements.
    
    Parameters
    ----------
    x, y : array_like
        Two sets of measurements.  Both arrays should have the same length.
        If only x is given (and y=None), then it must be a two-dimensional
        array where one dimension has length 2.  The two sets of measurements
        are then found by splitting the array along the length-2 dimension.
    
    Returns
    -------
    slope : float
        slope of the regression line
    intercept : float
        intercept of the regression line
    rvalue : float
        correlation coefficient
    pvalue : float
        two-sided p-value for a hypothesis test whose null hypothesis is
        that the slope is zero, using Wald Test with t-distribution of
        the test statistic.
    stderr : float
        Standard error of the estimated gradient.
    
    See also
    --------
    :func:`scipy.optimize.curve_fit` : Use non-linear
     least squares to fit a function to data.
    :func:`scipy.optimize.leastsq` : Minimize the sum of
     squares of a set of equations.
    
    Examples
    --------
    >>> import matplotlib.pyplot as plt
    >>> from scipy import stats
    >>> np.random.seed(12345678)
    >>> x = np.random.random(10)
    >>> y = np.random.random(10)
    >>> slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
    
    To get coefficient of determination (r_squared)
    
    >>> print("r-squared:", r_value**2)
    r-squared: 0.080402268539
    
    Plot the data along with the fitted line
    
    >>> plt.plot(x, y, 'o', label='original data')
    >>> plt.plot(x, intercept + slope*x, 'r', label='fitted line')
    >>> plt.legend()
    >>> plt.show()

Get Data for Portfolio

In [4]:
spy_etf = web.DataReader('SPY','google')
spy_etf.info()
C:\Users\KL\Anaconda3\lib\site-packages\pandas_datareader\google\daily.py:40: UnstableAPIWarning: 
The Google Finance API has not been stable since late 2017. Requests seem
to fail at random. Failure is especially common when bulk downloading.

  warnings.warn(UNSTABLE_WARNING, UnstableAPIWarning)
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2054 entries, 2010-01-04 to 2018-03-02
Data columns (total 5 columns):
Open      2035 non-null float64
High      2035 non-null float64
Low       2035 non-null float64
Close     2054 non-null float64
Volume    2054 non-null int64
dtypes: float64(4), int64(1)
memory usage: 96.3 KB
In [5]:
spy_etf.head()
Out[5]:
Open High Low Close Volume
Date
2010-01-04 112.37 113.39 111.51 113.33 118944541
2010-01-05 113.26 113.68 112.85 113.63 111579866
2010-01-06 113.52 113.99 113.43 113.71 116074402
2010-01-07 113.50 114.33 113.18 114.19 131091048
2010-01-08 113.89 114.62 113.66 114.57 126402764
In [6]:
# Get dates from ,head() and .tail()
start = pd.to_datetime('2010-01-04')
end = pd.to_datetime('2017-07-18')
In [7]:
aapl = web.DataReader('AAPL','google',start,end)
C:\Users\KL\Anaconda3\lib\site-packages\pandas_datareader\google\daily.py:40: UnstableAPIWarning: 
The Google Finance API has not been stable since late 2017. Requests seem
to fail at random. Failure is especially common when bulk downloading.

  warnings.warn(UNSTABLE_WARNING, UnstableAPIWarning)
In [8]:
aapl.head()
Out[8]:
Open High Low Close Volume
Date
2010-01-04 30.49 30.64 30.34 30.57 123432050
2010-01-05 30.66 30.80 30.46 30.63 150476004
2010-01-06 30.63 30.75 30.11 30.14 138039594
2010-01-07 30.25 30.29 29.86 30.08 119282324
2010-01-08 30.04 30.29 29.87 30.28 111969081

Re-assign spy_etf with 'start' and 'end' dates so that the number of period matches with aapl. Otherwise, you can't plot the scatter graph

In [9]:
spy_etf = web.DataReader('SPY','google', start, end)
C:\Users\KL\Anaconda3\lib\site-packages\pandas_datareader\google\daily.py:40: UnstableAPIWarning: 
The Google Finance API has not been stable since late 2017. Requests seem
to fail at random. Failure is especially common when bulk downloading.

  warnings.warn(UNSTABLE_WARNING, UnstableAPIWarning)

Plot Performance

In [10]:
import matplotlib.pyplot as plt
%matplotlib inline
In [11]:
aapl['Close'].plot(label='AAPL',figsize=(10,8))
spy_etf['Close'].plot(label='SPY Index')
plt.legend()
Out[11]:
<matplotlib.legend.Legend at 0xc71b2f0>

Compare Cumulative Return

In [12]:
aapl['Cumulative'] = aapl['Close']/aapl['Close'].iloc[0]
spy_etf['Cumulative'] = spy_etf['Close']/spy_etf['Close'].iloc[0]
In [13]:
aapl['Cumulative'].plot(label='AAPL',figsize=(10,8))
spy_etf['Cumulative'].plot(label='SPY Index')
plt.legend()
plt.title('Cumulative Return')
Out[13]:
Text(0.5,1,'Cumulative Return')

Get Daily Return

In [14]:
aapl['Daily Return'] = aapl['Close'].pct_change(1)
spy_etf['Daily Return'] = spy_etf['Close'].pct_change(1)

Plot Scatter Graph

Plot scatter graph to see if there is any correlation visually

In [15]:
plt.scatter(aapl['Daily Return'],spy_etf['Daily Return'],alpha=0.3)
Out[15]:
<matplotlib.collections.PathCollection at 0xea902f0>

Plot Histogram

In [16]:
aapl['Daily Return'].hist(bins=100)
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x42b46b0>
In [17]:
spy_etf['Daily Return'].hist(bins=100)
Out[17]:
<matplotlib.axes._subplots.AxesSubplot at 0xeb0ac50>

Get Beta & Alpha

Get Beta & Alpha values from unpacking function .linregress()

In [18]:
beta,alpha,r_value,p_value,std_err = stats.linregress(aapl['Daily Return'].iloc[1:],spy_etf['Daily Return'].iloc[1:])
In [19]:
beta
Out[19]:
0.19423150396392763
In [20]:
alpha
Out[20]:
0.00026461336993233316
In [21]:
r_value
Out[21]:
0.33143080741409325

We simulate the scenario by creating a fake stock with return = spy_etf + noise with mean of 0 and std dev of 0.001

In [22]:
spy_etf['Daily Return'].head()
Out[22]:
Date
2010-01-04         NaN
2010-01-05    0.002647
2010-01-06    0.000704
2010-01-07    0.004221
2010-01-08    0.003328
Name: Daily Return, dtype: float64
In [23]:
import numpy as np
In [24]:
noise = np.random.normal(0, 0.001, len(spy_etf['Daily Return'].iloc[1:]))
In [25]:
noise
Out[25]:
array([ 0.00130414,  0.00215892, -0.00102251, ..., -0.00115488,
       -0.00028525, -0.00090769])
In [26]:
fake_stock = spy_etf['Daily Return'].iloc[1:] + noise
In [27]:
plt.scatter(fake_stock, spy_etf['Daily Return'].iloc[1:], alpha=0.25)
Out[27]:
<matplotlib.collections.PathCollection at 0xec768b0>
In [28]:
beta,alpha,r_value,p_value,std_err = stats.linregress(spy_etf['Daily Return'].iloc[1:]+noise,spy_etf['Daily Return'].iloc[1:])

Beta value is almost 1. The fake stock is highly correlated with SPY

In [29]:
beta
Out[29]:
0.9906233442249177
In [30]:
alpha
Out[30]:
-2.4392338140190198e-05

Looks like our understanding is correct! If you have a stock that lines up with the index or the market itself, you'll have a really high beta value (close to 1) and a really low alpha value.