First Trading Algorithm

Pairs Trading

Pairs trading is a strategy that uses two stocks that are highly correlated. We can then use the difference in price between the two stocks as signal if one moves out of correlation with the other. It is an older strategy that is used classically as a guide to beginning algorithmic trading. There is a fantastic full guide and write up on Investopedia you can find here! I highly recommend reading the article in full before continuing, it is entertaining and informative!

Let's create our first basic trading algorithm! This is an exercise in using quantopian, NOT a realistic representation of what a good algorithm is! Never use something as simple as this in the real world! This is an extremely simplified version of Pairs Trading, we won't be considering factors such as cointegration!

Part 1: Research

You can perform research process in your local environment or via the Quantopian platform. Codes included below are for running under local environment ONLY. These codes are NOT to be used in Quantopian platform! Please refer to the online notebook "Pairs Trading Research" in Quantopian platform

Imports

In [103]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import quandl

Get Data from Quandl

Use get_pricing() in Quantopian platform.

United Airlines and American Airlines

In [104]:
start = '07-01-2015'
end = '07-01-2017'
In [105]:
united = quandl.get('WIKI/UAL',start_date=start,end_date=end)
american = quandl.get('WIKI/AAL',start_date=start,end_date=end)
In [106]:
united.head()
Out[106]:
Open High Low Close Volume Ex-Dividend Split Ratio Adj. Open Adj. High Adj. Low Adj. Close Adj. Volume
Date
2015-01-07 64.96 66.11 64.00 65.53 5133939.0 0.0 1.0 64.96 66.11 64.00 65.53 5133939.0
2015-01-08 65.70 67.52 65.41 66.64 6889597.0 0.0 1.0 65.70 67.52 65.41 66.64 6889597.0
2015-01-09 66.76 66.97 64.90 65.34 3488027.0 0.0 1.0 66.76 66.97 64.90 65.34 3488027.0
2015-01-12 66.16 66.85 63.84 65.92 5246008.0 0.0 1.0 66.16 66.85 63.84 65.92 5246008.0
2015-01-13 66.84 68.26 65.45 66.41 6265791.0 0.0 1.0 66.84 68.26 65.45 66.41 6265791.0
In [107]:
american.head()
Out[107]:
Open High Low Close Volume Ex-Dividend Split Ratio Adj. Open Adj. High Adj. Low Adj. Close Adj. Volume
Date
2015-01-07 53.38 53.65 52.12 53.01 10069816.0 0.0 1.0 52.103884 52.367430 50.874006 51.742730 10069816.0
2015-01-08 53.48 54.28 53.25 53.66 9672064.0 0.0 1.0 52.201494 52.982369 51.976992 52.377191 9672064.0
2015-01-09 53.67 53.91 51.82 52.02 12290046.0 0.0 1.0 52.386951 52.621214 50.581178 50.776397 12290046.0
2015-01-12 51.06 51.45 49.20 49.58 18261336.0 0.0 1.0 49.839347 50.220023 48.023812 48.394728 18261336.0
2015-01-13 50.12 51.43 49.46 50.40 12259271.0 0.0 1.0 48.921819 50.200501 48.277597 49.195125 12259271.0
In [108]:
american['Adj. Close'].plot(label='American Airlines',figsize=(12,8))
united['Adj. Close'].plot(label='United Airlines')
plt.legend()
Out[108]:
<matplotlib.legend.Legend at 0x20a50c84400>

It looks like their behaviors are highly correlated to each other!

Spread and Correlation

In [126]:
np.corrcoef(american['Adj. Close'],united['Adj. Close'])
Out[126]:
array([[ 1.        ,  0.92145101],
       [ 0.92145101,  1.        ]])

We're going to assume that because they're highly correlated, any significant difference in the spread of their prices may be a trading opportunity.

Use .axhline() to draw a horizontal line

In [117]:
spread = american['Adj. Close'] - united['Adj. Close']
spread.plot(label='Spread',figsize=(12,8))
plt.axhline(spread.mean(),c='r')
plt.legend()
Out[117]:
<matplotlib.legend.Legend at 0x20a51bc1710>

Normalizing with a z-score

In [119]:
def zscore(stocks):
    return (stocks - stocks.mean()) / np.std(stocks)
In [127]:
zscore(spread).plot(figsize=(14,8))
plt.axhline(zscore(spread).mean(), color='black')
plt.axhline(1.0, c='r', ls='--')
plt.axhline(-1.0, c='g', ls='--')
plt.legend(['Spread z-score', 'Mean', '+1', '-1']);

If there is ever a dip either below -1 or some other arbitrary value below the mean or there is ever a peak above +1 or some other arbitrary value (e.g. 1.5), We expect there will be a revision to the mean and eventually the spread will get back down. The reason we think that is because in the past, these two stocks are highly correlated, which means the actual spread of value when we normalize it shouldn't be that big of a difference.

Rolling Z-Score

Our spread is currently American-United. Let's decide how to calculate this on a rolling basis for our use in Quantopian

In [133]:
#1 day moving average of the price spread
spread_mavg1 = spread.rolling(1).mean()

# 30 day moving average of the price spread
spread_mavg30 = spread.rolling(30).mean()

# Take a rolling 30 day standard deviation
std_30 = spread.rolling(30).std()

# Compute the z score for each day
zscore_30_1 = (spread_mavg1 - spread_mavg30)/std_30



zscore_30_1.plot(figsize=(12,8),label='Rolling 30 day Z score')
plt.axhline(0, color='black')
plt.axhline(1.0, color='red', linestyle='--');

Part 2: Implementation of Strategy

WARNING: YOU SHOULD NOT ACTUALLY TRADE WITH THIS!

The part applies to Quantopian platform ONLY!

There are two ways to get to the development environment:

  • Click on Q at the upper left corner and then click Start Coding
  • Click on Research and then click Algorithms
In [ ]:
import numpy as np
 
def initialize(context):
    """
    Called once at the start of the algorithm.
    """   
    
    # Every day we check the pair status
    schedule_function(check_pairs, date_rules.every_day(), time_rules.market_close(minutes=60))
    
    # Our Two Airlines
    context.aa = sid(45971) #aal
    context.ual = sid(28051) #ual   
    
    # Flags to tell us if we're currently in a trade
    context.long_on_spread = False
    context.shorting_spread = False


def check_pairs(context, data):
    
    # For convenience
    aa = context.aa
    ual = context.ual
    
    # Get pricing history
    prices = data.history([aa, ual], "price", 30, '1d')
    
 
    # Need to use .iloc[-1:] to get dataframe instead of series
    short_prices = prices.iloc[-1:]
    
    # Get the long 30 day mavg
    mavg_30 = np.mean(prices[aa] - prices[ual])
    
    # Get the std of the 30 day long window
    std_30 = np.std(prices[aa] - prices[ual])
    
    # Get the shorter span 1 day mavg
    mavg_1 = np.mean(short_prices[aa] - short_prices[ual])
    
    # Compute z-score
    if std_30 > 0:
        zscore = (mavg_1 - mavg_30)/std_30
    
        # Our two entry cases
        if zscore > 0.5 and not context.shorting_spread:
            # spread = aa - ual
            order_target_percent(aa, -0.5) # short top
            order_target_percent(ual, 0.5) # long bottom
            context.shorting_spread = True
            context.long_on_spread = False
            
        elif zscore < -0.5 and not context.long_on_spread:
            # spread = aa - ual
            order_target_percent(aa, 0.5) # long top
            order_target_percent(ual, -0.5) # short bottom
            context.shorting_spread = False
            context.long_on_spread = True
            
        # Our exit case
        elif abs(zscore) < 0.1:
            order_target_percent(aa, 0)
            order_target_percent(ual, 0)
            context.shorting_spread = False
            context.long_on_spread = False
        
        record('zscore', zscore)