A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy

The views expressed here are our own and do not necessarily reflect the views of Nomura Asset management.
Any errors and inadequacies are our own.
The AAAI-20 Workshop on Knowledge Discovery from Unstructured Data in Financial Services
February 7th, 2020
Masaya Abe1
and Junpei Komiyama2
A Robust Transferable Deep Learning Framework
for Cross-sectional Investment Strategy
1. Nomura Asset Management Co., Ltd.
2. New York University
Kei Nakagawa1,
https://arxiv.org/pdf/1910.01491.pdfArXiv version (Full paper):

1. Introduction and Motivation
2. Data and Methodology
3. Experimental Results
4. Conclusion
Agenda
1

Agenda
4. Conclusion
2

ROE
1 Month Return
●
●
●
Score
Value
Growth
Quality
Momentum
・Linear Regression
Factor
Candidates
Factor Classification
by human
Calculate
Relative Goodness
Cross-sectional Investment Strategy
Relative Stock Returns
・Rank IC (Spearman correlation):
3
・ How average returns change with different stock attributes : Factor

RIC-NN: Our methodology
Score
Input Output
(3) Deep Transfer Learning
Loss
rank IC
Stop
Epoch
Time step t-1 Time step t
𝒗𝒊 𝒗 𝒇
(2) Weight Initialization and Stopping
Stop
𝒗 𝒇
Loss
(1) Multi-factor Deep Learning Approach
Score
Stock
Input Output
Score
Stock
Input Output
Source Domain
Target Domain
Transfer
Stock Factor
Stock
Factor
Stock
Stock
Factor
Initialization
𝒗𝒊
NorthAmericaAsiaPacific
Initialization
4

Score
Factor
Candidates Deep Learning
・
・
・
・
・
・
・
・
・
・
・
・
・
・
・
ROE
1 Month Return
●
●
●
RIC-NN: Multi-factor Deep Learning Approach
Calculate
Relative Goodness
5
・ Deep learning for cross sectional investment strategy

・ DL for stock return prediction easily overfits to training data.
✓ Use early stopping to control the fitness to the past data.
RIC-NN: Weight Initialization and Stopping
・Use RankIC (Spearman
correlation) in terms of the fitness.
-> intuitive and controllable.
Loss
rank IC
Epoch
Time step T-1 Time step T
Stop
Loss
Initialization
・ Epoch-based stopping,
Risk of overfitting or underfitting
because the training speed varies.0.20.16
Stopping: the rank IC reaches 0.20.
Initialization: Use the model of timestep t-1 when rank IC is 0.16.
c.f.: fitness of a good portfolio to
future return is around 0.10
Initialization
Our Proposed (RIC-NN)
✓ Training at time step t:
6

ScoreRIC-NN
Factor
Stock
Input Output
Score
Factor
Stock
Input Output
North America Stock Market Asia Pacific Stock Market
(Source Domain) (Target Domain)
Transfer
RIC-NN
・Augment the model using the knowledge of a larger market.
✓ Use transfer learning
RIC-NN: Deep Transfer Learning
We want to capture the asymmetric structure between the two markets.
7

Agenda
4. Conclusion
8

・ We use the 20 factors that are often used in practice.
✓ Calculated for each regional index constituents
- MSCI North America Index (NA)
- MSCI Pacific Index (PA)
Features (Various Factors)
No. Feature (Factor) No. Feature (Factor) No. Feature (Factor)
1 Book-to-market Ratio 8 Return on Invested Capital 15 EPS Revision(1 month)
2 Earnings-to-price Ratio 9 Accruals 16 EPS Revision(3 months)
3 Dividend Yield 10 Total Asset Growth Rate 17 Past Stock Return(1 month)
4 Sales-to-price Ratio 11 Current Ratio 18 Past Stock Return(12 months)
5 Cash flow-to-price Ratio 12 Equity Ratio 19 Volatility
6 Return on Equity 13 Total Asset Turnover Rate 20 Skewness
7 Return on Asset 14 CAPEX Growth Rate
※ Monthly data
Data sources: Namely, Compustat, WorldScope, Thomson Reuters, I/B/E/S and EXSHARE.
9

Features (Various Factors)
・ Cumulative returns in NA(left-side) and PA(right-side) on 20 factors
・ Return of each factor varies largely over time.
10

Problem Formulation
MSE 𝑡 =
1
𝐾
෍
𝑡′=𝑡−𝑁
𝑡−1
෍
𝑖∈𝑈 𝑡′
𝑟𝑖,𝑡′+1 − 𝑓 𝒗𝑖,𝑇; 𝜽 𝑇+1
2
𝑁 = 120 (10 years)
𝐾 = ෍
𝑡′=𝑡−𝑁
𝑡−1
𝑈 𝑡′
・ We define the problem as a regression problem to minimize MSE.
✓ Approximate function 𝑓 ∙ with the parameter 𝜽 𝑇+1 that maps 𝒗𝑖,𝑇 to 𝑟𝑖,𝑇+1
𝑓 𝒗𝑖,𝑇; 𝜽 𝑇+1 → 𝑟𝑖,𝑇+1
✓ Train the models using the data of the latest 120 time steps from the
past 10 years.
11
𝒗𝑖,𝑇 ：Augmented factors
𝜽 𝑇+1：NN Weights
𝑟𝑖,𝑇+1：Relative Stock return

✓ 20 factors:
Problem Formulation
・ Given a stock 𝑖 at month 𝑇 (𝑖 ∈ 𝑈 𝑇: a regional index constituents at 𝑇)
𝒙𝑖,𝑇 ∈ 𝑅20
✓ Features: 20 factors and preprocessed factors
𝑥/ 𝑅
𝑦 ≔ 2(𝑥 − 𝑦)/( 𝑥 + 𝑦 )
✓ Output variable: scaled one-month-ahead stock return
Scale to the range [0,1]
Pre-processing ＆ Feature augmentation
𝒗𝑖,𝑇 = (𝒙𝑖,𝑇, 𝒙𝑖,𝑇−3, … , 𝒙𝑖,𝑇−12, 𝒙𝑖,𝑇/ 𝑅
𝒙𝑖,𝑇−3, … , 𝒙𝑖,𝑇/ 𝑅
𝒙𝑖,𝑇−12) ∈ [0,1] 𝟏𝟖𝟎
𝑟𝑖,𝑇+1 ∈ [0,1]
12
・Most of factors are updated quarterly ・Time difference between the present and each quarter ago

・ Architecture of RIC-NN is quite standard
✓ Fully-connected feedforward neural networks
✓ 6 Hidden layers: { 150 – 150 – 100 – 100 – 50 – 50 }
Dropout rates: (50% – 50% – 30% – 30% – 10% – 10%)
✓ Activation function: ReLU function
✓ RIC-NN(Transfer Learning:TF)
Compared Models
・ Other off-the-shelf machine learning models
✓ Epoch-based Neural Network (NN(Epoch))
✓ Random Forest (RF)
✓ Ridge Regression (RR)
We use the weights of the first four layers that are trained
in the source region as the initial weight of the target region.
Our Proposed (RIC-NN)
13

Prediction Period
November 2004
:
December 1994
Scores
January 2005
Training
120 set
𝑓 𝒗𝑖,𝑡; 𝜽 𝒕+𝟏
∗
argmin
𝜽
MSE 𝑡
Features: 𝒗𝑖,𝑡 December 2004
𝒗𝑖,𝑡 𝜽 𝑡+1
∗
December 2004
:
January 1995
February 2005
January 2005
𝒗𝑖,𝑡 𝜽 𝑡+1
∗
・・・
October 2018
:
November 2008
December 2018
November 2018
𝜽 𝑡+1
∗𝒗𝑖,𝑡
14 years (168 months)
・ 14 years (from January 2005 to December 2018)
・ Updated by sliding one-month-ahead and carrying out a monthly forecast.
14

Agenda
4. Conclusion
15

・ Simple portfolio strategies
Performance Measure
✓ Long Portfolio Strategy
✓ We make quintile portfolios.
- Buy (Long) the top 1/5 score stocks with equal weighting
- Benchmark: the average return of all stocks
→ Relative performance evaluation
１２３４５
Relative goodness
Investment Universe
16

Performance Measure
・ We use the following (standard) measures.
✓ Alpha Return ≔ ς 𝑡=1
𝑇
1 + 𝛼 𝑡
12/𝑇 − 1
✓ 𝑇𝐸 Risk ≔
12
𝑇−1
𝛼 𝑡 − 𝜇 𝛼
2
✓ 𝐼𝑅 Return/Risk ≔ Alpha/𝑇𝐸
portfolio return – benchmark return
𝜇 𝛼: Average of Alpha
𝛼 𝑡:
✓ MaxDD Worst − case Loss ≔ min
𝑘∈[1,𝑇]
(0,
𝑊𝑘
𝑃𝑜𝑟𝑡
max
𝑗∈ 1,𝑘
𝑊𝑗
𝑃𝑜𝑟𝑡 − 1)
𝑊𝑘
𝑃𝑜𝑟𝑡
: Cumulative return of the portfolio
17

Experimental Results (1/2)
・ NA: RIC-NN without transfer learning performed best.
・ PA: RIC-NN with transfer learning performed best.
・ NA as a source domain enhances the performance of PA,
not vice versa.
MSCI North America
Linear
LR RF DL(Epoch) RIC-NN RIC-NN(TF from PF)
Alpha 0.62% 0.79% 0.82% 1.23% 1.20%
TE 5.40% 5.14% 4.48% 4.14% 4.43%
IR 0.11 0.15 0.18 0.30 0.27
MaxDD -21.84% -24.57% -17.41% -14.37% -20.57%
Long
Nonlinear
MSCI Pacific
Linear
LR RF DL(Epoch) RIC-NN RIC-NN(TF from NA)
Alpha 5.35% 3.79% 4.34% 5.25% 5.78%
TE 5.17% 5.75% 4.18% 4.20% 3.95%
IR 1.04 0.66 1.04 1.25 1.46
MaxDD -11.53% -11.43% -9.37% -7.51% -3.37%
Long
Nonlinear
18

Experimental Results (2/2)
・ While NN at epoch 50 performs better in NA, NN at epoch 60 performs better in PA.
・ NN(Epoch) is very sensitive to the choice of the epoch.
・ RIC-NN outperforms epoch-based stopping: rank IC controls the fitness of
the stock prediction models consistently.
MSCI North America
40 50 56 60 80
Alpha 1.23% 0.18% 1.48% 0.82% 1.25% 0.70%
TE 4.14% 4.52% 4.35% 4.48% 4.49% 4.14%
IR 0.30 0.04 0.34 0.18 0.28 0.17
MaxDD -14.37% -22.67% -13.48% -17.41% -20.98% -15.94%
Long RIC-NN
NN(Epoch)
MSCI Pacific
40 46 50 60 80
Alpha 5.25% 4.13% 4.34% 4.28% 4.52% 2.99%
TE 4.20% 4.36% 4.18% 4.73% 4.34% 4.06%
IR 1.25 0.95 1.04 0.90 1.04 0.74
MaxDD -7.51% -8.08% -9.37% -7.16% -7.45% -7.52%
NN(Epoch)
Long RIC-NN
*These epochs are chosen so that the rank IC reaches 0.20 during the training of the first time step.
*
*
19

・ We have proposed a new stock price prediction framework called RIC-NN
by introducing three practical ideas:
(1) A nonlinear multi-factor approach is better than a linear approach.
(2) Rank IC-based stopping outperforms epoch-based stopping.
(3) Multi-region transfer learning works well.
・ Better return of the portfolio, better control of the fitness of the model to
the past dataset.
Conclusion
20

A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy

More Related Content

Similar to A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy (20)

More from Kei Nakagawa (20)

Recently uploaded (20)

A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy