Steven E. Pav
Opendoor
June 2, 2022
The Sharpe ratio (SR) is a sample statistic used to measure investment performance, defined as sample mean of returns divided by sample volatility of returns:
\frac{\hat{\mu}}{\sqrt{\hat{\sigma}^2}}.
The Sharpe ratio and Signal-Noise ratio are connected to three questions:
Example: “Sharpe is useless because returns are not normal.” Arguable for Las Vegas, work-arounds in Ivory Tower.
The Sharpe ratio has escaped academia and has a kind of currency among allocators and investors.
SharpeR::as.sr
: Mkt SMB HML UMD RF
Jan 1927 0.19 -0.56 4.83 0.44 0.25
Feb 1927 4.44 -0.10 3.17 -1.32 0.26
SR/sqrt(yr) Std. Error t value Pr(>t)
Mkt 0.61 0.10 6.0 1.8e-09 ***
UMD 0.48 0.10 4.6 2.1e-06 ***
SMB 0.23 0.10 2.2 0.014 *
HML 0.32 0.10 3.1 0.001 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mkt SMB HML UMD RF
1926-11-03 0.213 -0.24 -0.28 0.57 0.013
1926-11-04 0.603 -0.15 0.69 -0.52 0.013
If returns were normal, the Sharpe ratio would follow a (non-central) t distribution, up to scaling:
\mathrm{Sharpe\,\,ratio} = \frac{\hat{\mu}}{\hat{\sigma}},\,\,\,\quad \mathrm{t\,\,statistic} = \sqrt{n}\frac{\hat{\mu}}{\hat{\sigma}}.
# using the simple asymptotic standard error:
zeta <- as.sr(mff4['2011-01-01::2020-12-31','Mkt'])
confint(zeta,type='t')
2.5 % 97.5 %
Mkt 0.3729 1.639
as.sr(...,higher_order=TRUE)
computes and stores the moments needed by se
, confint
and predint
.One and two sample Hypothesis testing via SharpeR::sr_test
.
# higher order approximate standard error
print(sr_test(mff4[,'Mkt'],alternative='greater',zeta=0.3,ope=12,conf.level=0.95,type='Mertens'))
One Sample sr test, Mertens method
data: mff4[, "Mkt"]
t = 6, df = 1127, p-value = 0.001
alternative hypothesis: true signal-noise ratio is greater than 0.3
sample estimates:
[,1]
Mkt 0.6138
attr(,"names")
[1] "Sharpe ratio of mff4[, \"Mkt\"]"
SR/sqrt(yr) Std. Error t value Pr(>t)
Mkt 0.61 0.10 6.0 1.8e-09 ***
HML 0.32 0.10 3.1 0.001 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(sr_test(x=mff4[,'Mkt'],y=mff4[,'HML'],ope=12,alternative='two.sided',paired=TRUE,conf.level=0.95,type='Mertens'))
Paired sr-test
data: mff4[, "Mkt"] and mff4[, "HML"]
t = 2.3, df = 1127, p-value = 0.02
alternative hypothesis: true difference in signal-noise ratios is not equal to 0
sample estimates:
difference in Sharpe ratios
0.2959
We don’t always have binary buy/no-buy decisions, instead we can hold a portfolio of investments.
Just as \sqrt{n}\frac{\hat{\mu}}{\hat{\sigma}} = t in the univariate case, n \hat{\mu}^{\top}\hat{\Sigma}^{-1}\hat{\mu} = T^2 the Hotelling statistic for the multivariate case.
response | classical statistics | quantitative investing |
---|---|---|
univariate | t statistic | Sharpe Ratio |
multivariate | T^2 statistic | Squared Optimal Sharpe Ratio |
(There’s another dimension to this table if you consider conditioning information!)
Major divergence between Ivory Tower and Wall Street in this case:
Also: distribution of T^2 is less robust to assumptions than that of t.
That aside, we can compute \hat{\zeta}_*^2 and perform inference via SharpeR::as.sropt
:
SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(>T^2)
Sharpe 1.07 1.04 0.84 1.26 107 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Again, inference is on \zeta^2_*, not the SNR of the sample Markowitz portfolio.
MarkowitzR::mp_vcov
:library(MarkowitzR)
mp <- MarkowitzR::mp_vcov(rets)
knitr::kable(rbind(round(t(mp$W),4),paste0('(',round(sqrt(diag(mp$What)),4),')')),
caption="Sample Markowitz portfolio weights and std.errs.")
Mkt | HML | SMB | UMD | |
---|---|---|---|---|
Intercept | 0.0435 | 0.043 | 0.0049 | 0.06 |
(0.0075) | (0.0109) | (0.0098) | (0.0098) |
Freshman Quant Question: (Why) Would an investor pay for the sample Markowitz portfolio?
Markowitz is a Las Vegas result, did not specify how \vec{\mu}_t, \Sigma_t are estimated. Use features \vec{f}_{t-1} to predict them?
Flattening: Explicitly look for a portfolio linear in \vec{f}_{t-1}. (Brandt and Santa-Clara (2006))
Conditional Expectation Model: yields another connection to classical statistics, via MGLH. (Pav (2021))
Mkt
.rets <- mff4[,c('Mkt','HML','SMB','UMD')]
library(fromo)
momentum <- 0.1*dplyr::lag(fromo::running_mean(rets[,'Mkt'],window=6),1) # don't forget the lag!!
vola <- log(dplyr::lag(fromo::running_sd(rets[,'Mkt'],window=12),1))
vola <- vola / median(vola,na.rm=TRUE)
flattened <- cbind(setNames(rets,paste0('intercept_',colnames(rets))),
setNames(momentum*rets,paste0('momentum_',colnames(rets))),
setNames(as.numeric(vola)*rets,paste0('vola_',colnames(rets))))[-c(1,2),]
print(SharpeR::as.sropt(flattened))
SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(>T^2)
Sharpe 1.3 1.2 1.0 1.4 153 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Use case: Suppose you think an asset’s returns cannot be forecast, or you want no exposure to some assets.
Signal-noise optimization of portfolio with hedge constraint: \max_{\vec{w} : \mathrm{G}\Sigma\vec{w} = \vec{0}} \frac{\vec{w}^{\top}\vec{\mu}}{\sqrt{\vec{w}^{\top}\Sigma\vec{w}}}.
Rows of \mathrm{G} are “hedged out”. This problem is solved by c\left(\vec{w}_{*,\mathrm{I}} - \vec{w}_{*,\mathrm{G}}\right), where we define \vec{w}_{*,\mathrm{A}} = \mathrm{A}^{\top}\left(\mathrm{A}\Sigma\mathrm{A}^{\top}\right)^{-1}\mathrm{A}\vec{\mu} for matrix \mathrm{A}. “Projected Markowitz”.
This portfolio has squared signal-noise ratio \Delta = \zeta^2_{*,\mathrm{I}} - \zeta^2_{*,\mathrm{G}}, where \zeta^2_{*,\mathrm{A}} = \vec{\mu}^{\top}\mathrm{A}^{\top}\left(\mathrm{A}\Sigma\mathrm{A}^{\top}\right)^{-1}\mathrm{A}\vec{\mu}. “Projected squared SNR”.
If this quantity were zero, then all of the SNR of the assets are captured in the rows of \mathrm{G}. (Kan and Zhou (2012))
UMD
) from Fama French 4 factor returns using SharpeR::as.del_sropt
: SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(>T^2)
Sharpe 1.07 1.04 0.84 1.26 107 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# now hedge against UMD:
G <- diag(ncol(rets))[colnames(rets)=='UMD',]
print(SharpeR::as.del_sropt(rets,G=G))
SR/sqrt(yr) 2.5 % 97.5 % F value Pr(>F)
Sharpe 0.95 0.73 1.2 28 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(>T^2)
Sharpe 1.3 1.2 1.0 1.4 153 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# now hedge against intercept features:
G <- diag(ncol(flattened))[grepl('^intercept_',colnames(flattened)),]
print(dim(G))
[1] 4 12
SR/sqrt(yr) 2.5 % 97.5 % F value Pr(>F)
Sharpe 0.7 0.41 0.86 5.3 1.7e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Recall that inference on \zeta^2_{*} may not be sufficient for success because of mis-estimation of \vec{w}_*.
Two tools:
Sharpe Ratio Information Criterion (SRIC) is an approximately unbiased estimator of the SNR of \hat{w}_*, defined as
SRIC = \hat{\zeta}_* - \frac{k-1}{n\hat{\zeta}_*}.
(Paulsen and Söhl (2020))
Caution: estimating this via cross-validation can be biased!
Similarly defined approximate confidence bounds for the SNR of \hat{w}_* of the form
b_{\alpha} = \hat{\zeta}_* - \frac{f\left(k,\alpha;...\right)}{n\hat{\zeta}_*}.
The function f\left(k,\alpha;\cdot\right) depends on the unknown {\zeta}_* but can be estimated from \hat{\zeta}_*. (Pav (2020))
The probability that the SNR of \vec{w}_* is below b_{\alpha} is approximately \alpha.
SharpeR::sric
(also in the print
method): SR/sqrt(yr) SRIC/sqrt(yr) 2.5 % 97.5 % T^2 value Pr(>T^2)
Sharpe 1.07 1.04 0.84 1.26 107 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[,1]
[1,] 1.037
SharpeR::asnr_confint
:rets <- mff4[,c('Mkt','HML','SMB','UMD')]
zs <- (SharpeR::as.sropt(rets))
print(asnr_confint(zs,level.lo=0.025,level.hi=0.975)) # currently in dev branch!
2.5 % 97.5 %
[1,] 0.7954 1.203
SharpeR
and MarkowitzR
.Buy my book, or download “Short Sharpe Course”.
Performance Estimation with the Sharpe Ratio