Table Of ContentTechnische Universit¨at Mu¨nchen
Department of Mathematics
Master’s Thesis
Bayesian Analysis of the One-Factor Copula
Model with Applications to Finance
Benedikt Schamberger
Supervisor: Prof. Claudia Czado, Ph.D.
Advisors: Lutz Gruber, Prof. Claudia Czado, Ph.D.
Submission Date: 09.04.2015
I hereby declare that this thesis is my own work and that no other sources have been used
except those clearly indicated and referenced.
Munich, 09.04.2015
Zusammenfassung
Vine Copulas erm¨oglichen die Modellierung von hochdimensionalen Abha¨ngigkeitsstrukt-
uren anhand von zwei-dimensionalen Bausteinen. Die verwandten Faktorcopula Modelle
verwenden das gleiche Prinzip, mit dem Unterschied, dass vorrausgesetzt wird, dass den
Variablen Faktoren, die nicht beobachtet werden ko¨nnen, zugrunde liegen. Diese beiden
Modelle und weitere Grundlagen, die fu¨r ihre Anwendung no¨tig sind, werden vorgestellt
und anschließend anhand von europa¨ischen Aktienrenditen verglichen.
Zusa¨tzlich wird fu¨r den Spezialfall von einem latenten Faktor mit einem Markov chain
Monte Carlo Verfahren eine Bayesianische Analyse des Faktorcopula Modells entwickelt.
Verschiedene Varianten des Verfahrens werden mit Hilfe einer Simulationsstudie mitein-
ander verglichen, und neben Vine Copulas dazu verwendet um in einer Anwendungsstudie
die Risikomaße Value-at-Risk und Expected Shortfall fu¨r ein Portfolio vorherzusagen.
i
Contents
1 Introduction 1
2 Probability theory and copulas 3
2.1 Basics and distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Elliptical copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Archimedean copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.4 Extreme value copulas . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Measures of association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Tail dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.1 Bivariate tail dependence . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4.2 Multivariate tail dependence . . . . . . . . . . . . . . . . . . . . . . 22
2.4.3 Conditional tail dependence . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Vines and vine copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.1 Graph theory notation . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.2 Regular vines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5.3 Regular vine distributions and copulas . . . . . . . . . . . . . . . . 26
3 Statistical methods and backtesting 31
3.1 Assessment of model fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Model backtests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Value-at-Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Expected shortfall . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Sampling Methods 39
4.1 Markov chain Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 Gibbs sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 Metropolis-Hastings algorithm . . . . . . . . . . . . . . . . . . . . . 40
4.2 Metropolis-Hastings within Gibbs . . . . . . . . . . . . . . . . . . . . . . . 41
4.2.1 Mode and curvature matching . . . . . . . . . . . . . . . . . . . . . 42
4.2.2 Expectation and variance matching . . . . . . . . . . . . . . . . . . 43
4.2.3 Independence and random walk samplers . . . . . . . . . . . . . . . 43
4.3 Adaptive Rejection Metropolis Sampling . . . . . . . . . . . . . . . . . . . 44
4.3.1 Acceptance-Rejection Sampling . . . . . . . . . . . . . . . . . . . . 44
4.3.2 Adaptive Rejection Sampling . . . . . . . . . . . . . . . . . . . . . 45
4.3.3 Adaptive Rejection Metropolis Sampling . . . . . . . . . . . . . . . 47
5 Marginal models 49
5.1 ARMA-GARCH processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1.2 Quasi-maximum likelihood estimation . . . . . . . . . . . . . . . . . 51
5.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Dynamic linear models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
ii
6 Factor copula models 59
6.1 Model formulation and examples . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.1 One-factor copula model . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.2 Two-factor copula model . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1.3 Factor copula models with more than two factors . . . . . . . . . . 67
6.2 Properties of the one- and two-factor copula model . . . . . . . . . . . . . 67
6.2.1 Dependence properties . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.2.2 Tail properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7 Bayesian analysis of the one-factor copula model 72
7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2 Prior for copula parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3 Prior for latent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.4 Likelihood for observations . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7.5 Posteriors and full conditional densities . . . . . . . . . . . . . . . . . . . . 75
8 Simulation study 78
8.1 Simulated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.1.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.1.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2 Marginal maximum likelihood estimation . . . . . . . . . . . . . . . . . . . 79
8.3 Gibbs sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.3.1 Starting values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.3.2 ARMS within Gibbs . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.3.3 Metropolis-Hastings within Gibbs . . . . . . . . . . . . . . . . . . . 82
8.3.4 Code validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
8.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.4.1 Marginal maximum likelihood estimation . . . . . . . . . . . . . . . 85
8.4.2 Gibbs sampling with Gumbel linking copulas . . . . . . . . . . . . . 86
8.4.3 Thinned individual ARMS . . . . . . . . . . . . . . . . . . . . . . . 94
8.4.4 Posterior mode estimation . . . . . . . . . . . . . . . . . . . . . . . 96
8.4.5 Credible intervals for the latent variable . . . . . . . . . . . . . . . 96
8.4.6 Gibbs sampling with Gaussian and survival Gumbel linking copulas 99
8.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9 Empirical study 102
9.1 Historical data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
9.2 Marginal models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.2.1 In-sample ARMA-GARCH . . . . . . . . . . . . . . . . . . . . . . . 105
9.2.2 Out-of-sample ARMA-GARCH . . . . . . . . . . . . . . . . . . . . 108
9.2.3 Time-varying ARMA DLM . . . . . . . . . . . . . . . . . . . . . . 114
9.3 Dependence models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
9.3.1 Setup of vine copula models . . . . . . . . . . . . . . . . . . . . . . 119
9.3.2 Vine copula models with index data . . . . . . . . . . . . . . . . . . 120
9.3.3 Vine copula models without index data . . . . . . . . . . . . . . . . 124
9.3.4 Setup of factor copula models . . . . . . . . . . . . . . . . . . . . . 126
iii
9.3.5 One-factor copula model . . . . . . . . . . . . . . . . . . . . . . . . 126
9.3.6 Two-factor copula model . . . . . . . . . . . . . . . . . . . . . . . . 127
9.3.7 Comparison of vine and factor copula models . . . . . . . . . . . . 128
9.4 Bayesian inference for the one-factor copula model . . . . . . . . . . . . . . 129
9.4.1 Behavior of the latent variable over time . . . . . . . . . . . . . . . 132
9.4.2 Joint distribution of the latent variable and bank stocks . . . . . . . 133
9.5 Value-at-Risk and expected shortfall forecasts . . . . . . . . . . . . . . . . 139
9.5.1 Portfolio composition . . . . . . . . . . . . . . . . . . . . . . . . . . 139
9.5.2 Value-at-Risk and expected shortfall backtests . . . . . . . . . . . . 140
10 Conclusions and outlook 143
A Simulation study: Figures and tables 144
A.1 Traceplots and histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
A.2 Boxplots of latent variable . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
A.3 Acceptance rate and effective sample size . . . . . . . . . . . . . . . . . . . 204
A.4 Posterior mode estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
B Empirical study: Figures and tables 209
B.1 In-sample ARMA–GARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
B.2 Time-varying ARMA DLM . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
B.3 Vine copula models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
B.3.1 ARMA-GARCH vine copula fit tables . . . . . . . . . . . . . . . . 214
B.3.2 DLM vine copula fit tables . . . . . . . . . . . . . . . . . . . . . . . 221
B.4 Estimated parameters of the one- and two-factor copula models . . . . . . 228
B.5 Bayesian inference for the one-factor copula model . . . . . . . . . . . . . . 231
B.5.1 Trace and density plots of years 2005 to 2013 . . . . . . . . . . . . 231
B.5.2 Trace and density plots of individual years . . . . . . . . . . . . . . 235
B.5.3 Posterior mode estimates . . . . . . . . . . . . . . . . . . . . . . . . 263
B.5.4 Behavior of the latent variable over time . . . . . . . . . . . . . . . 264
B.6 Value-at-Risk and expected shortfall forecasts . . . . . . . . . . . . . . . . 274
B.6.1 Vine copula fits of individual years . . . . . . . . . . . . . . . . . . 274
List of Figures 279
List of Tables 286
List of Algorithms 289
iv
1 Introduction
After the initial work done by Sklar (1959), it took several decades for copula based
dependence models to rise to prominence. Although the copula approach to multivariate
dependence offers a plethora of possibilities, copulas also require a high degree of skill
in implementation. In a majority of financial applications, Gaussian and Student’s t
copulas are applied to model the dependence in portfolios, even though these models
often fail to account for asymmetries or tail dependence in the data (see Embrechts,
2009). Consequently, some of the blame of the financial crisis was attributed to copula
models, since they failed to accurately depict the risk of joint failures (see Salmon, 2012).
More general high dimensional copulas suffer from increased theoretical and computa-
tional complexity, motivating the more recent introduction of vine copulas by a series of
works by Joe (1997), Bedford and Cooke (2002), Kurowicka and Cooke (2006) and Aas
et al. (2009). Vine copulas are a flexible class of copula models that create higher dimen-
sional dependence structures out of well-studied bivariate building blocks in a pair-copula
construction, circumventing the use of a single high-dimensional copula.
In many applications, the dependence of observed events can also be explained by un-
observed latent variables, such as economic factors. Generalizing a result of Joe (2011),
Krupskii and Joe (2013) suggest the use of factor copula models that are related to vine
copulas, when the assumption of an underlying factor structure is justified. Factor copula
models generalize multivariate normal models that explain the dependence in random
variables by a linear relation of a few normally distributed latent factors. Similarly to
vinecopulas, bivariatecopulabuildingblocksareusedtoachieveanadaptableoverarching
structure.
The aim of this Master’s thesis is to compare the performance of vine and factor copula
models based on a data set consisting of a time series of financial stock returns. As part
of modeling multivariate data with copulas, marginal time series models are examined
in a first step, before the multivariate dependence structure can be analyzed. In order
to facilitate a full Bayesian inference of the stock returns, a Markov chain Monte Carlo
method is derived for the factor copula model, in the case when only one underlying
latent factor is assumed. In conjunction with marginal dynamic linear models (DLM),
this enables forecasting of key risk measures such as Value-at-Risk and expected short-
fall. Furthermore, Bayesian analysis of the factor copula model allows the study of the
unobserved variable, providing additional information about historical random events.
The remainder of this thesis is structured in the following way. Section 2 states some fun-
damental results from probability theory, provides definitions of distributions and copula
families, and gives an introduction into copula and vine copula theory. Additionally,
measures to quantify association and tail behavior are discussed. In Section 3 statistical
methods, tests and backtests are presented, which are used to assess the goodness-of-fit
of models and to evaluate forecasting performance. General sampling methods with fo-
cus on Metropolis-Hastings sampling in the context of a Gibbs sampler are introduced
in Section 4. Section 5 provides the definition of classical autoregressive moving average
(ARMA) generalized auto-regressive conditional heteroskedastic (GARCH) and Bayesian
DLMs, which are used as marginal time series models for the financial data set. Factor
copula models and some of their most important properties constitute the content of Sec-
tion 6. As a special case of the general factor copula models, a Bayesian analysis of the
1
1 Introduction
one-factor copula model is derived in Section 7 and validated in a simulation study in
Section 8. Finally, in Section 9, all of the previous considerations are combined in a study
of an empirical data set of daily log returns. The performance of vine copula and factor
copula models for multivariate dependence is compared and the Bayesian forecast of risk
measures analyzed.
2
2 Probability theory and copulas
This section provides a brief overview over results from probability theory that are used
in proofs in the later parts of the thesis, as well a short introduction to copulas and
important measures of dependence. Finally, some basics of the more involved vine copula
theory are provided.
Concerning notation, a probability space (Ω,F,P) is assumed for all random variables,
and will not be explicitly mentioned in the following. E and Var denote the expectation
and variance, if they exist, of a random variable X ∼ F, respectively, where F is the
(cumulative) distribution function or density of X. ranX denotes the range of random
variable X, i.e., the set of values that X can obtain.
Section 2.1 contains some results from probability theory and the definitions of named
distributions that occur in this thesis. In Section 2.2 fundamentals of copula theory,
including the seminal theorem by Sklar, and parametric families that are common in
practice are presented. Sections 2.3 and 2.4 focus on measuring the differences in the
dependence structure of vectors of random variables and their relation to copulas. Finally,
Section 2.5 demonstrates how bivariate copulas can be used in pair-copula constructions
to build dependence structures in high dimensions.
2.1 Basics and distributions
Some results from probability theory, and distributions that are used in the later parts of
the thesis are given in the following. Thorough introductions to probability theory can
be found in, e.g., Gut (2009) and Durrett (2010).
Definition 2.1 (Modeandcurvature). Letf beatwicecontinuouslydifferentiabledensity
with support S. Then, the mode x of f is defined as
m
x ··= argmaxf(x),
m
x∈S
and the curvature c at point x ∈ S as
c(x) ··= f(cid:48)(cid:48)(x).
Note that x must not be unique.
m
Theorem 2.2 (Continuous law of total probability). Let X ∼ F , E[X] < ∞ and
X
Y ∼ F be continuous random variables. Then, for all x ∈ ranX and y ∈ ranY
Y
(cid:90) ∞ (cid:90) ∞
P(X ≤ x) = E[1 | Y = y]dF (y) = F (x| y)dF (y).
{X≤x} Y X|Y Y
−∞ −∞
Proof. See (Gut, 2009, Theorem 2.1).
Important distributions and a few of their properties that are used later in the thesis are
given in the following.
Definition 2.3 (Bernoulli distribution). A random variable X ∼ Ber(p) with probability
mass function
P(X = i) = pi(1−p)1−i, i = 0,1, p ∈ [0,1],
has a Bernoulli distribution with success probability p.
3
2 Probability theory and copulas
Definition 2.4 (Uniform distribution). A random variable U ∼ U(a,b) with density
f(x;a,b) = 1 , x ∈ R, a < b, a,b ∈ R,
{a≤x≤b}
where 1 is the indicator function, has a (continuous) uniform distribution on the interval
(a,b).
If a = 0 and b = 1, U(0,1) is called standard uniform distribution. U(0,1) is frequently
used for sampling random variates and it is the marginal distribution of copulas.
Definition 2.5 (Beta distribution). A random variable X ∼ Beta(α,β) with density
1
f(x;α,β) = xα−1(1−x)β−1, x ∈ [0,1], α,β > 0,
B(α,β)
has a Beta distribution with shape parameters α and β. B is the beta function defined
as
(cid:90) 1
B(x,y) = tx−1(1−t)y−1dt, x,y > 0.
0
Expectation and variance of X ∼ Beta(α,β) are given by
α αβ
E[x] = , Var(X) = ,
α+β (α+β)2(α+β +1)
respectively (see Gut, 2009, p. 283), and mode and curvature by
α−1
x (α,β) = , α,β > 1,
m
α+β −2
and
1 (cid:16)
c(x,α,β) = xα−3(1−x)β−3 α2 +x2(α+β −3)(α+β −2)
B(α,β)
(cid:17)
−2(α−1)x(α+β −3)−3α+2 ,
respectively. Forα = β = 1, theBetadistributioncoincideswiththeuniformdistribution.
Definition 2.6 (Gamma distribution). A random variable X ∼ G(s,r) with density
rs
f(x;s,r) = xs−1e−rx, x > 0, r,s > 0
Γ(s)
has a Gamma distribution with shape s and rate r. Γ is the gamma function defined as
(cid:90) ∞
Γ(t) = xt−1e−xdx, t > 0.
0
Expectation and variance of X ∼ Gamma(r,s) are given by
r r
E[X] = , Var(X) = ,
s s2
respectively (see Gut, 2009, p. 283).
4
Description:It is commonly used after fitting an ARMA model to test whether the manual tuning of proposal distributions is required, assisting swift and Figure 9.4: Daily log returns (grey) with ARMA(1,1)-GARCH(1,1) forecast value (dark.