Humboldt-Universität zu Berlin - High Dimensional Nonstationary Time Series

Abstracts

IRTG 1792 DP Abstracts

IRTG1792DP2018 042

On Complete Convergence in Marcinkiewicz-Zygmund Type SLLN for END Random Variables and its Applications

Ji Gao YAN


Abstract
In this paper, the complete convergence for maximal weighted sums of extended negatively
dependent (END, for short) random variables is investigated. Some sucient conditions
for the complete convergence and some applications to a nonparametric model are provided. The
results obtained in the paper generalise and improve the corresponding ones of Wang el al. (2014b)
and Shen, Xue, and Wang (2017).

Keywords:
Complete convergence; Maximal weighted sums; Extended negatively dependent.

JEL Classification:
C00

MSC(2010) Subject Classification:
60F15

IRTG1792DP2018 043

Textual Sentiment and Sector specific reaction

Elisabeth Bommes
Cathy Yi-Hsuan Chen
Wolfgang Karl Härdle


Abstract
News move markets and contains incremental information about stock
reactions. Future trading volumes, volatility and returns are a
ected by sentiments of texts and opinions expressed in articles. Earlier
work of sentiment distillation of stock news suggests that risk prole reactions
might differ across sectors.
Conventional asset pricing theory recognizes the role of a sector and its
risk uniqueness that differs from market or rm specic risk.
Our research assesses whether incorporating the sentiment distilled from
sector specic news carries information about risk proles. Textual analytics applied to about 600K
articles leads us with lexical projection and machine learning to classication of sentiment polarities. The
texts are scraped from offcial NASDAQ web pages and with Natural Language Processing (NLP)
techniques, such as tokenization, lemmatization, a sector specic sentiment is extracted using a lexical
approach and a nancial phrase bank. Predicted sentence-level polarities are aggregated into a bullishness
measure on a daily basis and fed into a panel regression analysis with sector indicators. Supervised
learning with hinge or logistic loss and regularization yields good prediction results of polarity. Compared with
standard lexical projections, the supervised learning approach yields superior predictions of sentiment,
leading to highly sector specic sentiment reactions. The Consumer Staples, Health Care and Materials
sectors show strong risk prole reactions to negative polarity.

Keywords:
Investor Sentiment, Attention Analysis, Sector-specic Reactions, Volatility, Text Mining, Polarity

JEL Classification:
C81, G14, G17

IRTG1792DP2018 044

Understanding Cryptocurrencies

Wolfgang Karl Härdle
Campbell R. Harvey
Raphael C. G. Reule


Abstract
Cryptocurrencies refer to a type of digital cash that use distributed ledger - or
blockchain technology - to provide secure transactions. These currencies are generally
misunderstood. While initially dismissed as fads or bubbles, many large central
banks are considering launching their own version of national cryptocurrencies. In
contrast to most data in nancial economics, there is a plethora of detailed (free)
data on the history of every transaction for the cryptocurrency complex. Further,
there is little empirically-oriented research on this new asset class. This is an extraordinary
research opportunity for academia. We provide a starting point by
giving an insight into cryptocurrency mechanisms and detailing summary statistics
and focusing on potential future research avenues in nancial economics.

Keywords:
Cryptocurrency, Blockchain, Bitcoin, Economic bubbles, Peer-to-Peer, Cryptographic
hashing, Consensus, Proof-of-Work, Proof-of-stake, Volatility

JEL Classification:
C01, C58, E42, E51, G10, K24, K42, L86, O31

IRTG1792DP2018 045

Predicative Ability of Similarity-based Futures Trading Strategies

Hsin-Yu Chiu
Mi-Hsiu Chiang
Wei-Yu Kuo


Abstract
A trading rule that draws on the empirical similarity concept is proposed to simulate the
technical trading mentality|one that selectively perceives structural resemblances between
market scenarios of the present and the past. In more than half of the nineteen futures
markets that we test against for protability of this similarity-based trading rule, we nd
evidence of predictive ability that is robust to data-snooping and transaction-cost adjust-
ments. When aided by an exit strategy that liquidates the trader's positions across some
evenly-spaced time points, this rule generates the most robust returns.

Keywords:
empirical similarity; technical trading; futures markets; analogical reasoning

JEL Classification:
G11, G12

IRTG1792DP2018 046

Forecasting the Term Structure of Option Implied Volatility: The Power of an Adaptive Method

Ying Chen
Qian Han
Linlin Niu

Abstract
We model the term structure of implied volatility (TSIV) with an adaptive approach
to improve predictability, which treats dynamic time series models of globally time-
varying but locally constant parameters and uses a data-driven procedure to ?nd the
local optimal interval. We choose two speci?cations of the adaptive models: a simple
local AR (LAR) model for a univariate implied volatility series and an adaptive dynamic
Nelson-Siegel (ADNS) model of three factors, each based on an LAR, to model the cross-
section of the TSIV simultaneously with parsimony. Both LAR and ADNS models
uniformly outperform more than a dozen alternative models with signi?cance across
maturities for 1-20 day forecast horizons. Measured by RMSE and MAE, the forecast
errors of the random walk model can be reduced by between 20% and 60% for the 5 to
20 days ahead forecast. In terms of prediction accuracy of future directional changes,
the adaptive models achieve an accuracy range of 60%-90%, which strictly dominates
the range of 30%-59% of the alternative models.

Keywords:
Term structure of implied volatility, local parametric models, forecasting

JEL Classification:
C32, C53

IRTG1792DP2018-053.txt

The impact of temperature on gaming productivity: evidence from online games

Xiaojia Bao
Qingliang Fan

Abstract
This paper studies the short-run impacts of temperature on human performance in the
computer-mediated environment using server logs of a popular online game in China.
Taking advantage of the quasi-experiment of winter central heating policy inChina, we
distinguish the impacts of outdoor and indoor temperature and find that low temperatures
below 5 ?C decrease game performance significantly. Non-experienced players
suffered larger performance drop than experienced ones. Access to central heating
attenuates negative impacts of low outdoor temperatures on gamers’ performance.
High temperatures above 21 ?C also lead to drops in game performance.We conclude
that expanding the current central heating zone will bring an increase in human performance
by approximately 4% in Shanghai and surrounding provinces in the winter.
While often perceived as a leisure activity, online gaming requires intense engagement
and the deployment of cognitive, social, and motor skills, which are also key skills
for productive activities. Our results draw attention to potential damages of extreme
temperature on human performance in the modern computer-mediated environment.

Keywords:
Temperature, Human performance, Online game, Heating

JEL Classification:
Q54, J22, J24, D03

IRTG1792DP2018-047.txt

Inferences for a Partially Varying Coefficient Model With Endogenous Regressors

Zongwu Cai
Ying Fang
Ming Lin
Jia Su

Abstract
In this article, we propose a new class of semiparametric instrumental variable models with partially varying
coefficients, in which the structural function has a partially linear form and the impact of endogenous
structural variables can vary over different levels of some exogenous variables. We propose a three-step
estimation procedure to estimate both functional and constant coefficients. The consistency and asymptotic
normality of these proposed estimators are established. Moreover, a generalized F-test is developed to
test whether the functional coefficients are of particular parametric forms with some underlying economic
intuitions, and furthermore, the limiting distribution of the proposed generalized F-test statistic under the
null hypothesis is established. Finally, we illustrate the finite sample performance of our approach with
simulations and two real data examples in economics.

Keywords:
Endogeneity; Functional coefficients; Generalized F-test; Instrumental variables models;
Nonparametric test; Profile least squares

JEL Classification:
C00

IRTG1792DP2018-048.txt

A Regime Shift Model with Nonparametric Switching Mechanism

Haiqiang Chen
Yingxing Li
Ming Lin
Yanli Zhu

Abstract
In this paper, we propose a new class of regime shift models with flexible switching
mechanism that relies on a nonparametric probability function of the observed threshold
variables. The proposed models generally embrace traditional threshold models
with contaminated threshold variables or heterogeneous threshold values, thus gaining
more power in handling complicated data structure. We solve the identification issue by
imposing either global shape restriction or boundary condition on the nonparametric
probability function. We utilize the natural connection between penalized splines and
hierarchical Bayes to conduct smoothing. By adopting different priors, our procedure
could work well for estimations of smooth curve as well as discontinuous curves with
occasionally structural breaks. Bayesian tests for the existence of threshold effects are
also conducted based on the posterior samples from Markov chain Monte Carlo (MCMC)
methods. Both simulation studies and an empirical application in predicting
the U.S. stock market returns demonstrate the validity of our methods.

Keywords:
Threshold Model, Nonparametric, Markov Chain Monte Carlo, Bayesian
Inference, Spline.

JEL Classification:
C00

IRTG1792DP2018-049.txt

Strict Stationarity Testing and GLAD Estimation of Double Autoregressive Models

Shaojun Guo
Dong Li
Muyi Li

Abstract
In this article we develop a tractable procedure for testing strict stationarity in a
double autoregressive model and formulate the problem as testing if the top Lyapunov
exponent is negative. Without strict stationarity assumption, we construct a consistent
estimator of the associated top Lyapunov exponent and employ a random weighting
approach for its variance estimation, which in turn are used in a t-type test. We also
propose a GLAD estimation for parameters of interest, relaxing key assumptions on
the commonly used QMLE. All estimators, except for the intercept, are shown to be
consistent and asymptotically normal in both stationary and explosive situations. The
nite-sample performance of the proposed procedures is evaluated via Monte Carlo
simulation studies and a real dataset of interest rates is analyzed.

Keywords:
DAR model, GLAD estimation, Nonstationarity, Random weighting, Strict
stationarity testing.

JEL Classification:
C15, C22

IRTG1792DP2018-050.txt

Variable selection and direction estimation for single-index models via DC-TGDR method

Wei Zhong
Xi Liu
Shuangge Ma

Abstract
This paper is concerned with selecting important covariates
and estimating the index direction simultaneously for
high dimensional single-index models. We develop an efficient
Threshold Gradient Directed Regularization method
via maximizing Distance Covariance (DC-TGDR) between
the single index and response variable. Due to the appealing
property of distance covariance which can measure nonlinear
dependence between random variables, the proposed
method avoids estimating the unknown link function of the
single index and dramatically reduces computational complexity
compared to other methods that use smoothing techniques.
It keeps the model-free advantage from the view of
sufficient dimension reduction and requires neither predictors
nor response variable to be continuous. In addition, the
DC-TGDR method encourages a grouping effect. That is,
it is capable of choosing highly correlated covariates in or
out of the model together. We examine finite-sample performance
of the proposed method by Monte Carlo simulations.
In a real data analysis, we identify important copy number
alterations (CNAs) for gene expression.

Keywords:
Distance covariance, Highdimensional
data, Threshold gradient directed regularization,
Single-index models, Variable selection.

JEL Classification:
C00

IRTG1792DP2018-051.txt

Variable selection and direction estimation for single-index models via DC-TGDR method

Honglin Wang
Fan Yu
Yinggang Zhou

Abstract
The conventional wisdom that housing prices are the present value of future
rents ignores the fact that unlike dividends on stocks, rent is not discretionary.
Housing price uncertainty can affect household property investments, which
in turn affect rent. By extending the theory of investment under uncertainty, we
model the renter’s decision to buy a house and the landlord’s decision to sell
as the exercising of real options of waiting and examine real options effects on
rent. Using data from Hong Kong and mainland China, we find a significant
effect of housing price on rent and draw important policy implications.

Keywords:


JEL Classification:
C00

IRTG1792DP2018-052.txt

Nonparametric Additive Instrumental Variable Estimator: A Group Shrinkage Estimation Perspective

Qingliang Fan
Wei Zhong

Abstract
In this article, we study a nonparametric approach regarding a general nonlinear reduced form equation
to achieve a better approximation of the optimal instrument. Accordingly, we propose the nonparametric
additive instrumental variable estimator (NAIVE) with the adaptive group Lasso.We theoretically demonstrate
that the proposed estimator is root-n consistent and asymptotically normal. The adaptive group
Lasso helps us select the valid instruments while the dimensionality of potential instrumental variables is
allowed to be greater than the sample size. In practice, the degree and knots of B-spline series are selected
by minimizing the BIC or EBIC criteria for each nonparametric additive component in the reduced form
equation. In Monte Carlo simulations, we show that the NAIVE has the same performance as the linear
instrumental variable (IV) estimator for the truly linear reduced form equation. On the other hand, the
NAIVE performs much better in terms of bias and mean squared errors compared to other alternative
estimators under the high-dimensional nonlinear reduced form equation. We further illustrate our method
in an empirical study of international trade and growth. Our findings provide

Keywords:
Adaptive group Lasso; Instrumental variables; Nonparametric additive model; Optimal
estimator; Variable selection.

JEL Classification:
C00

IRTG1792DP2018-054.txt

Topic Modeling for Analyzing Open-Ended Survey Responses

Andra-Selina Pietsch
Stefan Lessmann

Abstract
Open-ended responses are widely used in market research studies. Processing of such
responses requires labor-intensive human coding. This paper focuses on unsupervised topic
models and tests their ability to automate the analysis of open-ended responses. Since state-ofthe-
art topic models struggle with the shortness of open-ended responses, the paper considers
three novel short text topic models: Latent Feature Latent Dirichlet Allocation, Biterm Topic
Model and Word Network Topic Model. The models are fitted and evaluated on a set of realworld
open-ended responses provided by a market research company. Multiple components
such as topic coherence and document classification are quantitatively and qualitatively
evaluated to appraise whether topic models can replace human coding. The results suggest that
topic models are a viable alternative for open-ended response coding. However, their
usefulness is limited when a correct one-to-one mapping of responses and topics or the exact
topic distribution is needed.

Keywords:
Market research, open-ended responses, text analytics, short text topic models

JEL Classification:

IRTG1792DP2018 055

Estimation of the discontinuous leverage effect: Evidence from the NASDAQ order book

Markus Bibinger
Christopher Neely
Lars Winkelmann

Abstract
An extensive empirical literature documents a generally negative relation, named the “leverage
effect,” between asset returns and changes of volatility. It is more challenging to establish
such a return-volatility relationship for jumps in high-frequency data. We propose new nonparametric
methods to assess and test for a discontinuous leverage effect — i.e. a covariation
between contemporaneous jumps in prices and volatility. The methods are robust to market
microstructure noise and build on a newly developed price-jump localization and estimation
procedure. Our empirical investigation of six years of transaction data from 320 NASDAQ
firms displays no unconditional negative covariation between price and volatility cojumps.
We show, however, that there is a strong and significant discontinuous leverage effect if
one conditions on the sign of price jumps and whether the price jumps are market-wide or
idiosyncratic.

Keywords:
High-frequency data, market microstructure, news impact, market-wide jumps, price jump, volatility jump

JEL Classification:
C13, C58

IRTG1792DP2018 056

Estimation of the discontinuous leverage effect: Evidence from the NASDAQ order book

Daniel Traian Pele
Miruna Mazurencu-Marinescu-Pele

Abstract
In this paper we investigate the statistical properties of cryptocurrencies by using alpha-stable distributions. We also study the benefits of the Metcalfe's law (the value of a network is proportional to the square of the number of connected users of the system) for the evaluation of cryptocurrencies. As the results showed a potential for herding behaviour, we used LPPL models to capture the behaviour of cryptocurrencies exchange rates during an endogenous bubble and to predict the most probable time of the regime switching.

Keywords:
cryptocurrency, Bitcoin, CRIX, Log-Periodic Power Law, Metcalfe’s law, stable distribution

JEL Classification:
C22, C32, C51, C53, C58, E41, E42, E47, E51, G1, G17

IRTG1792DP2018 057

Trending Mixture Copula Models with Copula Selection

Bingduo Yang
Zongwu Caib
Christian M. Hafner
Guannan Liu

Abstract
Modeling the joint tails of multiple nancial time series has important im-
plications for risk management. Classical models for dependence often encounter a lack
of t in the joint tails, calling for additional exibility. In this paper we introduce a new
nonparametric time-varying mixture copula model, in which both weights and depen-
dence parameters are deterministic functions of time. We propose penalized trending
mixture copula models with group smoothly clipped absolute deviation (SCAD) penal-
ty functions to do the estimation and copula selection simultaneously. Monte Carlo
simulation results suggest that the shrinkage estimation procedure performs well in s-
electing and estimating both constant and trending mixture copula models. Using the
proposed model and method, we analyze the evolution of the dependence among four
international stock markets, and nd substantial changes in the levels and patterns of
the dependence, in particular around crisis periods.

Keywords:
Copula, Time-Varying Copula, Mixture Copula, Copula Selection

JEL Classification:

IRTG1792DP2018 058

Investing with cryptocurrencies - evaluating the potential of portfolio allocation strategies

Alla Petukhina
Simon Trimborn
Wolfgang Karl Härdle
Hermann Elendner

Abstract
The market capitalization of cryptocurrencies has risen rapidly during the
last few years. Despite their high volatility, this fact has spurred growing
interest in cryptocurrencies as an alternative investment asset for portfolio and
risk management. We characterise the effects of adding cryptocurrencies in addition
to traditional assets to the set of eligible assets in portfolio management.
Out-of-sample performance and diversification benefits are studied for the most
popular portfolio-construction rules, including mean-variance optimization,
risk-parity, and maximum-diversification strategies, as well as combined strategies.
To account for the frequently low liquidity of cryptocurrency markets
we incorporate the LIBRO method, which gives suitable liquidity constraints.
Our results show that cryptocurrencies can improve the risk-return profile of
portfolios. In particular, cryptocurrencies are more useful for portfolio strategies
with higher target returns; they do not play a role in minimum-variance
portfolios. However, a maximum-diversification strategy (maximising the Portfolio
Diversification Index, PDI) draws appreciably on cryptocurrencies, and
spanning tests clearly indicate that cryptocurrency returns are non-redundant
additions to the investment universe.

Keywords:
cryptocurrency, CRIX, investments, portfolio management, asset
classes, blockchain, Bitcoin, altcoins, DLT

JEL Classification:
C01, C58, G11

IRTG1792DP2018 059

Towards the interpretation of time-varying regularization parameters in streaming penalized regression models

Lenka Zbonakova
Ricardo Pio Monti
Wolfgang Karl Härdle

Abstract
High-dimensional, streaming datasets are ubiquitous in modern applications.
Examples range from nance and e-commerce to the study of biomedical and
neuroimaging data. As a result, many novel algorithms have been proposed to
address challenges posed by such datasets. In this work, we focus on the use of L1-
regularized linear models in the context of (possibly non-stationary) streaming
data. Recently, it has been noted that the choice of the regularization parameter
is fundamental in such models and several methods have been proposed which
iteratively tune such a parameter in a time-varying manner, thereby allowing
the underlying sparsity of estimated models to vary. Moreover, in many applications,
inference on the regularization parameter may itself be of interest, as
such a parameter is related to the underlying sparsity of the model. However, in
this work, we highlight and provide extensive empirical evidence regarding how
various (often unrelated) statistical properties in the data can lead to changes
in the regularization parameter. In particular, through various synthetic experiments,
we demonstrate that changes in the regularization parameter may be
driven by changes in the true underlying sparsity, signal-to-noise ratio or even
model misspecication. The purpose of this letter is, therefore, to highlight and
catalog various statistical properties which induce changes in the associated regularization
parameter. We conclude by presenting two applications: one relating
to nancial data and another to neuroimaging data, where the aforementioned
discussion is relevant.

Keywords:
Lasso, penalty parameter, stock prices, neuroimaging

JEL Classification:
C13, C15, C63

IRTG1792DP2018 061

PLUG-IN L2-UPPER ERROR BOUNDS IN DECONVOLUTION, FOR A MIXING DENSITY ESTIMATE IN Rd AND FOR ITS DERIVATIVES

Yannis G. Yatracos

Abstract
In deconvolution in Rd; d 1; with mixing density p(2 P) and kernel h; the mixture
density fp(2 Fp) can always be estimated with f^pn; ^pn 2 P; via Minimum Distance
Estimation approaches proposed herein, with calculation of f^pn's upper L1-error rate, an;
in probability or in risk; h is either known or unknown, an decreases to zero with n: In
applications, an is obtained when P consists either of products of d densities dened on
a compact, or L1 separable densities in R with their dierences changing sign at most J
times; J is either known or unknown. When h is known and p is ~q-smooth, vanishing
outside a compact in Rd; plug-in upper bounds are then also provided for the L2-error
rate of ^pn and its derivatives, respectively, in probability or in risk; ~q 2 R+; d 1: These
L2-upper bounds depend on h's Fourier transform, ~h(6= 0); and have rates (log a??1
n )??N1 and aN2 n , respectively, for h super-smooth and smooth; N1 > 0; N2 > 0: For the typical
an (log n) n??; the former (logarithmic) rate bound is optimal for any > 0 and
the latter misses the optimal rate by the factor (log n) when = :5; > 0; > 0: The
exponents N1 and N2 appear also in optimal rates and lower error and risk bounds in the
deconvolution literature.

Keywords:


JEL Classification:

IRTG1792DP2018 060

RESIDUAL'S INFLUENCE INDEX (RINFIN), BAD LEVERAGE AND UNMASKING IN HIGH DIMENSIONAL L2-REGRESSION

Yannis G. Yatracos

Abstract
In linear regression of Y on X(2 Rp) with parameters (2 Rp+1);
statistical inference is unreliable when observations are obtained
from gross-error model, F;G = (1??)F +G; instead of the assumed
probability F;G is gross-error probability, 0 < < 1: When G is unit
mass at (x; y); Residual's In uence Index, RINFIN(x; y; ; ), measures
the dierence in small x-perturbations of L2-residual, r(x; y);
for model F and for F;G via r's x-partial derivatives. Asymptotic
properties are presented for sample RINFIN that is successful in
extracting indications for in uential and bad leverage cases in microarray
data and simulated, high dimensional data. Its performance
improves as p increases and can also be used in multiple response
linear regression. RINFIN's advantage is that, whereas in in uence
functions of L2-regression coecients each x-coordinate and r(x; y)
appear in a sum as product with moderate size when (x; y) is bad
leverage case and masking makes r(x; y) nearly vanish, RINFIN's
x-partial derivatives convert the product in sum allowing for unmasking.

Keywords:
Big Data, Data Science, In fluence Function, Leverage, Masking, Residual's In fluence Index (RINFIN)

JEL Classification:

AMS 2010 subject classications:
62-07, 62-09, 62J05, 62F35, 62G35