Abstracts
IRTG1792DP2018 062
Conversion uplift in e-commerce: A systematic benchmark of modeling strategies
Robin Gubela
Artem Bequé
Fabian Gebert
Stefan Lessmann
Abstract
Uplift modeling combines machine learning and experimental strategies to estimate the differential
effect of a treatment on individuals behavior. The paper considers uplift models in the scope of
marketing campaign targeting. Literature on uplift modeling strategies is fragmented across academic
disciplines and lacks an overarching empirical comparison. Using data from online retailers,
we fill this gap and contribute to literature through consolidating prior work on uplift modeling
and systematically comparing the predictive performance and utility of available uplift modeling
strategies. Our empirical study includes three experiments in which we examine the interaction
between an uplift modeling strategy and the underlying machine learning algorithm to implement
the strategy, quantify model performance in terms of business value and demonstrate the advantages
of uplift models over response models, which are widely used in marketing. The results
facilitate making specific recommendations how to deploy uplift models in e-commerce applications.
Keywords:
e-commerce analytics, machine learning, uplift modeling, real-time targeting
JEL Classification:
IRTG1792DP2018 063
Causal Inference using Machine Learning. An Evaluation of recent Methods through Simulations
Daniel Jacob
Stefan Lessmann
Wolfgang Karl Härdle
Abstract
The estimation of a causal parameter in a high-dimensional setting where the
functions are potentially complex is a challenging task. Parametric and linear
modelling is not sufficient to generate unbiased and consistent estimators. Modern
approaches, therefore, use machine learning (ML) algorithms to learn these nuisance
functions. However, this leads to new problems like the regularization bias or
overfitting that are common when using ML models.
This paper considers different novel methods that overcome these problems or at
least address them. These methods differ in terms of the target parameter, namely
the average treatment effect of the population, group heterogeneity or the conditional
average treatment effect for each individual. Each method is first investigated and
tested separately and second, they are compared among each other. To do this in a
disciplined manner, simulations with synthetic data are used. This ensures that all
distributions of the generated treatment effect parameters are known. The findings
are that each method has its limits in terms of unbiased estimation, the detection
of heterogeneity and also the determination of which covariates are responsible for
different causal effects.
Keywords:
causal inference, machine learning, simulation study, sample-splitting
double machine learning, sorted group ATE (GATES), causal tree
JEL Classification:
C01, C14, C31, C63
IRTG1792DP2018 064
Semiparametric Estimation and Variable Selection for Single-index Copula Models
Bingduo Yang
Christian M. Hafner
Guannan Liu
Wei Long
Abstract:
A copula model with flexibly specified dependence structure can be useful to capture the complexity and heterogeneity in economic and financial time series. However, there exists little methodological guidance for the specification process using copulas. This paper contributes to fill this gap by considering the recently proposed single-index copulas, for which we propose a simultaneous estimation and variable selection procedure. The proposed method allows to choose the most relevant state variables from a comprehensive set using a penalized estimation, and we derive its large sample properties. Simulation results demonstrate the good performance of the proposed method in selecting the appropriate state variables and estimating the unknown index coefficients and dependence parameters. An application of the new procedure identifies six macroeconomic driving factors for the dependence among U.S. housing markets.
Keywords:
Semiparametric Copula, Single-Index Copula, Variable Selection, SCAD
JEL Classification:
C14, C22
IRTG1792DP2018 065
Price Management in the Used-Car Market: An Evaluation of Survival Analysis
Alexander Born
Nikoleta Kovachka
Stefan Lessmann
Hsin-Vonn Seow
Abstract:
Second-hand car markets contribute to billions of Euro turnover each year but
hardly generate profit for used car dealers. The paper examines the potential of
sophisticated data-driven pricing systems to enhance supplier-side decision-
making and escape the zero-profit-trap. Profit maximization requires an accurate
understanding of demand. The paper identifies factors that characterize consumer
demand and proposes a framework to estimate demand functions using survival
analysis. Empirical analysis of a large data set of daily used car sales between
2008 to 2012 confirm the merit of the new factors. Observed results also show
the value of survival analysis to explain and predict demand. Random survival
forest emerges as the most suitable vehicle to develop price response functions
as input for a dynamic pricing system.
Keywords:
Automotive Industry, Price Optimization, Survival Analysis, Dynamic Pricing
JEL Classification:
""
IRTG1792DP2018 066
Deep learning-based cryptocurrency sentiment construction
Sergey Nasekin
Cathy Yi-Hsuan Chen
Abstract:
We study investor sentiment on a non-classical asset, cryptocurrencies using a
“cryptospecificlexicon” recently proposed in Chen et al. (2018) and statistical
learning methods.We account for context-specific information and word similarity
by learning word embeddingsvia neural network-based Word2Vec model. On top of
pre-trained word vectors, weapply popular machine learning methods such as
recursive neural networks for sentencelevelclassification and sentiment index
construction. We perform this analysis on a noveldataset of 1220K messages
related to 425 cryptocurrencies posted on a microblogging platformStockTwits
during the period between March 2013 and May 2018. The constructed sentiment
indices are value-relevant in terms of its return and volatility predictability
for thecryptocurrency market index.
Keywords:
sentiment analysis, lexicon, social media, word embedding, deep learning
JEL Classification:
G41, G4, G12
IRTG1792DP2018 067
COOLING MEASURES AND HOUSING WEALTH: EVIDENCE FROM SINGAPORE Wolfgang Karl Härdle Rainer Schulz Taojun Xie Abstract: Excessive house price growth was at the heart of the financial crisis in 2007/08. Since then, many countries have added cooling measures to their regulatory frameworks. It has been found that these measures can indeed control price growth, but no one has examined whether this has adverse consequences for the housing wealth distribution. We examine this for Singapore, which started in 2009 to target price growth over ten rounds in total. We find that welfare from housing wealth in the last round might not be higher than before 2009. This depends on the deflator used to convert nominal into real prices. Irrespective of the deflator, we can reject that welfare increased monotonically over the different rounds. Keywords: house price distribution, stochastic dominance tests JEL Classification: R31, C31, C55
IRTG1792DP2019 001
Cooling Measures and Housing Wealth: Evidence from Singapore
Wolfgang Karl Härdle
Rainer Schulz
Taojun Xie
Abstract:
Excessive house price growth was at the heart of the financial crisis in
2007/08. Since then, many countries have added cooling measures to their
regulatory frameworks. It has been found that these measures can indeed control
price growth, but no one has examined whether this has adverse consequences for
the housing wealth distribution. We examine this for Singapore, which started in
2009 to target price growth over ten rounds in total. We find that welfare from
housing wealth in the last round might not be higher than before 2009. This
depends on the deflator used to convert nominal into real prices. Irrespective
of the deflator, we can reject that welfare increased monotonically over the
different rounds.
Keywords:
house price distribution, stochastic dominance tests
JEL Classification:
R31, C31, C55
IRTG1792DP2019 002
Information Arrival, News Sentiment, Volatilities and Jumps of Intraday Returns
Ya Qian
Jun Tu
Wolfgang Karl Härdle
Abstract:
This work aims to investigate the (inter)relations of information arrival, news
sentiment, volatilities and jump dynamics of intraday returns. Two parametric
GARCH-type jump models which explicitly incorporate both news arrival and news
sentiment variables are proposed, among which one assumes news affecting
financial markets through the jump component while the other postulating the
GARCH component channel. In order to give the most-likely format of the
interactions between news arrival and stock market behaviors, these two models
are compared with several other easier versions of GARCH-type models based on
the calibration results on DJIA 30 stocks. The necessity to include news
processes in intraday stock volatility modeling is justified in our specific
calibration samples (2008 and 2013, respectively). While it is not as profitable
to model jump process separately as using simpler GARCH process with error
distribution capable to capture fat tail behaviors of financial time series. In
conclusion, our calibration results suggest GARCH-news model with skew-t
innovation distribution as the best candidate for intraday returns of large
stocks in US market, which means one can probably avoid the complicatedness of
modelling jump behavior by using a simplier skew-t error distribution assumption
instead, but it’s necessary to incorporate news variables.
Keywords:
information arrival, volatility modeling, jump, sentiment, GARCH
JEL Classification:
C52, C55, C58, G14
IRTG1792DP2019 003
Estimating low sampling frequency risk measure by high-frequency data
Niels Wesselhöfft
Wolfgang K. Härdle
Abstract:
Weekly, quarterly and yearly risk measures are crucial for risk reporting
according to Basel III and Solvency II. For the respective data frequencies, the
authors show in a simulation and backtest study that available data series are
not sufficient in order to estimate Value at Risk and Expected Shortfall
sufficiently, given confidence levels of 99.9% and 99.99%. Accordingly, this
paper presents a semi-parametric estimation method, rescaling data from high- to
low-frequency which allows to obtain significantly more data points for the
estimation of the respective risk measures. The presented methodology in the
α-stable framework, which is able to mimic multifractal behavior in asset
returns, provides tail events which never occurred in the original low-frequency
dataset.
Keywords:
high-frequency, multifractal, stable distribution, rescaling, risk management,
Value at Risk, quantile distribution
JEL Classification:
C14, C22, C46, C53, G32
IRTG1792DP2019 004
Constrained Kelly portfolios under alpha-stable laws
Niels Wesselhöfft
Wolfgang K. Härdle
Abstract:
This paper provides a detailed framework for modeling portfolios, achieving the
highest growth rate under subjective risk constraints such as Value at Risk
(VaR) in the presence of stable laws. Although the maximization of the expected
logarithm of wealth induces outperforming any other significantly different
strategy, the Kelly Criterion implies larger bets than a risk-averse investor
would accept. Restricting the Kelly optimization by spectral risk measures, the
authors provide a generalized mapping for different measures of growth and
security. Analyzing over 30 years of S&P 500 returns for different sampling
frequencies, the authors find evidence for leptokurtic behavior for all
respective sampling frequencies. Given that lower sampling frequencies imply a
smaller number of data points, this paper argues in favor of α-stable laws and
its scaling behavior to model financial market returns for a given horizon in an
i.i.d. world. Instead of simulating from the class of elliptically stable
distributions, a nonparametric scaling approximation, based on the data-set
itself, is proposed. Our paper also uncovers that including long put options
into the portfolio optimization, improves the growth criterion for a given
security level, leading to a new Kelly portfolio providing the highest geometric
mean.
Keywords:
growth-optimal, Kelly criterion, protective put, portfolio optimization, stable
distribution, Value at Risk
JEL Classification:
C13, C46, C61, C73, G11
IRTG1792DP2019 005
Usage Continuance in Software-as-a-Service
Elias Baumann
Jana Kern
Stefan Lessmann
Abstract:
Software-as-a-service applications are experiencing immense growth as their
comparatively low cost makes them an important alternative to traditional
software. Following the initial adoption phase, vendors are now concerned with
the continued usage of their software. To analyze the influence of different
measures to improve continued usage over time, a longitudinal study design using
data from a SaaS vendor was implemented. By employing a linear mixed model, the
study finds several measures to have a positive effect on a software’s usage
penetration. In addition to these activation measures performed by the SaaS
vendor, software as well as client characteristics were likewise examined but
did not display significant estimates. In summary the study contributes novel
insights into the scarcely researched field of influencing factors on SaaS usage
continuance.
Keywords:
Linear Mixed Models Software-as-a-Service Usage Continuance
JEL Classification:
C00
IRTG1792DP2019 006
Adaptive Nonparametric Community Detection
Larisa Adamyan
Kirill Efimov
Vladimir Spokoiny
Abstract:
Understanding the topological structure of real world networks is of huge
interest in a variety of fields. One of the way to investigate this structure is
to find the groups of densely connected nodes called communities. This paper
presents a new non-parametric method of community detection in networks called
Adaptive Weights Community Detection. The idea of the algorithm is to associate
a local community for each node. On every iteration the algorithm tests a
hypothesis that two nodes are in the same community by comparing their local
communities. The test rejects the hypothesis if the density of edges between
these two local communities is lower than the density inside each one. A
detailed performance analysis of the method shows its dominance over state-of-
the-art methods on well known artificial and real world benchmarks.
Keywords:
Adaptive weights, Gap coefficient, Graph clustering, Nonparametric, Overlapping
communities
JEL Classification:
C00
IRTG1792DP2019 008
Forex Exchange Rate Forecasting Using Deep Recurrent Neural Networks
Alexander J. Dautel
Wolfgang K. Härdle
Stefan Lessmann
Hsin-Vonn Seow
Abstract:
Deep learning has substantially advanced the state-of-the-art in computer
vision, natural language processing and other elds. The paper examines the
potential of contemporary recurrent deep learning architectures for nancial
time series forecasting. Considering the foreign exchange market as testbed, we
systematically compare long short-term memory networks and gated recurrent units
to traditional recurrent architectures as well as feedforward networks in terms
of their directional forecasting accuracy and the profitability of trading model
predictions. Empirical results indicate the suitability of deep networks for
exchange rate forecasting in general but also evidence the diculty of
implementing and tuning corresponding architectures. Especially with regard to
trading pro t, a simpler neural network may perform as well as if not better
than a more complex deep neural network.
Keywords:
Deep learning, Financial time series forecasting, Recurrent neural networks,
Foreign exchange rates
JEL Classification:
C00
IRTG1792DP2019 009
Dynamic Network Perspective of Cryptocurrencies
Li Guo
Yubo Tao
Wolfgang K. Härdle
Abstract:
Cryptocurrencies are becoming an attractive asset class and are the focus of
recent quantitative research. The joint dynamics of the cryptocurrency market
yields information on network risk. Utilizing the adaptive LASSO approach, we
build a dynamic network of cryptocurrencies and model the latent communities
with a dynamic stochastic blockmodel. We develop a dynamic covariate-assisted
spectral clustering method to uniformly estimate the latent group membership of
cryptocurrencies consistently. We show that return inter-predictability and
crypto characteristics, including hashing algorithms and proof types, jointly
determine the crypto market segmentation. Based on this classification result,
it is natural to employ eigenvector centrality to identify a cryptocurrency’s
idiosyncratic risk. An asset pricing analysis finds that a cross-sectional
portfolio with a higher centrality earns a higher risk premium. Further tests
confirm that centrality serves as a risk factor well and delivers valuable
information content on cryptocurrency markets.
Keywords:
Community Detection, Dynamic Stochastic Blockmodel, Spectral Clustering, Node
Covariate, Return Predictability, Portfolio Management
JEL Classification:
C00
IRTG1792DP2019 010
Understanding the Role of Housing in Inequality and Social Mobility
Yang Tang
Xinwen Ni
Abstract:
Housing typically takes up a major proportion of households' expenditure, and
thus it certainly plays a critical role in shaping the pattern of income in-
equality and social mobility. Whether high housing price-to-rent ratio will am-
plify inequality and inhibit social class upgrading is still a controversial
issue in the existing literature. In this paper, we develop a partial
equilibrium life- cycle framework to address these issues. Agents in our economy
are divided into two social classes according to the initial human capital level
inherited from their parents. Those who belong to upper class will draw their
innate abilities from a distribution that rst order stochastically dominates
those from lower class. Throughout the entire lifecycle, agents make endogenous
human capital investment and housing tenure decisions. We calibrate the model to
mimic some stylized facts in the the real world counter part. Our simulation
results indicate an inverse-U pattern between housing price-to-rent ratio and
measures of income inequality, and as well as a U-shape pattern between price-
to-rent ratio and social mobility measured by Shorrocks Index. The implication
is that housing tends to amplify the inequality and slow down the social
mobility when houses can only be purchased by a small group of agents in the
economy. Moreover, our results also suggest that better quality of education as
a result of a higher return to human capital investment tends to dampen the role
of housing.
Keywords:
Income Inequality, Social Mobility, Price-to-rent ratio
JEL Classification:
C00
IRTG1792DP2019 011
The role of medical expenses in the saving decision of elderly: a life cycle
model
Xinwen Ni
Abstract:
In this paper, we develop an multi period overlapping generation framework to
investigate agents' consumption and saving decisions, inequality and welfare
among elderly. We assume that agents are heterogeneous in the non-asset income
and the medical expenditure. In order to explicitly analyze the e ects of
medical expenditure, we conduct three counterfactual exercises. We successively
shut down the heterogeneity in labor income, in the level and in the dispersion
of medical expenses respectively. By comparing the benchmark with the
counterfactual results, we nd that in general wealth inequality decreases with
age, and income uncertainty contributes the most to wealth inequality. Both
average consumption and consumption inequality increase with age. Consumption
inequality largely tracks income inequality. Though uncertainty in medical
expenditures has little e ect on consumption inequality, a higher level of
medical expenditures may exacerbate consumption inequality. Meanwhile, the
average saving of elderly exhibits an inverse-U shape with age. The impacts on
average saving are similar both in benchmark and in counterfactual exercises.
Welfare increases with age.
Keywords:
Income Inequality, Social Mobility, Price-to-rent ratio
JEL Classification:
C00
IRTG1792DP2019 012
Voting for Health Insurance Policy: the U.S. versus Europe
Xinwen Ni
Abstract:
In this paper, we build an overlapping generation model to examine the reason
why developed countries with similar background have implemented different
social health insurance systems. We propose two hypotheses to explain this
phenomenon: (i) the different participation rates of the poor in the voting;
(ii) the distinct attitudes towards the size of the government and the existence
of a compulsory social health insurance system. Agents need to vote for one of
two policies: Policy I without Social Health Insurance (SHI) but with the
subsidy for the poor, and Policy II with fully covered SHI. By comparing either
their current utility or the expected life time utility, households will choose
one policy. We find that under Policy I, the derivative of the changes of
expected utility with respect to income is not monotonic. This means that both
the poorest and the richest dislike the social health insurance system. With the
calibrated parameters, we solve the benchmark and find that the public’s
attitude towards the size of the government and the lower representation of the
poor affect the election result. The changes in the minimum consumption level
under Policy I affect the voting results most, followed by the attitude. Voting
Participant rate plays the most insignificant role in the voting outcome. The
sensitivity analysis shows that our main findings are robust to the input
parameters.
Keywords:
Social Health Insurance, Voting
JEL Classification:
C00
IRTG1792DP2019 013
Inference of Break-Points in High-Dimensional Time Series
Likai Chen
Weining Wang
Wei Biao Wu
Abstract:
We consider a new procedure for detecting structural breaks in mean for high-
dimensional time series. We target breaks happening at unknown time points and
locations. In particular, at a fixed time point our method is concerned with
either the biggest break in one location or aggregating simultaneous breaks over
multiple locations. We allow for both big or small sized breaks, so that we can
1), stamp the dates and the locations of the breaks, 2), estimate the break
sizes and 3), make inference on the break sizes as well as the break dates. Our
theoretical setup incorporates both temporal and crosssectional dependence, and
is suitable for heavy-tailed innovations. We derive the asymptotic distribution
for the sizes of the breaks by extending the existing powerful theory on local
linear kernel estimation and high dimensional Gaussian approximation to allow
for trend stationary time series with jumps. A robust long-run covariance matrix
estimation is proposed, which can be of independent interest. An application on
detecting structural changes of the US unemployment rate is considered to
illustrate the usefulness of our method.
Keywords:
high-dimensional time series, multiple change-points, Gaussian approximation,
nonparametric estimation, heavy tailed, long-run covariance matrix
JEL Classification:
C00
IRTG1792DP2019 014
Forecasting in Blockchain-based Local Energy Markets
Michael Kostmann
Wolfgang K. Härdle
Abstract:
Increasingly volatile and distributed energy production challenge traditional
mechanisms to manage grid loads and price energy. Local energy markets (LEMs)
may be a response to those challenges as they can balance energy production and
consumption locally and may lower energy costs for consumers. Blockchain-based
LEMs provide a decentralized market to local energy consumer and prosumers. They
implement a market mechanism in the form of a smart contract without the need
for a central authority coordinating the market. Recently proposed blockchain-
based LEMs use auction designs to match future demand and supply. Thus, such
blockchain-based LEMs rely on accurate short-term forecasts of individual
households’ energy consumption and production. Often, such accurate forecasts
are simply assumed to be given. The present research tests this assumption.
First, by evaluating the forecast accuracy achievable with state-of-the-art
energy forecasting techniques for individual households and, second, by
assessing the effect of prediction errors on market outcomes in three different
supply scenarios. The evaluation shows that, although a LASSO regression model
is capable of achieving reasonably low forecasting errors, the costly settlement
of prediction errors can offset and even surpass the savings brought to
consumers by a blockchain-based LEM. This shows, that due to prediction errors,
participation in LEMs may be uneconomical for consumers, and thus, has to be
taken into consideration for pricing mechanisms in blockchain-based LEMs.
Keywords:
Blockchain; Local Energy Market; Smart Contract; Machine Learning; Household;
Energy Prediction; Prediction Errors; Market Mechanism
JEL Classification:
Q47; D44; D47; C53
IRTG1792DP2019 015
Media-expressed tone, Option Characteristics, and Stock Return Predictability
Cathy Yi-Hsuan Chen
Matthias R. Fengler
Wolfgang K. Härdle
Yanchu Liu
Abstract:
We distill tone from a huge assortment of NASDAQ articles to examine the
predictive power of media-expressed tone in single-stock option markets and
equity markets. We find that (1) option markets are impacted by media tone; (2)
option variables predict stock returns along with tone; (3) option variables
orthogonalized to public information and tone are more effective predictors of
stock returns; (4) overnight tone appears to be more informative than trading-
time tone, possibly due to a different thematic coverage of the trading versus
the overnight archive; (5) tone disagreement commands a strong positive risk
premium above and beyond market volatility.
Keywords:
option markets, equity markets, stock return predictability, media tone, topic
model
JEL Classification:
G12, G14, G41