Paper "Challenges and Opportunities for Statistics in the Era of Data Science" by Kirch, Lahiri, Binder, Brannath, Cribben, Dette, Doebler, Feng, Gandy, Greven et al. accepted by Harvard Data Science Review
The paper Challenges and Opportunities for Statistics in the Era of Data Science is accepted by Harvard Data Science Review.
Authors: Claudia Kirch1, Soumendra Lahiri2, Harald Binder3, Werner Brannath4, Ivor Cribben5, Holger Dette6, Philipp Doebler7, Oliver Feng8, Axel Gandy9, Sonja Greven10, Barbara Hammer11, Stefan Harmeling7, Thomas Hotz12, Göran Kauermann13, Joscha Krause14, Georg Krempl15, Alicia Nieto-Reyes16, Ostap Okhrin17, Hernando Ombao18, Florian Pein19, Michal Pešta20, Dimitris Politis21, Li-Xuan Qin22, Tom Rainforth23, Holger Rauhut13, Henry Reeve24, David Salinas25, Johannes Schmidt-Hieber26, Clayton Scott27, Johan Segers28, Myra Spiliopoulou1 , Adalbert Wilhelm29, Ines Wilms30, Yi Yu31, Johannes Lederer32
1Otto-von-Guericke University Magdeburg, Germany, 2Washington-University in St. Louis, USA, 3Medical Faculty and Medical Center, University of Freiburg, Germany, 4University of Bremen, Germany, 5University of Alberta, Canada, 6Ruhr-University Bochum, Germany, 7TU Dortmund, Germany, 8University of Bath, UK, 9Imperial College London, UK, 10Humboldt-University Berlin, Germany, 11University of Bielefeld, Germany, 12Technische Universität Ilmenau, Germany, 13Ludwigs-Maximilian-University Munich, Germany, 14Trier-University, Germany, 15Utrecht University, Netherlands, 16University of Cantabria, Spain, 17TUD Dresden University of Technology, Germany, 18KAUST University, Saudi Arabia, 19Lancaster University, UK, 20Charles University, Prague, Czech Republic, 21University of California, San Diego, USA, 22Memorial Sloan Kettering Cancer Center, USA, 23University of Oxford, UK, 24University of Bristol, UK, 25ELLIS Institute Tübingen, Germany, 26University of Twente, Netherlands, 27University of Michigan, USA, 28KU Leuven, Belgium, 29Constructor University Bremen, Germany, 30Maastricht University, Netherlands, 31University of Warwick, UK, 32University of Hamburg, Germany
CRediT taxonomy:
• Claudia Kirch and Johannes Lederer: conceptualization, funding acquisition, project administration, writing original draft, writing review and editing
• Holger Dette and Barbara Hammer: conceptualization, funding acquisition, writing original draft
• Everybody: writing original draft
Abstract:
Statistics as a scientific discipline is currently facing the great challenge of finding its place in data science once more. While at the beginning of the last century, the development of the discipline of statistics was initiated by data-related research questions, nowadays, it is often viewed to have not kept up with the current developments in data science, which are largely focused on algorithmic, exploratory and computational aspects and often driven by other disciplines, such as computer science. However, statistics can—and should—contribute to the advances of data science. Of most interest are the strengths of statistics, such as the mathematical focus that leads to theoretical guarantees. This includes methods for formal modeling, hypothesis tests, uncertainty quantification and statistical inference. Of particular interest are also established statistical frameworks to handle causality or data deficiencies such as dependence, missingness, biases or confounding.
This paper summarizes the findings of a discussion workshop on the topic that was held in June 2023 in Hannover, Germany. The discussion centered around the following questions: How must statistics be set up so that it can contribute (more) to modern data science? In which direction should it develop further? Which strengths can already be used now? What conditions must be created so that this can succeed? What can be done to arrive at a common language? What is the added value of formal modeling, inference, and the mathematical perspective taken in statistics?