Opportunities and Limitations of Digital Traces and Machine Learning Methods in Sociology

Authors

DOI:

https://doi.org/10.14515/monitoring.2021.1.1760

Keywords:

digital footprints, big data, machine learning, forecasting modeling, computational social sciences, computational sociology, data analysis, text analysis

Abstract

The article discusses the opportunities and limitations of using new data sources and methods of its collection, processing and analysis, namely, digital traces and machine learning in Sociology. At first, we examine the disadvantages of traditional data sources (surveys) and then, based on relevant and recent empirical studies, we discuss how these disadvantages can be overcome using digital traces. The main drawbacks of survey data are the reactivity, a small sample size, and rare frequency of surveys. Based on these drawbacks we identify types of research questions that can only be answered with digital traces. Finally, we also explore the disadvantages of digital traces: lack of representativeness, construct validity, external and internal interfering factors, and non-stationarity. Relying on recent methodological developments the paper explains how to take into consideration these limitations and how to adjust for them wherever possible.

Acknowledgments. The study was funded by the Russian Foundation for Basic Research (RFBR), project no. 20-311-90056.

Author Biographies

Mikhail B. Bogdanov, National Research University Higher School of Economics

  • National Research University Higher School of Economics, Moscow, Russia
    • Junior Research Fellow at the  Centre for Cultural Sociology, Institute of Education
    • PhD student in Sociology

Ivan B. Smirnov, National Research University Higher School of Economics

  • National Research University Higher School of Economics, Moscow, Russia
    • Leading Research Fellow, Head of the Computational Social Science Lab, Institute of Education

Published

2021-03-04

Issue

Section

METHODS AND METHODOLOGY