On the experience of applying Big data in political science
https://doi.org/10.31249/poln/2023.04.09
Abstract
The review article is devoted to the analysis of successes and challenges of Big Data application in political science. The first part discusses the ontological and epistemological foundations of Big Data and machine learning application in political science. In the second part, the author reviews representative results of political science research using Big Data. The third part deals with criticism and limitations of Big Data in political research. The author shows besides purely technical problems, such as incompleteness of available data, distortions due to the presence of bots, there are sufficient limitations to the application of big data for analyzing dispositional political actions.
References
1. Anderson C. The end of theory: The data deluge makes the scientific method obsolete. Wired magazine. 2008, July 16, P. 1–2. Mode of access: http://statlit.org/pdf/2008EndOfTheory-DataDelugeMakesScientificMethodObsolete-WiredMagazine.pdf (accessed: 10.07.2023).
2. Akhremenko A., Petrov A. Anger, identity or efficacy belief? Dynamics of motivation and participation in 2020 Belarusian protests. Polis. Political Studies, 2023, N 2, P. 138–153. (In Russ.) DOI: https://doi.org/10.17976/jpps/2023.02.10
3. Ash E., Gauthier G., Widmer P. Relatio: Text semantics capture political and economic narratives. Political Analysis. 2023, First View. P. 1–18. DOI: https://doi.org/10.1017/pan.2023.8
4. Athey S. Beyond prediction: Using big data for policy problems. Science. 2017. N 355, P. 483–485. DOI: https://doi.org/10.1126/science.aal432
5. Barclay F., Pichandy C., Venkat A., Sudhakaran S. India 2014: Facebook ‘like’ as a predictor of election outcomes. Asian Journal of Political Science. 2015, Vol 23, N 2, P. 134–160. DOI: https://doi.org/10.1080/02185377.2015.1020319
6. Benoit K., Munger K., Spirling A. Measuring and explaining political sophistication through textual complexity. American Journal of Political Science. 2019, Vol. 63, N 2, P. 491–508. DOI: https://doi.org/10.1111/ajps.12423
7. Bond R., Messing S. Quantifying social media’s political space: Estimating ideology from publicly revealed preferences on Facebook. American Political Science Review. 2015, Vol. 109, N 1, P. 62–78. DOI: https://doi.org/10.1017/S0003055414000525
8. Bonica A. A data-driven voter guide for US elections: Adapting quantitative measures of the preferences and priorities of political elites to help voters learn about candidates. RSF: The Russell Sage Foundation Journal of the Social Sciences. 2016, Vol. 2, N 7, P. 11–32. DOI: https://doi.org/10.7758/rsf.2016.2.7.02
9. Boullier D. The social sciences and the traces of big data. Revue francaise de science politique. 2015, Vol. 65, N 5, P. 805–828. DOI: https://doi.org/10.48550/arXiv.1607.05034
10. Boyd D., Crawford K. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society. 2012, Vol. 15, N 5, P. 662–679. DOI: https://doi.org/10.1080/1369118X.2012.678878
11. Ceron A., Curini L., Iacus S. Using sentiment analysis to monitor electoral campaigns: Method matters – evidence from the United States and Italy. Social Science Computer Review. 2015, Vol. 33, N 1, P. 3–20. DOI: https://doi.org/10.1177/0894439314521983
12. Cheeseman N., Klaas B. How to rig and election. Moscow: Bombora, 2021, 320 p. (In Russ.).
13. Ceron A., Curini L., Iacus S. Social media and elections: A meta-analysis of onlinebased electoral forecasts. In: Arzheimer K., Evans J., Lewis-Beck M.S. (eds.) Sage Handbook of electoral behaviour. London: SAGE Publications Ltd, 2017, P. 883– 903. DOI: https://doi.org/10.4135/9781473957978
14. Chatsiou K., Mikhaylov S.J. Deep learning for political science. arXiv preprint arXiv:2005.06540. 2020. Mode of access: https://arxiv.org/abs/2005.06540 (accessed: 10.07.2023).
15. Colaresi M., Mahmood Z. Do the robot: Lessons from machine learning to improve conflict forecasting. Journal of Peace Research. 2017, Vol. 54, N 2, P. 193–214. DOI: https://doi.org/10.1177/0022343316682065
16. Conover M.D., Davis C., Ferrara E., McKelvey K., Menczer F., Flammini A. The geospatial characteristics of a social movement communication network. PloS one. 2013, Vol. 8, N 3. DOI: https://doi.org/10.1371/journal.pone.0055957
17. Gandomi A., Haider M. Beyond the hype: big data concepts, methods, and analytics. International journal of information management. 2015, Vol. 35, N 2, P. 137–144. DOI: https://doi.org/10.1016/j.ijinfomgt.2014.10.007
18. Grimmer J., Stewart B.M. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political analysis. 2013, Vol. 21 N. 3, P. 267– 297. DOI: doi:10.1093/pan/mps028
19. Grossman J., Pedahzur A. Political science and big data: Structured data, unstructured data, and how to use them. Political science quarterly. 2020, Vol. 135, N 2, P. 225– 257. DOI: https://doi.org/10.1002/polq.13032
20. Karstens M., Soules M.J., Dietrich N. On the Replicability of Data Collection Using Online News Databases. PS: Political Science & Politics. 2023, Vol. 56, N 2, P. 265–272. DOI:10.1017/S1049096522001317
21. Katagiri A., Min E. The credibility of public and private signals: A document-based approach. American Political Science Review. 2019, Vol. 113, N 1, P. 156–172. DOI: https://doi.org/10.1017/S0003055418000643
22. Sang E.T., Bos J. Predicting the 2011 Dutch senate election results with twitter. Proceedings of the workshop on semantic analysis in social media. 2012. – Mode of access: https://www.let.rug.nl/bos/pubs/TjongBos2012EACL.pdf (accessed: 11.06.2023).
23. King G., Pan J., Roberts M.E. How censorship in China allows government criticism but silences collective expression. American political science Review. 2013, Vol. 107, N 2, P. 326–343. DOI: https://doi.org/10.1017/S0003055413000014
24. Kitchin R., McArdle G. What makes Big Data, Big Data? Exploring the ontological characteristics of 26 datasets. Big Data & Society. 2016, Vol. 3, N 1. DOI: https://doi.org/10.1177/205395171663113
25. Lazer D., Radford J. Data ex machina: introduction to big data. Annual Review of Sociology. 2017, Vol. 43, P. 19–39. DOI: https://doi.org/10.1146/annurev-soc-060116-053457
26. Lizardo O. Habit and the Explanation of Action. Journal for the Theory of Social Behaviour. 2021, Vol. 51, N 3, P. 391–411. DOI: https://doi.org/10.1111/jtsb.12273
27. Meng X-L. Statistical paradises and paradoxes in big data (i) law of large populations, big data paradox, and the 2016 US presidential election. The Annals of Applied Statistics. 2018, Vol. 12, N 2, P. 685–726. DOI: https://doi.org/10.1214/18-AOAS1161SF
28. Montgomery J.M., Olivella S., Potter J.D., Crisp B.F. An informed forensics approach to detecting vote irregularities. Political Analysis. 2015, Vol. 23, N 4, P. 488–505. DOI: https://doi.org/10.1093/pan/mpv023
29. Mueller H., Rauh C. Reading between the lines: Prediction of political violence using newspaper text. American Political Science Review. 2018, Vol. 112, N 2, P. 358– 375. DOI: https://doi.org/10.1017/S0003055417000570
30. Ng A., Soo K. Numsense! Data science for the layman: no math added. Piter, 2022, 208 p. (In Russ.).
31. Park B., Greene K., Colaresi M. Human rights are (increasingly) plural: Learning the changing taxonomy of human rights from large-scale text reveals information effects. American Political Science Review. 2020, Vol. 14, N 3, P. 888–910. DOI: https://doi.org/10.1017/S0003055420000258
32. Pavan E., Mainardi A. Striking, Marching, Tweeting. Studying how online networks change together with movements. Partecipazione e conflitto. 2018, Vol. 11, N 2, P. 394–422. DOI: 10.1285/i20356609v11i2p394
33. Pietsch W. Aspects of theory-ladenness in data-intensive science. Philosophy of Science. 2015, Vol. 82, N 5, P. 905–916. DOI: https://doi.org/10.1086/683328
34. Pietsch W. On the epistemology of data science: Conceptual tools for a new inductivism. Berlin; Heidelberg: Springer-Verlag. 2021, Vol. 148, 295 p. DOI: https://doi.org/10.1007/978-3-030-86442-2
35. Reis, J., Correia A., Murai F., Veloso ABenevenuto F. Supervised learning for fake news detection. IEEE Intelligent Systems. 2019, Vol. 34, N 2, P. 76–81. DOI: https://doi.org/10.1109/MIS.2019.2899143
36. Rettberg J.W. Algorithmic failure as a humanities methodology: Machine learning's mispredictions identify rich cases for qualitative analysis. Big Data & Society. 2022, Vol. 9, N 2. DOI: https://doi.org/10.1177/20539517221131290
37. Rotman A., Shalev M. Using location data from mobile phones to study participation in mass protests. Sociological Methods & Research. 2022, Vol. 51, N 3, P. 1357–1412. DOI: https://doi.org/10.1177/0049124120914926
38. Salganik M.J. Bit by bit: Social research in the digital age. Princeton: Princeton University Press, 2019, 448 p.
39. Vakhshtayn, V. Technics, or the Charm of Progress. St. Petersburg: European University press, 2021, 156 p. (In Russ.).
40. Wagschal U. Ettensperger F. Big Data in Social Sciences. In: Berg-Schlosser D., Badie B., Morlino L. (eds.) The SAGE Handbook of Political Science. London: SAGE Publications Ltd, 2020, P. 272–287. DOI: 10.4135/9781529714333.n19