全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Big Data, Demography, and Causality

DOI: 10.4236/jss.2024.121012, PP. 181-206

Keywords: Big Data, Demography, Causality, Abduction, Deduction, Induction

Full-Text   Cite this paper   Add to My Lib

Abstract:

The objectives of this paper are to examine to what extent Big Data are presently used in population research and to consider their potential for causal inference. After examining the characteristics and challenges of big data, the subsequent section deals with the use of big data in the study of the key demographic phenomena and is based on a literature review for the period 2015-2022 of 63 scientific journals concerned with population issues. The final section examines to what extent the use of big data could improve causal inference. Our results show that demographers continue to privilege sources of numerical data and are less prone to use digital media data or other sources such as images. Big Data can contribute to improving explanations in demography thanks to the large number of observations and variables in the data sets, especially when they can be individually linked together. Causal knowledge requires however that one can propose and test a suitable mechanism explaining why a variation in one variable produces a variation in another variable.

References

[1]  Acolin, A., Decter-Frain, A., & Hall, M. (2022). Small-Area Estimates from Consumer Trace Data. Demographic Research, 47, 843-882.
https://doi.org/10.4054/DemRes.2022.47.27
[2]  Aizawa, T. (2020). Trajectory of Inequality of Opportunity in Child Height Growth: Evidence from the Young Lives Study. Demographic Research, 42, 165-202.
https://doi.org/10.4054/DemRes.2020.42.7
[3]  Aliseda, A. (2006). Abductive Reasoning. Logical Investigations into Discovery and Explanation. Springer.
https://doi.org/10.1007/1-4020-3907-7
[4]  Al-Mekhal, M., & Khwaja, A. A. (2019). A Synthesis of Big Data Definitions and Characteristics. In 2019 International Conference on Computational Sciences and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC) (pp. 314-322). IEEE.
https://doi.org/10.1109/CSE/EUC.2019.00067
[5]  Anderson, C. (2008). The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired.
https://www.wired.com/2008/06/pb-theory/
[6]  Arpino, B., Le Moglie, M., & Mencarini, L. (2022). What Tears Couples Apart: A Machine Learning Analysis of Union Dissolution in Germany. Demography, 59, 161-186.
https://doi.org/10.1215/00703370-9648346
[7]  Barés Gómez, C., & Fontaine, M. (2021). Medical Reasoning in Public Health Emergencies. Below High Standards of Accuracy. Teorema: Revista Internacional de Filosofía, 40, 151-173.
[8]  Barés Gómez, C., & Fontaine, M. (2022). Medical Reasoning and the GW Model of Abduction. In L. Magnani (Ed.), Handbook of Abductive Cognition (pp. 1-26). Springer.
https://doi.org/10.1007/978-3-030-68436-5_14-1
[9]  Baro, E., Degoul, S., Beuscart, R., & Chazard, E. (2015). Toward a Literature-Driven Definition of Big Data in Healthcare. BioMed Research International, 2015, Article ID: 639021.
https://doi.org/10.1155/2015/639021
[10]  Bijak, J. (2019). Editorial: P-Values, Theory, Replicability, and Rigour. Demographic Research, 41, 949-952.
https://doi.org/10.4054/DemRes.2019.41.32
[11]  Bijak, J. (2022). Towards Bayesian Model-Based Demography. Agency, Complexity and Uncertainty in Migration Studies. Springer.
https://doi.org/10.1007/978-3-030-83039-7
[12]  Billari, F. C. (2022). Demography: Fast and Slow. Population and Development Review, 48, 9-30.
https://doi.org/10.1111/padr.12464
[13]  Billari, F. C., D’Amuri, F., & Marcucci, J. (2013). Forecasting Births Using Google. In PAA, The Population Association of America Annual Meeting, Session 155: Methods and Models in Fertility Research (pp. 1-30). Population Association of America.
[14]  Bohon, S. A. (2018). Demography in the Big Data Revolution: Changing the Culture to Forge New Frontiers. Population Research and Policy Review, 37, 323-341.
https://doi.org/10.1007/s11113-018-9464-6
[15]  Bosco, C., Grubanov-Boskovic, S., Iacus, S.M., Minora, U., Sermi, F., & Spyratos, S. (2022). Data Innovation in Demogaphy, Migration and Human Mobility. EUR30907 EN, Publications Office of the European Union.
[16]  Boyd, D., & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication & Society, 15, 662-679.
https://doi.org/10.1080/1369118X.2012.678878
[17]  Breiman, L. (2001). Random Forests. Machine Learning, 45, 5-32.
https://doi.org/10.1023/A:1010933404324
[18]  Careja, R., & Bevelander, P. (2018). Using Population Registers for Migration and Integration Research: Examples from Denmark and Sweden. Comparative Migration Studies, 6, Article No. 19.
https://doi.org/10.1186/s40878-018-0076-4
[19]  Catellin, S. (2004). L’abduction: Une pratique de la découverte scientifique et littéraire. Hermès, La Revue, 39, 179-185.
https://doi.org/10.4267/2042/9480
[20]  Chakrabarti, P., & Frye, M. (2017). A Mixed-Methods Framework for Analyzing Text Data: Integrating Computational Techniques with Qualitative Methods in Demography. Demographic Research, 37, 1351-1382.
https://doi.org/10.4054/DemRes.2017.37.42
[21]  Connelly, R., Playford, C. J., Gayle, V., & Dibben, C. (2016). The Role of Administrative Data in the Big Data Revolution in Social Science Research. Social Science Research, 59, 1-12.
https://doi.org/10.1016/j.ssresearch.2016.04.015
[22]  Courgeau, D., Bijak, J., Franck, R., & Silverman, E. (2014). Are the Four Baconian Idols Still Alive in Demography? Revue Quetelet/Quetelet Journal, 2, 31-59.
https://doi.org/10.14428/rqj2014.02.02.02
[23]  Cox, M., & Ellsworth, D. (1997). Managing Big Data for Scientific Visualization. In IEEE, Proceedings of the 8th Conference on Visualization’97 (pp. 5-17). IEEE Computer Society Press.
[24]  Curien, N. (2022). Transitions de phase dans les graphes aléatoires: Une preuve inespérée. La Recherche, 570, 107-111.
[25]  Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating Noise to Sensitivity in Private Data Analysis. In S. Halevi, & T. Rabin (Eds.), Theory of Cryptography. TCC 2006. Lecture Notes in Computer Science (Vol. 3876, pp. 265-284). Springer.
https://doi.org/10.1007/11681878_14
[26]  Eureqa (2022). Reverse Engineering Dynamical Systems.
https://www.creativemachineslab.com/eureqa.html
[27]  Favaretto, M., De Clercq, E., Schneble, C. O., & Elger, B. S. (2020). What Is Your Definition of Big Data? Researchers’ Understanding of the Phenomenon of the Decade. PLOS ONE, 15, e0228987.
https://doi.org/10.1371/journal.pone.0228987
[28]  Gabbay, D., & Woods, J. (2005). The Reach of Abduction Insight and Trials (Vol. 2). Elsevier Science.
[29]  Gendronneau, C., Wisniowski, A., Yildiz, D., Zagheni, E., Fiorio, L., Hsiao, Y., Stepanek, M., Weber, I., Abel, G., & Hoorens, S. (2019). Measuring Labour Mobility and Migration Using Big Data. Exploring the Potential of Social-Media Data for Measuring EU Mobility Flows and Stocks of EU Movers. European Commission.
[30]  Gourbin, C., Wunsch, G., Moreau, L., Guillaume, A., & ECAF Team (2017). Direct and Indirect Paths Leading to Contraceptive Use in Urban Africa. Revue Quetelet/Quetelet Journal, 5, 33-71.
https://doi.org/10.14428/rqj2017.05.01.02
[31]  Guilmoto, C. Z. (2015). Mapping the Diversity of Gender Preferences and Sex Imbalances in Indonesia in 2010. Population Studies, 69, 299-315.
https://doi.org/10.1080/00324728.2015.1091603
[32]  Harron, K., Dibben, C., Boyd, J., Hjern, A., Azimaee, M., Barreto, M. L., & Goldstein, H. (2017). Challenges in Administrative Data Linkage for Research. Big Data & Society, 4, 1-12.
https://doi.org/10.1177/2053951717745678
[33]  Hauer, M. E., & Bohon, S. A. (2020). Causal Inference in Population Trends: Searching for Demographic Anomalies in Big Data.
https://doi.org/10.31235/osf.io/xn2v9
[34]  Heckman, J. J. (2008). Econometric Causality. International Statistical Review, 76, 1-27.
https://doi.org/10.1111/j.1751-5823.2007.00024.x
[35]  Ianni, M., Masciari, E., & Sperlí, G. (2021). A Survey of Big Data Dimensions vs Social Networks Analysis. Journal of Intelligent Information Systems, 57, 73-100.
https://doi.org/10.1007/s10844-020-00629-2
[36]  Illari, P., & Russo, F. (2014). Causality: Philosophical Theory Meets Scientific Practice. Oxford University Press.
[37]  Kashyap, R. (2021). Has Demography Witnessed a Data Revolution? Promises and Pitfalls of a Changing Data Ecosystem. Population Studies, 75, 47-75.
https://doi.org/10.1080/00324728.2021.1969031
[38]  Kashyap, R., & Zagheni, E. (2023). Chap. 17. Leveraging Digital and Computational Demography for Policy Insights. In E. Bertoni, M. Fontana, L. Gabrielli, S. Signorelli, & M. Vespe (Eds.), Handbook of Computational Social Science for Policy (pp. 327-343). Springer.
https://doi.org/10.1007/978-3-031-16624-2_17
[39]  Kashyap, R., Rinderknecht, R. G., Akbaritabar, A., Alburez-Guterriez, D., Gil-Clavel S., Grow, A., Kim, J., Leasure, D. R., Lohmann, S., Negraia D. V., Perrotta, D., Rampazzo, F., Tsai, C.-J., Verhagen, M. D., Zagheni, E., & Zhao, X. (2023). Digital and Computational Demography. In J. Skopek (Ed.), Research Handbook on Digital Sociology (pp. 47-85). Edward Elgar Publishing.
https://doi.org/10.4337/9781789906769.00010
[40]  Kitchin, R. (2014). Big Data, New Epistemologies and Paradigm Shifts. Big Data & Society, 1, 1-12.
https://doi.org/10.1177/2053951714528481
[41]  Kitchin, R., & McArdle, G. (2016). What Makes Big Data, Big Data? Exploring the Ontological Characteristics of 26 Datasets. Big Data & Society, 3, 1-10.
https://doi.org/10.1177/2053951716631130
[42]  Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity and Variety. Application Delivery Strategies, Meta Group, 3 p.
[43]  Lin, Y., & Xiao, N. (2023). Assessing the Impact of Differential Privacy on Population Uniques in Geographically Aggregated Data: The Case of the 2020 U.S. Census. Population Research and Policy Review, 42, Article No. 81.
https://doi.org/10.1007/s11113-023-09829-4
[44]  Magnani, L. (2017). The Abductive Structure of Scientific Creativity. Springer.
https://doi.org/10.1007/978-3-319-59256-5
[45]  Pearl, J. (2000). Causality. Models, Reasoning, and Inference. Cambridge University Press.
[46]  Pietsch, W. (2021). Big Data. In Elements Philosophy of Science. Cambridge University Press.
https://doi.org/10.1017/9781108588676
[47]  Rampazzo, F., Rango, M., & Weber, I. (2023). Chap. 18. New Migration Data: Challenges and Opportunities. In E. Bertoni, M. Fontana, L. Gabrielli, S. Signorelli, & M. Vespe (Eds.), Handbook of Computational Social Science for Policy (pp. 345-359). Springer.
https://doi.org/10.1007/978-3-031-16624-2_18
[48]  Rosenfeld, A., Sina, S., Sarne, D., Avidov, O., & Kraus, S. (2018). WhatsApp Usage Patterns and Prediction of Demographic Characteristics without Access to Message Content. Demographic Research, 39, 647-670.
https://doi.org/10.4054/DemRes.2018.39.22
[49]  Ruggles, S. (2014). Big Microdata for Population Research. Demography, 51, 287-297.
https://doi.org/10.1007/s13524-013-0240-2
[50]  Russo, F. (2009). Causality and Causal Modelling in the Social Sciences. Springer.
https://doi.org/10.1007/978-1-4020-8817-9
[51]  Russo, F. (2022). Techno-Scientific Practices: An Informational Approach. Rowman & Littlefield Publishers.
[52]  Russo, F., Wunsch, G., & Mouchart, M. (2019). Causality in the Social Sciences: A Structural Modelling Framework. Quality & Quantity, 53, 2575-2588.
https://doi.org/10.1007/s11135-019-00872-y
[53]  Santavirta, T., & Myrskyla, M. (2015). Reproductive Behavior Following Evacuation to Foster Care during World War II. Demographic Research, 33, 1-30.
https://doi.org/10.4054/DemRes.2015.33.1
[54]  Schmidt, M., & Lipson, H. (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324, 81-85.
https://doi.org/10.1126/science.1165893
[55]  Symons, J., & Alvarado, R. (2016). Can We Trust Big Data? Applying Philosophy of Science to Software. Big Data & Society, 3, 1-17.
https://doi.org/10.1177/2053951716664747
[56]  Timmins, K. A., Green, M. A., Radley, D., Morris, M. A., & Pearce, J. (2018). How Has Big Data Contributed to Obesity Research? A Review of the Literature. International Journal of Obesity, 42, 1951-1962.
https://doi.org/10.1038/s41366-018-0153-7
[57]  Titiunik, R. (2015). Can Big Data Solve the Fundamental Problem of Causal Inference? Political Science & Politics, 48, 75-79.
https://doi.org/10.1017/S1049096514001772
[58]  Tjaden, J. (2021). Measuring Migration 2.0: A Review of Digital Data Sources. Comparative Migration Studies, 9, Article No. 59.
https://doi.org/10.1186/s40878-021-00273-x
[59]  Twitter (2021). Four Truths about Bots. Common Thread.
[60]  U.S. Census Bureau (2021). Disclosure Avoidance for the 2020 Census: An Introduction. U.S. Government Publishing Office.
[61]  University of Wisconsin (2021). What Is Big Data? Data Sciences.
https://datasciencedegree.wisconsin.edu/data-science/what-is-big-data/
[62]  Wang, L. (2017). Heterogeneous Data and Big Data Analytics. Automatic Control and Information Sciences, 3, 8-15.
https://doi.org/10.12691/acis-3-1-3
[63]  Ward, J. S., & Barker, A. (2013). Undefined by Data: A Survey of Big Data Definitions. arXiv:1309.5821v1
[64]  Wood, A., Altman, M., Nissim, K., & Vadhan, S. (2020). Designing Access with Differential Privacy. In S. Cole, I. Dhaliwal, A. Sautmann, & L. Vilhuber (Eds.), Handbook on Using Administrative Data for Research and Evidence-Based Policy.
https://cyber.harvard.edu/story/2021-02/designing-access-differential-privacy
[65]  Wunsch, G., & Gourbin, C. (2020). Causal Assessment in Demographic Research. Genus, 76, Article No. 18.
https://doi.org/10.1186/s41118-020-00090-7
[66]  Xiang, B., Chen, E.-H., & Zhou, T. (2009). Finding Community Structure Based on Subgraph Similarity. In S. Fortunato, G. Mangioni, R. Menezes, & V. Nicosia (Eds.), Complex Networks. Studies in Computational Intelligence (pp. 73-81). Springer.
https://doi.org/10.1007/978-3-642-01206-8_7
[67]  Yang, K. C., Pierri, F., Hui, P. M., Axelrod, D., Torres-Lugo, C., Bryden, J., & Menczer, F. (2021). The COVID-19 Infodemic: Twitter versus Facebook. Big Data & Society, 8.
https://doi.org/10.1177/20539517211013861
[68]  Ylijoki, O., & Porras, J. (2016). Perspectives to Definition of Big Data: A Mapping Study and Discussion. Journal of Innovation Management, 4, 69-91.
https://doi.org/10.24840/2183-0606_004.001_0006

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413