Overcoverage occurs when individuals are registered as living in a country but in fact live elsewhere or have passed away without their death being recorded. This discrepancy leads to serious biases in demographic rates, negatively influencing policymaking and research. Despite the importance of addressing overcoverage, previous approaches have struggled to overcome significant challenges when applied to large populations. Many traditional methods rely on deterministic rules or assumptions, which fail to capture the complexity of migration patterns, temporary absences, and interactions between population registers. These methods are particularly limited in accounting for under-recorded life events, such as unregistered emigration or death.
We propose a novel approach to estimate the true population size and overcoverage using Swedish Population Registers. Our approach builds on capture-recapture (CR) models, formulated as a hidden Markov model (HMM), providing an efficient and scalable framework for model fitting. Unlike deterministic approaches, our model accounts for temporary emigration and integrates data from an arbitrary number of, possibly interacting, observation registers. By leveraging the flexibility of the HMM formulation, the model is capable of handling large, complex datasets while maintaining robust estimations of population dynamics and individual trajectories.
The model has been employed on data from Sweden, providing new insights into population mobility, the scope of overcoverage, and the demographic processes underlying it. This framework represents a significant advancement in studying overcoverage, offering a more accurate and generalizable method for understanding population dynamics at scale.