Linked Open Data technologies for publication of census microdata
Journal of the American Society for Information Science and Technology
Published online on June 20, 2013
Abstract
Censuses are one of the most relevant types of statistical data, allowing analyses of the population in terms of demography, economy, sociology, and culture. For fine‐grained analysis, census agencies publish census microdata that consist of a sample of individual records of the census containing detailed anonymous individual information. Working with microdata from different censuses and doing comparative studies are currently difficult tasks due to the diversity of formats and granularities. In this article, we show that novel data processing techniques can be applied to make census microdata interoperable and easy to access and combine. In fact, we demonstrate how Linked Open Data principles, a set of techniques to publish and make connections of (semi‐)structured data on the web, can be fruitfully applied to census microdata. We present a step‐by‐step process to achieve this goal and we study, in theory and practice, two real case studies: the 2001 Spanish census and a general framework for Integrated Public Use Microdata Series (IPUMS‐I).