A Regression-constrained Optimization Approach to Estimating Suppressed Information using Time-series Data: Application to County Business Patterns 1999-2006
International Regional Science Review
Published online on September 18, 2013
Abstract
Most regional economic databases (e.g., US Economic Census and County Business Patterns [CBP]) have some employment records suppressed and then represented as ranges, in order to guarantee the confidentiality of the data. This article incorporates the implicit temporal relationships between annual employment data over several years into an optimization model designed to estimate suppressed records. This model minimizes (1) the sum of the deviations between the estimates and target values within the corresponding ranges and (2) the sum of the deviations between the estimates and an employment trend curve endogenously determined through absolute-value regression. The 1999–2006 CBP data for Arizona are used to test the model. Two decision-theoretic criteria (Pareto frontier and concordance–discordance analysis) are used to analyze the results, pointing to a specific set of parameters yielding the best estimates.