Reducing the Need for Heuristic Rules – An Iterative Algorithm for Imputing the Education Variable in SIAB
JOURNAL ARTICLE
Cite JOURNAL ARTICLE
Style
Format
Reducing the Need for Heuristic Rules – An Iterative Algorithm for Imputing the Education Variable in SIAB
Hutter, Christian | Möller, Joachim | Penninger, Marion
Journal of Contextual Economics – Schmollers Jahrbuch, Vol. 135 (2015), Iss. 3 : pp. 355–388
Additional Information
Article Details
Author Details
Christian Hutter, Institut für Arbeitsmarkt- und Berufsforschung, FB A2 – Prognosen und Strukturanalysen, Weddigenstraße 20 –22, 90478 Nürnberg
Joachim Möller, Institut für Arbeitsmarkt- und Berufsforschung, FB A2 – Prognosen und Strukturanalysen, Weddigenstraße 20 –22, 90478 Nürnberg
Marion Penninger, Institut für Arbeitsmarkt- und Berufsforschung, FB A2 – Prognosen und Strukturanalysen, Weddigenstraße 20 –22, 90478 Nürnberg
Abstract
The article proposes an iterative imputation algorithm based on the EM-Algorithm and employs it to improve the education variable in the Sample of Integrated Labour Market Biographies (SIAB), an administrative panel data set provided by the Institute for Employment Research (IAB). Since the education variable in SIAB is reported for statistical reasons only, it suffers from frequent inconsistent reports and a high and increasing share of missing values. Existing imputation procedures are mainly based on heuristic rules and there is no guidance of which procedure outperforms the others. Our iterative imputation algorithm reduces the role of heuristic decision rules and estimates the most likely educational or vocational status using information based on the employee"s whole employment biography. The resulting imputed education variable does not contain inconsistent reports. Furthermore, the share of missing spells is reduced by 87 percent. After imputation, the education variable shows better congruence to independent survey data (ALWA). The article focuses on the results for a (large) subgroup of SIAB (West German employees born after 1960 with a single main job). However, robustness checks reveal that the final education variable is stable with respect to different samples, termination criteria and control variables. Hence, we conclude that our imputation algorithm can serve as a blueprint for further expansions.