Menu Expand

The Impact of Cleansing Procedures and Coding Decisions for Overlaps on Estimation Results – Evidence from German Administrative Data

Cite JOURNAL ARTICLE

Style

Scioch, P. The Impact of Cleansing Procedures and Coding Decisions for Overlaps on Estimation Results – Evidence from German Administrative Data. Journal of Contextual Economics – Schmollers Jahrbuch, 130(4), 485-512. https://doi.org/10.3790/schm.130.4.485
Scioch, Patrycja "The Impact of Cleansing Procedures and Coding Decisions for Overlaps on Estimation Results – Evidence from German Administrative Data" Journal of Contextual Economics – Schmollers Jahrbuch 130.4, 2010, 485-512. https://doi.org/10.3790/schm.130.4.485
Scioch, Patrycja (2010): The Impact of Cleansing Procedures and Coding Decisions for Overlaps on Estimation Results – Evidence from German Administrative Data, in: Journal of Contextual Economics – Schmollers Jahrbuch, vol. 130, iss. 4, 485-512, [online] https://doi.org/10.3790/schm.130.4.485

Format

The Impact of Cleansing Procedures and Coding Decisions for Overlaps on Estimation Results – Evidence from German Administrative Data

Scioch, Patrycja

Journal of Contextual Economics – Schmollers Jahrbuch, Vol. 130 (2010), Iss. 4 : pp. 485–512

1 Citations (CrossRef)

Additional Information

Article Details

Author Details

Patrycja Scioch, Institut für Arbeitsmarkt- und Berufsforschung der Bundesagentur für Arbeit, Weddigenstraße 20 – 22, 90478 Nürnberg.

Cited By

  1. Improving DWP Assessment of the Relative Costs and Benefits of Employment Programmes

    Greenberg, David H.

    Knight, Genevieve

    Speckesser, Stephen

    Hevenstone, Debra

    (2011)

    https://doi.org/10.2139/ssrn.1966064 [Citations: 1]

Abstract

Process-generated and administrative datasets have become increasingly important for labor market research over the past ten years. Their major advantages are large sample sizes and the absence of retrospective gaps and unit non-response. Nevertheless, the quality and validity of these types of data remains unclear, and a great deal of preparation and data cleansing is necessary before the data can be analyzed. Unfortunately, few researchers explicitly describe the cleansing procedures or coding decisions used for this purpose, thus leaving their impact on the results unclear. The present paper focuses on the variation in research results resulting from different cleansing and coding procedures. The paper uses the framework of data preparation proposed by Wunsch / Lechner (2008) as a benchmark, and induces variation by developing different cleansing procedures and coding decisions for overlapping and parallel observations. The descriptive results show that the data sets (resulting from the different procedures) show varying ranges of difference for some attributes related to time and personal characteristics. Similar results emerge from the subsequent analysis of treatment effects, which do not vary in overall shape but in magnitude, especially during the lock-in effect. In sum, the results indicate that the empirical findings of evaluation studies based on matching algorithms are fairly robust to variations in the underlying method of data preparation.

Received: September 30, 2010

Accepted: January 21, 2011