Имеет ли метод индикаторной переменной преимущества перед анализом полных наблюдений при обработке пропусков в категориальных регрессорах?
If missingness is encountered in a categorical regressor, which approach is preferable: complete case analysis or the missing-indicator method? The former approach implies including in analysis (linear regression in our research) only the cases without missingness across analyzed variables. This approach is embedded in many statistical applications by default, and despite the opinion that its applicability is rather restricted, up-to-date studies provide evidence for its wide applicability – even to missingness not at random. The missing-indicator method, according to which missing data are replaced with a single valid value and a new missing-indicator variable is created, pretends to be an alternative that keeps a full sample available for analysis and, hypothetically, does not lead to the deterioration of parameter estimates. By means of simulated data and a statistical experiment, controlling the factors of missingness mechanism, missingness proportion, and a regression model’s specification, we compare parameter estimates produced by each approach to handling missingness – how biased and inefficient they are. According to the results, no approach leads to crucially biased estimates, but the missing-indicator method produces ineffective estimates.