A Comparison of the Missing-Indicator Method and Complete Case Analysis in Case of Categorical Data

?

A Comparison of the Missing-Indicator Method and Complete Case Analysis in Case of Categorical Data

2019.

The research aims to provide a complex analysis of the missing-indicator method’s performance in case of a categorical independent variable in regression in comparison with complete case analysis. While the latter seems to be the most popular way to handle missing data, the former appears to be a simple and effective alternative that allows making a full sample available for analysis. By means of a statistical experiment and simulated data, we examined how these methods perform in conditions that differ in a mechanism of missingness, proportion of missing data, and model specification. The final results show that, overall, both methods produce unbiased estimates of regression coefficients, but crucially biased estimates of their standard errors and additional statistics such as R2, adjusted R2, and F-statistic, especially in case of a missing-indicator method. We explain these results by contribution of a missing-indicator variable, coefficient of which always turns out to be significant and far away from zero.

Priority areas: sociology

Language: English

Full text

Publication based on the results of:

Обоснование преимуществ поиска эффектов взаимодействия и их учета в социологических регрессионных моделях (2018)

How to choose an approach to handling missing categorical data: (un)expected findings from a simulated statistical experiment

Zhuchkova S., Rotmistrov A., Quality and Quantity 2022 Vol. 56 No. 1 P. 1–22

The study is devoted to a comparison of three approaches to handling missing data of categorical variables: complete case analysis, multiple imputation (based on random forest), and the missing-indicator method. Focusing on OLS regression, we describe how the choice of the approach depends on the missingness mechanism, its proportion, and model specification. The results of ...

Added: February 20, 2021

Имеет ли метод индикаторной переменной преимущества перед анализом полных наблюдений при обработке пропусков в категориальных регрессорах?

Zhuchkova S., Rotmistrov A., Shabanova E., Мониторинг общественного мнения: Экономические и социальные перемены 2021 № 4 С. 23–52

If missingness is encountered in a categorical regressor, which approach is preferable: complete case analysis or the missing-indicator method? The former approach implies including in analysis (linear regression in our research) only the cases without missingness across analyzed variables. This approach is embedded in many statistical applications by default, and despite the opinion that its ...

Added: December 12, 2020

Социальные медиа: о чем и кому пишут их пользователи? Некоторые подходы к анализу данных

Kotyrlo E., Прикладная эконометрика 2017 № 3 С. 74–99

Study of users and their segmentation, based on users’ preferred topics of discussion and their networking, is the unique opportunity offered by social networks. Variety of approaches to social media analysis based on social network analysis and text mining is summarized in the paper. It is extended by concentration index application and visualizing of the ...

Added: October 20, 2017

Применение многоуровневого регрессионного моделирования к межстрановым данным (на примере генерализованного доверия)

Vólchenko O., Shirokanova A., Социология: методология, методы, математическое моделирование 2016 № 43 С. 7–62

The paper deals with multilevel regression modelling (MLM) as a method preferred to the ordinary least-squares regression in the analysis of comparative data with hierarchical data structure. We present substantive reasons (contextual sources of heterogeneity, causal heterogeneity, and generalisability of results) and statistical reasons (obtaining more precise and reliable estimates) for multilevel modelling. We also ...

Added: August 9, 2017

Возможность работы с пропущенными данными при использовании CHAID: результаты статистического эксперимента

Zhuchkova S., Rotmistrov A., Социология: методология, методы, математическое моделирование 2018 № 46 С. 85–122

The paper is addressed to an approach to working with a missing data "as is". I.e. it is supposed that missing data becomes one more category of the exploring variable. Such an approach to working with missings is radically different from alternative approaches: they are to delete those observations which contain missings or replace missings ...

Added: September 17, 2018

Роль семьи в качестве канала межпоколенческой передачи традиций волонтёрства в современной России

Mersiyanova I. V., Malakhov D., Ivanova N., Экономическая социология 2019 Т. 20 № 3 С. 66–89

The paper focuses on the role of family in forming the consistency of vol-unteering traditions in contemporary Russia. The paper investigated the correlation between parental volunteering and the current volunteering of their children. International studies indicate that family impact on chil-dren’s attitude towards volunteering is a significant channel of intergenera-tional transmission of prosocial behavioral patterns. ...

Added: June 5, 2019

Регрессионные модели в оценке факторов международной миграции

Inglehart R. F., Ponarin E., Равлик М. В., Социологические исследования 2014 № 11 С. 23–33

Country-level variables that may significantly influence inflows of migrants in the world are analyzed by means of regression analysis. In particular, we find significant influence of non-economic factors (especially, education) on migration flows. The database in the form of bilateral migration flows for each country was constructed on the basis of the UN data on ...

Added: October 8, 2014

Инновативное поведение на работе - опыт построения социологического индекса

Климова С. Г., Galitskaya E., Galitsky E., Вестник Института социологии 2010 № 1 С. 328–351

The article explores the procedural aspect of constructing structural and logical typologies with the aim of creating the innovation index - workers attitudes guiding innovation and innovation -related behavior at workplace. ...

Added: November 12, 2012

Подходы к агрегированию результатов множественного заполнения пропусков: сравнительный анализ

Zangieva I., Suleimanova A., Социология: методология, методы, математическое моделирование 2016 № 42 С. 7–60

Multiple imputation is an approach to missing data elimination created by Donald Rubin. The purpose of multiple imputation is to reconstruct the initial structure of data, i.e. to generate the answers as close as possible to hypothetical complete dataset. However, the original algorithm of multiple imputation is complicated and demands a major amount of effort ...

Added: March 2, 2017

Влияние партийного состава парламента на социально-экономическую политику в регионе: эмпирический анализ

Ozhegov E. M., Митрохина Е. М., Вестник Пермского университета. Серия: Политология 2016 № 2 С. 61–81

The research analyzes the impact of the parliament structure on budget expenditures in Russian regions. According to the partisan theory, political parties represented in parliaments carry out a political course in concordance with their ideology. Thus, politics influences economy. Using regression analysis, in our paper we test the hypothesis that left political parties increase social expenditures. We use dataset containing ...

Added: August 17, 2016

Модель-ориентированный подход к отсутствующим значениям: множественная импутация в многоуровневой регрессии посредством R (на примере анализа опросных данных по гордости страной)

Fabrykant M., Социология: методология, методы, математическое моделирование 2015 № 41 С. 7–29

The article substantiates and describes the multiple imputation technique and its procedure of dealing with missing values in dataset. It presents the model-oriented approach as opposed to design-oriented approach to analyzing survey data. The theory section lists the benefits of multiple imputation and states why it should be preferred to other ways of dealing with ...

Added: June 3, 2016

Статистический подход к оцениванию качества жизни населения в регионах России

Mkhitarian V., Бакуменко Л. П., Вестник КазЭУ 2013 № 3 (93) С. 9–20

The authors analyzed the population life quality of some regions in Russian Federation with using of multivariate statistical analysis. The authors found that increasing population life quality, in particular, increasing life expectancy can be achieved by adjusting the demographic indicators, cash income, development of health, social and environmental security in the Volga Federal District. While ...

Added: March 17, 2014

Эконометрика

Radionova M. V., Фролова Н. В., Чичагов В. В., Пермь: Пермский государственный университет, 2011.

Пособие подготовлено авторами на основе опыта преподавания эконометрики для студентов экономических факультетов ПермГНИУ и НИУ ВШЭ – Пермь. В учебном пособии изложены основные сведения по разделам курса «Эконометрика». Помимо необходимого теоретического материала приведено много примеров практического применения теоретических результатов. Большое количество практических примеров, приложений и статистических таблиц, а также заданий для самостоятельной работы студентов призвано ...

Added: February 5, 2013

Approximation of Functions Defined in Tabular Form: Multicriteria Approach

A. P. Nelyubin, V. V. Podinovski, Computational Mathematics and Mathematical Physics 2023 Vol. 63 No. 5 P. 730–742

A new approach to estimating approximation parameters is developed. In this approach, the distance of the approximating function from a given finite set of points is estimated by a vector criterion the components of which are the absolute values of residuals at all points. Using this criterion, the remoteness preference relation is defined, and the nondominated function with ...

Added: June 9, 2023

Политические факторы распределения бюджетных средств в регионы РФ (2003-2012 гг.)

Kamolikova V., Актуальные вопросы права экономики и управления 2017 С. 184–187

The paper touches upon political, social and economic factors in 2003-2012 according to which federal authorities redistribute the budget transfers to support the regions of Russian Federation. The conducted regression analysis proved the strategy of bilateral financing using not only political factors, but also socioeconomic ones. The economic growth allowed to federal center to maintain ...

Added: October 21, 2017

Влияние ESG факторов при совершении сделок слияния и поглощения

Nazarova V., Айтюкова Ю. М., Токушева Л. Р., Финансы и бизнес 2022 Т. 18 № 1 С. 42–60

The article analyzes the impact of the ESG rating on the creation of value for shareholders of companies in developing countries (Russia and China are taken for the study). Findings based on a sample from 2007 to 2021 indicate that the ESG rating has a positive impact on the performance of acquiring stocks in both ...

Added: January 12, 2023

A method for reclassifying cause of death in cases categorized as “event of undetermined intent”

Andreev E. M., Shkolnikov V., Pridemore W. A. et al., Population Health Metrics 2015 Vol. 13 No. 23 P. 1–25

Background: We present a method for reclassifying external causes of death categorized as “event of undetermined intent” (EUIs) into non-transport accidents, suicides, or homicides. In nations like Russia and the UK the absolute number of EUIs is large, the EUI death rate is high, or EUIs comprise a non-trivial proportion of all deaths due to ...

Added: September 8, 2015

Политические факторы распределения бюджетных средств в регионы РФ (2003-2012 гг.)

Kamolikova V., В кн.: Актуальные вопросы права, экономики и управления: сборник статей VI Международной научно-практической конференции. Пенза: МЦНС "Наука и просвещение", 2017. С. 184–187.

Added: October 21, 2017

Актуальные вопросы права, экономики и управления: сборник статей VI Международной научно-практической конференции

Пенза: МЦНС "Наука и просвещение", 2017.

The paper touches upon political, social and economic factors in 2003-2012 according to which federal authorities redistribute the budget transfers to support the regions of Russian Federation. The conducted regression analysis proved the strategy of bilateral financing using not only political factors, but also socio-economic ones. The economic growth allowed to federal center to maintainwith ...

Added: October 21, 2017

Подготовка IT-специалистов в российских вузах: статистический анализ

Melikyan A., Вопросы статистики 2022 Т. 29 № 6 С. 74–83

The article presents the results of a statistical analysis of trends in the change in the number of students studying in IT specialties in Russian higher educational institutions, and also, based on the econometric methods of data analysis, characterizes universities with a high number and a significant proportion of students studying in information technology programs. ...

Added: December 12, 2022

Введение в эконометрику. Часть 2. Эконометрика и регрессионный анализ.

Polyakov K. L., Московский государственный институт электроники и математики, 2012.

Рассматривается использование методов регрессионного анализа при решении количественных задач в экономике. Приводится ряд теоретических результатов, необходимых для освоения других разделов эконометрики. Для студентов старших курсов, имеющих хорошую математическую и экономическую подготовку, аспирантов, а также инженеров, применяющих на практике методы регрессионного анализа. ...

Added: March 16, 2013

Различия в кадровой политике предприятий в контексте степени юнионизации работников: эмпирический анализ компаний строительной индустрии Свердловской области

Aleksandrova E., Калабина Е. Г., Арсланова Л. Ш., Вестник Омского университета. Серия: Экономика 2014 № 1 С. 34–42

On the basis of the analysis and generalization of the corresponding scientific literature, empirical researches and personal views of authors concerning questions of an unionization of workers, were revealed and proved theoretical installations which made methodological bases of this research. The study is based on an interdisciplinary approach to the study of the complex relationships ...

Added: October 10, 2015

Смертность по уровню образования в России

Pyankova A., Fattakhov T., Экономический журнал Высшей школы экономики 2017 Т. 21 № 4 С. 623–647

Earlier papers revealed educational differences in mortality in Russia in 1970’s–1980’s were at least as significant as in Western countries, and were largely similar to those observed in Eastern Bloc. Starting from 1998 there is a little knowledge about socio-economic characteristics of mortality since data collection has been discontinued and resumed only in 2011. Contemporary ...

Added: December 6, 2017

Востребованность образования взрослых и факторы, связанные с участием в нём: Россия на фоне стран ОЭСР

Popov D., Экономическая социология 2019 Т. 20 № 2 С. 122–153

Статья посвящена исследованию образования взрослых людей, которое они получают в течение жизни. Эффективное накопление и трансляция знаний — базовый социальный процесс, обеспечивающий функционирование всех общественных институтов, включая экономику. В литературе было показано, что участие взрослых в образовании обладает вполне измеримыми последствиями для жизни, в том числе улучшением экономического положения, социального самочувствия, состояния здоровья и сохранением ...

Added: July 30, 2019