?
A Comparison of the Missing-Indicator Method and Complete Case Analysis in Case of Categorical Data
The research aims to provide a complex analysis of the missing-indicator method’s performance in case of a categorical independent variable in regression in comparison with complete case analysis. While the latter seems to be the most popular way to handle missing data, the former appears to be a simple and effective alternative that allows making a full sample available for analysis. By means of a statistical experiment and simulated data, we examined how these methods perform in conditions that differ in a mechanism of missingness, proportion of missing data, and model specification. The final results show that, overall, both methods produce unbiased estimates of regression coefficients, but crucially biased estimates of their standard errors and additional statistics such as R2, adjusted R2, and F-statistic, especially in case of a missing-indicator method. We explain these results by contribution of a missing-indicator variable, coefficient of which always turns out to be significant and far away from zero.
Keywords: пропущенные данныерегрессионный анализregression analysismissing datacategorical datamissing indicator methodкатегориальные данныеметод индикаторной переменной
Publication based on the results of:
Zhuchkova S., Rotmistrov A., Quality and Quantity 2022 Vol. 56 No. 1 P. 1-22
The study is devoted to a comparison of three approaches to handling missing data of categorical variables: complete case analysis, multiple imputation (based on random forest), and the missing-indicator method. Focusing on OLS regression, we describe how the choice of the approach depends on the missingness mechanism, its proportion, and model specification. The results of ...
Added: February 20, 2021
Zhuchkova S., Rotmistrov A., Shabanova E., Мониторинг общественного мнения: Экономические и социальные перемены 2021 № 4 С. 23-52
If missingness is encountered in a categorical regressor, which approach is preferable: complete case analysis or the missing-indicator method? The former approach implies including in analysis (linear regression in our research) only the cases without missingness across analyzed variables. This approach is embedded in many statistical applications by default, and despite the opinion that its ...
Added: December 12, 2020
Kotyrlo E., Прикладная эконометрика 2017 № 3 С. 74-99
Study of users and their segmentation, based on users’ preferred topics of discussion and their networking, is the unique opportunity offered by social networks. Variety of approaches to social media analysis based on social network analysis and text mining is summarized in the paper. It is extended by concentration index application and visualizing of the ...
Added: October 20, 2017
Vólchenko O., Shirokanova A., Социология: методология, методы, математическое моделирование 2016 № 43 С. 7-62
The paper deals with multilevel regression modelling (MLM) as a method preferred to the ordinary least-squares regression in the analysis of comparative data with hierarchical data structure. We present substantive reasons (contextual sources of heterogeneity, causal heterogeneity, and generalisability of results) and statistical reasons (obtaining more precise and reliable estimates) for multilevel modelling. We also ...
Added: August 9, 2017
Zhuchkova S., Rotmistrov A., Социология: методология, методы, математическое моделирование 2018 № 46 С. 85-122
The paper is addressed to an approach to working with a missing data "as is". I.e. it is supposed that missing data becomes one more category of the exploring variable. Such an approach to working with missings is radically different from alternative approaches: they are to delete those observations which contain missings or replace missings ...
Added: September 17, 2018
Mersiyanova I. V., Malakhov D., Ivanova N., Экономическая социология 2019 Т. 20 № 3 С. 66-89
The paper focuses on the role of family in forming the consistency of vol-unteering traditions in contemporary Russia. The paper investigated the correlation between parental volunteering and the current volunteering of their children. International studies indicate that family impact on chil-dren’s attitude towards volunteering is a significant channel of intergenera-tional transmission of prosocial behavioral patterns. ...
Added: June 5, 2019
Inglehart R. F., Ponarin E., Равлик М. В., Социологические исследования 2014 № 11 С. 23-33
Country-level variables that may significantly influence inflows of migrants in the world are analyzed by means of regression analysis. In particular, we find significant influence of non-economic factors (especially, education) on migration flows. The database in the form of bilateral migration flows for each country was constructed on the basis of the UN data on ...
Added: October 8, 2014
Климова С. Г., Galitskaya E., Galitsky E., Вестник Института социологии 2010 № 1 С. 328-351
The article explores the procedural aspect of constructing structural and logical typologies with the aim of creating the innovation index - workers attitudes guiding innovation and innovation -related behavior at workplace. ...
Added: November 12, 2012
Zangieva I., Suleimanova A., Социология: методология, методы, математическое моделирование 2016 № 42 С. 7-60
Multiple imputation is an approach to missing data elimination created by Donald Rubin. The purpose of multiple imputation is to reconstruct the initial structure of data, i.e. to generate the answers as close as possible to hypothetical complete dataset. However, the original algorithm of multiple imputation is complicated and demands a major amount of effort ...
Added: March 2, 2017
Ozhegov E. M., Митрохина Е. М., Вестник Пермского университета. Серия: Политология 2016 № 2 С. 61-81
The research analyzes the impact of the parliament structure on budget expenditures
in Russian regions. According to the partisan theory, political parties represented
in parliaments carry out a political course in concordance with their ideology.
Thus, politics influences economy. Using regression analysis, in our paper we
test the hypothesis that left political parties increase social expenditures. We use
dataset containing ...
Added: August 17, 2016
Fabrykant M., Социология: методология, методы, математическое моделирование 2015 № 41 С. 7-29
The article substantiates and describes the multiple imputation technique and its procedure of dealing with missing values in dataset. It presents the model-oriented approach as opposed to design-oriented approach to analyzing survey data. The theory section lists the benefits of multiple imputation and states why it should be preferred to other ways of dealing with ...
Added: June 3, 2016
Mkhitarian V., Бакуменко Л. П., Вестник КазЭУ 2013 № 3 (93) С. 9-20
The authors analyzed the population life quality of some regions in Russian Federation with using of multivariate statistical analysis. The authors found that increasing population life quality, in particular, increasing life expectancy can be achieved by adjusting the demographic indicators, cash income, development of health, social and environmental security in the Volga Federal District. While ...
Added: March 17, 2014
Radionova M. V., Фролова Н. В., Чичагов В. В., Пермь : Пермский государственный университет, 2011
Пособие подготовлено авторами на основе опыта преподавания эконометрики для студентов экономических факультетов ПермГНИУ и НИУ ВШЭ – Пермь. В учебном пособии изложены основные сведения по разделам курса «Эконометрика». Помимо необходимого теоретического материала приведено много примеров практического применения теоретических результатов. Большое количество практических примеров, приложений и статистических таблиц, а также заданий для самостоятельной работы студентов призвано ...
Added: February 5, 2013
A. P. Nelyubin, V. V. Podinovski, Computational Mathematics and Mathematical Physics 2023 Vol. 63 No. 5 P. 730-742
A new approach to estimating approximation parameters is developed. In this approach, the
distance of the approximating function from a given finite set of points is estimated by a vector criterion
the components of which are the absolute values of residuals at all points. Using this criterion, the
remoteness preference relation is defined, and the nondominated function with ...
Added: June 9, 2023
Kamolikova V., Актуальные вопросы права экономики и управления 2017 С. 184-187
The paper touches upon political, social and economic factors in 2003-2012 according to which federal authorities redistribute the budget transfers to support the regions of Russian Federation. The conducted regression analysis proved the strategy of bilateral financing using not only political factors, but also socioeconomic ones. The economic growth allowed to federal center to maintain ...
Added: October 21, 2017
Nazarova V., Айтюкова Ю. М., Токушева Л. Р., Финансы и бизнес 2022 Т. 18 № 1 С. 42-60
The article analyzes the impact of the ESG rating on the creation of value for shareholders of companies in developing countries (Russia and China are taken for the study). Findings based on a sample from 2007 to 2021 indicate that the ESG rating has a positive impact on the performance of acquiring stocks in both ...
Added: January 12, 2023
Andreev E. M., Shkolnikov V., Pridemore W. A. et al., Population Health Metrics 2015 Vol. 13 No. 23 P. 1-25
Background: We present a method for reclassifying external causes of death categorized as “event of undetermined intent” (EUIs) into non-transport accidents, suicides, or homicides. In nations like Russia and the UK the absolute number of EUIs is large, the EUI death rate is high, or EUIs comprise a non-trivial proportion of all deaths due to ...
Added: September 8, 2015
Kamolikova V., В кн. : Актуальные вопросы права, экономики и управления: сборник статей VI Международной научно-практической конференции. : Пенза : МЦНС "Наука и просвещение", 2017. С. 184-187.
The paper touches upon political, social and economic factors in 2003-2012 according to which federal authorities redistribute the budget transfers to support the regions of Russian Federation. The conducted regression analysis proved the strategy of bilateral financing using not only political factors, but also socioeconomic ones. The economic growth allowed to federal center to maintain ...
Added: October 21, 2017
Пенза : МЦНС "Наука и просвещение", 2017
The paper touches upon political, social and economic factors in 2003-2012 according to which federal authorities redistribute the budget transfers to support the regions of Russian Federation. The conducted regression analysis proved the strategy of bilateral financing using not only political factors, but also socio-economic ones. The economic growth allowed to federal center to maintainwith ...
Added: October 21, 2017
Melikyan A., Вопросы статистики 2022 Т. 29 № 6 С. 74-83
The article presents the results of a statistical analysis of trends in the change in the number of students studying in IT specialties in Russian higher educational institutions, and also, based on the econometric methods of data analysis, characterizes universities with a high number and a significant proportion of students studying in information technology programs. ...
Added: December 12, 2022
Polyakov K. L., Московский государственный институт электроники и математики, 2012
Рассматривается использование методов регрессионного анализа при решении количественных задач в экономике. Приводится ряд теоретических результатов, необходимых для освоения других разделов эконометрики.
Для студентов старших курсов, имеющих хорошую математическую и экономическую подготовку, аспирантов, а также инженеров, применяющих на практике методы регрессионного анализа. ...
Added: March 16, 2013
Aleksandrova E., Калабина Е. Г., Арсланова Л. Ш., Вестник Омского университета. Серия: Экономика 2014 № 1 С. 34-42
On the basis of the analysis and generalization of the corresponding scientific literature, empirical researches and personal views of authors concerning questions of an unionization of workers, were revealed and proved theoretical installations which made methodological bases of this research. The study is based on an interdisciplinary approach to the study of the complex relationships ...
Added: October 10, 2015
Pyankova A., Fattakhov T., Экономический журнал Высшей школы экономики 2017 Т. 21 № 4 С. 623-647
Earlier papers revealed educational differences in mortality in Russia in 1970’s–1980’s were at least as significant as in Western countries, and were largely similar to those observed in Eastern Bloc. Starting from 1998 there is a little knowledge about socio-economic characteristics of mortality since data collection has been discontinued and resumed only in 2011. Contemporary ...
Added: December 6, 2017
Popov D., Экономическая социология 2019 Т. 20 № 2 С. 122-153
Статья посвящена исследованию образования взрослых людей, которое они получают в течение жизни. Эффективное накопление и трансляция знаний — базовый социальный процесс, обеспечивающий функционирование всех общественных институтов, включая экономику. В литературе было показано, что участие взрослых в образовании обладает вполне измеримыми последствиями для жизни, в том числе улучшением экономического положения, социального самочувствия, состояния здоровья и сохранением ...
Added: July 30, 2019