Defining Variable Construction

Read Complete Research Material

DEFINING VARIABLE CONSTRUCTION

Defining Variable Construction

Defining Variable Construction

Many researchers worry a great deal about incomplete data/missing data while other researchers seem not to worry at all. Such responses may indicate more about the people than about the problems. With this in mind I want to begin by stating: I like incomplete data and think there should be more of it. I do not intend to be facetious, but I do wish to make a key point: structural equation models do not require all variables to be measured on all individuals under all conditions. Such a notion means a significant change in cost/effort estimation philosophy of a software project.

In this article I outline some available structural equation models for dealing with incomplete data. I emphasize the term "incomplete," rather than the more common term "missing," because the latter term usually is associated with negative consequences or problems. Of course, this subtle terminology makes little difference and what really matters are the reasons why the data are incomplete, missing, or otherwise unobserved. I try to show how incomplete data can be a useful part of experimental analyses, how incomplete data can and probably should be dealt with, and how some experiments can actually benefit from having more incomplete data. I also weigh some potential costs and benefits of having incomplete data.

I present examples of four different kinds of incomplete data: (a) latent variables, (b) omitted variables, (c) randomly missing data, and (d) nonrandomly missing data. To illustrate these distinctions I use two structural factor path models (presented in Figures 1a and 1b). I will show how, in the example presented here, the two-factor path model 1b is clearly preferable to a one-factor model 1a. We can next examine some basic questions, such as: "What are structural factor models with incomplete data?", "Do incomplete data models actually work?", "When will these incomplete data models fail?", and "How can these models be useful in planning experiments?" To answer these questions I examine the results obtained for these same models with and without different subjects and variables. These sensitivity analyses display critical aspects of the available data, and also illustrate some useful information about fitting models with incomplete data.

The techniques I use here are not novel, but they are under-utilized. These demonstrations closely follow my previous work on longitudinal incomplete data models.1 Many of these ideas use the multivariate statistical models described. The identical incomplete data principles have also been used in a wide variety of other contexts, including survey methods (e.g., Madow, Olkin & Rubin, 1983), item-response theory (Mislevy, 1991), inter-battery measurement (e.g., Werts, Rock & Grandy, 1979) etc. I try to highlight similarities among these seemingly different methodologies.

The example here is based on a structural path analytic model with equality constraints on correlations, variances, and means. Model information will be accumulated over many independent groups, including the possibility of a separate specification equations for each individual score vector. Some model assumptions I make here ...
Related Ads