Statistical Methods for Detecting and Correcting Sample Selection Bias
-
Gary Cuddeback
-
College of Social Work
-
The University of Tennessee
-
128 Henson Hall
-
Knoxville TN 37996-3332
-
(865) 974-1707
-
FAX: (865) 974-1662
-
gcuddeba@utk.edu
-
-
Beth Wilson
-
College of Social Work
-
The University of Tennessee
-
128 Henson Hall
-
Knoxville TN 37996-3332
-
(865) 974-1707
-
FAX: (865) 974-1662
-
bethwilson@aol.com
-
-
John G. Orme
-
College of Social Work
-
The University of Tennessee
-
128 Henson Hall
-
Knoxville TN 37996-3332
-
(865) 974-1707
-
FAX: (865) 974-1662
-
jorme@utk.edu
-
-
Terri Combs-Orme
-
College of Social Work
-
The University of Tennessee
-
128 Henson Hall
-
Knoxville TN 37996-3332
-
(865) 974-1707
-
FAX: (865) 974-1662
-
tcombs-orme@utk.edu
Purpose: Researchers seldom realize 100% participation for any research
study. If participants and non-participants are systematically different
substantive results may be biased in unknown ways, and external or internal
validity may be compromised. Typically social work researchers use bivariate
tests to detect selection bias (e.g., c2 to compare the race of participants
and non-participants). Occasionally multiple regression methods are used
(e.g., logistic regression with participation/non-participation as the
dependent variable). Neither of these methods can be used to correct substantive
results for selection bias. Rather, subjective judgments are made about
the possible effects of selection bias on substantive results.
Methods: Sample selection models are a well-developed class
of econometric models that can be used to detect and correct for selection
bias, but these are rarely used in social work research. Data available
for participants and non-participants (e.g., demographic variables) are
used to model participation/non-participation (typically using binary probit
multiple regression). Simultaneously, a substantive model (no different
than would otherwise be tested) is estimated and corrected for selection
bias. A wide variety of statistical methods are available to estimate these
substantive models (e.g., linear regression, binary and multinomial logistic
regression), and so sample selection models can be used to analyze almost
all types of dependent variables.
Results: This presentation will: (1) give an overview of sample
selection models; (2) illustrate selected models using data from a study
of 230 foster families in which there was 70% participation; (3) compare
substantive results with and without the use of sample selection models;
(4) discuss computer software for estimating sample selection models; and
(5) direct conference participants to additional literature in this area.
Implications: Sample selection models can help further social
work research by providing researchers with methods of detecting and correcting
sample selection bias.