What is covariance temporal precedence and internal validity?

Corresponding Author: Eli Tsukayama, University of Pennsylvania, 3701 Market St., Suite 219, Philadelphia, PA 19104, ude.nnepu.hcysp@tile

Copyright notice

The publisher's final edited version of this article is available free at Soc Psychol Personal Sci

Abstract

The predictive validity of personality for important life outcomes is well established, but conventional longitudinal analyses cannot rule out the possibility that unmeasured third-variable confounds fully account for the observed relationships. Longitudinal hierarchical linear models (HLM) with time-varying covariates allow each subject to serve as his or her own control, thus eliminating between-individual confounds. HLM also allows the directionality of the causal relationship to be tested by reversing time-lagged predictor and outcome variables. We illustrate these techniques through a series of models that demonstrate that within-individual changes in self-control over time predict subsequent changes in GPA but not vice-versa. The evidence supporting a causal role for self-control was not moderated by IQ, gender, ethnicity, or income. Further analyses rule out one time-varying confound: self-esteem. The analytic approach taken in this study provides the strongest evidence to date for the causal role of self-control in determining achievement.

Keywords: causal modeling, self-control, self-regulation, academic achievement

Beyond Prediction: Establishing the Causal Role of Self-Control in Academic Achievement

As every graduate student in psychology knows, correlation does not necessarily imply causation. What every graduate student may not know or remember is that even prediction does not necessarily imply causation. Thus, whereas the predictive validity of personality in general and of self-control in particular for important life outcomes is well-established (Duckworth & Seligman, 2005; Mischel, Shoda, & Rodriguez, 1989; Ozer & Benet-Martinez, 2006), no study has incontrovertibly ruled out the possibility that some unmeasured confound (i.e., a lurking third variable) fully accounts for the observed associations.

According to John Stuart Mill’s classical formulation (Shadish, Cook, & Campbell, 2002), establishing a causal relationship requires three criteria: (a) temporal precedence (i.e., the cause precedes the effect), (b) covariance (i.e., the cause and effect are related), and (c) disqualification of alternative explanations (i.e., no third variable accounts for the observed relationship). Random-assignment, double-blind, placebo-controlled experimental designs meet all three criteria when their assumptions are met (e.g., no attrition and perfect compliance). Temporal precedence is established by manipulating the hypothesized cause and measuring its subsequent effect on outcomes. Covariance between the hypothesized cause and outcome is established through statistical tests. Finally, potential third-variable confounds are controlled by randomly assigning participants to condition.

Manipulating personality in a random-assignment experiment could, in theory, establish its causal role for later outcomes but, alas, personality is not easily manipulated. To our knowledge, no empirical investigation to date has successfully manipulated trait-level self-control and measured subsequent effects on life outcomes. (However, see Bandura & Mischel, 1965; Diamond, Barnett, Thomas, & Munro, 2007; Rueda, Rothbart, McCandliss, Saccomanno, & Posner, 2005, for studies that make important strides in this direction.) Consequently, researchers interested in the relationship between self-control and life outcomes have resorted to correlational designs. Prospective longitudinal designs can satisfy temporal precedence, and a significant correlation, by definition, fulfills the covariance criterion. However, ruling out third variables remains a challenge. Although potential third variables can (and should) be anticipated, measured, and statistically controlled, it is theoretically impossible to be sure that one has measured all potential confounds.

Traditional longitudinal analyses that control for prior levels of the outcome can demonstrate that the relationship between x (e.g., self-control) at Time 1 and y (e.g., grade point average [GPA]) at Time 2 is not due to the cross-sectional correlation between x and y at Time 1 and the autocorrelation between y at Times 1 and 2 (i.e., y at Time 1 is treated as a third variable that is controlled by partialling out its correlation with y at Time 2). However, such designs do not rule out other potential third-variable confounds. For example, if self-control at Time 1 predicts GPA at Time 2 controlling for GPA at Time 1, then the interpretation is that individuals with higher levels of self-control at Time 1 have larger changes in GPA. Although the DV is a within-individual change, the IV is still a between-individual variable. Consequently, between-individual third variables can still confound the relationship. Most panel models (e.g., cross-lagged panel models with autoregressive paths) through path analysis or structural equation modeling are basically variants of this type of analysis.

Longitudinal growth-curve modeling using hierarchical linear models (HLM) offers a partial solution to the third-variable problem.1 In particular, longitudinal HLM offers an opportunity to have each subject serve as his or her own control, thus eliminating between-individual third-variable confounds. Using repeated measures of both predictors and outcomes, we can examine within-individual covariance of these variables over time. To illustrate, in this study we used HLM to model student achievement trajectories in 4 years and treat self-control as a time-varying covariate. Our findings provide the strongest evidence to date that self-control indeed causes—and doesn’t merely predict—academic performance.

Controlling for Time-Invariant Confounds

The lurking third-variable problem is rarely mentioned explicitly as a limitation in longitudinal studies, probably because one can never rule out all potential unmeasured confounds. But, we argue, one can and should at least admit the possibility of unmeasured third variables. Moreover, one should seek out innovations in statistical analysis that improve our ability, as Cronbach (1957) put it, to “observe and organize data from Nature’s experiments” (p. 672).

Growth curve analysis using HLM is one such innovation. HLM can, for example, model change in an outcome over time by estimating a growth curve for each individual. Each growth curve conveys information about an individual’s baseline (i.e., the intercept), and the change from one year to the next (i.e., the slope). When predictors are treated as time-varying covariates, they can be used to explain short-term deviations from the overall growth trajectory. By treating a predictor as a time-varying covariate in the prediction of trajectories, one can rule out the possibility of all time-invariant confounds (e.g., relatively stable variables such as socioeconomic status). Specifically, if short-term changes in a predictor predict subsequent short-term changes in achievement, a confounding variable z would have to predict these changes and also be tightly yoked to changes in the predictor over time (i.e., the confound and predictor would have to go up and down together in synchrony over time). HLM also allows the directionality of the causal relationship to be confirmed by reversing the time-lagged predictor and outcome variables. HLM with reversed time-lagged, time-varying covariates has been used in at least one short-term diary study (Almeida, Wethington, & Chandler, 1999), but to our knowledge this is the first long-term study that applies this technique.

This Study

In this study, we used HLM growth curve analysis with time-varying covariates to test the hypothesis that self-control plays a causal role in academic achievement. To increase reliability and validity, we employed a multisource approach to measurement. Parent-, teacher-, and self-report ratings of self-control were collected in the fall of 4 consecutive academic years. At the conclusion of each academic year, final GPA was recorded from school records. In addition, to demonstrate how to control for possible time-varying confounds, we replicated these analyses with self-esteem as a covariate. Measured IQ, gender, ethnicity, and socioeconomic status were tested as possible moderators.

Method

Participants

The participants were students from a socioeconomically and ethnically diverse public magnet school in a city in the northeastern United States. Students were admitted to this middle school based on prior grades and standardized test scores.

In Fall 2003, about 86% of the school’s 164 fifth-grade students (n = 142) elected to participate in a longitudinal study of character strengths and academic achievement in children. Signed child assent and parent consent forms that assured participants of the confidentiality of their data were received for all participants. In Fall 2004, 3 students left and 38 new students joined the cohort (now composed of sixth graders). In this revised cohort, 49 new students joined the study and 17 students did not reconsent. The final sample was composed of 189 participants who contributed data at least once in 2003 or 2004.

In October 2003, when the first set of self-control measures was administered, the mean age of participants was 10.52 years (SD = 0.37). Approximately 48.9% of participants were Caucasian, 29.6% were Black, 13.2% were Asian, 5.8% were Latino, and 1.6% were of other ethnic backgrounds. Of the participants, 54% were female; 24% were from low-income families, as indicated by participation in the federal lunch program.

Procedure

We collected data in the 4 school years between 2003 and 2007 (i.e., from fifth to eighth grades). Each fall, we collected self-report, parent, and teacher self-control questionnaires and self-report self-esteem questionnaires. Each spring, we recorded report card grades from school records. Approximately 57% of our sample had GPA and self-control data for all 4 years of the study, 26% had data for 3 years, 5% had data for 2 years, and 13% had data for 1 year. In Spring 2003, we collected IQ scores.

Measures

Self-control

The Brief Self-Control Scale (Tangney, Baumeister, & Boone, 2004) includes 13 items endorsed on a 5-point scale where 1 = not like me at all and 5 = very much like me (e.g., “I have a hard time breaking bad habits” and “I do certain things that are bad for me, if they are fun”). Parents and teachers completed a version of the Brief Self-Control Scale written in the third-person, with the student as the target (e.g., “This child/student has a hard time breaking bad habits”). To avoid confounding teacher ratings and teacher-determined grades, students’ homeroom advisors rather than course teachers completed the questionnaires. Observed internal reliabilities for self, parent, and teacher report self-control across the 4 years ranged from α = .83 to .96.

Self-esteem

The Rosenberg Self-Esteem Scale (Rosenberg, 1965) includes 10 items (e.g., “On the whole, I am satisfied with myself”) endorsed on a 4-point scale ranging from 1 = strongly disagree to 4 = strongly agree. Observed reliabilities across the four years ranged from α = .87 to .89.

Report card grades

GPA was measured on a 100-point scale. Overall GPA was used for statistical analyses.

Gender, ethnicity, and family income

We obtained gender, ethnicity, and home address information from school records. We used home address information in conjunction with data from the U.S. Bureau of the Census (2000) to estimate household income. To normalize the distribution for statistical analyses, we performed a natural log transformation on the income data.

IQ

As a measure of intelligence, we used the Otis-Lennon School Ability Test–Seventh Edition (Otis & Lennon, 1997) Level F. This 40-min, group-administered, paper-and-pencil test measures verbal, quantitative, and figural reasoning skills. The school ability index for this test is a standard score normalized according to the student’s age in months, with a mean of 100 and a standard deviation of 16. Normal curve equivalent scores were derived from percentile ranks for use in statistical analyses.

Data Analysis Strategy

We used HLM growth curve models to estimate baseline and annual change in observed outcomes for each student as they moved from fifth grade through eighth grade. Because the main goal of our analyses was to examine whether within-individual changes in self-control predicted changes in GPA, we included baseline and time-varying predictor variables to explore the potential causal relationship between self-control and achievement. To assess the possibility of bidirectional causality, we analyzed two sets of models: (a) GPA as the outcome with self-control as a time-varying covariate and (b) self-control as the outcome with GPA as a time-varying covariate. In both sets of models, the time-varying covariates were measured 6 months prior to the outcome (see Figure 1). Time-varying covariates were centered around the individuals’ means (see Raudenbush & Bryk, 2002, p. 31, for a discussion of centering). The individual mean was used as the baseline measure, which shows the long-term relationship between self-control and achievement, whereas the time varying measure shows the short-term (i.e., 6-month periodic) relationship between self-control and achievement. To improve interpretation of the models, we centered time at fifth grade (i.e., fifth grade = 0, sixth grade = 1, seventh grade = 2, etc.) so that the growth curve intercept for each student represented their baseline achievement (or self-control for the second model) for fifth grade.2

What is covariance temporal precedence and internal validity?

Open in a separate window

Figure 1

Illustration of self-control and grade point average (GPA) as time-varying covariates

Results

Self-Control

Cross-sectional correlations between parent, teacher, and self-report ratings of self-control ranged from r = .18 to .51. We created a composite measure of self-control by averaging parent, teacher, and self-report scores on the Brief Self-Control Scale. In the 4 years of the study, approximately 15% of assessments were missing either parent, teacher, or self-report ratings; in these cases, we averaged the two nonmissing scores. About 4% were missing two of these three scores; in these cases, we used the single nonmissing scores. Following Nunnally (1967), we calculated the reliability of these composite scores: .94, .95, .94, and .93, in Grades 5 through 8, respectively.

Descriptive Statistics

Mean differences in GPA across grades suggest that GPA increased slightly from fifth to sixth grade, sand then declined through eighth grade (see Table 1). Mean self-control and self-esteem both declined from fifth to eighth grade.

Table 1

Means, Standard Deviations, and Number of Participants Contributing Grade Point Average (GPA), Self-Control, and Self-Esteem Data at Each Time Point

GradeGPA


Self-Control
Self-Esteem
MSDnMSDnMSDn587.835.921424.070.511423.340.53138689.845.791654.050.531683.320.54168788.306.351563.870.531583.280.55149887.775.901573.900.531573.300.58144

Open in a separate window

Changes in Self-Control Prospectively Predict GPA

Preliminary analyses revealed that a quadratic growth model provided a better fit to the GPA data than a linear growth model did. Consequently, we added individual mean centered self-control as a time-varying covariate at Level 1 to a quadratic growth model of GPA. Centering self-control around an individual’s mean essentially removes between-individual information, so we added the individuals’ mean self-control back into the model as a grand mean centered predictor of the intercept at Level 2. We also added individuals’ mean self-control as a predictor of the slope and quadratic term. This modeling strategy allowed us to estimate the relationship between short-term changes in self-control and short-term changes in GPA along with the relationships between a student’s average self-control and baseline GPA and the rates of change in GPA.

Level 1—Within individual:

GPAti=π0i+π1i(Timeti)+π2i(Timeti2)+π3i(IndividualmeancenteredSCti)+εti

(1a)

Level 2—Between individual:

π0i = β00 + β01(Individual mean SCi) + ς0i

(1b)

π1i = β10 + β11(Individual mean SCi) + ς1i

(1c)

π2i = β20 + β21(Individual mean SCi) + ς2i

(1d)

π3i = β30

(1e)

Controlling for time, changes in self-control during middle school predicted changes in GPA, β30 = 1.81, t(610) = 4.47, p < .001.2 The associated effect-size correlation (see Rosenthal & Rosnow, 1991, p. 441) was reffect = .18, indicating a small-to-medium sized effect (Cohen, 1992) of short-term changes in self-control on changes in GPA 6 months later. At Level 2, individual differences in mean self-control were associated with individual differences in GPA, β01 = 5.35, t(187) = 6.12, p < .001, reffect = .47, suggesting a stronger relationship between a student’s average self-control and baseline achievement in fifth grade. Individual differences in mean self-control also predicted the slope, β11 = 1.83, t(187) = 2.25, p < .026, and the quadratic term, β21 = −0.52, t(187) = −2.17, p < .031, indicating that individuals with higher self-control tend to have a steeper increase in GPA between fifth and sixth grades but also a steeper decline from seventh to eight grades relative to their more impulsive peers.

Changes in GPA Do Not Prospectively Predict Self-Control

Using a linear growth model with self-control as the outcome, we added individual-mean centered GPA as a time-varying covariate at Level 1 and added the individual’s mean GPA into the model as a predictor of the intercept at Level 2. A quadratic term does not appear because this model used only the three time points where GPA was measured 6 months prior to self-control (see Figure 1).3

Level 1—Within individual:

Self − controlti = π0i + π1i(Timeti) + π2i(Individual mean centered GPAti) + εti

(2a)

Level 2—Between individual:

π0i = β00 + β01(Individual mean GPAi) + ς0i

(2b)

π1i = β10 + β11(Individual mean GPAi) + ς1i

(2c)

π2i = β20

(2d)

Short-term changes in GPA did not predict subsequent changes in self-control, β20 < 0.001, t(424) = 0.05, p = .96, reffect < .01. Although individual differences in mean GPA were associated with individual differences in baseline self-control, β01 = 0.05, t(162) = 4.61, p < .001, reffect = .34, mean GPA did not predict annual change in self-control, β11 = −0.001, t(162) = −0.30, p = .77, reffect = .02.

Mean Self-Control, IQ, Gender, Family Income, and Ethnicity Do Not Moderate the Effect of Self-Control on GPA

Next, we examined whether individual-level variables moderated the effect of within-individual changes in self-control on GPA by adding potential moderators at Level 2 in separate models.4

Level 1—Within individual:

GPAti=π0i+π1i(Timeti)+π2i(Timeti2)+π3i(IndividualmeancenteredSCti)+εti

(3a)

Level 2—Between individual:

π0i = β00 + β01(Individual mean SCi) + ς0i

(3b)

π1i = β10 + β11(Individual mean SCi) + ς1i

(3c)

π2i = β20 + β21(Individual mean SCi) + ς2i

(3d)

π3i = β30 + β31(Potential moderator)

(3e)

Although HLM can handle unbalanced or missing data at Level 1, higher levels cannot have missing predictors. Consequently, we excluded subjects who did not have Level 2 predictors from the respective analysis. As shown in Table 2, mean self-control, IQ, gender, and family income did not moderate the effect of within-individual changes in self-control on GPA. Because ethnicity was coded as a set of dummy variables, we conducted an omnibus General Linear Hypothesis test. Ethnicity did not moderate the effects of self-control on GPA as indicated by a nonsignificant General Linear Hypothesis test, χ2(4) = 3.94, p = .41. In all moderation analyses, the relationship between within-individual changes in self-control and within-individual changes in GPA (β30) remained significant.

Table 2

Moderation Analyses of the Within-Individual Effect of Self-Control on Grade Point Average (GPA) in Quadratic Growth Models With Self-Control as a Time-Varying Covariate

Level-2 PredictorCoefficientSEEffect Size rpnIndividual mean self-control, β31−0.590.93.03.53189IQ, β310.040.04.05.28141Gender, β310.640.79.03.39189Income, β310.530.83.03.53187Ethnicity186 Black, β311.040.89.05.2456 Latino, β32−0.251.65.01.8811 Asian, β33−0.871.22.03.4825 Other, β34−2.253.44.03.513

Open in a separate window

Changes in Self-Control Prospectively Predict GPA Controlling for Changes in Self-Esteem

The relationship between changes in self-esteem and changes in GPA was slight in magnitude and only marginally significant, β30 = 0.71, t(584) = 1.95, p = .052, reffect = .08, in a quadratic growth model with self-esteem as a time-varying covariate and individual mean self-esteem as a Level 2 predictor of the intercept, slope, and quadratic term. When both self-control and self-esteem were added simultaneously as time-varying covariates and Level 2 predictors of the intercept, slope, and quadratic term, the relationship between short-term changes in self-control and changes in GPA remained significant, β30 = 1.89, t(580) = 4.27, p < .001, reffect = .17, whereas the relationship between short-term changes in self-esteem and changes in GPA was insignificant, β31 = 0.39, t(580) = 1.08, p = .28, reffect = .04. Interestingly, this effect of self-esteem is marginally significant if self-control is not included in the model, suggesting that the effect of self-esteem may actually be explained by the effect of self-control. Considering that the multisource approach to measuring self-control might have provided an unfair advantage in terms of increased reliability, we conducted the analysis again with just self-reported self-control (as opposed to the composite measure). This analysis produced similar results, β30 = 1.52, t(579) = 4.76, p < .001, reffect = .19, and β31 = 0.07, t(579) = 0.18, p = .86, reffect = .01.

Discussion

This study provides the most rigorous evidence to date that self-control causally influences academic achievement. A series of HLM growth curve analyses demonstrate that changes in self-control over time predicted subsequent changes in GPA, but changes in GPA over time did not predict subsequent changes in self-control. The evidence supporting a causal role for self-control was not moderated by IQ, gender, ethnicity, or income. Third-variable confounds that were stable over time did not account for the observed relations between self-control and GPA. Further analyses ruled out at least one time-varying confound, self-esteem, which was associated with both self-control and academic achievement but did not account for the effect of self-control on achievement. Rather, results suggested that the effect of self-control may have accounted for the association between self-esteem and achievement.

Notwithstanding the adage that correlation does not imply causation, the social science literature is littered with studies that confuse the two. The urge to attribute causal status to a psychological variable x for determining later outcome y is particularly strong when design features rule out many threats to internal validity. For instance, well-designed, prospective, longitudinal studies often include the following features: (a) x is measured at Time 1 and y is measured at Time 2, (b) the effect of x on y is theoretically predicted, (c) outcome y is measured objectively, (d) the effect of x at Time 1 on y at Time 2 holds when controlling for y at Time 1, and (e) theoretically predicted third-variable confounds are controlled for. Whereas such studies (e.g., Duckworth & Seligman, 2005) provide robust evidence for predictive validity, they nevertheless are vulnerable to the possibility that an unmeasured third-variable confound fully accounts for the observed predictive relationship between x and y.

This longitudinal HLM study illustrated an innovative analytic strategy that effectively controlled for all time-invariant third-variable confounds. Furthermore, we tested and ruled out the possibility that self-esteem, which varies over time and is highly correlated with both self-report and composite measures of self-control, accounted for the observed findings. What our analyses did not rule out, however, is the possibility of an unmeasured time-varying third variable that changes in sync with self-control and causally determines subsequent academic performance. What might such time-varying third variables be? We can only speculate. Intrinsic interest in academic work? Self-efficacy expectations? Why modulations in these variables would be synchronized with changes in self-control is not obvious, but this investigation does not conclusively rule out these possibilities.

As mentioned earlier, a tightly controlled experimental study could go one (final) step further to establish the causal role of self-control for life outcomes. We hope that by providing the strongest evidence to date for the causal importance of self-control to objectively measured important outcomes, our findings underscore the theoretical and practical import of such intervention research.

Footnotes

1The term hierarchical linear models (HLM) refers to a general class of models that is known by several other designations, including mixed linear models, multilevel linear models, mixed-effects models, random-effects models, random-coefficient regression models, covariance-components models, and variance-components models. Many of these models are special cases of each other; however, we chose to use the term HLM because we believe it is the most descriptive and specific term for the models we use here and also the most common label used for this class of models in psychology. Although some may consider growth-curve models to be synonymous with HLM, growth-curve models is actually a special application of HLM that can also be estimated using structural equation models.

2Because first-quarter grade point average (GPA) was partially determined before self-control was measured each year, we also analyzed a model with the average of the second, third, and fourth quarter GPA as the outcome. The pattern of results was essentially identical to the model with cumulative GPA as the outcome: changes in self-control during middle school predicted changes in GPA, β30 = 1.28, t(607) = 2.93, p = .004; individuals with higher mean self-control had higher GPAs at fifth grade, β01 = 5.23, t(187) = 6.02, p < .001; individuals with higher self-control tended to have a steeper increase in GPA between fifth and sixth grades, β11 = 2.18, t(187) = 2.50, p < .014; but they also had a steeper decline from seventh to eighth grades relative to their more impulsive peers, β21 = −0.60, t(187) = −2.23, p < .027.

3Because our model included only three predictions of self-control from prior GPA, we also examined two linear growth models using only three predictions of GPA from prior self-control. Either using data from Grades 5, 6, and 7 or using data from Grades 6, 7, and 8, changes in self-control predicted changes in GPA: β20 = 2.43, t(455) = 4.23, p < .001, reffect = .19; and β20 = 1.35, t(470) = 3.16, p = .002, reffect = .14, respectively.

4These potential moderators vary between but not within individuals. Consequently, these between-individual differences can potentially moderate but not confound the within-individual effects. That is, the within-individual effects of self-control could be stronger (or weaker) for individuals with certain characteristics, but those characteristics cannot explain the within-individual effects (i.e., those characteristics are not third-variable confounds).

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interests with respect to the authorship and/or publication of this article.

Financial Disclosure/Funding

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305C050041-05 to the University of Pennsylvania. The opinions expressed are those of the authors and do not represent views of the U.S. Department of Education.

Is temporal precedence internal validity?

Temporal Precedence This is the process of establishing that the cause did indeed happen before the effect, providing a solution to the chicken and egg problem. To establish internal validity through temporal precedence, a researcher must establish which variable came first.

What is a temporal precedence?

temporal precedence, which is establishing that the cause (i.e., independent variable) occurs before the effect (i.e., outcome); 2. establishing that the cause and effect are related and/or covary; and. 3. establishing that there are no plausible alternative explanations.

What are the 3 criteria of establishing cause and effect relationship in research design?

The three criteria for establishing cause and effect – association, time ordering (or temporal precedence), and non-spuriousness – are familiar to most researchers from courses in research methods or statistics.

What does it mean that variables must have temporal precedence for causal inference?

Temporal precedence (causal inference condition 2) This condition extends the idea of causality further by stating that if A causes B, then A should come before B in time. That is, the cause should precede the effect, in time.