linktest

A link test can be run after any single-equation estimation command (e.g., regress). The test is based on the idea that if a regression(-like) equation is properly specified no additional independent variables should be significant above chance. The link test looks for a specific type of specification error called a link error wherein< a dependent variable needs to be transformed (linked) to accurately relate to independent variable. The link test adds the squared independent variable to the model and tests for significance versus the nonsquared model. A model without a link error will have a nonsignificant t-test versus the unsquared version.

[R}linktest offers a relatively straightforward example of the basic idea. Consider following along.
First, we sysuse auto to load the sample dataset of interest. Next we regress mpg weight displ foreign which runs a regression using car weight, engine displacement and country of manufacture to predict miles per gallon. While the model looks alright, we next run linktest to make sure the squared term does not have more explanatory power.

We see from the t-test for hatsq that the squared term is a significant predictor. To fix this error, we use a common (useful) misinterpretation of the result interpreting the problem as indicating a misspecification of the independent variable (conditional on the specification). While we could try generating a new variable (generate weightsq=weight*weight) it might make more sense in this case to address the dependent variable, creating a new variable of gallons per mile ( gen galpermi=1/mpg) and running the regression in terms of weight and displacement (regress galpermii weight displ foreign). If we then run linktest

The t-test for hatsq is now insignificant, indicating that we have fixed the model so that it passes the link test.

To learn more about the link test, you may want to read href="http://www.jstor.org/sici?sici=0006-341X%281949%295%3A3%3C232%3AODOFFN%3E2.0.CO%3B2-H&origin=serialsolutions"> Tukey's original paper on the subject (from JSTOR) and/or Daryl Pregibon's Ph.D. dissertation (1979) Data Analytic: Methods for Generalized Linear Models which expanded on Tukey's idea and presents the test Stata uses (his easier to find, later (1980) publication presents a different version of the test not used by Stata).As usual, you can also try help linktest though the file is sparse.

Back to Estimation