Thursday, February 16, 2012

Degrees of freedom

In my seminar on Growth & Development today we discussed a paper where the sample size was fairly small, around 75 observations. The authors said due to the small sample size, they couldn't estimate models with a lot of regressors in them because of degrees of freedom issues.

Then they proceeded to investigate upwards of 30 variables, by using them one at a time! To "save" degrees of freedom!

Yikes!

First off, excluding relevant variables in the analysis biases results unless the variables are somehow orthogonal to each other, which is EXTREMELY unlikely.

Second, estimating 30 small regressions on the same sample does not actually save ANY degrees of freedom over estimating one big regression on the sample.

Sure you can say it does and use the nominal critical values in each case, but you are kidding yourself and misleading your readers.

Degrees of freedom are like cigarettes. Once you use them, they are gone. They can't be re-used over and over again.

Overall the paper reported well over 100 estimated coefficients. On 75 data points. In a ton of different regressions all with the same dependent variable. Used the nominal critical values in every case.

What is the critical value for a "t-stat" with negative 34 degrees of freedom?

Anyone?

Bueller?

7 comments:

Andrew said...

It's painful to read about this, even second-hand. This was a published paper?! Which journal allowed it?

Mungowitz said...

I'm with Andrew. Citation or it didn't happen.

Angus said...

forthcoming in a quite "respectable" journal. I'm trying to be the good blogger and not name names since my co-blogger is going all "lefty bedwetter" on everyone's ass!

1 out of 20, not bad said...

So I'm assuming they found that 5 of their coefficients were stat sig at the 0.05 level and 10 at the 0.1 level?

Norman said...

OK, you win. I thought worst was the public econ paper that had a grand total of 8 observations and then provided the astounding main result that all three of its coefficients were statistically insignificant.

On the other hand, this makes every rejection letter from now on a little bit more insulting.

Natalie said...

I heard a speaker at a conference say once say that "the little plus or minus sign you see in a survey, that's the confidence interval!"

Dr. Tufte said...

I agree that the paper is junk, or worse because it apparently has a seal of approval.

But, c'mon ... the author was worried about 75 observations and 30 estimates? That leaves 45 degrees of freedom. That's considered quite a lot in fields where data gathering is destructive (petroleum geology or some weapons testing).

We have too many people in social sciences teaching statistics based on their personal experience with very large survey data sets, where you can often find statistically significant results that are meaningless because the scale of the effect is so small.