StudySmarter - The all-in-one study app.
4.8 • +11k Ratings
More than 3 Million Downloads
Free
Your life is made up of constraints on your time. When you go to work, how much time you spend studying, and the amount of sleep you need are all examples of constraints placed on you. You can think about how free you are in terms of how many constraints are placed upon you.
In statistics, there are constraints as well. The Chi Squared Tests use degrees of freedom to describe how free a test is based on the constraints placed on it. Read on to figure out how free the Chi Squared Test really is!
Many tests use degrees of freedom, but here you will see degrees of freedom as it relates to Chi Squared Tests. In general, the degrees of freedom is a way to measure how many test statistics you have calculated from the data. The more test statistics you have calculated using your sample, the less freedom you have to make choices with your data. Of course, there is a more formal way to describe these constraints as well.
Aconstraint, also called arestriction, is a requirement placed on the data by the model for the data.
Let's look at an example to see what that means in practice.
Suppose you are doing an experiment where you roll a four sided die \(200\) times. Then the sample size is \(n=200\). Oneconstraintis that your experiment needs the sample size to be \(200\).
The number of constraints will also depend on the number of parameters you need to describe a distribution, and whether or not you know what these parameters are.
Next, let's look at how the constraints relate to degrees of freedom.
For most cases, the formula
degrees of freedom = number of observed frequencies - number of constraints
can be used. If you go back to the example with the four sided die above, there was one constraint. The number of observed frequencies is \(4\) (the number of sides on the die. So the degrees of freedom would be \(4-1 = 3\).
There is a more general formula for the degrees of freedom:
degrees of freedom = number of cells (after combining) - number of constraints.
你可能想知道什么是细胞及其原因you might combine it. Let's look at an example.
You send out a survey to \(200\) people asking how many pets people have. You get back the following table of responses.
Table 1. Responses from pet ownership survey.
Pets | \(0\) | \(1\) | \(2\) | \(3\) | \(4\) | \(>4\) |
Expected | \(60\) | \(72\) | \(31\) | \(20\) | \(7\) | \(10\) |
However, the model you are using is only a good approximation if none of the expected values falls below \(15\). So you could combine the last two columns of data (known as cells) into the table below.
表2。响应从宠物的ership survey with combined cells.
Pets | \(0\) | \(1\) | \(2\) | \(3\) | \(>3\) |
Expected | \(60\) | \(72\) | \(31\) | \(20\) | \(17\) |
Then there are \(5\) cells, and one constraint (that the total of the expected values is \(200\)). So the degrees of freedom is \(5 - 1= 4\).
You will usually only combine adjoining cells in your tables of data. Next, let's look at the official definition of degrees of freedom with the Chi-Squared distribution.
If you have a random variable \(X\) and want to do an approximation for the statistic \(X^2\), you would use the \(\chi^2\) family of distributions. This is written as
\[\begin{align} X^2 &= \sum \frac{(O_t - E_t)^2}{E_t} \\ &= \sum \frac{O_t ^2}{E_t} -N \\ & \sim \chi^2, \end{align}\]
where \(O_t\) is the observed frequency, \(E_t\) is the expected frequency, and \(N\) is the total number of observations. Remember that the Chi-Squared tests are only a good approximation if none of the expected frequencies is below \(5\).
For a reminder of this test and how to use it, see Chi Squared Tests.
The \(\chi^2\) distributions are actually a family of distributions that depend on the degrees of freedom. The degrees of freedom for this kind of distribution are written using the variable \(\nu\). Since you may need to combine cells when using \(\chi^2\) distributions, you would use the definition below.
For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by
\[ \nu = \text{number of cells after combining}-1.\]
There will be cases where cells won't be combined, and in that case, you can simplify things a bit. If you go back to the four sided die example, there are \(4\) possibilities that could come up on the die, and these are the expected values. So for this example \(\nu = 4 - 1 = 3\) even if you are using a Chi-Squared distribution to model it.
To be sure you know how many degrees of freedom you have when using the Chi-Squared distribution, it is written as a subscript: \(\chi^2_\nu \).
Once you know that you are using a Chi-Squared distribution with \(\nu\) degrees of freedom, you will need to use a degrees of freedom table so that you can do hypothesis tests. Here is a section out of a Chi-Squared table.
Table 3. Chi-Squared table.
degrees of freedom |
\(0.99\) |
\(0.95\) |
\(0.9\) |
\(0.1\) |
\(0.05\) |
\(0.01\) |
\(2\) |
\(0.020\) |
\(0.103\) |
\(0.211\) |
\(4.605\) |
\(5.991\) |
\(9.210\) |
\(3\) |
\(0.155\) |
\(0.352\) |
\(0.584\) |
\(6.251\) |
\(7.815\) |
\(11.345\) |
\(4\) |
\(0.297\) |
\(0.711\) |
\(1.064\) |
\(7.779\) |
\(9.488\) |
\(13.277\) |
The first column of the table contains the degrees of freedom, and the first row of the table are areas to the right of the critical value.
The notation for a critical value of \(\chi^2_\nu\) which is exceeded with probability \(a\%\) is \(\chi^2_\nu(a\%)\) or\(\chi^2_\nu(a/100)\).
Let's take an example using the Chi-Squared table.
Find the critical value for \(\chi^2_3(0.01)\).
Solution:
The notation for \(\chi^2_3(0.01)\) tells you that there are \(3\) degrees of freedom and you are interested in the \(0.01\) column of the table. Looking at the intersection of the row and column in the table above, you get \(11.345\). So
\[\chi^2_3(0.01) = 11.345 . \]
There is a second use for the table, as demonstrated in the next example.
Find the smallest value of \(y\) such that \(P(\chi^2_3 > y) = 0.95\).
Solution:
Remember that the significance level is the probability that the distribution exceeds the critical value. So asking for the smallest value \(y\) where\(P(\chi^2_3 > y) = 0.95\)is the same as asking what \(\chi^2_3(0.95)\) is. Using the Chi-Squared table you can see that \(\chi^2_3(0.95) =0.352 \), so \(y=0.352\).
Of course, a table can't list all of the possible values. If you need a value which is not in the table, there are many different statistics packages or calculators that can give you Chi-Squared table values.
The degrees of freedom in a \(t\)-test is calculated depending on if you are using paired samples or not. For more information on these topics, see the articlesT-distributionand Paired t-test.
For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by
\[ \nu = \text{number of cells after combining}-1.\]
这取决于的test you are doing. Sometimes it is the sample size minus 1, sometimes it is the sample size minus 2.
The degree of freedom is related to the sample size and the kind of test you are doing. For example in a paired t-test the degree of freedom is the sample size minus 1.
It is the number of degrees of freedom.
It tells you how many independent values that can vary without breaking any constraints in the problem.
In statistics, the degrees of freedom tells you how many independent values that can vary without breaking any constraints in the problem.
Be perfectly prepared on time with an individual plan.
Test your knowledge with gamified quizzes.
Create and find flashcards in record time.
Create beautiful notes faster than ever before.
Have all your study materials in one place.
Upload unlimited documents and save them online.
Identify your study strength and weaknesses.
Set individual study goals and earn points reaching them.
Stop procrastinating with our study reminders.
Earn points, unlock badges and level up while studying.
Create flashcards in notes completely automatically.
Create the most beautiful study materials using our templates.
Sign up to highlight and take notes. It’s 100% free.