Chi Square (Functioning and hypothesis testing)

  • Jul 26, 2021
click fraud protection

The Chi Square It is the most knowledgeable test and one of the most used to perform the analysis of qualitative variables. Its name comes from the probability distribution on which it is based and its usefulness allows the evaluation of the independence between two nominal variables u ordinal, providing a method that verifies if the frequencies observed in each category are compatible, with the independence of the two variables.

In order to carry out the evaluation, the calculation of the values ​​that will indicate the absolute independence, this is called the expected frequency, which is going to be compared with the frequency of the sample.

Advertisements

This is a test that can only be applied in studies that are based on independent samples and if most of the Expected values ​​are greater than 5, since the expected values ​​are those that can show absolute independence between the two variables.

This test uses an approximation to its distribution, in order to evaluate the probability of a difference that is equal to or greater than the existing one between the data and the frequencies that are expected depending on the hypothesis null.

Advertisements

The accuracy of this evaluation will depend on whether the expected values ​​are not so small and, in the case of a lesser measure, that the contrast does not rise too high between them.

square chi

Advertisements

In this article you will find:

What is Chi Square for

This statistic serves to test the hypotheses related to the frequency distributions. In general, this test has the ability to contrast the observation of frequencies with the frequencies expected according to the null hypothesis.

Using this statistic you can test the association between two variables by using a hypothetical situation and the simulated data. It is also used to evaluate how good the result is for a theoretical distribution, by pretending to represent the real distribution of the data of a certain sample.

Advertisements

This is called evaluating the goodness of a fit and to test it it is necessary to see the measure of how the observed data fit within a theoretical or expected distribution. In this case, a second scenario and simulated data should be used.

Types of Chi square tests

It is a hypothesis testing, which can compare the distribution that observes the data with an expected distribution of the data. Due to this, there are various types of tests such as those mentioned below:

Advertisements

Chi-square goodness-of-fit test

This analysis is used to check how well a sample of categorical data fits a theoretical distribution.

For example, it is possible to check whether a die is fair by being rolled several times and using a goodness-of-fit test of Chi square in order to determine if the results proceed to follow a uniform distribution. In this sense, the statistic of this test manages to quantify the variation of the observed distribution of the counts in relation to the hypothetical distribution.

Chi square test of association and independence

For these tests the calculations are the same, however, the answer to the question that may be posed may be different.

  • The association test is used to determine if a variable is linked to another variable.
  • The independence test is used to indicate if the observed value of a variable depends on the value that can be observed of another variable.

Chi square considerations

This type of test, unlike others, does not establish restrictions on the number of modalities by variables and you don't need the number of rows and columns in the table to have to coincide.

Despite this, if you need a study to be conducted that is based on independent samples and when the expected values, all are greater than 5, since all the expected values ​​are usually those that demonstrate the absolute independence between the two variables.

Also, to use this type of test, the measurement level has to be higher or nominal. It does not have an upper limit, which means that it does not facilitate knowing the intensity of the correlation, therefore, the Chi square can take values ​​between zero and infinite. If, on the contrary, the sample increases, the value of this test also increases.

Chi square operation

As already mentioned, this test is used with the data that belong to a nominal and higher scale, therefore, from the Chi square one can arrive at establish a null hypothesis that asks for a specific probability distribution, as is the mathematical model of the population that has provided the show.

Once the hypothesis is obtained, the contrast must be carried out and to do it, the data must be available within a frequency table. The absolute frequency observed in each of the values ​​or intervals of values ​​must be indicated.

Thus, since the null hypothesis is assumed to be true for each value or interval of values, the absolute frequency must be calculated to obtain the expected frequency.

Chi square hypothesis test

The Chi square test It is part of the tests of goodness of contrasts or fit, which have the purpose of deciding if the acceptance of hypotheses is possible when a given sample comes from a certain population that has a specific probability distribution within the hypothesis null.

The contrasts are made up of the comparison of frequencies that are observed within the sample together with the theoretical or expected frequencies, in case the null hypothesis were true. In this way, the null hypothesis is rejected, if there is a significant difference between the observed frequencies and the expected frequencies.

instagram viewer