# Hypothesis Testing (Contingency Table Test) (G Dataflow)

Last Modified: June 25, 2019

Tests whether the row and column categorical variables of a contingency table are independent.  ## table

Contingency table of counts or frequencies. ## significance level

Probability that this node incorrectly rejects a true null hypothesis.

Default: 0.05 ## error in

Error conditions that occur before this node runs.

The node responds to this input according to standard error behavior.

Standard Error Behavior

Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

error in does not contain an error error in contains an error  If no error occurred before the node runs, the node begins execution normally.

If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.

Default: No error ## null hypothesis rejected?

A Boolean that indicates whether this node rejects the null hypothesis.

 True p value is less than or equal to significance level. This node rejects the null hypothesis and accepts the alternative hypothesis. False p value is greater than significance level. This node accepts the null hypothesis and rejects the alternative hypothesis. ## p value

Smallest significance level that leads to rejection of the null hypothesis based on the sample sets. ## contingency table test information

Sample statistics of the contingency table test. ### degree of freedom

Degree of freedom of the chi-squared distribution that the test statistic follows. ### sample chi-squared value

Sample test statistic used in the contingency table test. ### chi-squared critical value

Chi-squared value that corresponds to significance level.

Algorithm for Calculating chi-squared critical value

chi-squared critical value satisfies the following equation:

Prob{X n > chi-squared critical value} = significance level

where X n represents a chi-squared distributed variate with n degrees of freedom. ## error out

Error information.

The node produces this output according to standard error behavior.

Standard Error Behavior

Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

error in does not contain an error error in contains an error  If no error occurred before the node runs, the node begins execution normally.

If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.

## Types of Tests with Contingency Tables

This node uses the Pearson's chi-squared test of homogeneity and test of independence to test your hypothesis.

For the chi-squared test of homogeneity, you take a random sample of a fixed size from each category in one categorization scheme. For each sample, categorize the objects of experimentation according to the second scheme and tally them. This node tests the hypothesis to determine whether the populations from which each sample is taken are identically distributed with respect to the second categorization scheme.

For the chi-squared test of independence, you take only one sample from the total population, categorize each object, and tally each object in two categorization schemes. This node tests the hypothesis that the categorization schemes are independent.

## Algorithm for Testing the Hypothesis of Contingency Tables

For both tests, this node tests the hypothesis with the same algorithm. Let y (p, q) be the number of occurrences in the (p q)th cell of the contingency table for p = 0, 1, ..., (s - 1) and q = 0, 1, ..., (k - 1).

And let

${y}_{p}=\sum _{q=0}^{k-1}{y}_{\left(p,\text{}q\right)}\phantom{\rule{0ex}{0ex}}{y}_{q}=\sum _{p=0}^{s-1}{y}_{\left(p,\text{}q\right)}\phantom{\rule{0ex}{0ex}}y=\sum _{p=0}^{s-1}\sum _{q=0}^{k-1}{y}_{\left(p,\text{}q\right)}\phantom{\rule{0ex}{0ex}}{e}_{\left(p,\text{}q\right)}=\frac{{y}_{p}{y}_{q}}{y}\phantom{\rule{0ex}{0ex}}x=\sum _{p=0}^{s-1}\sum _{q=0}^{k-1}\frac{{\left({y}_{\left(p,\text{}q\right)}-{e}_{\left(p,\text{}q\right)}\right)}^{2}}{{e}_{\left(p,\text{}q\right)}}$

where

• s is the number of rows in the contingency table
• k is the number of columns in the contingency table
• e (p, q) is the expected counts or frequencies in the (p q)th cell
• x is sample chi-squared value

This node uses sample chi-squared value to calculate p value according to the following equation:

p value = Prob{Xx}

where X is a random variable from the chi-squared distribution. If the hypothesis is true, x came from a chi-squared distribution with (s - 1) and (k - 1) degrees of freedom.

Where This Node Can Run:

Desktop OS: Windows

FPGA: Not supported

Web Server: Not supported in VIs that run in a web application