# ANOVA (Two-Way ANOVA) (G Dataflow)

Version:

Performs a two-way analysis of variance (ANOVA) and returns the effect of the levels of two factors and the interactions between the factors on the experimental outcome.

## level b

Number of levels in factor b. You must specify at least two levels. Otherwise, this node returns an error.

The sign of level b is set to positive if b is a fixed effect and negative if b is a random effect.

Default: 2

## level a

Number of levels in factor a. You must specify at least two levels. Otherwise, this node returns an error.

The sign of level a is set to positive if a is a fixed effect and negative if a is a random effect.

Default: 2

## x

All the observational data. You must specify an equal number of observations in each cell.

The total number of data points in x must equal the result of multiplying the number of levels in each factor and the number of observations per cell. Otherwise, this node returns an error. For example, if level a is 2, level b is 3, and observations per cell is 2, x must contain 12 data points.

## index a

The level of factor a to which the corresponding observation belongs.

This input converts input levels that do not begin with zero or input levels that have nonconsecutive values. For example, if you enter an index that contains the levels 3, 5, and 7, this input converts the levels to an index array with level values of 0, 1, and 2.

## index b

The level of factor b to which the corresponding observation belongs.

This input converts input levels that do not begin with zero or input levels that have nonconsecutive values. For example, if you enter an index that contains the levels 3, 5, and 7, this input converts the levels to an index array with level values of 0, 1, and 2.

## error in

Error conditions that occur before this node runs.

The node responds to this input according to standard error behavior.

Standard Error Behavior

Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

error in does not contain an error error in contains an error
If no error occurred before the node runs, the node begins execution normally.

If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.

Default: No error

## observations per cell

Number of observations in each cell. Each cell must contain at least one observation. Otherwise, this node returns an error.

Default: 1 — This node assumes that the interaction of factor a and factor b has no effect on the experimental outcome. Both level a and level b must be positive if observations per cell is 1.

## significance

Values corresponding to the significance levels.

Compare the corresponding significance output with the chosen level of significance to determine whether the level of the factor or the interaction among the factors has an effect on the experimental outcome. A common choice of the chosen level of significance is 0.05. If the corresponding significance output is less than the chosen level of significance, at least one level of the factor or the interaction among the factors has some effect on the experimental outcome.

For example, if factor a is a random effect, your chosen level of significance is 0.05, and significance a is 0.03, then you can conclude that factor a has an effect on the experimental outcome.

Algorithm for Calculating significance

This node calculates significance using the following equations:

$\mathrm{significance}\text{\hspace{0.17em}}a=\left\{\begin{array}{cc}\mathrm{Prob}\left\{{F}_{\mathrm{dofa},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}>\mathrm{fa}\right\}& \left(\mathrm{if}\text{\hspace{0.17em}}b\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{fixed}\right)\\ \mathrm{Prob}\left\{{F}_{\mathrm{dofa},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofab}}>\mathrm{fa}\right\}& \left(\mathrm{if}\text{\hspace{0.17em}}b\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{random}\right)\end{array}$
$\mathrm{significance}\text{\hspace{0.17em}}b=\left\{\begin{array}{cc}\mathrm{Prob}\left\{{F}_{\mathrm{dofb},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}>\mathrm{fb}\right\}& \left(\mathrm{if}\text{\hspace{0.17em}}a\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{fixed}\right)\\ \mathrm{Prob}\left\{{F}_{\mathrm{dofb},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofab}}>\mathrm{fb}\right\}& \left(\mathrm{if}\text{\hspace{0.17em}}a\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{random}\right)\end{array}$
$\mathrm{significance}\text{\hspace{0.17em}}ab=\mathrm{Prob}\left\{{F}_{\mathrm{dofab},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}>\mathrm{fab}\right\}$

where

• ${F}_{\mathrm{dofa},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}$ is the F distribution with dofa and dofe degrees of freedom
• ${F}_{\mathrm{dofa},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofab}}$ is the F distribution with dofa and dofab degrees of freedom
• ${F}_{\mathrm{dofb},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}$ is the F distribution with dofb and dofe degrees of freedom
• ${F}_{\mathrm{dofb},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofab}}$ is the F distribution with dofb and dofab degrees of freedom
• ${F}_{\mathrm{dofab},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dofe}}$ is the F distribution with dofab and dofe degrees of freedom

### significance a

Level of significance associated with factor a.

### significance b

Level of significance associated with factor b.

### significance ab

Level of significance associated with the interaction of factors a and b.

## summary

A 4-by-4 matrix that displays the obtained values for analysis.

$\mathrm{summary}=\left[\begin{array}{cc}\mathrm{ssa}& \mathrm{dofa}\\ \mathrm{ssb}& \mathrm{dofb}\\ \mathrm{ssab}& \mathrm{dofab}\\ \mathrm{sse}& \mathrm{dofe}\end{array}\phantom{\rule{0ex}{0ex}}\text{\hspace{0.17em}}\phantom{\rule{0ex}{0ex}}\phantom{\square }\begin{array}{cc}\mathrm{msa}& \mathrm{fa}\\ \mathrm{msb}& \mathrm{fb}\\ \mathrm{msab}& \mathrm{fab}\\ \mathrm{mse}& 0.0\end{array}\right]$

where

• The first column corresponds to the sum of squares associated with factor a, factor b, ab interaction, and residual error
• The second column corresponds to the respective degrees of freedom
• The third column corresponds to the respective mean squares
• The fourth column corresponds to the respective F values

Algorithm for Calculating Sums of Squares

This node calculates the sums of squares using the following equations:

$\mathrm{ssa}=bL\underset{p=0}{\overset{a-1}{\sum }}{\left(\stackrel{¯}{{x}_{p\cdot \cdot }}-\stackrel{¯}{{x}_{\cdots }}\right)}^{2}$
$\mathrm{ssb}=aL\underset{q=0}{\overset{b-1}{\sum }}{\left(\stackrel{¯}{{x}_{\cdot q\cdot }}-\stackrel{¯}{{x}_{\cdots }}\right)}^{2}$
$\mathrm{ssab}=\left\{\begin{array}{cc}L\underset{p=0}{\overset{a-1}{\sum }}\underset{q=0}{\overset{b-1}{\sum }}{\left(\stackrel{¯}{{x}_{pq\cdot }}-\stackrel{¯}{{x}_{p\cdot \cdot }}-\stackrel{¯}{{x}_{\cdot q\cdot }}+\stackrel{¯}{{x}_{\cdots }}\right)}^{2}& \left(\mathrm{if}\text{\hspace{0.17em}}L>1\right)\\ 0& \left(\mathrm{if}\text{\hspace{0.17em}}L=1\right)\end{array}$
$\mathrm{sse}=\left\{\begin{array}{cc}\underset{p=0}{\overset{a-1}{\sum }}\underset{q=0}{\overset{b-1}{\sum }}\underset{r=0}{\overset{L-1}{\sum }}{\left({x}_{pqr}-\stackrel{¯}{{x}_{pq\cdot }}\right)}^{2}& \left(\mathrm{if}\text{\hspace{0.17em}}L>1\right)\\ \underset{p=0}{\overset{a-1}{\sum }}\underset{q=0}{\overset{b-1}{\sum }}{\left({x}_{pq1}-\stackrel{¯}{{x}_{p\cdot 1}}-\stackrel{¯}{{x}_{\cdot q1}}+\stackrel{¯}{{x}_{\cdot \cdot 1}}\right)}^{2}& \left(\mathrm{if}\text{\hspace{0.17em}}L=1\right)\end{array}$

where

• b is the number of levels in factor b
• L is the number of observational data per cell
• a is the number of levels in factor a
• p is the index of each level in factor a, starting from 0
• $\stackrel{¯}{{x}_{p\cdot \cdot }}$ is the mean of all the observational data at the pth level of factor a
• $\stackrel{¯}{{x}_{\cdots }}$ is the mean of all the observational data
• q is the index of each level in factor b, starting from 0
• $\stackrel{¯}{{x}_{\cdot q\cdot }}$ is the mean of all the observational data at the qth level of factor b
• $\stackrel{¯}{{x}_{pq\cdot }}$ is the mean of all the observational data at the pth and qth levels of factor a and b respectively
• r is the index of each observational data in a cell defined by the pth and qth levels of factor a and b respectively
• xpqr is the rth observational data at the pth and qth levels of factor a and b respectively
• xpq1 is the only observational data in the cell defined by the pth and qth levels of factor a and b respectively, when L = 1
• $\stackrel{¯}{{x}_{p\cdot 1}}$ is the mean of all the observational data at the pth level of factor a, when L = 1
• $\stackrel{¯}{{x}_{\cdot q1}}$ is the mean of all the observational data at the qth levels of factor b, when L = 1
• $\stackrel{¯}{{x}_{\cdot \cdot 1}}$ is the mean of all the observational data, when L = 1

Algorithm for Calculating Degrees of Freedom

This node calculates the degrees of freedom using the following equations:

$\mathrm{dofa}=a-1$
$\mathrm{dofb}=b-1$
$\mathrm{dofab}=\left\{\begin{array}{cc}\left(a-1\right)\left(b-1\right)& \left(\mathrm{if}\text{\hspace{0.17em}}L>1\right)\\ 0& \left(\mathrm{if}\text{\hspace{0.17em}}L=1\right)\end{array}$
$\mathrm{dofe}=\left\{\begin{array}{cc}ab\left(L-1\right)& \left(\mathrm{if}\text{\hspace{0.17em}}L>1\right)\\ \left(a-1\right)\left(b-1\right)& \left(\mathrm{if}\text{\hspace{0.17em}}L=1\right)\end{array}$

where

• a is the number of levels in factor a
• b is the number of levels in factor b
• L is the number of observational data per cell

Algorithm for Calculating Mean Squares

This node calculates the mean squares using the following equations:

$\mathrm{msa}=\frac{\mathrm{ssa}}{\mathrm{dofa}}$
$\mathrm{msb}=\frac{\mathrm{ssb}}{\mathrm{dofb}}$
$\mathrm{msab}=\frac{\mathrm{ssab}}{\mathrm{dofab}}$
$\mathrm{mse}=\frac{\mathrm{sse}}{\mathrm{dofe}}$

where

• ssa is a measure of variation attributed to factor a
• dofa is the degree of freedom of ssa
• ssb is a measure of variation attributed to factor b
• dofb is the degree of freedom of ssb
• ssab is a measure of variation attributed to the interaction of factor a and b
• dofab is the degree of freedom of ssab
• sse is a measure of variation attributed to random fluctuation
• dofe is the degree of freedom of sse

Algorithm for Calculating F Values

This node calculates the F values using the following equations:

$\mathrm{fa}=\left\{\begin{array}{cc}\frac{\mathrm{msa}}{\mathrm{mse}}& \left(\mathrm{if}\text{\hspace{0.17em}}b\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{fixed}\right)\\ \frac{\mathrm{msa}}{\mathrm{msab}}& \left(\mathrm{if}\text{\hspace{0.17em}}b\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{random}\right)\end{array}$
$\mathrm{fb}=\left\{\begin{array}{cc}\frac{\mathrm{msb}}{\mathrm{mse}}& \left(\mathrm{if}\text{\hspace{0.17em}}a\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{fixed}\right)\\ \frac{\mathrm{msb}}{\mathrm{msab}}& \left(\mathrm{if}\text{\hspace{0.17em}}a\text{\hspace{0.17em}}\mathrm{is}\text{\hspace{0.17em}}\mathrm{random}\right)\end{array}$
$\mathrm{fab}=\frac{\mathrm{msab}}{\mathrm{mse}}$

where

• msa is the mean square quantity of ssa
• mse is the mean square quantity of sse
• msab is the mean square quantity of ssab
• msb is the mean square quantity of ssb

The greater the F value is, the more significant effect the corresponding factor or the interaction of the factors has on the experimental outcome.

## error out

Error information.

The node produces this output according to standard error behavior.

Standard Error Behavior

Many nodes provide an error in input and an error out output so that the node can respond to and communicate errors that occur while code is running. The value of error in specifies whether an error occurred before the node runs. Most nodes respond to values of error in in a standard, predictable way.

error in does not contain an error error in contains an error
If no error occurred before the node runs, the node begins execution normally.

If no error occurs while the node runs, it returns no error. If an error does occur while the node runs, it returns that error information as error out.

If an error occurred before the node runs, the node does not execute. Instead, it returns the error in value as error out.

## Random and Fixed Effects

A factor is a basis for categorizing data. A factor is a random effect if it has a large population of levels about which you want to draw conclusions but such that you cannot sample from all levels. You thus pick levels at random and generalize about all levels.

A factor is a fixed effect if you can sample from all levels about which you want to draw conclusions.

## ANOVA Cells

In ANOVA, cells mean level combinations of multiple factors. For example, if you specify the inputs for this node as shown in the following table, the second table below illustrates the cell distributions.

 level a 2 level b 3 x [10, 15, 20, 25, 17, 4] index a [0, 1, 1, 1, 0, 0] index b [0, 0, 2, 1, 1, 2] observations per cell 1
factor b (Level 0) factor b (Level 1) factor b (Level 2)
factor a (Level 0) 10 17 4
factor a (Level 1) 15 25 20

Using age or weight as a factor, this example demonstrates how to test whether age or weight has an effect on the number of sit-ups a person can do.

The following table defines the levels of age and weight.

 factor a (age) Level 0 6 years old to 10 years old Level 1 11 years old to 15 years old factor b (weight) Level 0 less than 50 kg Level 1 between 50 and 75 kg Level 2 more than 75 kg

The following table lists the results of a random sampling of six people. The results are based on a series of observations of how many sit-ups people from different age and weight groups can do.

Note

To perform a two-way analysis of variance, you must make at least one observation per level, and make the same number of observations per cell.

 Person 1 8 years old (Level 0) 30 kg (Level 0) 10 sit-ups Person 2 12 years old (Level 1) 40 kg (Level 0) 15 sit-ups Person 3 15 years old (Level 1) 76 kg (Level 2) 20 sit-ups Person 4 14 years old (Level 1) 60 kg (Level 1) 25 sit-ups Person 5 9 years old (Level 0) 51 kg (Level 1) 17 sit-ups Person 6 10 years old (Level 0) 80 kg (Level 2) 4 sit-ups

The following table lists the inputs and outputs of this node.

 level a 2 level b 3 x [10, 15, 20, 25, 17, 4] index a [0, 1, 1, 1, 0, 0] index b [0, 0, 2, 1, 1, 2] observations per cell 1 summary ssa 140.167 ssb 102.333 ssab 0 sse 32.3333 dofa 1 dofb 2 dofab 0 dofe 2 msa 140.167 msb 51.1667 msab 0 mse 16.1667 fa 8.6701 fb 3.16495 fab 0 0.0 0 significance significance a 0.0985787 significance b 0.240099 significance ab 0

Where This Node Can Run:

Desktop OS: Windows

FPGA: This product does not support FPGA devices