Goodness of Fit (G Dataflow)

Calculates three statistical parameters that describe how well a fitted model matches the original data set.

y

Array of dependent values of the original data set. The number of elements in y must be greater than degree of freedom.

best fit

Array of dependent values of the fitted model. best fit must be the same size as y.

weight

Array of weights for the observations.

weight must be the same size as y. If you do not wire an input to weight, this node sets all elements of weight to 1. If an element in weight is less than 0, this node uses the absolute value of the element.

degree of freedom

Length of the array of dependent values of the original data set minus the number of coefficients in the fitted model. If degree of freedom is less than or equal to 0, this node sets degree of freedom to the length of y minus 2.

Default: -1

error in

Error conditions that occur before this node runs. The node responds to this input according to standard error behavior.

Default: No error

SSE

Summation of square error. The smaller the SSE, the better the fit.

R-square

A normalized parameter to measure the goodness of fit. The closer to 1 the R-square, the better the fit.

RMSE

Root mean square error. The smaller the RMSE, the better the fit.

error out

Error information. The node produces this output according to standard error behavior.

Algorithm for Calculating the Statistical Parameters

The statistical parameters SSE, R-square, and RMSE are defined by the following equations:

$\mathrm{SSE}=\underset{i=0}{\overset{n-1}{\sum }}{w}_{i}{\left({y}_{i}-{f}_{i}\right)}^{2}$
$R-\mathrm{square}=1-\frac{SSE}{SST}$
$RMSE=\sqrt{\frac{SSE}{DOF}}$

where

• wi is the ith element of weight
• yi is the ith element of y
• fi is the ith element of best fit
• $SST=\underset{i=0}{\overset{n-1}{\sum }}{w}_{i}{\left({y}_{i}-\stackrel{¯}{y}\right)}^{2}$
• $\stackrel{¯}{y}$ is the mean value of y
• DOF is the degree of freedom

