Correlation Coefficient
correcoef(A: any[] | Mat, B: any[] | Mat): Number
param A
- the first part of the pair of data to find the correlation coefficient of. Needs to be a JS array or Mat.
param B
- similar to A
, needs to be the same length and a JS array or Mat holding data.
returns - a number representing the correlation coefficient of A
and B
, between -1 and 1.
If A
and B
are not 1-d JS array, then they will be flattened to 1-d.
The correlation coefficient can be thought of as a number that represents how strongly 2 variables are connected. It takes in two variables of data, A and B, where A contains data, say [0,1,1,2,2,3] and B contains data, say [0,0,1,1,2,3,]. It then effectively finds the distances between each point for each indice and keeps a track of it as a sum.
Overall, it outputs a number between -1 and 1. The closer it is to 1, the stronger the correlation. The closer to -1, the stronger the inverse correlation.
A practical use of this is something like an array of housing prices over the years - A and an array of crime amount over the years - B. Is crime correlated to housing prices? If so, how much is it correlated?
To find the correlation coefficient of the two inputs, use the following formula:
$$r = \frac{\sum(x_i - \bar{x}) (y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}$$