# Fisher transformation

In statistics, hypotheses about the value of the population correlation coefficient ρ between variables *X* and *Y* can be tested using the **Fisher transformation**^{[1]}^{[2]} (aka **Fisher z-transformation**) applied to the sample correlation coefficient.

## Definition

Given a set of *N* bivariate sample pairs (*X*_{i}, *Y*_{i}), *i* = 1, ..., *N*, the sample correlation coefficient *r* is given by

Here stands for the covariance between the variables and and stands for the standard deviation of the respective variable. Fisher's z-transformation of *r* is defined as

where "ln" is the natural logarithm function and "arctanh" is the inverse hyperbolic tangent function.

If (*X*, *Y*) has a bivariate normal distribution, and if the pairs (*X*_{i}, *Y*_{i}) are independent, then *z* is approximately normally distributed with mean

and standard error

where *N* is the sample size, and ρ is the true correlation coefficient.

This transformation, and its inverse

can be used to construct a large-sample confidence interval for *r* using standard normal theory and derivations.

## Discussion

The Fisher transformation is an approximate variance-stabilizing transformation for *r* when *X* and *Y* follow a bivariate normal distribution. This means that the variance of *z* is approximately constant for all values of the population correlation coefficient *ρ*. Without the Fisher transformation, the variance of *r* grows smaller as |*ρ*| gets closer to 1. Since the Fisher transformation is approximately the identity function when |*r*| < 1/2, it is sometimes useful to remember that the variance of *r* is well approximated by 1/*N* as long as |*ρ*| is not too large and *N* is not too small. This is related to the fact that the asymptotic variance of *r* is 1 for bivariate normal data.

The behavior of this transform has been extensively studied since Fisher introduced it in 1915. Fisher himself found the exact distribution of *z* for data from a bivariate normal distribution in 1921; Gayen in 1951^{[3]}
determined the exact distribution of *z* for data from a bivariate Type A Edgeworth distribution. Hotelling in 1953 calculated the Taylor series expressions for the moments of *z* and several related statistics^{[4]} and Hawkins in 1989 discovered the asymptotic distribution of *z* for data from a distribution with bounded fourth moments.^{[5]}

## Other uses

While the Fisher transformation is mainly associated with the Pearson product-moment correlation coefficient for bivariate normal observations, it can also be applied to Spearman's rank correlation coefficient in more general cases. A similar result for the asymptotic distribution applies, but with a minor adjustment factor: see the latter article for details.

## See also

- Data transformation (statistics)
- Meta-analysis (this transformation is used in meta analysis for stabilizing the variance)
- R implementation

## References

- ↑ Fisher, R. A. (1915). "Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population".
*Biometrika*. Biometrika Trust.**10**(4): 507–521. doi:10.2307/2331838. JSTOR 2331838. - ↑ Fisher, R. A. (1921). "On the 'probable error' of a coefficient of correlation deduced from a small sample" (PDF).
*Metron*.**1**: 3–32. - ↑ Gayen, A. K. (1951). "The Frequency Distribution of the Product-Moment Correlation Coefficient in Random Samples of Any Size Drawn from Non-Normal Universes".
*Biometrika*. Biometrika Trust.**38**(1/2): 219–247. doi:10.1093/biomet/38.1-2.219. JSTOR 2332329. - ↑ Hotelling, H (1953). "New light on the correlation coefficient and its transforms".
*Journal of the Royal Statistical Society, Series B*. Blackwell Publishing.**15**(2): 193–225. JSTOR 2983768. - ↑ Hawkins, D. L. (1989). "Using U statistics to derive the asymptotic distribution of Fisher's Z statistic".
*The American Statistician*. American Statistical Association.**43**(4): 235–237. doi:10.2307/2685369. JSTOR 2685369.