In statistics, **Mahalanobis distance** is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analysed. It is a useful way of determining *similarity* of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set.

Formally, the Mahalanobis distance from a group of values with mean

and covariance matrix

for a multivariate vector

is defined as:

Mahalanobis distance can also be defined as dissimilarity measure between two random vectors

and of the same distribution with the covariance matrix

:

If the covariance matrix is the identity matrix then it is the same as
Euclidean distance. If covariance matrix is diagonal, then it is called *normalized Euclidean distance*:

where

is the standard deviation of the over the sample set.