The following table lists the functions that calculate the measures of central tendency.
Most frequent value
The average is a simple and popular estimate of location. If the data sample comes from a normal distribution, then the sample mean is also optimal (minimum variance unbiased estimator (MVUE) of µ).
Unfortunately, outliers, data entry errors, or glitches exist in almost all real data. The sample mean is sensitive to these problems. One bad data value can move the average away from the center of the rest of the data by an arbitrarily large distance.
The median and trimmed mean are two measures that are resistant (robust) to outliers. The median is the 50th percentile of the sample, which will only change slightly if you add a large perturbation to any value. The idea behind the trimmed mean is to ignore a small percentage of the highest and lowest values of a sample when determining the center of the sample.
The geometric mean and harmonic mean, like the average, are not robust to outliers. They are useful when the sample is distributed lognormal or heavily skewed.
The following example shows the behavior of the measures of location for a sample with one outlier.
x = [ones(1,6) 100] x = 1 1 1 1 1 1 100 locate = [geomean(x) harmmean(x) mean(x) median(x)... trimmean(x,25)] locate = 1.9307 1.1647 15.1429 1.0000 1.0000
You can see that the mean is far from any data value because of the influence of the outlier. The median and trimmed mean ignore the outlying value and describe the location of the rest of the data values.