## Central Tendency

The term central tendency refers to the “middle” value or perhaps a typical value of the data, and is measured using the mean, median, or mode. Each of these measures is calculated differently, and the one that is best to use depends upon the situation.   Mean The mean is the most commonly-used measure of central tendency. When we talk about an “average”, we usually are referring to the mean. The mean is simply the sum of the values divided by the total number of items in the set. The result is referred to as the arithmetic mean. Sometimes it is useful to give more weighting to certain data points, in which case the result is called the weighted arithmetic mean. The notation used to express the mean depends on whether we are talking about the population mean or the sample mean:   =  population mean   =  sample mean The population mean then is defined as:                       =           where   =  number of data points in the population   =  value of each data point i. The mean is valid only for interval data or ratio data. Since it uses the values of all of the data…

## Dispersion

Without knowing something about how data is dispersed, measures of central tendency may be misleading. For example, a residential street with 20 homes on it having a mean value of \$200,000 with little variation from the mean would be very different from a street with the same mean home value but with 3 homes having a value of \$1 million and the other 17 clustered around \$60,000. Measures of dispersion provide a more complete picture. Dispersion measures include the range, average deviation, variance, and standard deviation. Range The simplest measure of dispersion is the range. The range is calculated by simply taking the difference between the maximum and minimum values in the data set. However, the range only provides information about the maximum and minimum values and does not say anything about the values in between. Average Deviation Another method is to calculate the average difference between each data point and the mean value, and divide by the number of points to calcuate the average deviation (mean deviation). However, performing this calcuation will result in an average deviation of zero since the values above the mean will cancel the values below the mean. If this method is used, the absolute…

## Standard Deviation and Variance

A commonly used measure of dispersion is the standard deviation, which is simply the square root of the variance. The variance of a data set is calculated by taking the arithmetic mean of the squared differences between each value and the mean value. Squaring the difference has at least three advantages: Squaring makes each term positive so that values above the mean do not cancel values below the mean. Squaring adds more weighting to the larger differences, and in many cases this extra weighting is appropriate since points further from the mean may be more significant. The mathematics are relatively manageable when using this measure in subsequent statisitical calculations. Because the differences are squared, the units of variance are not the same as the units of the data. Therefore, the standard deviation is reported as the square root of the variance and the units then correspond to those of the data set. The calculation and notation of the variance and standard deviation depends on whether we are considering the entire population or a sample set. Following the general convention of using Greek characters to express population parameters and Arabic characters to express sample statistics, the notation for standard deviation and…