Other (Non-Resistant) Measures of Spread

 This one is not in the text

 MAD = mean absolute deviation

  = average of absolute deviations from the median

the bigger MAD, the further is your typical xi from the median ® more spread out

 

xample:

 

data 1 = {1, 3, 5} Q2 = 3

data 2 = {2, 3, 4} Q2 = 3

 

 

Certainly data 1 is more spread out than data 2.

 

Note: MAD is not resistant because data in the 'tails' will affect the average value. (I suppose you could look at median absolute deviation: that would be a resistant measure of spread.)

 

Another (Non-Resistant Measure)

 

Standard Deviation (and Variance)

This is a very popular measure of spread. You have to be careful using it if your data are skewed because it is sensitive to outliers.

 

Start with a list {x1,…xn}

 

Construct the mean

 

 

The variance is defined as

 

 

This is just about equal to the average squared deviation from the mean (if it was really the average you would divide by n)

The standard deviation is defined as

 

If n is 'big' n is close to n-1 so it doesn't make any difference. (called degrees of freedom issue)

 

Note: both s and s2 will be large if the observations are widely spread out from the mean.

 

Example:

data 1 = {1, 3, 5}

data 2 = {2, 3, 4}

 

 

Round Off Error

or s2 or other numerical values in statistics we will make mistakes because the calculator (say) can only remember a fixed number of digits after the decimal point. This is called round off error.

 

 

 

 

 

 Computing Formula For Variance

 

 

(takes fewer additions/multiplications to compute

  

Sensitivity of s2

 

Maris = {8, 13, 14, 16, 23, 26, 28, 33, 39, 61}

 

 

 Units of Measurement

 Note that is in the same units as all the {x1,…,xn} but s2 is not (it is in units squared) also s is in the same units as xi.

 

What happens if you make a linear transformation of a list of data.

 

Start with {x1, x2…, xn} Transform each xi to yi = a+bxi Where a and b are constant numbers.

 

{x1,

x2,

…,

xn}

å

â

 

æ

{a+bx1,

a+bx2,

…,

a+bxn}

={ y1,

y2,

…,

yn}.

 

so à scaled by a factor b2

also sy = bsx à scaled by a factor of proportionality b

 

Try it out with a simple data set {x1, x2, x3} = (-1, 0, 1)