On
the
|
Vary Useful Statistics |
|
Introduction
|
The measure of
spread of a dataset is a vary
useful statistic! When summarising a dataset, at least two measures are needed - one to locate the dataset and another to indicate the spread of the data. The mean and the median are the common measures of location, while the standard deviation and interquartile range are commonly used to summarise the spread of the data. Range One measure of the usefulness of a statistics is its robustness. A robust measure is one that is little affected by outliers. On this basis the range, which is simply calculated as Range = largest data value - smallest data value is obviously not very robust and hence is not particularly useful. Mean Deviation Until recently I was never able to satisfactorily answer the question, "The mean deviation is simple. Why is the standard deviation used rather than the mean deviation?" An email by Paul Gardner from Monash University gives a clear explanation, and is the basis of the article, I'm Not Mad about MAD. Standard Deviation The Measures of Spread worksheet contains some lovely questions on standard deviation, courtesy of Pat Ballew. In fact I would rate these questions as being at least 1.5 standard deviations above the average question. Interquartile Range The interquartile range, while simple in concept, has caused much grief to introductory statistics teachers since different respectable sources define it in different respectable ways! The article Ticky-Tacky Boxes discusses the different methods of finding Q1 and Q3, in the context of constructing a boxplot, and makes a recommendation as to which is 'best'. The STEPS modules are a collection of hypertext-based tutorials covering a wide range of statistics topics, including summary statistics. Visit the STEPS page for further information and a list of the modules available. | Read
Me First! | Introduction | Acknowledgements | |
Assessment | Datasets | Resources | |