From the
Exploring Data website - http://curriculum.qed.qld.gov.au/kla/eda/
© Education Queensland, 1997
Introduction to Stemplots
The purpose of displaying data graphically is to give a visual display of the interesting and important features of the dataset. Which particular displays are best is not a question that can necessarily be answered before the data is viewed, hence a statistican will view the data in different ways.
A stemplot shows the shape of the distribution and indicates whether there are potential outliers. Constructing a stemplot is often the first step in analysising a dataset, and helps to determine what analysis is appropriate.
Stemplots are useful for displaying small datasets with only positive values, or if it is important to retain the original data. They are quicker and easier than histograms to construct by hand.
The choices of bin width are limited, so for some datasets that otherwise meet the criteria, a stemplot may not be very useful. The bins may be too large or too small to properly display the distribution of the data. For these datasets, a histogram is preferable.
How to displaying a particular dataset with a stemplot often requires judgement. How to split the stems, how to represent outliers and whether to truncate the data are decisions that often have to be made. The overarching underpinning is that the plot should quickly inform us about the salient features of the dataset.
Once a stemplot is constructed, students should consider these questions:
What is the location of the data?
How much is the data spread out?
What is its overall shape; is it symmetric, skew or bi-modal?
Are there any unusual data values such as outliers?
Is there evidence of clustering?
Students should also note any unusual features of the dataset not highlighted by the above questions. For example one row may have many more elements in it than the other rows. The student should ask, Is this a random occurance or is it a relevant feature of this dataset? Often answering such questions isnt easy.