CH 1 Continued - Categorical Variables in Statistics
Continuing from the previous chapter, we will now look at how to represent categorical data in statistics with an example.
1000 ball bearings classified as:
- conforming (
910
)- too thick (
53
)- too thin (
37
)
Here, the data is classified into three categories, which are mutually exclusive and exhaustive. The data is categorical because it is classified into categories and not measured.
Lets make a table of counts for the data.
Table of Counts
Classification | frequency | f/n - relative frequency proportion |
---|---|---|
Conforming | 910 | .910 |
too thick | 53 | .053 |
too thin | 37 | .037 |
Bar Chart with MatPlotLib
This will create a bar chart with the proportion of each category, where the height of the bar represents the proportion of the category.
Variable Statistics and Boxplots
Now, lets look at another example
Chromium VS Nickel Simple Dataset
Chromium:
31, 1, 511, 2, 574, 496, 322, 424, 269, 140, 244, 252, 76, 108, 24, 38, 18, 43, 30, 191
Nickel:
23, 22, 55, 39, 283, 34, 159, 37, 61, 34, 163, 140, 32, 23, 54, 837, 64, 354, 376, 471
We are going to find:
- Mean
- Median
- Standard Deviation
- Quartiles
Lets find these with python.
This will yield the 1-Variable Statistics we would normally get from a Ti-84 Graphing Calculator. And lets include a graph to go along with it.