Any sufficiently advanced technology is equivalent to magic.

Top

Site Menu

Graphical Summary - Categorical Data

This page will only include very basic visualizations. More complex and customizable graphics can be made using the ggplot2 package along with others.

Bar Charts

The barplot() function will produce a bar chart of the provided categorical data. However, it cannot accept a list of the category names as the input. This function needs a concise numerical summary of the categories and how many are within each group. To achieve this, we will use the summary() function in conjunction with barplot().
Example: Using the iris dataset from R

# no pec barplot(summary(iris$Species))
With each graphical summary, there are other customizations and options you can add. A few of the additional arguments for barplot() are listed below along with their use: All of the arguments can be placed in the parentheses in any order, but they must be separated by a comma. You can put them all on one line or put each option on its own line, shown below:
Example: Using the iris dataset from R

# no pec barplot(summary(iris$Species), main = "Bar Chart of Iris Species", xlab = "Species", ylab = "Frequency", names.arg = c("Setosa", "Versicolor", "Virginica"), ylim = c(0, 60), col = c("lightseagreen", "mediumseagreen", "darkseagreen"))
Example: Using the mtcars dataset from R  (first used on Numerical Data Summary page)

# no pec # Basic Bar Chart barplot(summary(as.factor(mtcars$am))) # Customized Bar Chart barplot(summary(as.factor(mtcars$am)), main = "Bar Chart of Car Transmissions", xlab = "Transmission Type", ylab = "Frequency", names.arg = c("Automatic", "Manual"), ylim = c(0, 25), col = "#003056")

Changing the Order of the Categories

There are a few additional steps that need to be taken to change the order of the categories in a bar chart.
Example: Using the mtcars dataset from R  (first used on Numerical Data Summary page)

Remember that the am variable was recorded as a "0" for automatic and "1" for manual transmission and to make it a categorical variable, we need to use the as.factor() function.

# no pec as.factor(mtcars$am)
Notice in the output that R specifies the levels of the factor, Levels: 0 1. The order that these categories are in is very important. This is the order that R will use to plot the bar chart.
To change the order of the levels, we will use the factor() function and specify the levels in the order desired, in quotations and spelled exactly as they are listed in the previous example.
Similarly to as.factor(), this function will also make the am variable a categorical variable.

# no pec transmiss_order <- factor(mtcars$am, levels = c("1", "0")) transmiss_order
None of the data has been changed but the order of the categories has been changed, Levels: 1 0.
Here is the same customized barplot from above but with the new order of categories.

The only parts of the code that have been changed are the data being given to the function (transmiss_order rather than as.factor(mtcars$am)) and the order of the labels in the names.arg argument (Don't forget to change the labels!).

transmiss_order <- factor(mtcars$am, levels = c("1", "0")) # Customized Bar Chart with new order of categories. barplot(summary(transmiss_order), main = "Bar Chart of Car Transmissions", xlab = "Transmission Type", ylab = "Frequency", names.arg = c("Manual", "Automatic"), ylim = c(0, 25), col = "#003056")

Colors in R

There are two main ways to specify which color(s) you want to use in R. You can refer to a color by its name or in terms of their hexadecimal number (#RRGGBB).
A list of available colors by name is given here: R Colors

Video Tutorial:

Pie Charts

The pie() function will produce a pie chart of the provided categorical data. Like barplot(), it cannot accept a list of the category names as the input. We will use the summary() function again in conjunction with pie().

A few of the additional arguments for pie() are listed below along with their use:
Example: Using the iris dataset from R

# no pec # Basic Pie Chart pie(summary(iris$Species)) # Customized Pie Chart pie(summary(iris$Species), main = "Pie Chart of Iris Species", labels = c("Setosa", "Versicolor", "Virginica"), radius = 0.5, clockwise = TRUE, col = c("lightseagreen", "mediumseagreen", "darkseagreen"))
Example: Using the mtcars dataset from R  (first used on Numerical Data Summary page)

# no pec # Basic Pie Chart pie(summary(as.factor(mtcars$am))) # Customized Pie Chart pie(summary(as.factor(mtcars$am)), main = "Pie Chart of Car Transmissions", labels = c("Automatic", "Manual"), radius = -0.75, clockwise = FALSE, col = c("#003056", "#8A8D8F"))

Video Tutorial: