Visualizing Proportions using Stacked bars
- sam33frodon
- Jan 12, 2021
- 2 min read
Updated: Jan 13, 2021
There are many ways to present proportions.
1. Pie charts:
- work well when the goal is to emphasize simple fractions, such as one-half, one-third, or one-quarter
- work well when we have very small datasets - are not effective for comparing proportions
2. Stacked bars : work for side-by-side comparisons of multiple conditions or in a time series
3. Side-by-side bar charts: to directly compare the individual fractions to each other.
In this post, stacked bars are used when we have only two bars in each stack.
Example: to examine the difference in income between male and female (data source: BRFSS2018)
Here is how the aggregated data look like
df_GenderIncome <- BRFSS %>%
group_by(GENDER, INCOME) %>%
dplyr::summarise(Count = n()) %>%
ddply(.(INCOME),transform, Percentage = Count*100/sum(Count)) %>%
mutate(pos = cumsum(Percentage) - (0.5 * Percentage))
df_GenderIncome
## GENDER INCOME Count Percentage pos
## 1 Female $20K-<$35K 29608 57.43996 28.71998
## 2 Male $20K-<$35K 21938 42.56004 78.71998
## 3 Female $35K-<$50K 20421 52.79882 126.39941
## 4 Male $35K-<$50K 18256 47.20118 176.39941
## 5 Female $50K-<$75K 23800 50.50184 225.25092
## 6 Male $50K-<$75K 23327 49.49816 275.25092
## 7 Female $75K+ 47630 46.60561 323.30280
## 8 Male $75K+ 54568 53.39439 373.30280
## 9 Female <$20K 24445 61.07585 430.53793
## 10 Male <$20K 15579 38.92415 480.53793
To create stacked bar chart
ggplot(data = df_GenderIncome)+
geom_bar(aes(x = INCOME,
y = Percentage,
fill = GENDER),
stat = "identity") +
labs(y = "Relative proportion (%)",
x = "Salary Bracket") +
scale_fill_manual(values = c("#A9A9A9", "#FF8C00")) +
scale_y_continuous(breaks = seq(0, 100, 25)) +
geom_hline(yintercept = 50, linetype = "dashed", color = "blue")+
theme(axis.text = element_text(size = 12, color = "black"),
axis.title = element_text(size = 12, face = "bold"),
legend.text = element_text(colour="black", size = 12),
legend.position = "right")

Comentarios