A colleague showed me results of his study project with beeswarm plots made by GraphPad. I was wondering if it could be implemented in R and more specifically with ggplot2.

There is a R package allowing to draw such graphs, the beeswarm package (beeswarm, cran). An implementation was shown on R-statistics blog but not with ggplot.

First here’s the example from the beeswarm package:

library(beeswarm) data(breast) breast2 <- breast[order(breast$event_survival, breast$ER),] beeswarm(time_survival ~ event_survival, data = breast2, pch = 16, pwcol = as.numeric(ER), xlab = '', ylab = 'Follow-up time (months)', labels = c('Censored', 'Metastasis')) legend('topright', legend = levels(breast$ER), title = 'ER', pch = 16, col = 1:2)

Or even like in Tal Galili’s blog, with a boxplot:

beeswarm(time_survival ~ event_survival, data = breast2, pch = 16, pwcol = as.numeric(ER), xlab = '', ylab = 'Follow-up time (months)', labels = c('Censored', 'Metastasis')) boxplot(time_survival ~ event_survival, data = breast2, add = T, names = c("",""), col="#0000ff22") legend('topright', legend = levels(breast$ER), title = 'ER', pch = 16, col = 1:2)

The trick is to use the beeswarm call to get the x and y position. Beeswarm creates a dataframe from which we can get the necessary positionings.

beeswarm <- beeswarm(time_survival ~ event_survival, data = breast, method = 'swarm', pwcol = ER)[, c(1, 2, 4, 6)] colnames(beeswarm) <- c("x", "y", "ER", "event_survival") library(ggplot2) library(plyr) beeswarm.plot <- ggplot(beeswarm, aes(x, y)) + xlab("") + scale_y_continuous(expression("Follow-up time (months)")) beeswarm.plot2 <- beeswarm.plot + geom_boxplot(aes(x, y, group = round_any(x, 1, round)), outlier.shape = NA) beeswarm.plot3 <- beeswarm.plot2 + geom_point(aes(colour = ER)) + scale_colour_manual(values = c("black", "red")) + scale_x_continuous(breaks = c(1:2), labels = c("Censored", "Metastasis"), expand = c(0, 0.5))

Do not forget to remove the outliers from your boxplot or they will superimpose with the points created by geom_point.

I wonder if these plots are more useful in certain field. If anybody has references for beeswarm plots, I would be very grateful.