Beeswarm Plot with ggplot2

A colleague showed me results of his study project with beeswarm plots made by GraphPad. I was wondering if it could be implemented in R and more specifically with ggplot2.

There is a R package allowing to draw such graphs, the beeswarm package (beeswarm, cran). An implementation was shown on R-statistics blog but not with ggplot.

First here’s the example from the beeswarm package:

```library(beeswarm)
data(breast)
breast2 <- breast[order(breast\$event_survival, breast\$ER),]

beeswarm(time_survival ~ event_survival, data = breast2, pch = 16,
pwcol = as.numeric(ER), xlab = '',
ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast\$ER), title = 'ER',
pch = 16, col = 1:2)
```

Or even like in Tal Galili’s blog, with a boxplot:

```beeswarm(time_survival ~ event_survival, data = breast2, pch = 16,
pwcol = as.numeric(ER), xlab = '',
ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
boxplot(time_survival ~ event_survival, data = breast2, add = T,
names = c("",""), col="#0000ff22")
legend('topright', legend = levels(breast\$ER), title = 'ER',
pch = 16, col = 1:2)
```

The trick is to use the beeswarm call to get the x and y position. Beeswarm creates a dataframe from which we can get the necessary positionings.

```beeswarm <- beeswarm(time_survival ~ event_survival,
data = breast, method = 'swarm',
pwcol = ER)[, c(1, 2, 4, 6)]
colnames(beeswarm) <- c("x", "y", "ER", "event_survival")

library(ggplot2)
library(plyr)
beeswarm.plot <- ggplot(beeswarm, aes(x, y)) +
xlab("") +
scale_y_continuous(expression("Follow-up time (months)"))
beeswarm.plot2 <- beeswarm.plot + geom_boxplot(aes(x, y,
group = round_any(x, 1, round)), outlier.shape = NA)
beeswarm.plot3 <- beeswarm.plot2 + geom_point(aes(colour = ER)) +
scale_colour_manual(values = c("black", "red")) +
scale_x_continuous(breaks = c(1:2),
labels = c("Censored", "Metastasis"), expand = c(0, 0.5))
```

Do not forget to remove the outliers from your boxplot or they will superimpose with the points created by geom_point.

I wonder if these plots are more useful in certain field. If anybody has references for beeswarm plots, I would be very grateful.