Tips from Plotcon

The plotly package in R enables users to create interactive graphics via the plotly.js library. This past weekend I was lucky to attend a PLOTCON, a hands-on workshop taught by the package developer, Carson Sievert, at plotly headquarters in Montreal. The workshop not only offered an excellent introduction to plotly and Shiny, but also provided many tips on tools available in R for data cleaning and visualization. Here, I'd like to share a couple of them:

crosstalk

crosstalk is an add-on to the htmlwidget package in R. It implements communication between widgets and this functionality is available with and without Shiny. I will illustrate this with a simple example using the Titanic dataset I downloaded from Kaggle. The graph on the left side is a bar plot showing the distribution of passenger class on Titanic, the plot on the right side is a density plot showing the age distribution of passengers grouped by passenger class on Titanic. By clicking on certain passenger class on the bar chart, the age distribution is highlighted in the density plot. This is what I meant by interaction between widgets. By using the crosstalk package, creating a R6 object, the two plots are linked.

link <- SharedData$new(titanic,~Pclass)
p1 <- ggplotly(ggplot(link,aes(x=factor(Pclass)))+geom_bar()+coord_flip()+ylab('')+xlab('passenger class'))

p2 <- ggplotly(ggplot(link,aes(x=Age,color=factor(Pclass),fill=factor(Pclass)))+geom_density()+ylab('')+xlab('Age')+theme(legend.position="none"))

subplot(p1,p2,titleX=T,widths=c(0.6,0.4))%>%
  highlight(selectize=T,persistent=F,selected = attrs_selected(showlegend = FALSE))
Plot 5

patchwork

Patchwork is an add-on package for ggplot2. I like it because this package makes arranging multiple plots extremely simple. The package is not yet available on CRAN, but you can install it from github: 

devtools::install_github("thomasp85/patchwork")

The syntax is simple: literally you just need to add up plots! 

p1 <- ggplot(link,aes(x=factor(Pclass)))+geom_bar()+coord_flip()+xlab('Passenger Class')+ylab('')
p2 <- ggplot(link,aes(x=Age,color=factor(Pclass),fill=factor(Pclass)))+geom_density()+ylab('')+xlab('Age')

p1+p2
p1.png

And Voila! Now you have two plots arranged side by side. Compared with the gridExtra library, the syntax is shorter.  With only two plots, you probably wouldn't notice too much difference. The difference become obvious when arranging multiple plots with different dimensions. Here is an example:

p3 <- ggplot(data=titanic,aes(x=factor(Survived),y=Fare))+geom_boxplot()+xlab('Survived or Not?')

(p1 | p3) / p2

By using '|' and '/', you can specify the horizontal and vertical layouts. One line of code, that's all you need! More complex examples can be found here.

As you can see, the workshop was not just about teaching you how to use plotly or Shiny. It was the whole package about how to make data visualization easier and more accessible to scientists without tons of programming experience. I would highly recommend anyone interested in dataviz to attend one of Carson's plotly workshops. 

Last but not the least, as an open source language, there are numerous resources available at your fingertips. To begin with, I would recommend:

https://plotly-book.cpsievert.me 
http://vissoc.co

for a thorough introduction to plotly and ggplot2.