R Part 2: Installing Rstudio and using Shiny with R

#

Following on from my post on Installing and using R, I wanted to look at using Rstudio for executing R scripts and development of R shiny applications. Rstudio is a very nice development environment for running R scripts. It includes a lot of useful development aids like a nice help and documentation functionality, code colour coding and completion. Also Rstudio will keep a history of scripts you have run, one of my favourite features. It can also be used to easily install R package libraries.

Installing Rstudio

To start, go to the site

https://www.rstudio.com/products/rstudio/

And download the desktop version. Once installed you will get this nice environment.

1 RStudio

 

A few components on this screen to describe.

  • The console – The exact same as the command prompt R console we loaded up in Part 1
  • Environment – Any objects created in a session will be seen here
  • History – A history of all scripts run in the console, very useful for tracking progress
  • Files – Shows files in the current project, assuming you have created one
  • Plots – A container for any plots produced when you execute code
  • Packages – Allows you to search installed packages and update them.
  • Help – An output of the help documentation we used in Part 1.

Using Rstudio

In Rstudio I can now run some scripts. To show some nice functionality I am going to create some vectors and a dataframe

numbers <- c(10, 20, 30) # Create a vector with 3 numbers
letters <- c("a", "b", "c") # Create a vector with 3 letter
length(letters) # Count the length of vector letters
# create a dataframe with the two vector values
dataframe.lettersandnumbers <- as.data.frame(numbers, letters)
dataframe.lettersandnumbers # print the dataframe

When I run this script a couple of things happen. Three objects are created, a character vector called letters. A number vector called numbers and a dataframe called dataframe.lettersandnumbers.

The function length will return the length of the vector called letters. The final line prints the dataframe on screen.

2 RStudio Console

You can see the object created can now be seen in the Environment tab on the right. It will show some details about each object. On dataframes you can click the little table symbol to the right of then name. This will show the contents of the dataframe in a new tab, see below

I can run some code to generate plots in the Plot window, to do it I will be using the following data set from the TFL

TFLStationSummaryStatistics (Licence)

I can set the working directory for R to the following directory using the setwd() function. Note the use of ‘/’ rather than ‘\’ in the path. This is easily missed and can be difficult to spot. You can use getwd() to get the current working folder.

setwd("C:/RDatasets/")

Now let’s create some data sets, to start lets read in the CSV which I have put in folder C:/RDatasets/

setwd("C:/RDatasets/")
dataframe.TFLStationNumber <- read.csv("TFLStationSummaryStatistics.csv", strip.white=TRUE) # Load dataframe

Filter the data, returning recording type Annual Entry + Exit and year 2014

dataframe.Yearly <- subset(dataframe.TFLStationNumber, trimws(Recording) == "Annual Entry + Exit" & year == 2014) # filter yearly data trimming white space from values using trimws

Sort the data set by the count of people

dataframe.Sorted <- dataframe.Yearly[order(-dataframe.Yearly$PeopleCount),] # Order by count

Plot the data in a bar chart, selecting only the top 5 stations

options("scipen" = 10) # remove scientific notation
barplot(head(dataframe.Sorted$PeopleCount/1000000, 5), names.arg=head(dataframe.Sorted$Station), las=2, ylab="People Count(Millions)", main="Top Station Movement") # create graph, las=2 flips labels 90 degrees

RStudio Top 5 Stations

As you can see all the dataframes appear in the environment tab. The plot we have created appears in the plot window.

Next let’s install the shiny package.

Installing packages

One of the most useful things about R are the package libraries available. Packages are precompiled R code which can be referenced and call from your R solutions. When you install a package it is stored in your local R package library.

To install the Shiny package in Rstudio I can select Tools>Install Packages. The dialog below will open and I can write the name of the package I want to install. In the example below I am going to install the shiny package.

5 RStudio Install Library

When I hit install, R will go out and retrieve the package from the CRAN repository on the web.

After it is installed we can create some shiny apps.

Shiny Applications

In order to create the first application, we need to create a project and then add two script files, ui.R and server.R.

In Rstudio I can create projects and save scripts. To create a project I select File>New Project. When prompted select New Directory. For the first application I will create an empty project from the screen below. You can choose Shiny Web Application which will put everything in place and even some sample code, but for now let’s select an Empty Project.

NewProject

I will call it TFL Stations. Now I want to open 2 new scripts using File>New File>R Script. I save one named ui.r and another named server.r.

ShinyApp

When done the screen should look like this, with the two files visible in the project folder.

ShinyApp2

The minimum required code for a shiny application is, in ui.R

shinyUI(fluidPage(
)
)

And in server.R we need

shinyServer(function(input, output) {
}
)

Note when you have entered the required scripts we get the option to Run App

8 Rstudio Run Shiny App

When you run the application with this code it loads an empty screen as we have not specified any inputs or outputs.

So now let’s display the Top Station Movement plot from above

To do this I will add the code directly into server.R

I have added one component to the original code.  output$mybarplot <- renderPlot({ }). This creates an output variable called mybarplot which can be used in ui.R to display the plot. renderPlot is the type of output we are creating, other options here are renderTable, if I want to output a table or HTML if I want to output some HTML.

shinyServer(function(input, output) {
 
 setwd("C:/RDatasets/")
 dataframe.TFLStationNumber <- read.csv("TFLStationSummaryStatistics.csv", strip.white=TRUE) # Load dataframe
 
 dataframe.Yearly <- subset(dataframe.TFLStationNumber, trimws(Recording) == "Annual Entry + Exit" & year == 2014) # filter yearly data trimming white space from values
 
 dataframe.Sorted <- dataframe.Yearly[order(-dataframe.Yearly$PeopleCount),] # sort by people count,
 
 options("scipen" = 10) # remove scientific notation
 output$mybarplot <- renderPlot({
 barplot(head(dataframe.Sorted$PeopleCount/1000000), names.arg=head(dataframe.Sorted$Station), las=2, ylab="People Count(Millions)", main="Top Station Movement") # create graph
 })
}
)

Back to ui.R I have created a couple of new sections, here is a brief look at each

Within fluidPage

  • titlePanel – A title bar for the page
  • sidebarLayout – I want a sidebar(for parameters) and a main panel as my output

Within sidebarLayout

  • sidebarPanel – This contains anything in the sidebar, for now this is just a title called Parameters
  • mainPanel – Should contain the final output graphs etc

Within mainPanel

  • plotOutput – plotOutput tells the UI we are going to display a plot. The plot we will display has been configured in our server.R and is called mybarplot
shinyUI(fluidPage(
titlePanel("Station Analysis"),

 sidebarLayout(
  sidebarPanel( "Parameters"),
   mainPanel(
    plotOutput("mybarplot")
         )    
)
))

When we now run the application the plot is displayed as below

topstations11

 

The last thing I will do in this section is to create a parameter.

My data has been recorded split by time period weekday/weekend/annual and if the recording was an entry or exit. It will also be split by year. The parameter will be used to filter the data by these criteria.

To set up the parameter in ui.R I can use the selectInput function seen below. This contains a parameter name of recording_type, which can be used as a filter in our server script. It also has a label and the list of choices to be shown. This can be made dynamic, but for now let’s keep them static.

shinyUI(fluidPage(
 titlePanel("Station Analysis"),
  sidebarLayout(
   sidebarPanel(
    selectInput("recording_type", label = h3("Select Record Type"),
     choices = list(
     "Entry Weekday",
     "Exit Weekday",
     "Entry Saturday",
     "Exit Saturday",
     "Entry Sunday",
     "Exit Sunday",
     "Annual Entry + Exit"
    )),
    selectInput("year", label = h3("Select year"),
     choices = list(
     "2014",
     "2013",
     "2012",
     "2011",
     "2010",
     "2009",
     "2008"
     ))
   ),
 mainPanel(
  plotOutput("mybarplot")
  )
 )
))

In server.R I need to move the filters being applied into the renderPlot section. This means the data will refresh when the user changes a selection. I can then change the hardcoded filter values used earlier(“Annual Entry + Exit”, “2014”) to use the parameter value selected on the ui.R by adding the variables  input$recording_type and input$recording_type declared in ui.R

shinyServer(function(input, output) {
 
 setwd("C:/RDatasets/")
 dataframe.TFLStationNumber <- read.csv("TFLStationSummaryStatistics.csv", strip.white=TRUE) # Load dataframe
 options("scipen" = 10) # remove scientific notation
 output$mybarplot <- renderPlot({
 dataframe.Yearly <- subset(dataframe.TFLStationNumber, trimws(Recording) == input$recording_type & year == input$year) # filter yearly data trimming white space from values
 dataframe.Sorted <- dataframe.Yearly[order(-dataframe.Yearly$PeopleCount),] # sort by people count,
 barplot(head(dataframe.Sorted$PeopleCount/1000000), names.arg=head(dataframe.Sorted$Station), las=2, ylab="People Count(Millions)", main="Top Station Movement") # create graph
 })
})

The output should be similar to below, and the plot values should change as you change the parameter selections.

topstations21

See the application in action here

That is the very basic setup of a shiny application. I have kept the guide as simple as possible, there are plenty of resources available online to expand this further with more advanced graphs and use of more R libraries. We can also enhance the output using HTML tags and allow the user to upload, download data or export the output to PDF. The components I have looked at also have many more options, for example the type of input parameter and style of plot. See references below for more details or have a look at Part 3.

In part 3 I will take this further using more libraries, a clustering algorithm, and some google graphs. Finally in part 4 I will look at configuring and deploy to a Shiny Server.

References

http://shiny.rstudio.com/tutorial/

Data Resources

https://tfl.gov.uk/info-for/open-data-users/

Licence

https://tfl.gov.uk/corporate/terms-and-conditions/transport-data-service

Latest from this author