Troubleshooting R inside SQL Server 2016

#

The purpose of this article is to provide solutions for some of the common problems that I run into when installing R Services inside SQL Server 2016 and when executing R scripts with regular T-SQL code.

For this blog I will be using SQL Server 2016 Developer Edition, which is free to download if you have an account on MSDN. You can install SQL Server 2016 with an executable if you wish, but the installation is pretty standard with little room for customisation. To use R with T-SQL you need to install SQL Server 2016 with the R Services feature.

Installing SQL Server 2016 with R

NB: Some virtual machines (like the Data Science VM or the Windows SQL Server 2016 Enterprise VM on Azure) come with SQL Server 2016 already installed with R. If you are using such a VM, skip this step.

  1. Go to SQL Server 2016 Installation Center-> Installation -> … or add features to an existing installation.
  2. You might be asked to specify your where your installation media is. I actually found mine in a folder called C:/SQLServer2016Media/Developer. But whichever directory contains the SQL Server specific SETUP.EXE file is where you need to point it.
  3. Under Installation Type, select Add new features to an existing instance of SQL Server 2016.
  4. Under Feature Selection, tick the Database Engine Services (which includes R Services) and why not R Server too.

Configuring R in SQL Server

  1. You will need to configure SQL Server so that external scripts (like R scripts) are able to run in T-SQL queries. Open up a new SQL query window in Management Studio and execute the following:
    EXECUTE sp_configure 'external scripts enabled', 1;
    RECONFIGURE; 
  2. In order for this to take effect, you will need to restart SQL Server. This is convenient, because you need to make sure that a couple of services are running before R works. Go to View local services and restart the SQL Server service.
  3. Ensure that the following services are also running the SQL Server Launchpad and the SQL Server Agent
  4. Test that the R engine is working inside SQL Server by running the following query. You should know it has worked when you see a table containing data about cats’ body weights (Bwt) relative to their heart weights (Hwt).
DECLARE @Rscript AS NVARCHAR (255);
SET @Rscript = N'
# access a library with sample data.
library(MASS)
# return a table containing cat data.
myCats <- cats;';
EXECUTE sp_execute_external_script @language = N'R', @script = @rscript, @output_data_1_name = N'myCats';  

 

Installing packages

There are a number of ways you might want to install packages in R. If you are already comfortable with R, you can spin up an R terminal session, then the simplest way is to find the directory in which your R packages (specific to SQL Server 2016) are housed and install new packages into them.

Navigate to the bin directory from your command prompt (usually something like ~/MSSQL13.MSSQLSERVER/R_SERVICES/bin), execute R.exe and then the following R code:

# Point your R terminal session to the directory containing your SQL Server native libraries.
path <- "~/Microsoft SQL Server/MSSQL13.MSSQLSERVER/R_SERVICES/library";
# Set your .libPaths() variable, 
# so that install.packages() installs new packages into your SQL Server libraries directory.
.libPaths(path);
# Now any library or package you install will be available to SQL Server.
install.packages(
     c("dplyr","tidyr","data.table","psych","twitteR")
     );
Or similarly:
install.packages("dplyr", lib = path);

 

Firewall Rules Blocking SQL Server

After you install the necessary packages and configure SQL Server to execute external scripts, if you are pulling data from an external API or something similar then you may find that your R scripts will fail. This is because of a firewall rule that is created by default when R services is installed.

This example uses a handful of libraries to authenticate and connect to the Twitter Streaming API, downloading tweets with a certain search term in real time. When I run the same script in a normal R session, it will connect and stream tweets with no problems. The same script run in SSMS will immediately cut out at 0 seconds and cause the script to fail, since no tweets are pulled from Twitter’s servers.

DECLARE @Script AS NVARCHAR (MAX) = N'
library(streamR)
library(ROAuth)
library(RJSONIO)
# Get tweets using functions from the streamR.
load("/path/to/my_oauth.Rdata")
tweets <- filterStream(file.name = "",
                       track = "corbyn", 
                       language = "en",
                       timeout = 10,
                       oauth = my_oauth); 
tweets.df <- parseTweets(tweets)';
EXECUTE sp_execute_external_script @language = N'R', @script = @script, @output_data_1_name = N'tweets.df';  

 

One method is to simply turn off the firewall for the public network. If you are a sensible and security-minded person you might opt for the following instead, which allows you to keep both the public and private network firewalls on:

  1. Go to Windows Firewall with Advanced Security.
  2. Under Outbound Rules, disable “Block network for R local user accounts in SQL Server Instance [instance name]”.
  3. Test to see if running the same script inside SSMS. It should now take a little while it streams tweets and outputs them into a table.

 

Go and play!

This covers most of the common problems I have run into when installing R services in SQL Server 2016 and when running certain R scripts.

If you would like to become more familiar with the sp_execute_external_script procedure and how to use the input and output parameters, there are some really good resources on MSDN, including a data science walkthrough with data and example scripts to download and tinker with. 

You can start to learn the basics of R on the new Consolidata website, and start working through some interesting data applications like this one.

Latest from this author