BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News R for SQL Server 2016

R for SQL Server 2016

The R programming language was started in 1993 and saw its first production quality release in 2000. Since then, R has become an industry standard for statistical analysis with nearly a dozen graphical interfaces as well as packages for various IDEs and text editors.

Commercial support for R is offered by Revolution Analytics, a nine-year-old company that was recently acquired by Microsoft as part of their big data and machine learning initiatives. The result of this acquisition is support for R in the upcoming SQL Server 2016.

The primary advantage of SQL Server R Services is data locality. With R running in the database, you eliminate the performance hit normally associated with moving the data from your DB server to your R server. It also has the advantage of being able to encapsulate the whole operation into a stored procedure, eliminating the need for an external application server that governs the process.

The gateway for this is sp_execute_external_script. This stored procedure allows you to pipe data from SQL Server to R using standard queries. An R variable, usually a data frame, can likewise be returned back to SQL Server upon completion of the R script.

An interesting feature of sp_execute_external_script is the @language parameter. While it is currently limited to just R, the documentation implies that a plugin model for other scripting languages will be offered.

Can be any supported language that has been registered by using xxx. language is sysname.

R offers many advantages of normal SQL when it comes to complex data manipulation. Casimir Saternos article, Data Manipulation in R: Beyond SQL, offers some examples such as how to simplify pivoting and unpivoting tables, Casimir summaries the article with,

SQL is – by definition – a query language. It excels at retrieving data from a database and is in fact essential in many situations where it is the only way to get data out of a database.

However, SQL can be cumbersome when it is used to transform data. SQL is very flexible and so does support the ability to transform data in significant ways, but often with the cost of requiring verbose, obscure, and difficult to support SQL statements. R includes a number of packages that can perform such operations in a concise, clear, simple manner. It is well worth the time to learn these packages so that the best aspects of both SQL and R packages can be used to analyze data using a series of steps that is easy to understand and comprehend.

SQL Server 2016 is currently in its third Community Tech Preview.

Rate this Article

Adoption
Style

BT