BT

R for SQL Server 2016

by Jonathan Allen on Feb 11, 2016 |

The R programming language was started in 1993 and saw its first production quality release in 2000. Since then, R has become an industry standard for statistical analysis with nearly a dozen graphical interfaces as well as packages for various IDEs and text editors.

Commercial support for R is offered by Revolution Analytics, a nine-year-old company that was recently acquired by Microsoft as part of their big data and machine learning initiatives. The result of this acquisition is support for R in the upcoming SQL Server 2016.

The primary advantage of SQL Server R Services is data locality. With R running in the database, you eliminate the performance hit normally associated with moving the data from your DB server to your R server. It also has the advantage of being able to encapsulate the whole operation into a stored procedure, eliminating the need for an external application server that governs the process.

The gateway for this is sp_execute_external_script. This stored procedure allows you to pipe data from SQL Server to R using standard queries. An R variable, usually a data frame, can likewise be returned back to SQL Server upon completion of the R script.

An interesting feature of sp_execute_external_script is the @language parameter. While it is currently limited to just R, the documentation implies that a plugin model for other scripting languages will be offered.

Can be any supported language that has been registered by using xxx. language is sysname.

R offers many advantages of normal SQL when it comes to complex data manipulation. Casimir Saternos article, Data Manipulation in R: Beyond SQL, offers some examples such as how to simplify pivoting and unpivoting tables, Casimir summaries the article with,

SQL is – by definition – a query language. It excels at retrieving data from a database and is in fact essential in many situations where it is the only way to get data out of a database.

However, SQL can be cumbersome when it is used to transform data. SQL is very flexible and so does support the ability to transform data in significant ways, but often with the cost of requiring verbose, obscure, and difficult to support SQL statements. R includes a number of packages that can perform such operations in a concise, clear, simple manner. It is well worth the time to learn these packages so that the best aspects of both SQL and R packages can be used to analyze data using a series of steps that is easy to understand and comprehend.

SQL Server 2016 is currently in its third Community Tech Preview.

Rate this Article

Relevance
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss
General Feedback
Bugs
Advertising
Editorial
Marketing
InfoQ.com and all content copyright © 2006-2016 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT