Evolution in Data Integration From EII to Big Data
Approaches to integrating data are changing with emergence of cloud computing.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Mirko Stocker on May 18, 2011
Together with last week's announcement of the new Scala company Typesafe, the latest major version 2.9.0 of Scala was released. Compared to last years Scala 2.8 release, 2.9 contains a much smaller amount of new features and concentrated on improvements and bug fixes of existing ones.
The primary new feature of Scala 2.9 are the parallel collections, which are built on the same abstractions and provide the same interfaces as the existing collection implementations. From the release notes:
Every collection may be converted into a corresponding parallel collection with the new `par` method. Parallel collections utilize multicore processors by implementing bulk operations such as `foreach`, `map`, `filter` etc. in parallel.
For example, given a large array of strings which need to be filtered and then processed, the sequential code looks like this:
val result = data.filter(line => line.contains("keyword")).map(line => process(line))
With a large enough data set and a computationally intensive process function, in Scala 2.9 the code can easily be adjusted to run on multiple CPUs:
val result = data.par.filter(line => line.contains("keyword")).map(line => process(line))
Aleksandar Prokopec's presentation from Scala Days 2010 introduces and explains how the parallel collections work; there also exists a technical report.
Other new features in Scala 2.9 include some useful additions to the Scala REPL (the interactive Scala interpreter):
More robust cursor handling, bash-style ctrl-R history search, new commands like :imports, implicits, :keybindings.
From SBT, Scala's Simple Build System, the scala.sys and scala.sys.process packages were imported to make interacting with native processes easier. Another very useful improvement, even though it's not mentioned in the release notes, is the revamped ScalaDoc. For an example, compare the documentation of Scala's List in 2.9.0 to 2.8.1.
When Scala 2.8 was released last year, many users were concerned about the binary and source incompatibilities it introduced. Martin Odersky recently addressed the topic on the mailing list:
[..] 2.8.1 was binary compatible with 2.8. Now that 2.9 is around the corner, the question is what can we do for binary compatibility now and in the future?
[..] we not there yet, but are making progress. In particular, we have developed the basic technology that lets us maintain binary compatibility for future releases. [..]
Some binary incompatibilities are not accidental, but the result of conscious generalizations and enrichments of the libraries. We do not want to sacrifice these possible improvements for binary compatibility. What we do instead is search for technological solutions.
One such technological solution are compiler generated forwarders, so-called bridge methods, which delegate calls from an old to a new method. This technique is already used in Scala 2.9.0. Another solution is the migration manager that is being developed at Typesafe, which is not yet available. Martin concludes:
For 2.9, bridge methods ensure that quite a lot of code compiled against 2.8 will continue to operate, but my no means all code. So it is very much advised to recompile your projects for 2.9. Recompilation should in most cases be painless because 2.9 is by-and-large source compatible with 2.8. There's one exception: If your application uses features that were already deprecated in 2.8, it might find these features removed in 2.9. So it's a good idea to get rid of deprecation warnings before upgrading.
In contrast to the release of Scala 2.8.0, 2.9.0 is a much smaller, incremental release and many libraries are already available for Scala 2.9.
Monitor your Production Java App - includes JMX! Low Overhead - Free download
Using Drools? See what you're missing! Get the Power of Drools with the Assurance of Red Hat
In today’s hyper-competitive world, later may be too late to adopt Agile development and this Roadmap for Success will help you get started. Download "Agile Development: A Manager's Roadmap for Success" now!
Approaches to integrating data are changing with emergence of cloud computing.
Michele Ide-Smith presents the lessons learned in the process of introducing UX principles and techniques into a large organization through a series of small steps.
Dave Farley and Martin Thompson discuss solutions for doing low-latency high throughput transactions based on the Disruptor concurrency pattern.
Rajneesh Namta shares his thoughts, experiences, and some of the critical lessons learned while implementing software test automation on a recent Agile project.
Dale Schumacher presents several patterns of actor interaction that can be used in collaborative programs written in any language.
Rúnar Bjarnason discusses Scalaz, a Scala library of pure data structures, type classes, highly generalized functions, and concurrency abstractions to perform functional programming in Scala.
One of the main challenges when designing software architecture is considering quality attributes. Not only their design turns out to be difficult, but also the specification of these attributes.
Michael Feathers analyzes real code bases concluding that code is not nearly as beautiful as designers aspire to, discussing the everyday decisions that alter the code bit by bit.
No comments
Watch Thread Reply