Evolution in Data Integration From EII to Big Data
Approaches to integrating data are changing with emergence of cloud computing.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Jonathan Allen on Jul 25, 2011
As part of the Python Tools for Visual Studio project the well-known NumPy and SciPy libraries were ported to .NET. The port, which combines C# and C interfaces over a native C core, was done in such a way that all .NET languages can take advantage of it.
The IronPython ports of NumPy and SciPy are full .NET ports and include custom C#/C interfaces to a common native C core. This means that the full functionality is available not only to IronPython but to all .NET languages such as C# or F# by directly accessing the C# interface objects or sometimes by evaluating IronPython expressions from other .NET languages. This means that a multi-dimensional array object (ndarray) can be passed seamlessly between IronPython and C# or F# code. Further, the ndarray object implements the standard IEnumerable interface, allowing the array object to often be used with existing code that is not specific to NumPy.
NumPy is a fairly low level API for performing mathematical operations on large, multi-dimensional arrays and matrices. This library, originally known as Numeric, dates back to 1995, just one year after Python 1.0 was released. The current name of version of the library was created in 2005 by combining the earlier versions with a competing library known as numarray.
Built on top of this is SciPy. According to Wikipedia, “SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.” It is often considered an alternative to MATLAB, though SciPy often has to be combined with other libraries to fully replace the former.
The combination of NumPy and SciPy offers some notable advantages over normal .NET code. While .NET’s garbage collector can offer better performance than manual memory management, there is something to be said for the raw computational speed one can get from highly optimized C code.
On top of this is the concept of views. Instead of copying arrays, NumPy allows one to create arrays that are live subsets of other arrays. Changing the subset, known formally as a view, also changes the original array. This allows for cleaner code without sacrificing performance.
Using Drools? See what you're missing! Get the Power of Drools with the Assurance of Red Hat
Why NoSQL? A primer on Managing the Transition from RDBMS to NoSQL
Monitor your Production Java App - includes JMX! Low Overhead - Free download
Approaches to integrating data are changing with emergence of cloud computing.
Michele Ide-Smith presents the lessons learned in the process of introducing UX principles and techniques into a large organization through a series of small steps.
Dave Farley and Martin Thompson discuss solutions for doing low-latency high throughput transactions based on the Disruptor concurrency pattern.
Rajneesh Namta shares his thoughts, experiences, and some of the critical lessons learned while implementing software test automation on a recent Agile project.
Dale Schumacher presents several patterns of actor interaction that can be used in collaborative programs written in any language.
Rúnar Bjarnason discusses Scalaz, a Scala library of pure data structures, type classes, highly generalized functions, and concurrency abstractions to perform functional programming in Scala.
One of the main challenges when designing software architecture is considering quality attributes. Not only their design turns out to be difficult, but also the specification of these attributes.
Michael Feathers analyzes real code bases concluding that code is not nearly as beautiful as designers aspire to, discussing the everyday decisions that alter the code bit by bit.
No comments
Watch Thread Reply