IBM’s Software Architecture for Astronomically Big Data
IBM has recently prototyped a software architecture that can deal with large amount of data flows. IBM’s software is built for the SKA telescope (Square Kilometre Array) and allows to automatically classify astronomical objects. Radio astronomer Melanie Johnston-Hollitt at Victoria University, Wellington , NZ, has collaborated with IBM for developing the system.
Main goal of the SKA project is to perform unprecedented observation of radio sources using a network of dishes and aerials spread over Australia and New Zealand or through Southern Africa. A main design challenge is how to process one Exabyte of raw data per day. This is the data amount anticipated when the SKA system as the world’s largest and most sensitive radio telescope will be ready; it’s construction will start in 2016. IBM claims that this data amount exceeds the entire daily Internet traffic. The amount would suffice to fill over 15 million 64 GB iPods.
IBM announced on 30th November that it has prototyped
a new software architecture for automating data management, potentially making it easier for researchers to collect usable information from mega-scale data collection projects like the Square Kilometre Array (SKA) global telescope which aims to address unanswered questions about our universe.
With the support of Dr. Melanie Johnston-Hollitt the company created the Information Intensive Framework (IIF). According to IBM, the software uses the International Virtual Observatory Association Ontology to classify collected data into concepts understood by astronomers and then provides intelligent 'guided search' functionality. The ontology is technically based on the Ontology Web Language (OWL). By automating classification, astronomers hope to increase productivity and creativity.
While originally developed for the SKA, the IIF could also be leveraged in other domains. As Douglas Watt, Chief Technology Officer of IBM New Zealand, explains:
While developed with SKA in mind, the results are also applicable to other organisations faced with a ‘data deluge’. We have identified several local scenarios which would benefit from automated analysis of performance data to uncover trends, identify anomalies and improve decisions. These range from individual manufacturing plants and telecommunications companies to whole transport networks and healthcare systems.
Further work on IIF will comprise, among other topics, the achievement of performance improvements by leveraging parallel processing.
Readers interested in the SKA project can view an image at Flickr that illustrates some of the impressing details of SKA.