Oracle Big Data Appliance and Connectors Support Integration with Hadoop and Cloudera Manager
Oracle Big Data Appliance and Big Data Connectors support integration with Hadoop, Cloudera Manager and Oracle NoSQL Database. Oracle announced last month the availability of Big Data Appliance and Connectors products as well as the partnership with Cloudera.
Big Data Appliance incorporates Cloudera's Distribution including Apache Hadoop (CDH) with Cloudera Manager and an open source distribution of statistical programming language R. It runs on Oracle Enterprise Linux 5.6 operating system with the HotSpot Java Virtual Machine included in the package. The Big Data Appliance can run Oracle NoSQL Database Community and Enterprise Editions. It also integrates with other Oracle products like Exadata and Oracle Database using Oracle Big Data Connectors for the analysis of structured and unstructured data in the enterprise.
Big Data Connectors:
Oracle Big Data Connectors product can be used to integrate data stored in Hadoop and Oracle NoSQL Database with Oracle Database 11g. It enables data analysis using Oracle’s distribution of open source R analysis directly on Hadoop data. Big Data Connectors software bundle includes the following components:
- Oracle Loader for Hadoop: This is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. It can be used to sort, partition, and convert data into Oracle Database formats in Hadoop. It preprocesses the data to be loaded as a Hadoop job on a Hadoop cluster and then loads the converted data into the database. This feature also supports on-line and off-load options, load balancing, and multiple input formats (like delimited text files, Hive tables, and custom formats).
- Oracle Direct Connector for Hadoop Distributed File System (HDFS): This connects the data on HDFS from an Oracle Database and gives users the ability to access and import data from HDFS by allowing the creation of an external table in Oracle Database. The data stored in HDFS can be queried via SQL, joined with data stored in Oracle Database, or loaded into the Oracle Database. Data on HDFS can be in delimited files or in Oracle data pump files created by Oracle Loader for Hadoop.
- Oracle Data Integrator (ODI) Application Adapter for Hadoop: This adapter provides native Hadoop integration within ODI. The ODI modules can be used to build Hadoop metadata within ODI, load data into Hadoop, transform data within Hadoop, and load data directly into Oracle Database utilizing Oracle Loader for Hadoop.
- Oracle R Connector for Hadoop: This component is an R package that provides access to Hadoop and to the data stored in HDFS. It's used to create R models against large volumes of data leveraging MapReduce processing.
Cloudera Manager included in the Big Data Appliance bundle gives a cluster-wide, real-time view of nodes and services running and can be used to make the configuration changes across the cluster. It also includes reporting and diagnostic tools to view the cluster performance and utilization.
Oracle Advanced Analytics:
Oracle also recently announced the Oracle Advanced Analytics for Big Data by integrating the R Statistical Programming Language into Oracle Database 11g product. Oracle Data Mining, which now part of Oracle Advanced Analytics, helps enable customers to build and deploy predictive analytic applications to get more insights into the application performance.
Srini Penchikala Aug 21, 2014