InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Hadoop Futures at Structure Big Data: DataStax Brisk, EMC, and MapR
DataStax described Brisk their new Hadoop distribution that stores data in Cassandra, EMC published an ad that promised big news about Hadoop and Greenplum on May 9th, and GigaOm claimed that MapR Technologies is building a proprietary version of Hadoop. DataStax told InfoQ there are production Cassandra clusters of 700 nodes, storing hundreds of terbaytes, and doing 200,000 writes per second.
-
MongoDB 1.8 Improves Reliability with Journaling
MongoDB's new journaling feature improves reliability with write-ahead redo logs. Log entries are written before permanent storage is updated. When a server restarts after a crash outstanding journal files will be replayed before the server goes online. Other changes include sharding performance boosts, shell tab completion, and the addition of covering and sparse indexes.
-
Couchbase Announces Couchbase Server and an Advisory Board
Couchbase, the company recently formed by merging Membase and CouchOne, has announced the availability of Couchbase Server in addition to Membase Server and Mobile Couchbase, along with the Advisory Board Members.
-
Scale-up or Scale-out? Or both?
A prevalent trend in IT in the last twenty years was scaling-out, rather than scaling-up. But due to the recent technological advances there is a new option, scaling-out scaled-up servers based on GPUs.
-
Hadoop Redesign for Upgrades and Other Programming Paradigms
Yahoo recently announced and presented a redesign of the core map-reduce architecture for Hadoop to allow for easier upgrades, larger clusters, fast recovery, and to support programming paradigms in addition to Map-Reduce. The new design is quite similar to the open source Mesos cluster management project - both Yahoo and Mesos commented on the differences and opportunities.
-
NASA’s OODT selected as an Apache Top Level Project
The Apache Software Foundation has selected the Object Oriented Data Technology architecture to become one of its Top-Level-Projects (TLP). Originally created by NASA’s Jet Propulsion Laboratory, Pasadena, OODT allows transparent integration of geographically distributed and disparate computing and data resources via metadata middleware.
-
NoSQL Shake-Up. Membase and CouchOne merge into Couchbase
The shape of the NoSQL landscape is changing. The first big market aggregation took place with the merger of Membase Inc. with CouchOne into Couchbase. InfoQ spoke with James Phillip and Damien Katz about the benefits of the merger and future products.
-
Revolution Analytics - Commercializing R for Statistics
InfoQ interviewed David Smith, VP of Community for Revolution Analytics at the Strata big data conference. Revolution provides commercial extensions for the open source R statistics package and announced the R Enterprise v4.2 Suite along with offering tools to help SAS users to migrate to R.
-
JasperSoft 4 Released with Big Data Support
JasperSoft announces reporting support for Hadoop and leading NoSQL databases.
-
Making the Case for RAMClouds
Since early 2008, researchers and technologists alike have been tantalized by the possibility of using DRAM to scale high-performance storage using In Memory Data Grids, IMDG. How has the discussion progressed since that time?
-
Strata Big Data Conference Program Published
The program for the new O'Reilly Strata conference for big data was announced today and registration has opened. We interviewed conference organizer Edd Dumbilll about the conference.
-
MySQL/HandlerSocket and VoltDB: Contenders to NoSQL
NoSQL systems are considered by some as performing better than traditional SQL ones. Two SQL solutions, one based on MySQL plus a NoSQL layer used as a plug-in and VoltDB claim SQL still is a viable solution for large applications with high scalability needs.
-
Foursquare's MongoDB Outage
Foursquare recently suffered a total site outage for eleven hours. The outage was caused by unexpected uneven growth in their MongoDB database that their monitoring didn't detect. The system outage was prolonged when an attempt to add a partition didn't work due to fragmentation, and required taking the database offline to compact it. Learn what happened and what responses are planned.
-
Membase and Cloudera Announce Integration
Membase and Cloudera announced integration of the Membase NoSQL database and Cloudera's Distribution for Hadoop, the distributed map-reduce and storage system, allowing for bi-direction data replication between the systems.
-
Designing a Web Application with Scalability in Mind
Max Indelicato, a Software Development Director and former Chief Software Architect, has written a post on how to design a web application for scalability. He suggests choosing the right deploying and storage solution, a scalable data storage and schema, and using abstraction layers.