BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage AI, ML & Data Engineering Content on InfoQ

  • A Roundup of Cloudera Distribution Containing Apache Hadoop 5

    Cloudera recently released the latest version of its software distribution, CDH5. Almost 20 months after the last major version, CDH4 seems like ages in the Big Data world. We take a look at new features this release brings and the future direction of Cloudera after the latest round of investment from Intel and Google Ventures.

  • Hydra Takes On Hadoop

    The social-networking company AddThis open-sourced Hydra under the Apache version 2.0 License in a recent announcement. Hydra grew from an in-house platform created to process semi-structured social data as live streams and do efficient query processing on those data sets.

  • MongoDB 2.6 Release - An Interview With Kelly Stirman

    MongoDB needs no introduction for NoSQL users. Kelly Stirman, Director of Product Marketing at MongoDB is answering questions about the latest stable 2.6 release. Storage fragmentation, index intersection, full text search and MongoDB in enterprise are discussed. Finally, we have more info about one of the most watched and voted feature requests at MongoDB jira tracker, collection level locking.

  • Spark Gets a Dedicated Big Data Platform

    Spark users can now use a new Big Data platform provided by intelligence company Atigeo, which bundles most of the UC Berkeley stack into a unified framework optimized for low-latency data processing that can provide significant improvements over more traditional Hadoop-based platforms.

  • Rebecca Parsons on the ThoughtWorks Technology Radar

    In January ThoughtWorks released the latest version of their Technology Radar in which they track what's interesting in the software development ecosystem. The big themes this year are (1) early warning systems and recovery in production, (2) the tension between privacy and big data, (3) the javascript ecosystem and (4) blurring of the line between the physical and virtual worlds.

  • CouchDB Progresses As IBM Acquires Cloudant

    IBM announced recently a definite agreement to acquire the major contributor to the CouchDB project, cloud database startup Cloudant. Adding CouchDB to IBM’s arsenal of technologies together with SoftLayer acquisition and MongoDB partnership creates an ecosystem of technologies bringing IBM in direct comparison with Amazon. A comparison of CouchDB, DynamoDB and ObjectRocket shows the strong points

  • HBase 0.98 Introduces Cell-based Security

    Apache released HBase 0.98 primarily addressing convergence with Apache Accumulo via cell-based security while resolving over 230 JIRA issues. These new security features are modeled after Accumulo.

  • Graph Processing Using Big Data Technologies

    Processing extremely large graphs has been and remains a challenge, but recent advances in Big Data technologies have made this task more practical. Tapad, a startup based in NYC focused on cross-device content delivery, has made graph processing the heart of their business model using Big Data to scale to terabytes of data.

  • Domino: Datascience-as-a-Service

    Domino, a Platform-as-a-Service for data science, enables people to do analytical work using languages such as Python or R in the cloud (EC2).

  • Cassandra Gains Momentum On Enterprise Adoption Around 2.1 Release

    Cassandra is rapidly heading towards 2.1 release, with 2.1.0-beta1 already available for evaluation. We take a look at major features introduced in the latest major release and what's coming up. Supported by DataStax, Cassandra is expanding its reach towards the enterprise world. DataStax recently announced a partner network program, Patrick McFadin called out on MongoDB's scaling issues and other

  • Big Data Hadoop Solutions, State of Affairs in Q1/2014

    According to a new Forrest report, Hadoop’s momentum is unstoppable. Its usage in the enterprise is continuously growing due to its ability to offer companies new ways to store, process, analyze, and share big data. The report takes a look at Hadoop vendors and ranks them.

  • IBM Launches Contest for Cognitive Mobile Apps using Watson

    At the Mobile World Congress, IBM has announced a developer contest for developers to create mobile consumer and business apps powered by IBM Watson cognitive computing platform. The winners of the IBM Watson Mobile Developer Challenge will receive design consulting and support from IBM to gain access to the market.

  • Spark Officially Graduates From Apache Incubator

    Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.

  • Elasticsearch 1.0.0 released

    Elasticsearch released version 1.0.0 of its self-titled, open-source analytics tool. Elasticsearch is a distributed search engine which allows for real-time data analysis in big-data environments. The new version comes with various functional enhancements and changes to the API to make Elasticsearch more intuitive and powerful to use.

  • Running Spark on R with SparkR

    UC Berkeley’s AMPLab announced a developer preview of their new project SparkR to use Apache Spark natively from R.

BT