BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Big Data Content on InfoQ

  • Gobblin, LinkedIn's Unified Data Ingestion Platform

    At the 2014 QCon San Francisco conference, LinkedIn's Lin Qiao gave a talk on their Gobblin project (also summarized in a blog post) that is a unified data ingestion system for their internal and external data sources.

  • MapR-DB NoSQL Database Integrated into MapR Community Edition for Unlimited Production Use

    MapR Technologies, provider of the Apache Hadoop distribution, has open sourced their MapR-DB NoSQL database for unlimited production use. MapR-DB is a Wide Column NoSQL database with native integration to Hadoop and support for strong consistency and ACID transactions.

  • GridGain Becomes Apache Ignite

    GridGain's In-Memory Data Fabric entered Apache Incubator last October under the name of Apache Ignite. The company donated its flagship in-memory computing platform to the Apache Software Foundation with the intention of attracting external developers and growing a viable community around its core technology.

  • IBM, Databricks, GraphLab Present Notebooks as Unified Interfaces for Building Prediction Apps

    At the StrataHadoop conference in Barcelona last week, Rod Smith, Vice President of the IBM Emerging Internet Technologies organization, presented work on an internal product they have been developing in their consulting work with clients that integrates data sources, and data analysis.

  • Mahout to Get Self-Optimizing Matrix Algebra Interface with Pluggable Backends for Spark and Flink

    At the recent GOTO conference in Berlin, Mahout committer Sebastian Schelter outlined recent advances in Mahout's ongoing effort to create a scalable foundation for data analysis that is as easy to use as R or Python.

  • Web Summit 2014 Day Two Review

    Yesterday concluded the second day of the Web Summit in Dublin, Ireland. We see what happened and what is new from last day at the event.

  • Web Summit 2014 Day One Review

    Web Summit, one of the largest technology conferences in Europe opened up today. Famous people from the technology and business world are expected to talk, like Peter Thiel, Drew Houston and Anna Patterson.

  • Forrester Wave: Evaluating NoSQL Key-Value Databases

    In their first Forrester Wave: NoSQL Key-Value Databases, released in Q3 2014, Forrester has evaluated the most popular NoSQL database offerings.

  • Reactive Extensions, Async, and Splunk

    The 2.0 version of the Splunk C# SDK is heavily invested in modern C# features. Every major operation from login-onwards is available via asynchronous methods. And for most advanced uses such as sampling, Reactive Extensions come into play.

  • Splunk Conference Recap: The Key to Big Data is Machine Learning

    Splunk’s user conference has drawn to a close. After three days with over 160 sessions ranging from security and operations to business intelligence to even the Internet of Things, the same central theme kept appearing over and over again: the key to Big Data is machine learning.

  • Using Logs to Detect User-Based Threats

    A common theme at the Splunk user conference is the idea that the users are the greatest threat. Even in a well-regulated enterprise where no one has more privileges than what’s needed to do their job, a typical user has more than enough ability to steal massive amounts of data or cause widespread problems. Fortscale seeks to address this issue by using the data that you are already collecting.

  • Proactively Monitor Configuration Changes with Tripwire

    Most companies still manually track configuration changes using a wiki or spreadsheet. Only the most basic information such as IP addresses are included, as recording everything is just too tedious. Even knowing basic information such as who made the change is difficult and time consuming. Tripwire seeks to eliminate this problem by proactively monitoring configuration changes.

  • Big Data Analytics: Using Hunk with Hadoop and Elastic MapReduce

    Hunk is a relatively new product from Splunk for exploring and visualizing Hadoop and other NoSQL data stores. New in this release is support for Amazon’s Elastic MapReduce.

  • Splunk .conf2014 Keynote 1

    At the opening keynote for Splunk .conf2014 we heard about GE Capital’s developer culture, Red Hat’s internal IT focus, and Coca-Cola’s “Data Lake” theory of information management.

  • Apache Drill Included in MapR Latest Distribution Release

    MapR recently announced including Apache Drill in its latest release of MapR distribution. Apache Drill is the open source version of Google’s Dremel. Dremel is the infrastructure on which BigQuery is based upon. Drill is offering a low latency SQL-on-Hadoop interface. While this puts it in the same space as several other technologies around Hadoop, Drill has some unique characteristics setting it

BT