BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Data Warehousing Content on InfoQ

  • Data Workflow Management Using Airbnb's Airflow

    Airbnb recently opensourced Airflow, its own data workflow management framework. Airflow is being used internally at Airbnb to build, monitor and adjust data pipelines. Airflow’s creator, Maxime Beauchemin and Agari’s Data Architect and one of the framework’s early adopters Siddharth Anand discuss about Airflow, where it can be of use and future plans.

  • Software Defined Data Mart In The Enterprise Using Metanautix Quest

    Metanautix recently announced the newest version of its product, Quest. Quest allows enterprises to build software defined data marts that can run in virtualized servers. Designed from the ground up with security and auditability in mind, Quest can deal with Big Data workloads and encapsulate it into different logical views, ready for consumption by different departments in enterprise.

  • Thoughtworks Technology Radar March 2012

    ThoughtWorks recently published the latest update to its Technology Radar; a report produced to help technology decision makers understand emerging trends in software development techniques, tools, languages and platforms. There are some interesting observations of interest to Agile software development teams.

  • What’s New in SQL Server 2012 RC0

    Microsoft has released SQL Server 2012 Release Candidate 0. There are many new features, including: AlwaysOn, better performance management, more reporting and visualization tools, Columnstore index, and FileTables. The product will come in 3 main editions: Standard, Business Intelligence and Enterprise.

  • Olap4j 1.0: a Java API for OLAP Servers

    Business Intelligence vendor Pentaho has announced the release of olap4j 1.0, a new, common Java API for any online analytical processing (OLAP) server.

  • Column-based Storage in SQL Server 2011

    Imagine ad hock data mining queries against a single table with 1 TB of data and 1.44 billion rows coming back in roughly a second. This is the scenario Microsoft intends to support using 32-core machines and their new column-based storage engine.

  • Better Developer Experience in Version 1.5 of the Data Access Framework MetaModel

    Eobject.org's open-source Java framework MetaModel implements a unified API for the access, exploration, and query of different datastores. Eobjects.org, both a website and an open source software organization dedicated to "the development of Open Source software related to Business Intelligence and Data Warehousing", has recently published version 1.5 of MetaModel.

  • A Case for Graph Databases

    We talk with Daniel Kirstenpfad, founder and CTO of sones GmbH, about Graph Databases and how they can better model some types of data such as relations in a social networking application. A graph database can offer performance benefits over other types of databases because they explicitly represent a graph and are organized to have index free adjacency.

  • LinkedIn's Data Infrastructure

    Jay Kreps of LinkedIn presented some informative details of how they process data at the recent Hadoop Summit. Kreps described how LinkedIn crunches 120 billion relationships per day and blends large scale data computation with high volume, low latency site serving.

  • Facebook on Hadoop, Hive, HBase, and A/B Testing

    The Hadoop Summit of 2010 included presentations from a number of large scale users of Hadoop and related technologies. Notably, Facebook presented a keynote and details information about their use of Hive for analytics. Mike Schroepfer, Facebook's VP of Engineering delivered a keynote describing the scale of their data processing with Hadoop.

  • Emergent Data Architectures Highlights From GigaOm Structure Conference

    The GigaOM Stucture conference a couple of weeks ago addressed many areas of cloud computing. One of the key themes of the event was the emergence of new data architectures. Throughout the panels, interviews, and presentations many speakers identified significant changes in how data gets handled that will be coming.

  • Mahout 0.3: Open Source Machine Learning

    The need for machine-learning techniques like clustering, collaborative filtering, and categorization has steadily increased the last decade along with the number of solutions needing quick and efficient algorithms to transform vast amounts of raw data into relevant information. Apache Mount 0.3 has been announced on March, adding more functionality, stability and performance.

  • Information Can Be Sold and Bought in “Dallas”

    Microsoft’s service codename “Dallas” is an information marketplace bringing together data, imagery and service providers and their consumers facilitating information exchange through a single point of access.

  • MapPoint Add-In For SQL Server Released

    Microsoft released a free MapPoint 2009 Add-In for SQL Server 2008 spatial data. The add-in can be used with MapPoint to build map graphics against queries on SQL Server 2008 spatial geography columns.

  • Is the NoSQL Meeting Announcing the End of the RDBMS Era?

    The NoSQL meeting tried to raise the awareness towards the opportunity of using non-relational databases which promise to be cheaper, simpler to administer and maintain, and offering superior scalability. Michael Stonebraker, co-creator of Ingres and Postgres, thinks that the end of RDBMS era is close, while others think that we are not there yet.

BT