Confluent Platform 3.0 messaging system from Confluent, the company behind Apache Kafka messaging framework, supports Kafka Streams for real-time data processing. The company announced last week the general availability of the latest version of the open source Confluent platform.
While the last two versions of SQL Server focused on improving performance by offering new features, SQL Server 2016 looks inwards towards improving existing functionality.
Cloudera announced their partnership with MIT & Harvard's Broad Institute and detailed some of their experience with the Genome Analytics Toolkit pipeline.
In conjunction with the release of SQL Server 2016, Microsoft has announced that the Developer Edition of SQL Server will be free.
Two years after the first release of Apache Spark, Databricks announced the technical preview of Apache Spark 2.0 , based on upstream branch 2.0.0-preview. The preview is not ready for production, neither in terms of stability nor API, but is a release intended to gather feedback from the community ahead of the general availability of the release.
Realm, the open-source, object-oriented database has launched version 1.0 for iOS and Android. Realm's technical team told InfoQ that among the noted changes in the mobile database's latest release are an improved query language with support for partial string matches, relationship traversal, multi-field sorting, and distinct matches.
The NOLOCK directive was broken in Cumulative Update #6 for SQL Server 2014 SP1. As a result, databases that relied on that directive may experience unexpected blocking and/or deadlocks.
Amazon has recently announced an update to their Amazon Kinesis Service. In this update, three new features have been added to Amazon Kinesis Streams and Amazon Kinesis Firehose including support for Elasticsearch Service Integration, Shard-Level Metrics and Time-Based Iterators.
AWS engineers Christopher Crosbie and Ujjwal Ratan detail using Spark on EMR for precision medicine data analysis on the ADAM platform with data from the 1000 genomes project.
Genomic data sequencing and subsequent analysis faces large data volume challenges that several organizations are solving with cloud services. The Broad Institute detailed their experience with petabyte scale sequencing pipelines last month through the Google Research Blog and is detailed here by InfoQ.
NoSQL database adoption in a large organization takes significant effort and time for the transition from using relational database models to NoSQL databases. Mike Bowers, Enterprise Data Architect at LDS Church, spoke at the recent Enterprise Data World Conference about lessons learned from eight years of using NoSQL databases.
After months of awaiting details about the NHS and Google DeepMind partnership InfoQ gains insights into recent claims of widespread patient data access.
SQL Server 2005 has now officially hit its end of life. This means that it will no longer receive security updates and new vulnerabilities that are discovered will go unfixed. Yet a recent survey commissioned by Microsoft showed that 46% of companies using SQL Server had at least one production machine running SQL Server 2005.
Hadoop and other big data technologies revolutionized the way organizations run data analytics but the organizations are still facing challenges with operating costs of using these technologies for on-premise data processing. Ashish Thusoo recently spoke at Enterprise Data World Conference about Hadoop as a service offering that helps organizations bridge the gaps with these capabilities.
AirFlow recently joined the Apache Incubator program. AirFlow is a workflow and scheduling system designed to manage data pipelines. Developed by AirBnb for their internal usage, it was open sourced last September, as previously reported by InfoQ.