InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Vert.x 3.3.0 Features Enhanced Networking Microservices, Testing and More
Vert.x core developer Clement Escoffier of RedHat explores key features of just released Vert.x 3.3.0 reactive toolkit.
-
Apache TinkerPop Graduates to Top-Level Project
TinkerPop, a graph compute framework for OLTP and OLAP graph database and analytics processing graduated to top-level project with the Apache Software Foundation.
-
Test Well and Prosper: The Great Java Unit-Testing Frameworks Debate
A recent post in Reddit sparked a debate between the traditional testing framework JUnit and upstart Spock with the central theme, “What’s wrong with JUnit?”
-
Neha Narkhede: Large-Scale Stream Processing with Apache Kafka
In her presentation "Large-Scale Stream Processing with Apache Kafka" at QCon New York 2016, Neha Narkhede introduces Kafka Streams, a new feature of Kafka for processing streaming data. According to Narkhede stream processing has become popular because unbounded datasets can be found in many places. It is no longer a niche problem like, for example, machine learning.
-
LinkedIn Details Production Kafka Debugging and Best Practices
LinkedIn’s Joel Koshy details their Kafka usage, debugging and monitoring two production incidents in using the core Kafka infrastructure concepts, semantics and behavioral patterns to plan for and detect similar problems in the future.
-
Data Streaming Architecture with Apache Flink
Jamie Grier recently spoke at OSCON 2016 Conference about data streaming architecture using Apache Flink. He talked about the building blocks of data streaming applications and stateful stream processing with code examples of Flink applications and monitoring.
-
LinkedIn Details Open-Sourced Kafka Monitor
LinkedIn recently detailed open-sourced Kafka Monitor service that they're using to monitor production Kafka clusters as well as extensive testing automation, leading them to identify bugs in the main Kafka trunk and contribute solutions to the open-source community.
-
Spring Releases Version 1.1 Statemachine Framework
Spring releases version 1.1 of their state machine framework, dubbed Statemachine, featuring support for Spring Security, built-in support for Redis, and support for UI modeling.
-
Confluent Platform 3.0 Supports Kafka Streams for Real-Time Data Processing
Confluent Platform 3.0 messaging system from Confluent, the company behind Apache Kafka messaging framework, supports Kafka Streams for real-time data processing. The company announced last week the general availability of the latest version of the open source Confluent platform.
-
Combine SQL Server with Hadoop Using PolyBase
With the recently released SQL Server 2016, you can now use SQL queries against Hadoop and Azure blob storage. Not only do you no longer need to write map/reduce operations, you can also join relational and non-relational data with a single query.
-
Cloudera Announces Partnership with the Broad Institute
Cloudera announced their partnership with MIT & Harvard's Broad Institute and detailed some of their experience with the Genome Analytics Toolkit pipeline.
-
Apache Spark 2.0 Technical Preview
Two years after the first release of Apache Spark, Databricks announced the technical preview of Apache Spark 2.0 , based on upstream branch 2.0.0-preview. The preview is not ready for production, neither in terms of stability nor API, but is a release intended to gather feedback from the community ahead of the general availability of the release.
-
AWS Launches Massive X1 Instances Targeting High Memory Workloads
AWS recently added a new instance type with nearly 2 terabytes of memory and 128 virtual CPUs. This is the largest virtual server available today in the public cloud, and is a target for memory-intensive workloads such as SAP HANA.
-
Google Details New TensorFlow Optimized ASIC
The machine learning and engineering communities weigh in on news of Google's new TensorFlow optimized processor, the TPU and possibly influence several industry leaders in the hardware space like Intel and Nvidia.
-
Precision Medicine Modeling Demonstration with Spark on EMR, ADAM, and the 1000 Genomes Project
AWS engineers Christopher Crosbie and Ujjwal Ratan detail using Spark on EMR for precision medicine data analysis on the ADAM platform with data from the 1000 genomes project.