BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

rss
Data Science Follow 263 Followers

Yahoo Open Sources TensorFlowOnSpark

by Dylan Raithel Follow 4 Followers on  Mar 20, 2017

Yahoo open sources TensorFlowOnSpark, allowing Spark-native TensorFlow runtime and integration for distributed training and serving on Spark or Hadoop.

Data Science Follow 263 Followers

Google Cloud Machine Learning and Tensor Flow Alpha Release

by Dylan Raithel Follow 4 Followers on  Apr 18, 2016

Late last month Google released an alpha version of their TensorFlow (TF) integrated cloud machine learning service as a response to a growing need to make their Tensor Flow library to run at scale on the Google Cloud Platform (GCP). Google describes several new feature sets around making TF usage scale by integrating several pieces of the GCP like Dataproc, a managed Hadoop and Spark service.

Followers

IBM to Open Source 50 Projects

by Abel Avram Follow 4 Followers on  Jul 23, 2015

IBM has announced a new web portal called developerWorks Open, bringing together various projects they are open sourcing. The projects cover many domains including Analytics, Cloud, IoT, Mobile, Security, Social, Watson and others. So far, IBM has open sourced about 30 projects, and they plan to increase the number up to 50 by the end of the year, and others may come in the future.

Followers

MemSQL 4 Database Supports Community Edition, Geospatial Intelligence and Spark Integration

by Srini Penchikala Follow 13 Followers on  May 30, 2015

Latest version of MemSQL, in-memory database with support for transactions and analytics, includes a new Community Edition for free use by organizations. MemSQL 4, released last week, also supports integration with Apache Spark, Hadoop Distributed File System (HDFS), and Amazon S3.

Followers

LinkedIn Open Sources Cubert With an Eye To Big Data Analytics

by Alex Giamas Follow 3 Followers on  Dec 17, 2014

LinkedIn recently open sourced Cubert, its High Performance Computation Engine for Complex Big Data Analytics. Cubert is a framework written for analysts and data scientists in mind.Developed completely in Java and expressed as a scripting language, Cubert is designed for complex joins and aggregations that frequently arise in the reporting world.

Followers

Mahout to Get Self-Optimizing Matrix Algebra Interface with Pluggable Backends for Spark and Flink

by Mikio Braun Follow 0 Followers on  Nov 21, 2014

At the recent GOTO conference in Berlin, Mahout committer Sebastian Schelter outlined recent advances in Mahout's ongoing effort to create a scalable foundation for data analysis that is as easy to use as R or Python.

Followers

Apache Drill Included in MapR Latest Distribution Release

by Alex Giamas Follow 3 Followers on  Sep 30, 2014

MapR recently announced including Apache Drill in its latest release of MapR distribution. Apache Drill is the open source version of Google’s Dremel. Dremel is the infrastructure on which BigQuery is based upon. Drill is offering a low latency SQL-on-Hadoop interface. While this puts it in the same space as several other technologies around Hadoop, Drill has some unique characteristics setting it

Followers

DataBricks Announces Spark SQL for Manipulating Structured Data Using Spark

by Matt Kapilevich Follow 0 Followers on  Apr 19, 2014

DataBricks, the company behind Apache Spark, has announced a new addition into the Spark ecosystem called Spark SQL. Spark SQL is separate from Shark, and does not use Hive under the hood. InfoQ reached out to Reynold Xin and Michael Armbrust, software engineers at DataBricks, to learn more about Spark SQL.

Followers

A Roundup of Cloudera Distribution Containing Apache Hadoop 5

by Alex Giamas Follow 3 Followers on  Apr 18, 2014

Cloudera recently released the latest version of its software distribution, CDH5. Almost 20 months after the last major version, CDH4 seems like ages in the Big Data world. We take a look at new features this release brings and the future direction of Cloudera after the latest round of investment from Intel and Google Ventures.

Followers

Spark Gets a Dedicated Big Data Platform

by Charles Menguy Follow 0 Followers on  Apr 03, 2014

Spark users can now use a new Big Data platform provided by intelligence company Atigeo, which bundles most of the UC Berkeley stack into a unified framework optimized for low-latency data processing that can provide significant improvements over more traditional Hadoop-based platforms.

Followers

Spark Officially Graduates From Apache Incubator

by Alex Giamas Follow 3 Followers on  Feb 28, 2014

Recently, Spark graduated from the Apache incubator. Spark claims up to 100x speed improvements over Apache Hadoop over in-memory datasets and gracefully falling back to 10x speed improvement for on-disk performance. Based on Scala, it can run SQL queries and be used directly in R. It provides Machine Learning, Graph database capabilities and other further discussed in the article.

Followers

Spark, Storm and Real Time Analytics

by Alex Giamas Follow 3 Followers on  Jan 31, 2014 2

Hadoop is definitely the platform of choice for Big Data analysis and computation. While data Volume, Variety and Velocity increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics. Spark, Storm and the Lambda Architecture can help bridge the gap between batch and event based processing.

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT