Riley Newman, head of data science at Airbnb, recently published an article describing how the Californian startup defines and uses data science. He explains that data can be seen as the voice of the customers, and data science as an act of interpretation. He also details several initiatives that have been particularly important for scaling data science.
A year behind the schedule, JetBrains has made generally available the DBA tool DataGrip 1.0. Formerly known as 0xDBE, DataGrip is a tool for SQL database administrators and developers.
Yahoo! has benchmarked three of the main stream processing frameworks: Apache Flink, Spark and Storm.
FsLab, a collection of F# ooen source libraries for doing Data Science, was released earlier this year, InfoQ reached out with Tomas Petricek, creator of the project, to get more details.
IBM has inaugurated the IoT Global Headquarters and will use the Watson technology to analyze and interpret IoT data.
At #Pragma Conference 2015, Marcus Zarra, author of Pragmatic Bookshelf Core Data, described three approaches to using Core Data in a multithreaded environment and tried to clear up how Core Data should be used in 2015.
MongoDB recently announced the newest version of its NoSQL database synonymous product. Building upon the new features introduced in 3.0 release, 3.2 is expanding and solidifying MongoDB’s interest towards the corporate world.
Earlier last month in Las Vegas, at IBM Insight 2015, IBM announced a major commitment to the Apache Spark project. Referring to it as “potentially the most significant open source project of the next decade” tells a lot about how important IBM believes Apache Spark is. With IDC reporting that 80% of cloud applications in the future will be data intensive, Apache Spark can unlock previously...
JetBrains has released IntelliJ IDEA 15, with improved Java 8 lambda debugger support, a better user interface for running tests, enhanced JVM frameworks support (Spring 4.2, Hibernate 5.0, Grails 3.x, and Arquillian), TypeScript 1.6 and TSLint integration, and initial support for Angular 2.
About the same time Google announced open sourcing TensorFlow, Microsoft has pushed to GitHub DMTK, a Distributed Machine Learning Toolkit. While Google has released a one-machine version of TensorFlow, DMTK runs on a cluster of machines.
TensorFlow is a machine learning library created by the Brain Team researchers at Google and now open sourced under the Apache License 2.0. TensorFlow is detailed in the whitepaper TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. The source code can be found on Google Git.
At its 2015 Partners User Group Conference, Teradata announced two new software capabilities for real-time ingestion and analysis of massive streams of IoT data. While the Teradata Listener software enables "listening" to multiple, diverse IoT data streams in real time, the new Teradata Aster Analytics on Hadoop software provides scalable analysis of massive IoT data streams.
Neo4j Graph NoSQL database team launches open source graph query language called openCypher. Neo Technology, the company behind the graph database, announced last week at GraphConnect Conference, the launch of the open source project that will be available to technology providers as a common language for querying graph data.
Latest version of Graph NoSQL database Neo4j supports in-memory page cache, Docker tools, enhanced query planner and IBM POWER8 integration. Neo4j team announced last week the release of version 2.3 which also supports query development with graph and text string search.
Basho Technologies releases Riak TS distributed NoSQL database to store and analyze time series data. Basho team recently announced at AWS re:invent event, the availability of Riak TS which is optimized for reads and writes of time series data.