In this solutions track talk, sponsored by Cloudera, Eva Andreasson discusses how search and Hadoop can help with some of the industry's biggest challenges. She introduces the data hub concept.
Christopher Simons suggests using SBSE to iterated through multiple possible solutions and select the one that performs the best, offering insight into some available tools and techniques.
Graham Tackley dives into the details of ophan - obstacles faced by the newsroom that prompted them to build the system, how it works for alerting and how the tool has made the Guardian's readers - and staffers - lives better. Shay Banon covers the technical underpinnings of ophan with a deep dive into the Elasticsearch features and functionality that power the ophan system.
Mark Harwood shows how anomaly detection algorithms can spot card fraud, incorrectly tagged movies and the UK's most unexpected hotspot for weapon possession.
Baruch Sadogursky overviews and compares search and testing tools available to Grails developers.
Michael Brunton-Spall shares his experience re-architect The Guardian’ Content API from a system based on Solr to a message queue cloud service based upon Elastic Search, without any downtime.
Kumar Palaniapan and Scott Fleming present how NetApp deals with big data using Hadoop, HBase, Flume, and Solr, collecting and analyzing TBs of log data with Think Big Analytics.
Phil Wills discusses why The Guardian has introduced the Content Web API, how it has influenced the architecture of the site and how they develop software and collaborate with partners.
Shay Banon demoes ElasticSearch, an open source distributed and RESTful search engine, detailing some of its features: distributed, cloud readiness, facets, and percolator.
John Wang discusses LinkedIn real-time distributed search engine architecture and implementation details for People Search, Signal, Stream Indexing, Zoie, and Bobo.
This presentation discusses Hypertable, an open source, high performance, distributed database modeled after Google's Bigtable. Doug discusses the differences between Hypertable and traditional database technology, support for massive sparse tables, scaling to petabytes size, and how Hypertable is designed to run on top of an existing distributed file system, such as the Hadoop DFS.