BT

Precog: Big Data Analytics as a Service

by Abel Avram on Oct 03, 2012 |

Precog has recently announced a Big Data warehousing and analysis service which takes care of the data capture, storage, transformation, analysis and visualization process and the infrastructure on which it runs, but leaving open various access points throughout the service via RESTful APIs enabling developers and data scientists to control the entire process.

Precog captures input data from a variety of sources including SQL databases, Amazon S3, Hadoop, MongoDB, client-side web applications, and back-end servers. A RESTful API enables developers to capture data from external sources such as Twitter or Facebook, or from CSV files or mobile devices. The data is then stored in a custom database called PrecogDB, and can be enriched with various attributes – demographics, sentiment, location and others.

Data is then analyzed through an API, or by using client libraries (JavaScript, PHP), or with Labcoat, an IDE for data analysis using a declarative query language named Quirrel. Developers can create their own data capture, enrichment and analysis modules and even sell them on a marketplace.

Precog runs the entire process on a combination of cloud providers - Amazon EC2 and SoftLayer – to increase resilience and uptime.

In an interview for InfoQ, John A. De Goes, CEO and Founder of Precog, explained that the “architecture [of the system] has some similarities to the architecture of analytical databases, including column-oriented storage, but differs in supporting fully heterogeneous and denormalized data, and in supporting Quirrel, the "R for big data" language that lets you easily perform much more advanced calculations than you can with an analytical RDBMS.”

At the heart of the platform is PrecogDB, a columnar database written in Scala and running on the JVM, optimized for data capture and analysis. PrecogDB stores “measured data, such as clicks, purchases, measurements, tweets, and other kinds of activities, which collectively form a journal of historical activity,” according De Goes, who added: “Precog cannot yet store huge blobs of unstructured data, as is required for applications in bioinformatics and other areas, but this capability is in the roadmap.”

Regarding Quirrel, the statistical query language implemented by Precog, De Goes said: “In many respects, Quirrel is similar to the R programming language. Like R, Quirrel is designed for advanced analytics and statistics. Unlike R, Quirrel is not a Turing complete language, and it is purely declarative, which makes it possible to efficiently distribute Quirrel queries across massive clusters of machines (this also makes Quirrel much easier to learn than R)."

PrecogDB has “built-in routines for performing common analytical and statistical computations,” and a “granular, capability-based security model, which allows PrecogDB to be accessed by REST API directly from mobile devices and web applications.”

Hello stranger!

You need to Register an InfoQ account or to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2013 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT