BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Trifacta Seeks to Simplify Data Wrangling-as-a-Service

by Alex Giamas on Dec 30, 2013 |

Trifacta, a data analysis services platform, recently received VC investment to advance on their efforts of making data wrangling easier for data analysts. The goal is to collect, cleanse and munge data in a fraction of the time and effort it currently takes.

Data wrangling has traditionally been the most time consuming and painful part of every Big Data project. In our era, data is flowing, heterogeneous and constantly changing attributes as data sources are evolving. NoSQL databases have long tried to answer this question in the storage side by being column based or document based but the problem still remains in getting the data collected and applying semantics to it.

Trifacta is approaching the problem from a user centric perspective, instead of a developer one. Business analysts and data scientists will be able to cleanse datasets in a visual oriented way. Based on research at Berkeley and Stanford, the platform aims to make employees and machines collaborate together in extracting insights from datasets.

Automated smart sampling from big data sets together with visualization allows for the analyst to discover interesting patterns at a fraction of the time. Trifacta can then apply machine learning algorithms to suggest ways to reorganize information and get it into shape. The analyst can group the dataset into logical parts of information, normalizing it one step at a time and viewing the outcome in a user friendly way along its course of work. Generalizing in the whole dataset is the last step which turns the semi-structured dataset into shape. The platform is designed from ground up with user experience in mind to allow data analysts to shift in depth through data, without the need to develop complex pipelines to cleanse the data and bring them into the Data Warehouse.

Trifacta’s predecessor research project, DataWrangler and the research paper are available online and can give a sneak preview of what Trifacta is getting to, since they are still in a closed beta, only scheduling demos by invitation.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT