InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

MongoGraph Brings Semantic Web Features to MongoDB Developers

Posted by Srini Penchikala on Dec 03, 2011

Sections
Topics
MongoDB ,
Distributed Document Oriented Database ,
Graph Database ,
Data Access ,
NoSQL ,
Architecture ,
Database

MongoGraph from AllegroGraph team brings semantic web features to MongoDB developers. They implemented a MongoDB interface to AllegroGraph database to give Javascript programmers both joins and the semantic web capabilities. Using this approach JSON objects are automatically translated into triples and both the MongoDB query language and SPARQL work against these objects. Another goal of MongoGraph is to make the freetext engine of their graph database easy to search as Solr/Lucene.

AllegoGraph CEO Jans Aasman gave a presentation and talked about working on the level of objects instead of individual triples. InfoQ spoke with Jans about this new approach and how it helps the NoSQL developers.

Infoq: What are the advantages of representing JSON objects as RDF triples in a graph database?

Jans: Well, the most direct answer is that you can use JSON to model complex schemas and then perform complicated joins over your data without writing map-reduce queries. One can approach the JSON objects stored in MongoGraph both as JSON objects (using the MongoDB query language) or as more fine grained RDF triples that allow for complex models and complex joins (using the SPARQL query language). Also you can use all the other advanced features of an RDF Database (aka – TripleStore). One can apply the query language SPARQL or apply rules using mechanisms like SWRL, RIF, or Prolog.

You can also now link the data structures in your application that you represent as JSON seamlessly with RDF triples in the Linked Open Data Cloud.

InfoQ: How do you access the data stored in a MongoGraph type of database?

Jans: A MongoGraph query like the example below will return all books for 'Jans' 'Aasman' as JSON objects.  

db.authors.find({firstName: 'Jans', lastName: 'Aasman'})

But, assuming that we have a collection of authors, books, publishers and stores, one could also write a join heavy SPARQL query like:

select * where {
?x fr:firstName Jans; fr:lastName Aasman ; fr:authorOf ?book .
?book hasPublisher ?publisher .
?store fr:outletFor ?publisher; fr:located 'San Francisco' .
}

InfoQ: What are the limitations of using a solution like this?

Jans: Currently we implement 90% of the MongoDB API. However, we do not emulate the clustering mechanisms of MongoDB. For this capability we rely on the clustering mechanisms built in to AllegroGraph.

InfoQ: What are the emerging trends in combining the NoSQL data stores?

Jans: From the perspective of a Semantic Web - Graph database vendor what we see is that nearly all graph databases now perform their text indexing with Lucene based indexing (Solr or Elastic Search) and I wouldn't be surprised that most vendors soon will allow JSON objects as first class objects for graph databases. It was surprisingly straightforward to mix the JSON and triple/graph paradigm. We are also experimenting with key-value stores to see how that mixes with the triple/graph paradigm.

InfoQ: What best practices and architecture patterns should the developers and architects consider when using a solution like this one in their software applications?

Jans: If your application requires simple straight joins and your schema hardly changes then any RDBM will do.

If your application is mostly document based, where a document can be looked at as a pre-joined nested tree (think a Facebook page, think a nested JSON object) and where you don't want to be limited by an RDB schema then key-value stores and document stores like MongoDB are a good alternative.

If you want what is described in the previous paragraph but you have to perform complex joins or apply graph algorithms then the MongoGraph approach might be a viable solution.

 

Srini Penchikala currently works as Security Architect and has 17 yrs of experience in software product management.

  • This article is part of a featured topic series on NoSQL

Related Sponsor

Neo4j is a robust, high-performance, scalable graph database. It is the only NOSQL database that solves the complex, connected data challenges that enterprises face today.

No comments

Watch Thread Reply

Educational Content

Evolution in Data Integration From EII to Big Data

Approaches to integrating data are changing with emergence of cloud computing.

Winning Hearts and Minds: How to Embed UX from Scratch in a Large Organization

Michele Ide-Smith presents the lessons learned in the process of introducing UX principles and techniques into a large organization through a series of small steps.

LMAX Disruptor: 100K TPS at Less than 1ms Latency

Dave Farley and Martin Thompson discuss solutions for doing low-latency high throughput transactions based on the Disruptor concurrency pattern.

Thoughts on Test Automation in Agile

Rajneesh Namta shares his thoughts, experiences, and some of the critical lessons learned while implementing software test automation on a recent Agile project.

Actor Interaction Patterns

Dale Schumacher presents several patterns of actor interaction that can be used in collaborative programs written in any language.

Scalaz: Functional Programming in Scala

Rúnar Bjarnason discusses Scalaz, a Scala library of pure data structures, type classes, highly generalized functions, and concurrency abstractions to perform functional programming in Scala.

Faster, Better, Higher – But How?

One of the main challenges when designing software architecture is considering quality attributes. Not only their design turns out to be difficult, but also the specification of these attributes.

Software Naturalism - Embracing the Real Behind the Ideal

Michael Feathers analyzes real code bases concluding that code is not nearly as beautiful as designers aspire to, discussing the everyday decisions that alter the code bit by bit.