BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Nuxeo Introduces fise Semantic Engine

Nuxeo Introduces fise Semantic Engine

This item in japanese

Bookmarks

Nuxeo's employee blog recently introduced fise (Furtwangen IKS Semantic Engine) - an open source RESTful semantic engine to which NUXEO has made contributions. The goal of fise is to "help bring new and trendy semantic features to CMS by giving developers a stack of reusable HTTP semantic services to build upon." fise is part of a larger effort, IKS (Interactive Knowledge Stack) as a means of enhancing CMS offerings with Semantic Web capabilities.

A 'semantic engine' takes unstructured input (e.g. text files) and produces what amount to search-able indices and concordances as a means of extracting the "meaning" of that input. For example, semantic engines can typically categorize documents (e.g. by language or topic; suggest tags, or extract known entities (e.g. names, places, dates). Using this kind of classification information the engines can also sort and link related documents and extract assertions (e.g. "company x bought company y on this date for this amount of money"). A content management system is primarily concerned with the creation, persistence, and organization of texts (multimedia texts in many cases) and so the integration of a semantic engine provides obvious advantages for search and for organization of content. A content management system might be designed and used primarily to keep track of documents generated and used within an enterprise, or it might be used to organize and manage all of the 'documents' (web pages) that comprise a sophisticated site. One aspect of the effort to create a "Semantic Web" is for every Web page to incorporate the kind of classification, indexing, and concordance data generated by a semantic engine.

Open Calais, Zemanta and Evri are examples of semantic engines, available via Web APIs, that can be used to semantically annotate web pages and sites. An ancestor of this kind of semantic engine was IZE developed and marketed by a small Madison, Wisconsin company called Persoft, back in 1988.

The rationale for semantic annotation is summarized by Olivier Grisel (author of the Nuxeo blog) thusly:

Linking content items to semantic entities and topics that are defined in open universal databases (such as DBpedia, freebase or the NY Times database) allows for many content driven applications like online websites or private intranets to share a common conceptual frame and improve findability and interoperability.

Publishers can leverage such technologies to build automatically updated entity hubs that aggregate resources of different types (documents, calendar events, persons, organizations, ...) that are related to a given semantic entity identified by an disambiguated universal identifiers that span all applications.

fise offers three basic fttp services, defined as endpoints:

fise offers three HTTP endpoints: the engines, the store and the sparql endpoint:
  • the /engines endpoint allows the user to analyse English text content and send back the results of the analysis without storing anything on the server: this is stateless HTTP service
  • the /store endpoint does the same analysis but furthermore stores the results on the fise server: this a stateful HTTP service. Analysis results are then available for later browsing.
  • the /sparql endpoint provide a machine level access to perform complex graph queries the enhancements extracted on content items sent to the /store endpoint.

These services can be accessed directly via "a web user interface for human beings who want to test the capabilities of the engines manually and navigate through the results using there browser. This is primarily a demo mode." "The second way to use fise is the RESTful API for machines (e.g. third party ECM applications such as Nuxeo DM and Nuxeo DAM) that will use fise as an HTTP service to enhance the content of their documents."

Organizations and individuals are discovering that they are being overwhelmed by the sheer volume of information, mostly in the form of unstructured documents, that they must deal with on an ongoing basis. this accounts for the increasing interest in content management systems and CMS enhanced with semantic engine technology. Nuxeo is itself a provider of CMS services and has plans to integrate fise with its product line.

Right now fise is a standalone HTTP service with a basic web interface mainly used for demo purposes. To make it really useful some work is needed to integrate it with the Nuxeo platform so that Nuxeo DM, Nuxeo DAM and Nuxeo CMF users will benefit from a seamless semantic experience.

 

To what extent are you and your organization using CMS and what value are you finding in adding semantic annotations to your content?

Rate this Article

Adoption
Style

BT