Sirius: an Open Source Competitor to Siri, Cortana, Google Now

Sirius is an open source, customizable system that can be commanded through vocal input. It has been built by University of Michigan researchers and is similar to Apple Siri, Microsoft Cortana, and Google Now. According to University of Michigan, Sirius “is designed to spark a new generation of intelligent personal assistants” for wearables and other devices."

There are two parts to Sirius, properly:

a collection of services that implement a ready to be deployed intelligent personal assistant (IPA);
a set of components which powers the IPA and are collectively known as Sirius-suite.

Sirius provides IPA core functionalities such as speech recognition, image matching, and natural language processing, including question-and-answer capabilities. It also receives queries in the form of speech or images and returns results in the form of natural language.

According to Jason Mars, co-director of Clarity Lab, thanks to Sirius, “instead of making an app to run on the Apple Watch, for example, maybe I could make my own watch.” This could revolutionize the wearable industry similarly to what Linux did with the server computing space, he says. Another dimension where Sirius will be key, according to Mars, is research into the development of cloud-based services that process voice-enabled commands and the way they scale. Namely, he suggests, this could end up showing the requirement to redesign cloud platforms to specifically support voice-based workloads.

Once Sirius has been built locally, its three services can be started and tested independently, providing a ready-to-use solution to speech recognition, image matching, and question-and-answering problems.

Sirius’ foundation is Sirius-suite, a collection of the three kernels that power Sirius distinct capabilities and which is also available independently. More precisely, Sirius-suite kernels provide the following algorithms:

Gaussian Mixture Model (GMM) and Deep Neural Network (DNN) Scoring, used for automatics speech recognition (ASR).
Feature Extraction/Feature Description, which can be used to build an image matching pipeline.
Regular Expressions, Word Stemmer, and Conditional Random Fields, based on the Carnegie-Mellon OpenEphyra Q&A system.

Sirius users can address their questions to Sirius Users Google Group.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Open Source topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter