Interview on Wolfram|Alpha, a Computational Knowledge Engine
Wolfram|Alpha, the Computational Knowledge Engine from Wolfram Research, was officially released on May 18. Renowned for their flagship product Mathematica, Wolfram Research has long been one of the most respected suppliers of scientific computation software. The news about the launch of Wolfram|Alpha, a new search engine from Wolfram Research, thus has drawn much attention and hype. A "Google killer", "intelligent search" or "semantic search"? Many buzzwords have been associated with the product even before it has faced the public. Now, two months from its launching, we are in a better position to evaluate Wolfram|Alpha with our own hands-on experience. It is time to review a few frequently asked questions about Wolfram|Alpha: What is the relationship between Wolfram|Alpha and Google? How would Wolfram|Alpha position itself in the market? To what extent is Wolfram|Alpha a Semantic Web search engine? And how could Wolfram| Alpha make profit in the market? InfoQ had the chance to invite Xiang Wang, a Business Manager of Wolfram Research Inc. in China, to talk about these questions.
Why is Wolfram|Alpha called a Computational Knowledge Engine? What can we tell from this product's description?
We call Wolfram|Alpha a computational knowledge engine because it generates output by doing computations on its own internal knowledge base, instead of searching the web and returning links.
Wolfram|Alpha is constantly being compared to Google. While Google is a search engine designed for the public users, how would you position Wolfram|Alpha in the market? And what is your viewpoint on the relationship between Wolfram|Alpha and Google?
It's complementary to Google. Search engines give you links to pages that exist on the web. Wolfram|Alpha computes answers to specific questions using its built-in knowledge base and algorithms. Wolfram|Alpha has sidebar links for doing web searches. Its purpose/strength is not today's search queries. It enables a whole new set of questions to be asked. Based on past experience in search, we expect that peoples' queries will rapidly evolve to be Wolfram|Alpha queries once they see the capabilities. Usage pattern different from search: people will use it more systematically, asking the same question with different parameters.
Technically speaking, what are the essential differences between Wolfram|Alpha and related efforts like Ask Jeeves, Google Base, Powerset and the Cyc project?
Wolfram|Alpha is different from them. Ask Jeeves and Powerset are search engine to give you link to pages that exist on the web, while Wolfram|Alpha provide information derived from computation.
Since Wolfram|Alpha is dubbed 'smarter' than traditional search engines, I wonder how much AI techniques are actually employed in the system? How is inference done? What is the provenance of each fact/claim? And what if there is a disagreement? For example, how it would represent information about Israel/Palestine area?
It's much more an engineered artifact than a humanlike artificial intelligence. Some of what it does - especially in language understanding - may be similar to what humans do. But its primary objective is to do directed computations, not to act as a general intelligence. Wolfram|Alpha uses established scientific or other models as the basis for its computations. Whenever it does new computations, it's effectively deriving new facts. About the controversial data you asked about, we deal in different ways with numerical data and particular issues. For numerical data, Wolfram| Alpha curators typically assign a range of values that are then carried through computations. For issues such as the interpretation of particular names or terms, like Israel/Palestine area issue mentioned in your question, Wolfram|Alpha typically prompts users to choose the assumption they want. We spend considerable effort on automated testing, expert review, and checking external data that we use to ensure the results. But with trillions of pieces of data, it's inevitable that there are still errors out there. If you ever see a problem, please report it.
An article by Guardian editor Charles Arthur billed Wolfram|Alpha as 'Semantic Web search'. We would like to know whether Wolfram|Alpha is a Semantic Web search? Have you employed any Semantic Web technologies inside, or borrowed any ideas from the Semantic Web?
Wolfram|Alpha does not directly use semantic web technology. Wolfram| Alpha has its own internal knowledge base, with its own extensive internal semantics and ontology.
It is said that the data stores underlying the Wolfram|Alpha system are something like over 10 trillion individually tagged pieces of data. We are interested to know what data model lies behind the huge volume of data?
There are many trillions of elements in Wolfram|Alpha, continually growing through a large number of feeds.
The curated data is currently exclusive to the Wolfram|Alpha only, although it is valuable to the outside world as well. Are there any plans to make it available to the public, for example, in RDF? Or expose the Wolfram|Alpha's functionality as a web service?
Most of the data in Wolfram|Alpha is derived by computations, often based on multiple sources. A list of background sources and references is available via the "Source information" button at the bottom of relevant Wolfram|Alpha results pages.
In addition, some data is already directly available to Mathematica users, as load-on-demand computable data. And also, an API is under development for users to get raw data from Wolfram|Alpha.
Where does the raw data come from? How do you deal with the potential problems of heterogeneity and inconsistency, since the data comes from diverse sources?
The data are from many different sources, combined and curated by the Wolfram|Alpha team. To check the Wolfram|Alpha data, we use a portfolio of automated and manual methods, including statistics, visualization, source cross-checking, and expert review.
Natural Language Understanding is an important part of the system. Do you use any specific techniques or strategies that make the system unique, to make it 'stand out' of the crowd?
Wolfram|Alpha introduces many new methods for understanding linguistic inputs. Mostly they're unlike traditional NLP, because Wolfram|Alpha has to deal with linguistic fragments rather than full grammatical sentences.
It is noted that Wolfram|Alpha bears some relation to Stephen Wolfram's 'A New Kind of Science', or NKS. Can you explain, in brief detail, how the ideas of 'NKS' are applied to Wolfram|Alpha?
Wolfram|Alpha makes both conceptual and practical use of NKS’ idea of generating rich, complex behavior from simple underlying rules. In many ways, Wolfram|Alpha is the first "killer app" for NKS (see Stephen Wolfram's blog post: http://blog.wolfram.com/2009/05/14/7-years-of-nksand-its-first-killer-app/) .
Where is Wolfram|Alpha going: to be more specific, or more general?
Wolfram|Alpha's long-term goal is to make all systematic knowledge immediately computable and accessible to everyone. We aim to collect and curate all objective data; implement every known model, method, and algorithm; and make it possible to compute whatever can be computed about anything. Our goal is to build on the achievements of science and other systematizations of knowledge to provide a single source that can be relied on by everyone for definitive answers to factual queries.
Wolfram|Alpha aims to bring expert-level knowledge and capabilities to the broadest possible range of people-spanning all professions and education levels. Our goal is to accept completely free-form input, and to serve as a knowledge engine that generates powerful results and presents them with maximum clarity.
Wolfram|Alpha is an ambitious, long-term intellectual endeavor that we intend to deliver increasing capabilities over the years and decades to come. With a world-class team and participation from top outside experts in countless fields, our goal is to create something that will stand as a major milestone of 21st century intellectual achievement.
Last question. We know that Wolfram|Alpha is great, but can you give us any idea about the supposed profit model? What is your business plan?
We are exploring a number of business models for Wolfram|Alpha, including: partnerships with major third party organizations, sponsorships, and future professional and corporate versions.
Details of the professional version are not yet finalized, but are likely to include the ability to upload your own data onto our servers for inclusion in computations (only for the uploading user), ability to download data, as well as the graphs, and more CPU time allowable to the computation. Further off, the corporate version would be designed to run locally within a company, and have direct access to that company’s databases. For example, users might ask questions such as "sales of product A/product B" or "sales targets for John Smith" etc.
Wolfram Alpha: Why it fails
Mencius Moldbug: “They create an incomplete model of the giant electronic brain in their own, non-giant, non-electronic brains. Of course, since the giant electronic brain is a million lines of code which is constantly changing, this is a painful, inadequate and error-prone task.”