Wolfram Alpha and Google Squared

by Brad Cohen (@supnah)

wa_gs_logosWolfram Alpha and Google Squared have been getting a lot of press lately, and a lot of hype as well. Some of it is deserved, but both technologies still have a long way to go before they start to impact traditional search. But just what are these new search technologies, and what is meant by “structured results” in search?

A current Google search returns unstructured results. The results are organized because Google ranks which pages it thinks are most relevant to the search term that was entered, but the data within those pages - the data in each that represents the potential answer to your query - lives by itself within each page.

The idea of structured data is that the search engine doesn’t just index what a page is about so that it can match it to keyword phrases, it also indexes the data and information within that page as whatever type of data it happens to be.  If a page contains statistical data, then that statistical data is indexed.  Once the data is indexed, the search engine can use the data within Web pages as component parts that can be leveraged against one another computationally.

Consider this example:
“Web page A” contains a table with data on sales figures for “Product A” from 1980 to 19995. “Web page B” contains a table with data on sales figures for “Product B” from 1990 to 2000. A normal Google search would have no way to combine those figures for the searcher. However, in a structured data search you could ask the search engine (like Wolfram Alpha or Google Squared) to show you a comparison of sales between “Product A” and “Product B”. The engine would understand the data it had indexed, and it would understand your question, and it would be able to return a graph of the sales for “Product A” versus “Product B” for the years of 1990 to 1995, because those are the years for which it had indexed the data for both products between the datasets of the two different Web pages.

Wolfram Alpha takes this one step further by claiming the intent to index the data scattered across the Web as component parts that can then be computationally leveraged to perform extremely complex, and in some cases novel research. This is partially enabled because the programming language in which Wolfram Alpha is created, “Mathematica,” which was also created by Steven Wolfram, contains within it a vast number of the computational algorithms currently known to man.

Some of the barriers that will keep an average user from beginning to use these technologies are:

  • They require their queries to be input in a more exact manner than a typical Google search. Because you are asking a factual question and requesting a computation, you must ask in a way that that the engine understands.
  • They can answer only factual questions, or questions that can be answered computationally. While it is possible that what constitutes a “factual question” will expand over time as these engines become more robust and capable, for the time being this is still a relatively limited portion of the queries performed by users.

If these tools begin to gain significant market share in search it could change the game in a very real way. These tools index the information out of Web pages, so their results are not to send the user into the best matching Web page. Rather, the result is to present the user with an answer computed out of their index of structured data, in which case the user never leaves the search engine - the user never arrives at a Web page owned by a party other than the search engine. This means that there is no content outside of the search engine to leverage in terms of placing advertising, and it means that if the search engine places advertising within its results it is earning money by leveraging data it essentially pirated from other Web pages. The revenue model will be complicated to say the least.

But there are some very real examples of just how cool this technology could be. In a podcasted interview with Leo Laporte, Steven Wolfram spoke of a use of Wolfram Alpha that surprised even him. (You can listen to the full interview - it’s at the end of his podcast TWiT #195) Here’s Wolfram’s example in his own words:

“Looking at people’s names, like first names and so on. I had no idea how much structure there was in the popularity of a name as a fuction of time. And here’s a fun thing that I was just surprised by. Given that you can know, from birth records, you can know how many people named Steven were born in each year for the last hundred years or so, but then you can use mortality data to figure out just how many will be surviving at this time. And from that you can get the distribution of what the expected ages of people will be. And it’s really bizarre because an awful lot of people, particularly with slightly more unusual names, you can basically predict from their name roughly how old they are. I didn’t realize that there was that kind of quantitativeness in that area.”

So these tools are useful for specific questions based on calculable data. There is a large volume of search queries that these engines will not be able to answer anytime soon, and potentially will never be able to answer. You should not expect these tools to break the current model of Web revenue, but they may change it eventually. Two things are sure: 1) Wolfram Alpha isn’t going to kill Google tomorrow, and 2) This will be fun to watch as it develops.

Last 5 posts by BradCohen

Tags: ,

Leave a Reply