In 1994 Jerry Yang and David Filo evolved their directory of web sites into something they called “Yet Another Hierarchical Officious Oracle” which of course acronyms to “YAHOO”. Originally, Yahoo was a human-maintained directory of websites. Through the mid 1990s, every site indexed by Yahoo was evaluated and summarized by a human being. A search on Yahoo in 1995 presented a user not only with links (like a modern search engine does), but with a summary and commentary of each web site.
For the period of time that Yahoo could keep up with the growth of the web, it was an exceptional search engine. Indeed, its output was better than what modern search engines offer today because of the human element.
Yahoo became less and less effective through the late 1990s because manually indexing the exponentially growing world wide web became impossible. Fully automatic indexing technology, pioneered by Larry Page and Sergey Brin, was the only solution for the future. Google eclipsed Yahoo because its index actually spanned the entire web. But even today Google’s search output is less rich than what Yahoo was producing 15 years ago.
In the late 1980s, Apple produced a now famous film titled “Knowledge Navigator” about computing in the future. In it, a professor verbally instructs his computer to, among other things, plot deforestation patterns in Brazil since 1950. The computer complies. The film is fascinating because of how prescient it was in many respects and how far way we are from its vision in others.
As the film portended, we do now live in a world where any information known to mankind is available to us via the world wide web and tools like Google. But we are nowhere near being able to talk to our computers in natural language. Nor can our computers deliver Brazilian deforestation data in a line chart from a single command. We can “Google” Brazilian forestry data and after some minutes find this data, download it, format for our graphing program and get our line chart.
The next challenge in computer science will be to deliver on requests as arbitrary as “Brazilian deforestation line graph” as easily as Apple envisioned. Interestingly, Google has been ready for this since its very first day on the web. On every Google home page, in every language, there is a button labeled “I’m Feeling Lucky” which takes the user immediately to Google’s best guess of what she is looking for. Of course no one ever uses this button because it doesn’t work–yet. As of today, you cannot type “Brazilian deforestation data, 1950 – present, bar chart” and press one button to get precisely that. But the existence of that button on every Google home page suggests founders Larry Page and Sergey Brin “get it” and always have. They know well what the holy grail of computer science is.
97% of human knowledge is now digitally encoded. And Google has indexed most of it. But for all its computing power and storage capability, Google’s massive databases are remarkably ignorant as to what all the data actually is. Google has almost no meta-data. Per site, they have less meta-data than Yahoo had 15 years ago.
Until the internet’s index is enriched, it will remain impossible to achieve the precision answers Apple envisioned 25 years ago. Thus the future of the internet lies in metadata. The internet’s next giant leap will be achieved when search engines actually understand the data they index, (a la Yahoo 1995 and beyond) empowering them to transform that data en route to the user so that it arrives in precisely the format the user needs it. In other words, the future of the internet is to do what Quandl is currently working on. I think.