quandl blog Menu
Tammer Kamel

Tammer Kamel

Foot soldier for the open data movement; founder of Quandl.


The Travails of the Closed Data Industry in an Open Data World

open field

Image by Carlos Gotay

For decades, the Reuters corporation famously owned one share of every publically-traded company in America. This was not an investment strategy of course. It was simply to ensure that they received quarterly and annual reports from every public company in the country. From these documents they meticulously extracted quantitative corporate facts. This exercise empowered them to sell a well-organized compilation of company data to clients.

In the decades before the ascent of the Internet, Reuters provided an invaluable service: they delivered easy access to information that, although public in nature, was difficult to find and use. Reuters built a strong business around a labor-intensive but nonetheless very innovative data pipeline. Even through the first decade of the information age, there were really no alternatives to the Reuters product.

Perhaps even more innovative was the terminal apparatus that Michael Bloomberg invented in the early 1980s. The ideas that underlie the product actually foreshadowed the World Wide Web. The terminal used concepts such as unique resource identifiers that resulted in certain “pages” appearing on your screen. It also offered a secure instant message service 10 years before a few corporate pioneers started using email. There was a status bar for users, live scrolling news, and much else that would not be out of place in a social app built 20 years later. (Ironically, Bloomberg was so far ahead of the world on this stuff that it became a bugaboo for the company in later years: they would spend hundreds of millions to retrofit their product to the ‘modern’ web in the 90s and 00s).

Today the business models and technology that underlie both Reuters and Bloomberg cannot be labeled innovative. In fact, in 2013, both companies rely on commoditized technology and very orthodox business strategies. Bloomberg is, for all intents and purposes, a content web site. All its core technological innovations like networking, messaging and the browser concept are free and widely available in open source versions. The massive moat that once protected that company is but a damp ditch today.

Even worse for Bloomberg, Reuters and their peers is that data—the basis of everything they sell—is not scarce anymore. In fact, it has become readily available. Want to know IBM’s gross revenue last year? There would easily be one hundred web sites publishing that information for free. This makes charging anything, let alone thousands of dollars, for such data an increasingly difficult value proposition.

Hence Bloomberg, Reuters and their peers and emulators are fast becoming yet another case study on the consequences of trying to execute a business model whose efficacy is being diminished every day by the Internet itself. When your core innovation is built on the scarcity of a resource and that resource suddenly becomes ubiquitous, you’ve got a problem. And in the case of “old school” data vendors that problem is quickly escalating to an existential crisis. For these businesses, open data is anathema.

The response from these companies to this new challenge has been anything but innovative. On the contrary, these companies have taken an inelegant brute force approach to this problem: expend more and more resources to fight the dissemination of information. They want to counter open data with tactics they hope will reinstate data scarcity.

This strategy manifests in two ways. The first is to contractually curtail how customers can use the product they offer. E.g. you can buy this data from us providing you never share it with anyone else and if you do you are liable. That business model is simply not sustainable. It leaves a business having to spend substantial resources to monitor, scold, threaten and in some cases sue its own customers. It eventually wrecks your brand.

The other tactic they are attempting is to take data out of the public domain via proactive copyrighting. The University of Michigan consumer sentiment index is a great example. For decades it was calculated and published altruistically by a government-funded university. Not any more. Today the “Reuters University of Michigan Consumer Sentiment Index” is copyright and available to you if and only if you are paying the Thomson-Reuters corporation.

This won’t work either. If Reuters insists on making any particular index too difficult to acquire, you can be 100% certain the nascent but vibrant open data community will create an equivalent, freely available version. Worse, based on precedents from the open source movement in software, they will probably do it better and cheaper than Reuters can.

Proactive copyrighting and building walls around data that was previously free is, alas, the sum total of the incumbent response to open data. As crazy as it sounds, the best idea that these once-innovative companies seem to have is to fight the very dissemination of information on the Internet. I’m not certain, but I am pretty sure that no person, no team, no organization, no company, no government, no technology and no legislation has ever achieved this.

There are of course ways to survive open data and indeed thrive in the new regime it portends. But as long as these companies spend millions of dollars on lawyers trying to protect a moribund business model, they are doomed. Every day more and more of the data these companies sell is offered to the world by innovative startups for $0. Fighting this is a futile waste of money.

Data will always be valuable, but the era of monetizing this value is coming to end. You can no longer charge money for the simple act of making information available. But the era of inferring deep insight from data is just beginning, which offers a path to redemption for any vendor with the courage to divert money from lawyers to data scientists and engineers.




May News
Maplesoft users now have their own Quandl package.
  • Daniel

    ‘But the era of inferring deep insight from data is just beginning” Couldn’t agree more with this sentiment.

  • Thanks for leaving a comment, please keep it clean. HTML allowed is strong, code and a href.