Startups: Why to Share Data with Academics

Last week I wrote a bit about how to share data with academics. This is the complimentary piece, on why you should invest the time and energy in sharing your data with the academic community.

As I was talking to people about this topic it became clear that there are really two different questions people ask. First, why do this at all? And second, what do I tell my boss?

Let’s start with the second one. This is what you should tell your boss:

  • Academic research based on our work is a great press opportunity and demonstrates that credible people outside of our company find our work interesting.
  • Having researchers work on our data is an easy way to access highly educated brainpower, for free, that in no way competes with us. Who knows what interesting stuff they’ll come up with?
  • Personal relationships with university faculty are the absolute best way to recruit talent. If we invest a little bit of time in building a strong relationship with this professor, she’ll know the kind of people we’re looking for and send us her best students.

All of these points are valid, but they aren’t complete. As a startup, you’re mostly likely building a product at the intersection of a just-now-possible technology and a mostly-likely-ready market. The further the research in your field moves, the greater the number of possible futures for your company. Further, the greater the awareness of your type of technology in the community, the larger the market is likely to actually be.

Your company is one piece of a complex system, and the more robust that system becomes, the more possibilities there are for you. Share data, and you make the world a more interesting place in a direction that you’re interested in.


  • http://twitter.com/fhuszar Ferenc Huszar

    I agree, and this is why our company, PeerIndex invests time, capital and data into developing a healthy academic ecosystem around us. We fund PhD students, coauthor research proposals. Here is our perspective on this:

    In lean/agile startup land you often hear ‘perfect is the enemy of good’, and how you should just quickly release something that is imperfect but works and then innovate in an incremental, agile fashion, based on feedback on what aspects you should concentrate most on. But research and innovation often does not work this way. If you are incremental and agile, you often get stuck in a locally optimal solution, and you won’t be able to develop anything radically different from what you’ve started with. To really address big challenges you have to think outside the box, you have to have people around you who have the freedom to rethink anything in your product or approach.

    So there is this tention between a startup’s desire to release and innovate quicly, incrementally, and the long term goal of thinking outside the box and innovating radically. This is where academic collaborations are very useful: the startup can concentrate on day-to-day improvements, improving the product, adding new features, incrementally. In the meantime with your academic partner will have the chance to address the big questions underlying your product in a much more flexible way. And if you’re lucky in one or two years’ time they will come up with that radically new way of doing things that will eventually replace your old, outdated product.

    From a capital investment perspective, working with academia is very cheap, compared to a lot of other things that would result in the same effect. Your company may even benefit from it financially if you co-author certain grants. But it’s not just exploiting academia: they need your data, and they need your high-impact problems to work on. It’s a win-win situation. However, as an ex-PhD-student, I think it’s the companies who win more :)

  • Rodrigo Silva de Oliveira

    If I may, I’d like to add that, by publishing your work you also protect the intellectual property of your project or idea. It sounds like a paradox, but it works like a record that you are the responsible for coming up with this initiative.

  • Brian Dalessandro

    I totally agree on the benefits. I have gone down this path many times as an industrial data scientist. It should be noted that the academic’s incentive is usually to publish, and so the company considering this possibility needs to evaluate the risks associated with letting the ideas go to the public domain. A specific such risk is IP protection. On one hand, no one else can patent your idea after it has been published, but neither can you. This can be solved by filing a non-provisional patent application prior to publication. Another challenge is disentangling IP ownership with the university associated with the academic. Universities earn a lot of royalties on patents so they tend to be vigilant about it. Sometimes it is worthwhile to pay the academic as a consultant just to maintain clean ownership of the IP. 

    I’ll end my comment by proposing another benefit. Having an academic liaison will keep you abreast in the latest and greatest research. When you don’t have time to sift through white papers and conferences, its nice to have someone willing to share interesting finds with you.

  • http://twitter.com/FlatMerge FlatMerge

    Great posts! This comment is more about the “how to share data with academics” post… 

    Academics often shy away from startups, corporations, private industry in general. Mostly they do this because it is difficult to prove the scientific validity of them. Academics will not use your data if it is not credible and at least looks scientific. So, along with your API or data dump/stream, you will need some good metadata that *accurately* describes your data. Academics call this a “codebook.”

  • Scott Sokoloff

    There also the added benefit of gaining access to highly educated talent that might not otherwise consider working with you. I’ve had several experiences where I’ve shared data with the academic community and the resulting results spawned new work. I was able to hire the person who did the original research to assist with the ongoing effort.