Big Data Fatigue Syndrome (BDFS): a cognitive disorder characterized by feelings of frustration, disbelief, and growing apathy caused by repeated exposure to over-hyped technology concepts. Occasionally accompanied by recurring fantasies of slapping publishers.
In the midst of our mad rush to amass yottabytes of “big data” as the cure-all for health care (see my Twitter feed for example articles), I wonder if it might be possible to pause briefly and ask one question:
What exactly are you going to do with the data?
Don’t get me wrong: I am a huge advocate of the opportunity in big data (I actually believe it could be revolutionary). But it strikes me that health and life sciences has not really mastered “small data,” yet everyone seems excited to discuss big data. I suppose it is no different in other industries — the hype is rolling along, with Gartner estimating 2013 spending of $34B. That’s more than ice cream money.
Yet there are a few things we’ve learned in other industries and data experiences that might be applicable to health care:
1. If you don’t know what you are going to do with data, there is no way you will collect it properly. Hint: EMR implementers, you might want to look into this.
2. “More” and “Better” are two different and often unrelated concepts.
3. “More” increases costs regardless of how it is used (i.e., storage, cleaning, administration, integration architectures, software licenses, etc.).
4. “Better”, when used properly, increases return on investment (i.e., increased efficacy, productivity, cost containment and avoidance, revenue maximization)
5. If the “more” is not already inherently “better”, it can only become “better” by incurring additional costs.
“More” is a quantitative assessment – one petabyte is more than 500 terabytes. “Better” is a qualitative assessment – it requires context in order to assess. In the world of analytics, that context is directly related to the question you are trying to answer. Without that context, “more” can only ever be “more.”
In a conference I spoke at in July, I posed this question to the audience: do we really want “big data,” or should we be focused on “big insights?” Based on the reaction in the room, I think the question resonated with a lot of health executives. If we raised the caliber of questions we are asking, we would undoubtedly find big data has a dramatic role to play. For example, I’ve written before that big data presents a new opportunity in the science we practice. What sorts of clinical questions could we answer using this analytics-oriented approach…investigations that could potentially offer immediate benefits to patients and physicians? Could we, for example, model a 3-factor relationship between disease prevalence, socio-economic status, and geography in order to better optimize the design of clinical trials? Could we mine behavioral propensities to look for non-genomic indicators of treatment efficacy? Could we predict (not just detect) epidemiological progressions based on real-time consumer data feeds?
For each of these, the question itself opens up a more meaningful dialogue. How exactly could we analyze that? What data could we potentially use? How much data would we likely need? What would be the limitations of the data, and how might we address those limitations? What other questions would we need to answer in order to feel confident in our findings? These questions put us on the road to delivering real value from our data assets, regardless of their size or source.
The discussions we need to be having should be around the insights that could provide the biggest impacts across health and life sciences. Let’s define “better” before we decide how much “bigger.”