Being a new blogger, I am now always on the lookout for blogs I enjoy, be they geological, technological or the abstract/absurd.* They are rich sources of opinions. Occasionally, information sufficiently analyzed to yield insights. Information and opinion are often confused or confounded be it blog or traditional media.
Obviously, quality first-hand information feels like an ever rarer commodity. Nowhere was this more apparent than in what passed for election politics in the recent United States election cycle. The debates focused on character and past indiscretions as proof neither party’s candidate was fit to rule from the perspective of the other. Policy analysis and an understanding of the intended consequences received little mention and even less critique. That lead me to ponder: just how important is information to us, the benefactors of that wonderment: The Information Age? Now that ‘post-truth’ is a fact of life, what role does information play, if any?
Why do we even need to seek information, or evaluate it, or even worry about where it comes from or its quality? After all, with information so ubiquitous, surely quality, relevance, and reliability are ‘givens’, just like the reliability of the power systems we put in place, right? Sadly, no. If information’s reliability drops to 50/50 (i.e. any given piece of information could be reliable or not, with an accuracy no better than chance) trust will logically become completely absent. If true facts don’t matter anymore, how can we trust anything or anyone?
Reliability of information is never 100%. Even the best data is incomplete or contains errors. Collecting it takes effort. Confirming it takes time. Just ask any published author for a scientific journal or your mechanic who’s tried to extract data from your car’s computer. Often, information quality assessment includes views on its relevance, its reliability, its applicability and its quanta (resolution, fidelity, granularity and vintage). Arguably a recent inflation discussion misses the point on this front, as the measures which inform them fail to capture quality as a characteristic of goods & services in society and only focus on their prevalence. (It’s a shame he doesn’t have a blog.)
We are now at the point where we collect data (via a plethora of sensors on the IoT) simply for its option value: “let’s just collect a bunch of data and hope in the future we can find something interesting in there to help us do things better.” However, even unreliable information is not ‘free’ and always comes with risks, not the least that the information may not be irrelevant to the situation at hand (decision frame). That mindset of ‘some possible future optionality’ reflects an implicit view the information will have relatively high value in current or future decision-making context. This assumption benefits from a thoughtful challenge: what is the value of information in (your specific) Big Data application?
Modern systems (AI, neural networks, Watson, etc.) all rely on ‘judgment’ about the information (i.e. probability) to some degree; often this is updated from some known or estimated base rate. Knowing the reliability of information underpins a basic learning/state-update-model made famous by the Reverend Bayes‘ mathematical description. While high reliability is a goal of Big Data systems, the reliability of the data itself, and perhaps, more importantly, the interpretation of the data/analysis – its meaning as applied to the real world – also has some reliability. Sometimes this can be quantified; sometimes not. This is a key consideration for any use of information regardless of its source. As we embrace ‘post-truth’ correctly characterizing that reliability becomes ever more important. We must always be mindful that the characterization process is itself subject to the basic rules that govern how our mind works with respect to risk and benefit.
The reliability of any first-hand information is thus slowly (rapidly?) shifting from high quality / high reliability (implying high trust) to being no more informative than chance would dictate.** I find it not a little ironic that trust of first-hand information (and its confusion with opinion) is decreasing rapidly while a simultaneous explosion of data generation, storage, and analysis. If these could be seen as extreme poles of information quality, one type is racing to minimum quality while another is racing towards full reliability (to the best of human ability) on high-reliability systems.
An advanced application of the use of information gets short shrift in the Big Data discussion: Value of Information. “Value” seems to focus exclusively on the direct cost of building, running and using the systems necessary to apply advanced data processing and storage. Value of Information analysis seems MIA, or only of interest to academia. Its adoption is easier than ever, provided an appropriate framing is to hand.
Big Data is an enabler of decision making, just like any other information. Increasingly good curation with sophisticated computing systems adds to the reliability of accessing that information, applying it in real-time and quantifying its reliability or even its applicability; it does not replace the other pillars of quality decision making. I have yet to see a Big Data system advertised as being able to improve Commitment To Action. Perhaps instead of an “impact score” based on citations, all publications should be required to have a reliability score on their publications, taking into account past accuracy on 1) the actual facts 2) actual statistical significance and 3) any forecasts they make in relation to those. The score would have to be created by an independent body, preferably fully algorithmized.
Big Data will not end Decision Analysis. It will reduce the cycle time for some decisions. It enables increasingly relevant data/information to be used in routine decision making. Routine decision making lends itself to being algorithmized. That trend will continue. The non-routine, the novel, the non-repetitive (dare I say ‘strategic’) will command ever more of the mental energy of people / businesses / governments. Information reliability under these circumstances will continue to rely on judgment. Supplementing that judgment with statistics has arguably become the permanent mission for frequentists; their raison d’etre. The blurring distinction between information volume and information value gets harder to disentangle daily. Judging its use in decision making becomes a commensurately more rarified art form.