I am often fascinated by the ways in which data is erroneously perceived, collected, stored, retrieved, and reported in organizations. Data of all kinds certainly have limitations. In any given situation, I find that half the people ignore any data that is presented, and half of the remaining people misinterpret the information.
But rather than dwelling on these failures, I believe that a potential solution to data ignorance lies in thinking bigger about data. Data errors would be far less common (though certainly would not be eradicated) if people understood the true nature of data and the relationship of data to the space-time continuum (or some other type of scientific-like thing).
Therefore, I propose the following “Four Laws of Data” to help guide practitioners to understand the interdependencies within the data universe.
Law 1: The amount of data in the world is infinite and always has been infinite.
The universe of potential metadata for any one data point also is infinite.
Data beget data. Data are present in the minds of all sentient beings, and in the timeless gifts of nature (i.e., information frozen within tree trunks ensconced in glacial ice).
Imagine a blue shirt. The color blue on that shirt may be very unique. Perhaps the dye was obtained from a rare plant in Taiwan. Maybe that plant has appeared in the background of two movies starring Brad Pitt. Each of these characteristics of the “blue” data point is an additional data point in and of itself. These represent only a couple of data points pertaining to a shirt, but the entire universe of possibilities is as infinite as one’s imagination (and the collective imaginations of others).
Law 2: The knowledge of any one data point is inversely proportional to the availability of all data.
As collective human awareness of accessible data increases, the average knowledge of any one accessible data point diminishes.
Think of a city that has only one building. All residents can see it or experience it in some way on a regular basis. Compare this to New York City. The probability that any one person would have familiarity with any randomly chosen building in the city is minimal. Whereas landmarks like the Empire State Building or the Freedom Tower would be associated with higher levels of awareness, the vast majority of remaining buildings would be recognizable only by smaller groups of people. Accessibility does not breed familiarity.
Law 3: Data are not created. They are observed.
Data that are easily stored and retrieved (and therefore observed) receive disproportionate weight in human decision making. Data availability is not a substitute for truth.
There is an old mind trick that kids play on each other. One will say something like, “Don’t think about a pink polar bear driving a car.” Inevitably those within earshot begin thinking about just that image.
Somewhere, maybe in multiple different brains, data exist regarding the “percentage of people currently thinking about pink polar bears.” This data may be hard to measure. After all, the data are present only in each person’s mind. It may be difficult to consolidate data between minds, but the individual mind is still a data storage and retrieval system. Even if you can’t retrieve the data easily, the data is still there.
Even if no one ever mentions pink polar bears again, at any point in time, there is a percentage (maybe even 0 percent) of people thinking about pink polar bears. Even if you as an individual are not perceiving a data point, the data are observable somewhere. Data exist both before and after you observe something.
I often attend talks where people claim that data is “created” exponentially faster today than all human history put together. While there is some disagreement about these claims, what can’t be denied is that the world of data storage and retrieval has changed.
In this light, the issue is not that data existence has changed. According to Law 1, the amount of data is infinite and, in the purest form, data always have been infinite. The crux of data-climate change is the exabytes of data we store (and merge) in digital systems in 2015 compared with the distant past. Prior to the past century, the primary data-transfer systems were books and oral tradition. Nonetheless, the largest modern data processor in the world still doesn’t come close to the computing power of a single human brain. And the single human brain still doesn’t come close to the collective data landscape that has evolved since the beginning of time. The data climate has changed. That doesn’t mean data itself has changed, but the way we perceive, collect, store, retrieve, and analyze data certainly has.
My reason for proposing these laws is to further the conversation on the true nature of data. Too often, people talk about creating data, when all they are doing is making existing data more accessible. Also, when people do understand the data presented, they tend to make decisions based on the available data without questioning what data isn’t available and why. In a world with infinite data, it is often more important to focus on the universe that you can’t see than to limit yourself to what’s in front of your face.
View the Original Version of the Article at GovExec.com