Datafication

“Big data” as a term reminds me of “social media” a few years ago. It is in danger – through mis-use and over-use – of losing its currency before many people fully understand its significance. And it is very, very significant indeed.

One of the books I’m reading – at a rapid pace which is testament to its usefulness – is Big Data: A Revolution that will transform how we live, work and think, by The Economist’s data editor, Kenneth Cukier and Viktor Mayer-Schonberger, of the Oxford Internet Institute.

One of the problems with the term “big data” is that it is doing too many jobs. Cukier and Mayer-Schonberger offer us a provisional term for the revolution in data that we are living through:

There’s no good term to describe what’s taking place now, but one that helps frame the changes is datafication, a concept that we introduce in Chapter Five. It refers to taking information about all things under the sun—including ones we never used to think of as information at all, such as a person’s location, the vibrations of an engine, or the stress on a bridge—and transforming it into a data format to make it quantified.

Awkward as it is, “datafication” works for me as a description (possibly simply because it isn’t “big data”).

And the definition of big data? Try these:

There is no rigorous definition of big data. Initially the idea was that the volume of information had grown so large that the quantity being examined no longer fit into the memory that computers use for processing, so engineers needed to revamp the tools they used for analyzing it all. That is the origin of new processing technologies like Google’s MapReduce and its open-source equivalent, Hadoop, which came out of Yahoo. These let one manage far larger quantities of data than before, and the data—importantly—need not be placed in tidy rows or classic database tables. Other data-crunching technologies that dispense with the rigid hierarchies and homogeneity of yore are also on the horizon.

Or

big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more.

Before you get too cynical, before your cortex starts rejecting any conversation, content or plan that includes “big data”, I urge you to read this book. It’s a great primer on the issues and opportunities that the era of big data presents us with.

It also quickly introduces some key concepts that are incredibly powerful – about the messiness of data, the switch from causes to correlation and other ideas. It has my brain fizzing in the same way that The Origin of Wealth and Linked did a few years ago about networks and complexity.