Big data in a historical context

ZZ0012D5F8

Excellent stuff from Alan Patrick on his Broadstuff blog, talking about the 70s, 80s and 90s versions of big data – or “data”, they were calling it back then…

And you know what – you just cannot simulate the minute operation laden details of a shop floor or logistics network reliably. No matter how big your dataset, or your computers, or your machine tool onboard intelligence, there is just too much variability. Which is why the Just In Time/Lean movement came about as the better approach – the aim was to simplify the problem, rather than hit it with huge algorithm models and simulations so complex no one fully understood what they were doing anymore (just ask the banks what happens going down that route) – the aim of JiT/Lean was to actually reduce the problem variability, to get back to Small Data if you like.

Alan discusses the way that despite fascination with new technology and algorithms, the drumbeat that industry marches to is that of economics – in this case the pendulum swing of offshoring and onshoring, powered by the temporary advantage of emerging economies’ lower labour costs.

[….] It’s back to the future….I suspect they are now using bigger and bigger number crunching to eke the last 20% of improvements from the various kaizen projects ongoing, trying to keep the factories in situ as the Big Economics shift yet again

The rate of change today often feels bewildering at ground level, but keeping one eye on the forces of history and economics, we see ourselves in the context of slower moving, but more significant trends. In The Second Machine Age – which I’ve been fixated with over the last week (I even look dangerously close to finishing it) – the authors point out that

  • productivity gains from electric motors took about 30 years to emerge in manufacturing.
  • steam engines unlocked 100 years of productivity gains (and an exponential growth in human population).
  • microprocessors and the IT revolution unlocked meagre productivity gains until the late 1990s

What drove productivity in these instances was innovation that used the technology better – innovation in products, processes, organisation and management. When we’re looking at new technologies in our lives and workplaces like social computing, big data etc. it could be decades before their actual potential is felt by all bar the early adopters that are able to see their potential and change their mindsets and ways of working fastest.

Data exhaust trails

Another useful insight from Big Data: A Revolution That Will Transform How We Live, Work and Think:  

A term of art has emerged to describe the digital trail that people leave in their wake: “data exhaust”. it refers to data that is shed as a byproduct of people’s actions and movements in the world. For the Internet, it describes users’ online interactions: where they click, how long they look at  apage, where the mouse-cursor hovers, what they type, and more. Many companies design their systems so that they can harvest data exhaust and recycle it, to improve an existing service or to develop new ones. Google is the undisputed leader. 

As datafication continues, our data exhaust trails get larger: cameras and other sensors, carried by people and installed in .

Cisco’s Chief Futurist says shops’ CCTV will become the equivalent of web analytics to examine how shoppers are making their choices and allowing shops to optimise their layouts and even their offers in realtime… 

As video pixel counts increase, retailers will use video surveillance to hone in on shoppers with new levels of precision, determining demographic traits like, age, sex, and more. In-store activities can also be monitored with video, including display effectiveness, customer traffic patterns, and aisle dwell time. All of this data can be assessed in real time to adjust store operations dynamically. For example, the number of open registers could be increased based on an the number of shoppers in the store; heat maps will show which aisles attract the most traffic; and object detection can figure out which items shoppers are interacting with most.

This trend is at once exciting from a business and data strategy point of view and concerning from a personal point of view. How can we manage our web shadows when we aren’t even sure what data we are leaving behind us? 

Datafication

“Big data” as a term reminds me of “social media” a few years ago. It is in danger – through mis-use and over-use – of losing its currency before many people fully understand its significance. And it is very, very significant indeed.

One of the books I’m reading – at a rapid pace which is testament to its usefulness – is Big Data: A Revolution that will transform how we live, work and think, by The Economist’s data editor, Kenneth Cukier and Viktor Mayer-Schonberger, of the Oxford Internet Institute.

One of the problems with the term “big data” is that it is doing too many jobs. Cukier and Mayer-Schonberger offer us a provisional term for the revolution in data that we are living through:

There’s no good term to describe what’s taking place now, but one that helps frame the changes is datafication, a concept that we introduce in Chapter Five. It refers to taking information about all things under the sun—including ones we never used to think of as information at all, such as a person’s location, the vibrations of an engine, or the stress on a bridge—and transforming it into a data format to make it quantified.

Awkward as it is, “datafication” works for me as a description (possibly simply because it isn’t “big data”).

And the definition of big data? Try these:

There is no rigorous definition of big data. Initially the idea was that the volume of information had grown so large that the quantity being examined no longer fit into the memory that computers use for processing, so engineers needed to revamp the tools they used for analyzing it all. That is the origin of new processing technologies like Google’s MapReduce and its open-source equivalent, Hadoop, which came out of Yahoo. These let one manage far larger quantities of data than before, and the data—importantly—need not be placed in tidy rows or classic database tables. Other data-crunching technologies that dispense with the rigid hierarchies and homogeneity of yore are also on the horizon.

Or

big data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value, in ways that change markets, organizations, the relationship between citizens and governments, and more.

Before you get too cynical, before your cortex starts rejecting any conversation, content or plan that includes “big data”, I urge you to read this book. It’s a great primer on the issues and opportunities that the era of big data presents us with.

It also quickly introduces some key concepts that are incredibly powerful – about the messiness of data, the switch from causes to correlation and other ideas. It has my brain fizzing in the same way that The Origin of Wealth and Linked did a few years ago about networks and complexity.

Crowdsensing: mobile data and predictive algorithms

In Pakistan, mobile data has helped the authorities predict where an epidemic will break out:

Researchers working for the Pakistani government developed an early epidemic detection system for their region that looked for telltale signs of a serious outbreak in data gathered by government employees searching for dengue larvae and confirmed cases reported from hospitals. If the system’s algorithms spotted an impending outbreak, government employees would then go to the region to clear mosquito breeding grounds and kill larvae. “Getting early epidemic predictions this year helped us to identify outbreaks early,” says Umar Saif, a computer scientist at the Lahore University of Management Sciences, and a recipient of MIT Technology Review’s Innovators Under 35 award in 2011.

When we think about “mobility” and its potential in business and society, we shouldn’t limit ourselves to the desktop and app paradigm.