Deriving from the work of R. L. Ackoff[i] in 1988, a background to differences between data, information, understanding, knowledge and wisdom has emerged. Interpreting Ackoff, the differences between these levels of description are that data is merely symbolic; information is data arranged so as to gain some kind of meaning; understanding is an appreciation of meaningful information; wisdom is the artful deployment of understanding.
The issue to be here pursued is that data cannot actually be said to refer to anything in particular until these interpretive secondary uses occur. In other words, data is essentially transformational until operated upon. Ethical issues arise here as the use of data without consent and out of context can lead to misinterpretations (relative to the intentions, if any, of those who created the data somehow). The translation of data to information is an area of great import and must be examined. The further move from information to knowledge is then another problematic one in need of scrutiny.
Data collection is a commonplace, certainly in recent years. It is expected by mobile phone users, although sometimes frowned upon, that their web searches, location, messaging activity, and so on, are collected. This is generally taken to be ‘OK’ because the collected data are thought to go to improving user experience, such as optimising, or personalising, location services and advertising preferences. the data may also be assumed to be depersonalised, or of such little value that the role it plays has only very minor effects. This is not an entirely unproblematic picture, however. And it should always be borne in mind that acceptability of a practice is not necessarily itself an indication of the legitimacy of that practice.
The idea of targeted advertising is a useful starting point for some thoughts here. Between them, Facebook and Google control at least 60% of online advertising. These are also two of the most prolific data gatherers. From the material they gather, profiles are constructed of types of users. This is one mechanism for the kind of ad targeting they do. Through clicks, likes, search terms, locations, spending, and journey habits a profile is constructed that correlates with various market, social, and other variables.
At some level of abstraction, this correlation becomes practically useful to someone with products or services to sell. These people then pay for access to the profiles in order that their wares will be presented to those most likely to buy them. Facebook, for instance, takes payment and positions their offerings in the profile context of their model-buyer. Hence that advert in your news feed for premium, comedy-sloganed, tea-towels.
Maybe this seems reasonable, and an efficient way to do business. But there is an interesting change in principle at work that shifts a preconceived notion of buying and selling.
A company doesn’t need to profile a customer anymore for marketing: they will identify data-driven ‘needs’ in the market and then develop ‘solutions’ that the market will consume. The buyer/seller split has perhaps broken down, instead with diagnostic/patch issue approach in the ascendancy. This is a generalisation of the notion of consumer and market. The politics of this are implicit which is potentially very dangerous as it occurs in an influential, barely escapable public forum, but is not itself regulated by formal public discourse, such as democratic scrutiny.
In the model based on data, sellers approach intermediaries to discover types of buyer who ought to be interested in their stuff. This is not a normal supply/demand model. The seller is encouraged in the revised model to market to the profile, to know the type. And in the transaction both buyer and seller are clients of the data holder. This provides the data holder with a powerful position as they are able to shift both seller and buyer preferences through use of the profiles. This amounts to a political power, beyond marketing.
Political power here emerges from the ability wielded to shift social preferences, and to affect the representation of the same. It must be borne in mind that ‘marketing data’ is used not just by sellers of goods, but by political parties, charities, universities, newspapers, and others who wish to capture something of the zeitgeist at any point in time. Information based on the widespread collection of data is assumed to represent that zeitgeist better than other sources can. But the interpretation of data into information is a crucially overlooked step.
The data collected on the large scale here being discussed stands for nothing until interpreted. Interpretations must be subject to examination in order to understand what information the data can subserve. Data collection poses problems for users in the sense that these users’ every input leaves traces. These traces are often constituents of very large data sets and are typically used in ways not intended by the initial user who made them. Users are often unaware that they are leaving traces and do not invite interpretations of their actions. In the Ackoff scheme above, meaning emerges on interpretation of data, but what does it mean here to use these kinds of sources?
A snapshot of social preference based in this kind of data represents not ‘the people’, but aggregates of general behaviours interpreted through a rationale of some kind. The snapshot acquired like this represents a virtualisation of a possible zeitgeist, based in the generalised collection of data points acquired across multiple platforms, in no reproducible way. Where marketing, political decision, or resource allocation is based on such a snapshot, the outcome is deeply questionable.
Especially, in a political context, this kind of scenario foreshortens a notion of ‘government by consent’ as it seems to risk placing the people governed and the governing in the position of the buyer and seller above — each clients to a third party. The third party is in fact the same as in the market scenario, but now the ability to shift preferences becomes quite overtly sinister one. It isn’t tea towels at stake anymore, but fundamental expressions of organised mutual interaction, and generally how we ought to get along.
The notion of opting-in to data collection has become commonplace mediated especially in consumer markets, and via social media platforms , but the vision of oneself as an aggregate of data points has not been embraced overtly. Perhaps there are contexts in which this is changing, such as the case of the fitbit and such technologies that process and present oneself and others as data-aggregates. Perhaps this could have a normalising effect over time.
What isn’t clear so far is that there is a widespread understanding of the potential problems surrounding the kinds of tacit interpretations of large data that are becoming routine. These have consequences in important domains that deal with what it means to be who we think we are.
In an emerging context of the Internet of Things, issues such as those mentioned here can be said to become more pressing. Not only activities such as browsing the web, texting, shopping could be subject to data collection, but passivities too, private moments catalogued as such. Absences, as well as presences somewhere, might stand as ‘data’ on someone.
The question of who interprets data and for what purpose will become more pressing as data technologies continue to advance. At stake might be who you are, and whether the wisdom exists to use or discard what the data tells you.
[i] CF Rowley, Jennifer, The wisdom hierarchy: representations of the DIKW hierarchy, Journal of Information Science April 2007 vol. 33 no. 2 163-180