Denaturalizing “The New Oil”: The Ethical Questions of Big Data

June 15, 2017 • By Nick Couldry and Jun Yu

Our last post showed how the general discourses from the world’s leading institutions (e.g. World Economic Forum, OECD, McKinsey) work to legitimate and naturalize the processes of ‘datafication’, as well as the continuous automated extraction of personal data for new economic value that these processes involve. These institutions’ normative vision regards data as a naturally given entity, like oil, that should be used well to our common benefit. As such, datafication is seen as an indispensable feature of the economy and society in the 21^st century. What this vision bypasses, however, is any discussion of the ‘price’ involved: privacy, human freedom, and autonomy.

To obtain a deeper understanding of the potential contradiction between the conditions of datafication under which we live and our innate human freedom, we have in the past 12 months continued exploring how those general discourses are qualified, reproduced, and deepened further in two sectors where the promise of data is intensively applied: Health and Education.

Within the health sector, health data is seen as both a critical form of personal information and as a contributor to common goods like the cure of disease. This status provides the health sector’s large-scale financial and business infrastructure with special reasons to override individuals’ concerns about their privacy.

Multiple layers of discourse seek to justify the collection and processing of personal data under the label of “enhancing the quality of human life.” The discourses of marketing language (adverts, general puff, reports, etc.) of influential social fitness platforms and devices such as Endomondo, Fitbit, and PatientsLikeMe is one such example: expanded forms of data collection are licensed for these platforms, and their commercial use of collected data is authorized by claims that this collection enhances the service provided and the care offered to individual data subjects.

Such justification of commercial access to data arguably took the most radical form when adopted by genetic data companies like 23andMe, which pressured the public to rethink the nature and rights of individual subjects by drawing on the human race’s genetic commonality to anchor a vision of a new ‘community’ based on exploring the small genetic differences that remain between humans. In this light, data sharing and processing in the health sector becomes a means to make ‘common life’ better. The health industry is concomitantly reorganized, with the process of datafication (and its implementers) being the new centrality. Individuals are re-positioned as data subjects, assigned an ethical responsibility to contribute to the common life. The potential ‘price’ for the individual, however, remains under-discussed.

In the health domain, strict principles of data confidentiality are heavily ingrained as ethical principles. For this reason, the health sector not only assumes the rightness of data collection and sharing, but also continually defends it by raising the possibility of robust de-identification or anonymization of personal data.

Conversely, in the education sector, such strict ethical principles and force of confidentiality are less heavily ingrained, and the logic of automated data collection thus seems to operate with fewer constraints. Certainly, this is not to suggest that the education domain completely ignores privacy concerns, or is inattentive to individual confidentiality. But the starting-point for datafication in the education sector is very different from that of the health sector—not least because, as education scholar Ben Williamson has recently argued, schools have always been institutions characterized by high levels of (at least intermittent) surveillance of young people.

Pedagogic monitoring is not, therefore, a new idea in the education sector—and perhaps it is this distinctive starting-point that has enabled the recent major intensifications of dataveillance in the education sector. This case study, rather than pointing to a distinctive characteristic of educational data, offers clear evidence of an emerging general model of continuous automated surveillance as the basis for knowledge production and identity formation in social life. In the course of this, the nature and goal of education is, we argue, transformed.

As with health, new educational actors are purveying the infrastructures of surveillance and data-processing with a view to facilitating ‘personalized learning’ as a positive alternative to the educational model of the pre-datafication era—now characterized as an irresponsible ‘factory-model’ in which all children had toreceive the same content in the same manner. Since personalized learning cannot exist without continuous surveillance, any consideration of its possibly corrosive effects on the education process is virtually entirely missing. Privacy, however, is not totally forgotten! On the contrary, it is even proposed that surveillance can constitute an essential precondition for digital citizenship, because it provides “opportunities for mentorship, teaching, and learning” (Impero Education).

In contrast to the health sector, where the collection of personal data must at least be constantly defended, education’s increasing reliance on surveillance ‘in real time’ is coming to be considered inherent to the educational process itself. As a result, the striking panoptic possibilities of ubiquitous commercial access to personal educational data are neither disguised nor seen as threatening or chilling—but are rather seen as part of a virtuous circle of knowledge production, impossible before datafication.

Many education companies, such as Pearson and Blackboard, explicitly claim personalized learning as a ‘better’ education than the classroom model—better suited, that is, to the skills students require for the job market, but based on continuous surveillance of the entire process whereby the individual student thinks, learns, and grows. While few would oppose the need for measurement in education, such continuous tracking and measurement re-purposes education by focusing on a different sort of individual ‘development’, and a different relation between teacher and taught. As Neil Selwyn noted, the nature of teachers may come to resemble more that of assistant or moderator who sees pupils primarily through data rather than through the mutual experience of interaction in the classroom. In the process, the free space where students develop as people and emerge into responsible, educated citizens, has installed within it a continuous process of external surveillance, ensuring continuous measurement and measured adaptation. This has major consequences for the sort of freedom that education has until now been assumed to enhance.

These discourses from general institutions, and from the health and education sectors, together potentially undermine individual freedom and autonomy by treating the continuous automated collection and processing of personal data as a natural and legitimateprocess that bypasses ethical scrutiny while constituting a basic condition of contemporary life. To challenge such deep naturalization of datafication, we have reviewed two sorts of critical resource: legislative changes in Europe and legal debates in the USA.

The first, widely known as the General Data Protection Regulation (GDPR), will take effect in 2018. This law stems from a distinctive European approach grounded in the normative principle of the right to the ‘free development of the personality’. This right, which has its origins in the German Constitution, is designed to protect the value and dignity of the person based on free self-determination by emphasizing the individual’s capability to control the data and information produced about themself as a necessary precondition for that individual’s capacity to enjoy a truly ‘free’ life. The aim is to create a ‘free’ sphere in which the individual can feel safe from any external inferences. A recent report by Dutch government recently summed it up: “Freedom presupposes distance – a certain amount of social space between the individual and others, including supervising bodies”.

The GDPR regulates not only data collection but also data processing, with an aim to give control (partially) back to the individual. While this aim is attained through a range of individual rights, the most famous one would be “the right to be forgotten,” which allows the individual to request the deletion of their digital traces, provided that there are no legitimate grounds for retaining it (e.g. required for public benefit), regardless of this data’s continuing potential value to its controller – be it a governmental or commercial institution. Although sometimes mistaken as a restriction of the freedom of the press, this right instead challenges the idea that data collection and processing is somehow a natural part of human life by preventing corporate purposes (competitive market advantage and concomitant value-generation) from overriding individual freedom.

Another critical resource for challenging datafication’s impacts on the quality of life is the growing debate among legal scholars in the USA, where the negative idea of privacy as a ‘right to be let alone’ has become more prominent. Recently, there has been a move toward making connections between the responsible collection and use of data and a sustainable quality of life—a development that challenges the idea of collected personal data as ‘existing naturally’ for corporate and state use. Notable authors here include Julie Cohen, Philip Agre and Marc Rotenberg, and Neil Richards, all of whom have re-interpreted the right to privacy as involving the protections of the processes of constructing one’s identities or generating ideas from ‘surveillance or interference’. The need for such protection is best encapsulated in Cohen’s argument in her book Configuring the Networked Self—that privacy is a ‘breathing room’ that allows the individual to self-develop and to enjoy an ethical life, which points to the importance of living free of surveillance and monitoring if one is to become a healthy citizen.

However, the availability of these critical resources does not mean that there are no limits to the GDPR’s comprehensive legislative scheme and to the legal debates in the USA. For example, we still need to ask questions like: To what extent will these resources deal with the expansion of surveillance-based data extraction and processing identified in the education sector? Will children be allowed to remove their entire educational history once they have become adults? Will a datafied education sector based on new business models allow this?

These issues remain uncertain. But we hope our project will challenge prevailing trends of naturalizing data collection and thereby making it hard to see the potentially disturbing consequences of personal data processing for human life and its possibilities for autonomy and freedom.