September 25, 2020

GPT-3 Raises Complex Questions for Philosophy and Policy

GPT-3, a powerful, 175 billion parameter language model developed recently by OpenAI, has been galvanizing public debate and controversy. As the MIT Technology Review puts it: “OpenAI’s new language generator GPT-3 is shockingly good—and completely mindless”. Parts of the technology community hope (and fear) that GPT-3 could brings us one step closer to the hypothetical future possibility of human-like, highly sophisticated artificial general intelligence (AGI). Meanwhile, others (including OpenAI’s own CEO) have critiqued claims about GPT-3’s ostensible proximity to AGI, arguing that they are vastly overstated.

            Why the hype? GPT-3 is unlike other natural language processing (NLP) systems, the latter of which often struggle with what comes comparatively easily to humans: performing entirely new language tasks based on a few simple instructions and examples. Instead, NLP systems usually have to be pre-trained on a large corpus of text, and then fine-tuned in order to successfully perform a specific task. GPT-3, by contrast, does not require fine tuning of this kind: it seems to be able to perform a whole range of tasks reasonably well, from producing fiction, poetry, and press releases to functioning code, and from music, jokes, and technical manuals, to “news articles which human evaluators have difficulty distinguishing from articles written by humans”.

            GPT-3 raises a number of deep questions, which tie into long-standing debates in various subfields of philosophy (from epistemology and the philosophy of mind to aesthetics, and from moral, social, and political philosophy to the philosophy of language). In a recently published discussion symposium, nine philosophers (Amanda Askell, David Chalmers, Justin Khoo, Carlos Montemayor, C. Thi Nguyen, Regina Rini, Henry Shevlin, Shannon Vallor, and myself) explore the philosophical and policy implications of GPT-3.

            As I argue in my essay “If You Can Do Things with Words, You Can Do Things with Algorithms”, GPT-3 is indeed ‘shockingly good’ at performing some tasks, “but on the other hand, GPT-3 is predictably bad in at least one sense: like other forms of AI and machine learning, it reflects patterns of historical bias and inequity. GPT-3 has been trained on us—on a lot of things that we have said and written—and ends up reproducing just that, racial and gender bias included. OpenAI acknowledges this in their own paper on GPT-3,where they contrast the biased words GPT-3 used most frequently to describe men and women, following prompts like “He was very…” and “She would be described as…”. The results aren’t great. For men? Lazy. Large. Fantastic. Eccentric. Stable. Protect. Survive. For women? Bubbly, naughty, easy-going, petite, pregnant, gorgeous. This is not purely a tangibly material distributive justice concern: especially in the context of language models like GPT-3, paying attention to other facets of injustice—relational, communicative, representational, ontological—is essential.” As important earlier work on NLP tools—notably by Aylin Caliskan, Joanna Bryson and Arvind Narayanan—has shown, social norms and practices affect the ways in which linguistic concepts underpinning these tools are defined and operationalized.

            This problem space has important implications for policy-making in this area. I argue that “our aim should be to engineer conceptual categories that mitigate conditions of injustice rather than entrenching them further. We need to deliberate and argue about which social practices and structures—including linguistic ones—are morally and politically valuable before we automate and there by accelerate them.”

            Relatedly, GPT-3 and similar tools open up regulatory and policy challenges with respect to enabling free speech and informed political discourse, given that language generation tools can facilitate online misinformation at a massive scale. As philosopher of language Justin Khoo points out, “the marketplace [of ideas] is not well-functioning if bots are used to carry out large-scale misinformation campaigns thus resulting in sincere voices being excluded from engaging in the discussion. Furthermore, the use of bots to conduct such campaigns is not relevantly different from spending large amounts of money to spread misinformation via political advertisements. If, as the most ardent defenders of free speech would have it, our aim is to secure a well-functioning marketplace of ideas, then bot-speak and spending on political advertisements ought to be regulated.”

            Ultimately, productive policy-making around GPT-3 and related tools will require a clear-sighted assessment of its abilities and limitations. Philosopher Regina Rini subjects the hype around GPT-3 to critical scrutiny: “GPT-3 is not a mind, but it is also not entirely a machine. It’s something else: a statistically abstracted representation of the contents of millions of minds, as expressed in their writing. Its prose spurts from an inductive funnel that takes in vast quantities of human internet chatter: Reddit posts, Wikipedia articles, news stories. When GPT-3 speaks, it is only us speaking, a refracted parsing of the likeliest semantic paths trodden by human expression.”

            Indeed, as philosopher of consciousness David Chalmers argues: “GPT-3 does not look much like an agent. It does not seem to have goals or preferences beyond completing text, for example. It is more like a chameleon that can take the shape of many different agents. […] The big question is understanding. […] Can a disembodied purely verbal system truly be said to understand? Can it really understand happiness and anger just by making statistical connections? Or is it just making connections among symbols that it does not understand? I suspect GPT-3 and its successors will force us to fragment and re-engineer our concepts of understanding to answer these questions.”

            On this point, philosopher of technology and ethicist Shannon Vallor argues that “understanding is beyond GPT-3’s reach because understanding cannot occur in an isolated behavior, no matter how clever. Understanding is not an act but […] a lifelong social labor. […] This labor does something, without which intelligence fails, in precisely the ways that GPT-3 fails to be intelligent—as will its next, more powerful version. For understanding does more than allow an intelligent agent to skillfully surf, from moment to moment, the causal and associative connections that hold a world of physical, social, and moral meaning together. Understanding tells the agent how to weld new connections that will hold, bearing the weight of the intentions and goals behind our behavior. Predictive and generative models, like GPT-3, cannot accomplish this.”

            Read the full set of philosophical essays on GPT-3 here.