Tuesday, December 8, 2009

A Speculative Post on the Idea of Algorithmic Authority


Jack Balkin invited me to be on a panel yesterday at Yale's Information Society Project conference, Journalism & The New Media Ecology, and I used my remarks to observe that one of the things up for grabs in the current news environment is the nature of authority. In particular, I noted that people trust new classes of aggregators and filters, whether Google or Twitter or Wikipedia (in its 'breaking news' mode.)

I called this tendency algorithmic authority. I hadn't used that phrase before yesterday, so it's not well worked out (and I didn't coin it — as Jeff Jarvis noted at the time, Google lists a hundred or so previous occurrences.) There's a lot to be said on the subject, but as a placeholder for a well-worked-out post, I wanted to offer a rough and ready definition here.

As this is the first time I've written about this idea, this a bit of a ramble. I'll take on authority briefly, then add the importance of algorithms.

Khotyn is a small town in Moldova. That is a piece of information about Eastern European geography, and one that could be right or could be wrong. You've probably never heard of Khotyn, so you have to decide if you're going to take my word for it. (The "it" you'd be taking my word for is your belief that Khotyn is a town in Moldova.)

Do you trust me? You don't have much to go on, and you'd probably fall back on social judgement — do other people vouch for my knowledge of European geography and my likelihood to tell the truth? Some of these social judgments might be informal — do other people seem to trust me? — while others might be formal — do I have certification from an institution that will vouch for my knowledge of Eastern Europe? These groups would in turn have to seem trustworthy for you to accept their judgment of me. (It's turtles all the way down.)

The social characteristic of deciding who to trust is a key feature of authority — were you to say "I have it on good authority that Khotyn is a town in Moldova", you'd be saying that you trust me to know and disclose that information accurately, not just because you trust me, but because some other group has vouched, formally or informally, for my trustworthiness.

This is a compressed telling, and swerves around many epistemological potholes, such as information that can't be evaluated independently ("I love you"), information that is correct by definition ("The American Psychiatric Association says there is a mental disorder called psychosis"), or authorities making untestable propositions ("God hates it when you eat shrimp.") Even accepting those limits, though, the assertion that Khotyn is in Moldova provides enough of an illustration here, because it's false. Khotyn is in Ukraine.

And this is where authority begins to work its magic. If you told someone who knew better about the Moldovan town of Khotyn, and they asked where you got that incorrect bit of information, you'd have to say "Some guy on the internet said so." See how silly you'd feel?

Now imagine answering that question "Well, Encyclopedia Britannica said so!" You wouldn't be any less wrong, but you'd feel less silly. (Britannica did indeed wrongly assert, for years, that Khotyn was in Moldova, one of a collection of mistakes discovered in 2005 by a boy in London.) Why would you feel less silly getting the same wrong information from Britannica than from me? Because Britannica is an authoritative source.

Authority thus performs a dual function; looking to authorities is a way of increasing the likelihood of being right, and of reducing the penalty for being wrong. An authoritative source isn't just a source you trust; it's a source you and other members of your reference group trust together. This is the non-lawyer's version of "due diligence"; it's impossible to be right all the time, but it's much better to be wrong on good authority than otherwise, because if you're wrong on good authority, it's not your fault.

(As an aside, the existence of sources everyone accepts can be quite pernicious — in the US, the ratings agencies Moodys, Standard & Poor's, and Fitch did more than any other group of institutions to bring the global financial system to the brink of ruin, by debauching their assertions to investors about the riskiness of synthetic assets. Those investors accepted the judgement of the ratings agencies because everyone else was too. Like everything social, this is not a problem with a solution, just a dilemma with various equilibrium states, each of which in turn has characteristic disadvantages.)

Algorithmic authority is the decision to regard as authoritative an unmanaged process of extracting value from diverse, untrustworthy sources, without any human standing beside the result saying "Trust this because you trust me." This model of authority differs from personal or institutional authority, and has, I think, three critical characteristics.

First, it takes in material from multiple sources, which sources themselves are not universally vetted for their trustworthiness, and it combines those sources in a way that doesn't rely on any human manager to sign off on the results before they are published. This is how Google's PageRank algorithm works, it's how Twitscoop's zeitgeist measurement works, it's how Wikipedia's post hoc peer review works. At this point, its just an information tool.

Second, it produces good results, and as a consequence people come to trust it. At this point, it's become a valuable information tool, but not yet anything more.

The third characteristic is when people become aware not just of their own trust but of the trust of others: "I use Wikipedia all the time, and other members of my group do as well." Once everyone in the group has this realization, checking Wikipedia is tantamount to answering the kinds of questions Wikipedia purports to answer, for that group. This is the transition to algorithmic authority.

As the philosopher John Searle describes social facts, they rely on the formulation X counts as Y in C — in this case, Wikipedia comes to count as an acceptable source of answers for a particular group.

There's a spectrum of authority from "Good enough to settle a bar bet" to "Evidence to include in a dissertation defense", and most uses of algorithmic authority right now cluster around the inebriated end of that spectrum, but the important thing is that it is a spectrum, that algorithmic authority is on it, and that current forces seem set to push it further up the spectrum to an increasing number and variety of groups that regard these kinds of sources as authoritative.

There are people horrified by this prospect, but the criticism that Wikipedia, say, is not an "authoritative source" is an attempt to end the debate by hiding the fact that authority is a social agreement, not a culturally independent fact. Authority is as a authority does.

It's also worth noting that algorithmic authority isn't tied to digital data or even late-model information tools. The design of Wikileaks and Citizendium and Apache all use human vetting by actors prized for their expertise as a key part of the process. What seems important is that the decision to trust Google search, say, can't be explained as a simple extension of previous models. (Whereas the old Yahoo directory model was, specifically, an institutional model, and one that failed at scale.)

As more people come to realize that not only do they look to unsupervised processes for answers to certain questions, but that their friends do as well, those groups will come to treat those resources as authoritative. Which means that, for those groups, they will be authoritative, since there's no root authority to construct from. (I lied before. It's not turtles all the way down; its a network of inter-referential turtles.)

Now there are boundary problems with this definition, of course; we trust spreadsheet tools to handle large data sets we can't inspect by eye, and we trust scientific results in part because of the scientific method. Also, although Wikipedia doesn't ask you to trust particular contributors, it is not algorithmic in the same way PageRank is. As a result, the name may be better replaced by something else.

But the core of the idea is this: algorithmic authority handles the "Garbage In, Garbage Out" problem by accepting the garbage as an input, rather than trying to clean the data first; it provides the output to the end user without any human supervisor checking it at the penultimate step; and these processes are eroding the previous institutional monopoly on the kind of authority we are used to in a number of public spheres, including the sphere of news.


[link to original | source: Clay Shirky | published: 22 days ago | shared via feedly]


No comments:

Post a Comment