Oldfather, Bockhorst, and Dimmer on Automated Content Analysis of Judicial Opinions

September 8th, 2012

For some time, I have been researching and developing what I call “Assisted Decision Making” (see this presentation I gave at LawTechCamp London, and this article I authored about the future of FantasySCOTUS). Assisted Decision Making is a system in which technology can understand how courts resolve cases, and offer advice to potential litigants.

I am very pleased to see that Chad Oldfather, along with Joseph Bockhorst and Brian Dimmer, have authored an article in the Florida Law Review that explores this area. Here is the abstract of Triangulating Judicial Responsiveness: Automated Content Analysis, Judicial Opinions, and the Methodology of Legal Scholarship:

The increasing availability of digital versions of court documents, coupled with increases in the power and sophistication of computational methods of textual analysis, promises to enable both the creation of new avenues of scholarly inquiry and the refinement of old ones. This Article advances that project in three respects. First, it examines the potential for automated content analysis to mitigate one of the methodological problems that afflicts both content analysis and traditional legal scholarship — their acceptance on faith of the proposition that judicial opinions accurately report information about the cases they resolve and courts‘ decisional processes. Because automated methods can quickly process large amounts of text, they allow for assessment of the correspondence between opinions and other documents in the case, thereby providing a window into how closely opinions track the information provided by the litigants. Second, it explores one such novel measure — the responsiveness of opinions to briefs — in terms of its connection to both adjudicative theory and existing scholarship on the behavior of courts and judges. Finally, it reports our efforts to test the viability of automated methods for assessing responsiveness on a sample of briefs and opinions from the United States Court of Appeals for the First Circuit. Though we are focused primarily on validating our methodology, rather than on the results it generates, our initial investigation confirms that even basic approaches to automated content analysis provide useful information about responsiveness, and generates intriguing results that suggest avenues for further study.

Much of the article focuses on judicial responsiveness–that is, how do courts react to things that litigants do.

Using a set of briefs and opinions from the First Circuit, we have investigated two automated measures of judicial responsiveness both of which avoid the practical difficulties associated with manually assessing responsiveness, both of which employ a notion of the similarity between briefs and opinions. The first involves assessing document similarity through analysis of textual content of briefs and opinions. The second utilizes a similar methodology applied to citations to authority; that is, we assessed the extent to which opinions cite to the same legal authorities as relied upon by the parties in their briefs. In order to test the validity of these measures, we also undertook the sort of full-scale assessment of a set of cases outlined in the preceding paragraph, reviewing the briefs and opinions in depth and coding them for responsiveness.

More importantly, the article assesses how this information can benefit litigants.

Measures of judicial responsiveness are potentially valuable in at least four broad respects regardless of one‘s preference for judicial passivity . . . .  Third, this line of research might yield payoffs to advocates. To the extent that it becomes possible to know specifics about what triggers greater responsiveness— such as, for example, whether the filing of a reply brief has an effect— lawyers will be able to adjust their efforts accordingly.16  . . . .

Finally, this line of research may yield insights that are useful to practicing lawyers, and to those who teach advocacy. One can imagine, for example, large-scale analysis of the relationships among briefs and opinions generating information about the relative utility of briefing practices and approaches. It may tell us something about whether reply briefs matter, or whether response briefs should place relatively greater emphasis on engaging with the opponent‘s arguments or developing their own. It could also facilitate quantitative assessment of lawyering skills, such as enabling assessment of the relative quality of public defenders and private counsel in criminal appeals, or comparisons of specialists and non-specialists.

Also, the article reaches a conclusion that I think will rock most of empirical legal studies–that computers coding and predicting cases is enormously more efficient and more accurate than humans doing so.

Our final aim, then, is to explore whether computational methods can overcome these barriers. Enlisting computers rather than humans to ―read‖ and code opinions and other documents will enable researchers to analyze large amounts of information in short periods of time, and to do so with no need to worry about consistency from one reader to the next.

You should not fear your legal robotic overlords.

The article describes the methodologies here:

We investigated two types of automated approaches for quantification of responsiveness. These methods differ in the types of evidence considered. One approach uses the textual content of a case‘s opinion and briefs. This method estimates responsiveness by the cosine similarity between opinion and brief documents. This widely used document-similarity measure has been successfully applied to document classification, information retrieval, and other natural language processing tasks.180 The second approach is based on citation patterns in the opinion and briefs. Both methods involve measuring various aspects of the overlap among the documents.

H/T Legal Informatics Blog