A very cool paper from Dave Hoffman, et al. Here is the abstract:
This project empirically explores civil litigation from its inception by examining the content of civil complaints. We utilize spectral cluster analysis on a newly compiled federal district court dataset of causes of action in complaints to illustrate the relationship of legal claims to one another, the broader composition of lawsuits in trial courts, and the breadth of pleading in individual complaints. Our results shed light not only on the networks of legal theories in civil litigation but also on how lawsuits are classified and the strategies that plaintiffs and their attorneys employ when commencing litigation. This approach permits us to lay the foundations for a more precise and useful taxonomy of federal litigation than has been previously available, one that, after the Supreme Court’s recent decisions in Bell Atlantic v. Twombly (2007) and Ashcroft v. Iqbal (2009), has also arguably never been more relevant than it is today.
Dave discusses some of the data analysis in this post:
We gathered a set of 2,500 complaints (from a much larger sample of federal complaints derived through RECAP). The complaints were sampled to be fairly representative of all federal litigation, excluding pro se, social security, and prisoner petition cases. The sample contained 11,500 individual causes of action – around 4.6 causes of action per case. Guided by co-authors at Temple’s Center for Data Analytics, we used spectral clustering to examine the relationship between causes of action. Two years later and presto, we’ve a (draft) paper is up on SSRN! The ungainly title is Building a Taxonomy of Litigation: Clusters of Causes of Action in Federal Complaints. I welcome your comments, and your suggestions for a better title. Follow me after the jump for an exploration of our findings.
The figure below lays out a basic descriptive picture of the types of causes of action in our data. As you can see, almost one in three causes of action in federal court sounds in tort. Contract claims are the second most common legal theory advanced. (For more on the details of coding, including a discussion of the troublesome “bare claims for relief,” you’ll have to read the paper.)