Jan 24, 2013

Posted in Uncategorized

Predictive Coding Identifies 80% of Relevant Documents

Interesting developments from Loudon County Virginia.

ABA Journal reports on a WSJ Blog Post ($$$):

The e-discovery process got under way when lawyers coded a sample of 5,000 documents out of 1.3 million as either relevant or irrelevant. The information was then used to develop algorithms for a computer search of the remaining documents. The program turned up about 173,000 documents deemed relevant.

To see how well the computer program worked, the lawyers checked a sample of about 400 documents deemed relevant by the computer program. About 80 percent were indeed relevant. The lawyers then checked a sample of the documents deemed irrelevant. About 2.9 percent were possibly relevant. The statistics mean that about 81 percent of all relevant documents were found.

Humans can generally identify 60% of relevant documents.

Print Friendly
  • http://www.facebook.com/profile.php?id=1344654797 Christopher Baker

    That is pretty awesome. This should save some time, money, and effort. Unfortunately, someones probably still going to go through those irrelevant ones just in case one of those 2.9% have the gemstone in it (at least I would think they would).

  • Pingback: Dr. Watson and Robot, Esq. | Josh Blackman's Blog()