Predictive coding - what is it and how is it being used?

Predictive coding - what is it and how is it being used?


Predictive coding is a method of technology assisted review (TAR) used to assess the relevance of high volumes of documents for purposes of electronic disclosure (e-disclosure).


How does predictive coding work?


Predictive coding uses a combination of keyword search and iterative computer learning to rank the relevance of each individual document.

It starts off with a human operator (such as a lawyer) who understands how to grade the relevance of a legal document for purposes of disclosure. They will perform a search for relevant documents in an initial ‘seed set’ and grade these manually, providing a relevance mark for each (eg on a scale of 1 – 10). The software will analyse their manual grading to optimise its own search engine AI. This process will be repeated multiple times to enhance accuracy. 

Once the human operator has ‘trained’ the AI, the predictive coding system can be left to trawl through the bulk of documents automatically, with just the occasional verification for accuracy by lawyers. In this way, thousands of documents can be graded for relevance (for purposes of disclosure) in a fraction of the time it would take a team of paralegals.


Pros and cons of predictive coding


The primary benefits of predictive coding are speed and overall cost. Traditionally, firms have employed teams of paralegals to sift through vast swathes of documents for purposes of disclosure in large litigation cases. This can take weeks or months and is often extremely expensive. The digital age, with all the trails of electronic information, has led to far more work for e-disclosure teams, increasing the overall time and cost. Since it is automated, a predictive coding system can reduce the time spent on e-disclosure; the software can trawl through documents day and night without taking any rest breaks. And although the licensing costs of the software can be significant, it will generally be cheaper than employing a team of a team of paralegals, potentially levelling the playing field between litigants. Furthermore, it can (theoretically) improve accuracy, identifying relevant documents with more consistency than a human.

But predictive coding software is still relatively new and it does not always work flawlessly. Lawyers still need to divert their time to training up the system and checking the results. If technical problems arise, this can prove very costly for a firm and may even damage its reputation if things go awry. So it’s more of a risky proposition and, since the software does not have a high volume of users, the pricing is steep.


Predictive coding in the courts


Predictive coding has been used in America for many years and took a bit longer to become established in the UK. But in 2016 the Chancery Division of the High Court sanctioned the use of predictive coding in the case of Pyrrho Investments & MWB Property [2016] EWHC 256 (Ch), laying the groundwork for adoption of the technology in England and Wales.

A few months later, the High Court actually ordered the use of predictive coding on the basis that it would reduce costs, in the case of Brown v BCA Trading and others [2016] EWHC 1464 (Ch). In the event, BLP, which acted for BCA and had pushed for the use of predictive coding (in the face of opposition from the other side), won the case. Commenting, Oliver Glynn-Jones, partner at BLP at the time (now partner at Goodwin), said: “This is a case where the documents were key and pivotal to the judgment – and that’s what came out of the predictive coding exercise.


Related Articles:
Latest Articles:
About the author:
Alex Heshmaty is a legal copywriter and journalist with a particular interest in legal technology. He runs Legal Words, a legal copywriting and marketing agency.