Rely on the most comprehensive, up-to-date legal content designed and curated by lawyers for lawyers
Work faster and smarter to improve your drafting productivity without increasing risk
Accelerate the creation and use of high quality and trusted legal documents and forms
Streamline how you manage your legal business with proven tools and processes
Manage risk and compliance in your organisation to reduce your risk profile
Stay up to date and informed with insights from our trusted experts, news and information sources
Access the best content in the industry, effortlessly — confident that your news is trustworthy and up to date.
With over 30 practice areas, we have all bases covered. Find out how we can help
Our trusted tax intelligence solutions, highly-regarded exam training and education materials help guide and tutor Tax professionals
Regulatory, business information and analytics solutions that help professionals make better decisions
A leading provider of software platforms for professional services firms
In-depth analysis, commentary and practical information to help you protect your business
LexisNexis Blogs shed light on topics affecting the legal profession and the issues you're facing
Legal professionals trust us to help navigate change. Find out how we help ensure they exceed expectations
Lex Chat is a LexisNexis current affairs podcast sharing insights on topics for the legal profession
Printer Friendly Version
In Pyrrho, the High Court approved the use of “predictive coding”
(to facilitate the technology/computer assisted review of documents) in the disclosure process.
While judicial approval was not actually required – and Pyrrho is not the first case in which the technology has been used for e-disclosure in England and Wales – the judgment will serve as a useful guide and precedent to litigants
considering the use of this technology in future cases.
But what exactly is predictive coding and why did the Court approve its use?
Most e-disclosure is currently conducted by searching digital documents for keywords and then manually reviewing those that do. This is inherently time-intensive (and hence expensive).
A particular challenge of this approach is to strike a balance between excluding potentially relevant documents (keywords are too few or too narrow) and wasting time and/or money reviewing too many irrelevant documents (using too many keywords or keywords
that are too wide).
Surfacing potentially relevant documents based on keywords results in a binary output. Either the document contains one or more keywords – or it does not. Actual relevance is not determined until the resultant batch of documents has been manually
By contrast, predictive coding seeks to determine the likely relevance of each document, thus automating much/most of the review process itself.
The central part of the process is known as “machine learning” and essentially involves 3 steps:
A batch of documents is selected to form a “seed set”. The parties are free to agree selection criteria (which may include some use of keywords as identifiers) but the process would work equally well with a randomly generated seed set.
There are no prescribed parameters around what percentage of the total data set should be included. In Pyrrho, the parties estimated that the agreed seed set would comprise 1600 - 1800 documents (from over 3 million in total).
The system analyses the characteristics of the documents in the seed set and proceeds to present them to the lawyer conducting the training. The lawyer is asked to indicate whether each document is relevant or not.
With each such decision, the system builds an increasingly accurate model of which characteristics of documents in the seed set result in the lawyer categorizing it as relevant.
Having built a model which predicts the likely relevance of any given document in the seed set based on the lawyer’s decisions, the system applies this model to each document in the whole document set. It is then able to rank all of the documents
in order of likely relevance.
In recognition of the relative novelty of using predictive coding for e-disclosure in this jurisdiction, Master Matthews set out no less than ten factors which, in his view, favoured its use in this case.
It was also noted that there were “no factors of any weight pointing in the opposite direction”. As ever, the question of “whether it would be right for approval to be given in other cases will, of course, depend upon the particular
circumstances in obtaining them” (see para 34 of the judgment).
In this case, the parties had agreed to use predictive coding and the Court saw clear benefits in their doing so. In a case where the parties disagree on whether predictive coding should be used and/or on the exact mechanics of its use, the Court
will have a tougher challenge to face.
Take the machine learning process used in Pyrrho which involved three million documents. In 2013, LexisNexis was faced with the task of creating and applying a new taxonomy to all eight million documents held in its database
of legal documents. While the project took two years to complete, we now have an AI based system capable of performing the exercise in just hours. You can read about it here.
The next time you encounter a problem or bottleneck in your business, ask yourself, “Could this be solved or improved with AI?”
If you think the answer might be yes, our platform innovation and product development teams are always up for a challenge.
Previous (free) workshops have ranged from helping to break down big problems into manageable technology chunks, all the way to inspiring proofs of concept for entirely new solutions to challenges that may be shared by your own business or even the entire
If you would like to discuss anything in this article or would like to find out more about a workshop with our platform innovation team, please contact Alex Smith.
0330 161 1234