Rely on the most comprehensive, up-to-date legal content designed and curated by lawyers for lawyers
Work faster and smarter to improve your drafting productivity without increasing risk
Accelerate the creation and use of high quality and trusted legal documents and forms
Streamline how you manage your legal business with proven tools and processes
Manage risk and compliance in your organisation to reduce your risk profile
Stay up to date and informed with insights from our trusted experts, news and information sources
Access the best content in the industry, effortlessly — confident that your news is trustworthy and up to date.
With over 30 practice areas, we have all bases covered. Find out how we can help
Our trusted tax intelligence solutions, highly-regarded exam training and education materials help guide and tutor Tax professionals
Regulatory, business information and analytics solutions that help professionals make better decisions
A leading provider of software platforms for professional services firms
In-depth analysis, commentary and practical information to help you protect your business
LexisNexis Blogs shed light on topics affecting the legal profession and the issues you're facing
Legal professionals trust us to help navigate change. Find out how we help ensure they exceed expectations
Lex Chat is a LexisNexis current affairs podcast sharing insights on topics for the legal profession
Discuss the latest legal developments, ask questions, and share best practice with other LexisPSL subscribers
Dominic Tucker, Senior Consultant at Anexsys Ltd, considers how technology can assist and improve the disclosure process, where previously keyword searching and linear review was adopted as the default approach.
The objective of review in e-discovery is to identify as many relevant documents as possible, while reviewing as few non-relevant documents as possible (Da Silva Moore). This is known as achieving the highest possible recall (proportion of all relevant documents identified during a review) and precision (proportion of relevant documents within the reviewed set).
Although keywords are inherently biased thereby naturally excluding a proportion of relevant documents or necessitating the review of increasing volumes of irrelevant documents, lawyers have been relatively slow to adopt alternative approaches.
However, predictive coding, a technology which automates portions of an e-disclosure document review, is now starting to gain popularity in the UK as an approach to disclosure.
Predictive techniques are commonly applied to analyse data in order to assess risk and make future predictions. They are not unique to the legal world - common everyday uses include credit scoring, fraud identification and risk underwriting and they have been widely adopted in a variety of industries including accountancy, insurance, banking, financial services, pharmaceuticals and healthcare.
Predictive coding systems apply complex algorithms which, based upon their analysis of review decisions, identify similar documents which are prioritised for review.
In doing so, they aim to limit the review of irrelevant documents and enable relevant documents to be captured as efficiently as possible, thereby improving recall and precision.
A predictive coding exercise typically begins with a senior lawyer training an algorithm by reviewing a ‘seed set’ of example documents.
The algorithm analyses the characteristics of these documents, learns from the lawyer’s decision making and thereafter seeks to identify similar documents and rank them by their likelihood of relevance.
The most highly ranked documents can then be prioritised for review. This review continues until the system fails to return any further relevant documents or when the proportion of relevant documents becomes so low that continuing the review becomes disproportionate.
As in any disclosure exercise, a predictive coding methodology should be supported by an appropriate validation and quality checking regime so that decision making can be justified and each stage of a project independently verified.
How the seed set should be comprised is up for debate, as is the length of time that should be taken to train the algorithm and the extent of any quality control regime that should be adopted in order to validate the process.
In comprising the seed set, predictive coding in its most straightforward form will focus upon a randomly generated set of documents. No keywords are run and the system is left to present example documents to the senior lawyer unhindered by bias.
Relying on a randomly generated set of documents as the starting point is sometimes a step too far for most lawyers and could be seen as a blind leap of faith in the algorithm. The risk being that it appears more difficult to validate how the algorithm has been trained and from project to project its ability to stand up to scrutiny is uncertain.
As such, some lawyers are preferring to adopt a middle-ground ‘hybrid approach’ where the seed set is comprised of a mixture of keyword responsive, other searches and randomly selected documents.
Subscribers to Lexis®PSL Dispute Resolution can read Dominic Tucker’s full and more in depth analysis of Predictive Coding including of the hybrid approach recently endorsed by the Irish High Court in Irish Bank Resolution Corporation Ltd & ors v Quinn & ors  IEHC 175 here.
Subscribers can also read his analysis of the ways in which and the challenges the Swiss investigators may face in seeking to manage the data collected in their investigations into Fifa’s alleged corruption.
Sign up for a free trial here if you are not a subscriber and would like to read that full analysis.
Dominic Tucker is a Senior Consultant at Anexsys Ltd, a leading provider of outsourced eDisclosure and litigation support services to law firms, corporations and government departments.
Access this article and thousands of others like it free by subscribing to our blog.
Read full article
Already a subscriber? Login
0330 161 1234