This is the seventh article in a series that explores practical tools and strategies to proactively manage costs and effectively navigate through the eDiscovery process for litigation, internal investigations and regulatory matters. The series will provide practical tips on document management, data mapping, discovery planning, custodian interviews, document processing and hosting, eDiscovery technology and explore proposed arbitration rules and alternative dispute resolution.
The discovery process is typically the most time-consuming and expensive aspect of any legal dispute. Fortunately, by adopting appropriate technology and best practices, parties can achieve significant cost savings in the review process. This article will outline functionality that is available and should be used in every discovery project involving electronic evidence.
Email threading is now considered a basic tool that should be employed on every discovery project. Email threading groups the conversation that happens back and forth in email and identifies the “end points” or “inclusive emails.” The time and cost savings arising from this analysis is twofold.
First, the non-inclusive emails (the emails that are fully contained in inclusive email) can be removed from the review as they do not contain any unique content. The second benefit from email threading is that counsel can group and review email conversations/threads together. Even where only inclusive emails are reviewed, there will still be multiple inclusive emails in a thread.
For example, an underlying email can be forwarded to another recipient, creating a separate “branch” in the email thread, or an earlier email could contain an attachment that is not carried through the entire email thread. Reviewing the inclusive threads together provides better context around the emails and is more consistent and efficient. In our experience, email threading has reduced the volume of emails by up to 70 per cent in construction litigation.
Email threading should be discussed with opposing counsel and agreed to in the discovery plan.
Textual near duplicates
Our last article discussed deduplication. Textual near duplication is a tool that groups together textually similar documents that are not exact duplicates. For example, a PDF of a Word document will not be considered an exact duplicate but can be identified as a textual duplicate. Another example is an early draft of a contract.
The function can be set to identify documents between 80 to 100 per cent textual duplicates. Grouping similar documents also increases efficiency and accuracy, and allows counsel to easily identify drafts or various versions of a document. Some review software also allows you to generate a blackline to compare the text of the two documents.
Continuous active learning (CAL)
Continuous active learning (CAL) is often referred to as predictive coding or AI. While technically it is neither of these, it is a form of machine learning that assists in determining the likely relevance/non-relevance of documents.
Using CAL a team of counsel “trains” the algorithm on examples of relevant and irrelevant documents. The program will then rank the documents in the review set on a scale of one to 100, with the higher number rank as more likely to be relevant based on the example documents provided. The algorithm continues to refine the rankings as the team continues to identify documents as relevant or irrelevant. It is in this sense it is considered to be “continuous active learning” as it continues to fine tune the algorithm and learn from counsel decisions. The goal of this exercise is to reduce the number of documents requiring manual review. If done well and adequately audited, you can defensibly set aside the unreviewed documents that the algorithm predicts to be irrelevant. In our experience, the use of CAL in combination with other review techniques has reduced review costs by more than 90 per cent as compared to a traditional linear review.
Email threading, textual near duplicate identification and CAL are just three examples of the technology that is available to decrease the costs associated with discovery. These tools are most effective when used together by counsel trained in navigating the intricacies of the technology and who can understand the legal and factual aspects of the claim or investigation.
Candice Chan-Glasgow is director, review services at Heuristica Discovery Counsel LLP. Heuristica has offices in Toronto and Calgary and is the sole national law firm whose practice is limited to eDiscovery and electronic evidence.