Caseopea - the new frontier of automatic e-Discovery


Caseopea offers a next generation document review and analysis platform that dramatically reduces the overwhelming overhead imposed on lawyers by pre-litigation e-Discovery.

Caseopea combines its advanced Natural Language Understanding technology with its innovative semantic reasoning framework and knowledge-guided search engine, to deliver a groundbreaking e-Discovery platform able to identify and extract case-related material out of large volumes of electronic data.

The Opportunity

The traditional Discovery process, whereby both lawsuit parties share relevant documents with each other, used to involve physically handing over boxes of papers, but no more. The advent of new forms of digital office communications has triggered dramatic changes to the civil Discovery act. The transformation of the Discovery process was accelerated by a series of amendments to the Federal Rules of Civil Procedure (FRCP), introduced by US Supreme Court as of December 2006.  These amendments have largely extended the scope of admissible discovery material to include all forms of Electronically Stored Information (ESI), including e-mail, instant messages, chats, Electronic Office files (documents, presentation, spreadsheets), accounting databases, Web sites, and other electronically-stored information that could be relevant in a law suit. The Discovery process expanded scope is posing extreme challenges and opportunities for attorneys, clients, technology providers, and justice courts. Some well-known law experts have ranked the recent discovery amendments as the most significant change to the legal system in recent decades.

With massive volumes of electronic information being collected, reviewed, produced and consumed, the challenge is enormous and the effects are well noted – big firms that were used to producing half a million documents for a high stake case, would now have to process a hundred times that quantity. Moreover, the problem is not confined to staggering volumes. The new forms of electronic information are also much more complex, intertwined and fluid, making the Discovery information, saturated with cross references, much harder to follow, review and analyze.

It is therefore hardly surprising that the cost of e-Discovery has reached astounding numbers:
A Microsoft executive has disclosed in a recent post that the company is spending an average of US$ 20 million for e-Discovery per litigation.

The e-Discovery market is growing rapidly.
A recent IDC research has found that the cost of legal discovery and litigation support has totaled $12 billion in 2007 up 23% from $9.7 billion in 2006. A recent survey by Socha-Gelbmann (authoritative law technology experts) has found even more impressive growth rates: 33% in 2006 and 28% in 2007, the report is predicting growth rates well above 20% in the next few years.

The new landscape of e-Discovery has forced some fundamental changes in the industry. Law firms and enterprise legal departments, considered not long ago stiff technophobes, started investing heavily on e-Discovery and litigation support systems. The market for e-Discovery    support technology is booming across the entire technology spectrum: storage and document management systems, review and analysis platforms, collection preservation, online processing and full service outsourcing. Within this wide spectrum of offerings, the review and analysis step stands out as the most expensive and risky. Compared to other steps of e-Discovery, the cost associated with analysis and review is larger by scale. On average, it costs $1,800 to process and prepare data for analysis, and around $200-$250 per hour to analyze and review it, where review processes are usually measured in man years. A commonly accepted benchmark sets the cost for reviewing one gigabyte of data at well above $30,000.

In a typical enterprise lawsuit, where adversaries are dumping hundreds of millions of documents on each other in response to a discovery requests, the cost of analysis and review can easily reach tens of millions.

As an example, during the acquisition of MCI by Verizon, the two companies have deployed around 110 lawyers on each side for a period of 4 months in order to conduct a thorough privilege and relevance review of electronic data.

Yet, the impeding costs are not the only problem. The e-Discovery process is also riddled with human error – hardly surprising given that the process is characterized by an extremely low distillation ratio, where only 5-10 percent of the data analyzed ends up being relevant. A seminal study from a few years ago has found that legal researchers were only 20% accurate in finding case relevant documents while being convinced they were 80% accurate. A recent research showed human reviewers are consistently only 40%-50% accurate in their data culling decisions, failing by both missing out on crucial data and by including irrelevant documents. This typical low accuracy of human review reflects inconsistent judgment, lack of coordination, and fatigue. Nonetheless, a failure in this sensitive process can spell disaster, as was highlighted by two recent highly publicized court rulings:

         In one case Morgan Stanley lost a $1.45 billion verdict after a Florida court concluded that the company had failed to disclose a small number of crucial emails to their adversaries during the discovery phase.

         In the second, a San Diego judge has recently ordered Qualcomm to pay nearly $10 million legal fees to Broadcom for failing to turn over relevant electronic evidence related to their high-stakes patent lawsuit.

 The new reality of e-Discovery, combining high costs and high risk, is driving lawyers to realize that supporting software and automation is the only practical answer. In a dramatic paradigm shift, law firms are turning to search & analysis tools. However, though helpful, existing search tools and current text analysis technology provide only a partial answer to the monumental problem of analyzing highly contextual text. Conventional keyword or conceptual search engines are largely inappropriate for the task of unraveling the conversational flow of email threads, following the logic of running memo exchanges, tracking casual chats transcripts and making sense of cross-referencing documents. The shifting context of analyzed content, combined with the dynamic nature of the underlying communication channels, constitutes a very different search analysis space, well out of reach for conventional search and analysis tools. Here, eventuality, context and consequence play a key role in understanding the content being analyzed.

 There is a strong need for a different approach and new capabilities that would allow content analysis in a narrative, accumulative manner, while constantly considering context, time and consequence. An effective solution requires an intelligent system that can largely relieve lawyers from the tedious job of sifting through volumes of text in order to find a needle of evidence in the haystack of documents.  Caseopea offers this much needed solution.

Why Caseopea ?

 Currently available e-Discovery tools

Current offerings in the area of review and analysis are far from answering the needs described above. They can be divided into the following categories:

                General enterprise document management systems that provide limited search capabilities.

               General enterprise search engines providing enhanced keywords and Boolean search.

               Conceptual text analysis platforms, which automatically classify documents, based on repeating patterns or emphasized concepts. These tools usually generate visual documents clustering maps providing a convenient overview of documents population.

               Meta-data analyzers (mainly for emails), which extract and display document meta-data, such as who issued a specific email, when and to whom. These tools help reviewers to track and follow email threads, memo exchanges or chat sessions.

               Review support systems that facilitates easier manual review by providing convenient tools for the manual tagging and filtering of documents.

               Heavy-duty research platform based on statistical linguistic models. These tools require intensive training, tuning and often reprogramming, mainly used by project based outsourcing companies. They usually require intensive involvement of skilled technical stuff and linguistic experts.

 Caseopea Search & Analysis Platform

Caseopea has developed a both proven and groundbreaking text analysis platform that combines linguistics, semantics, logic and automated reasoning. The software both analyzes the text and learns from it. Context and consequence reasoning play an integral part in this dynamic analysis process. Text is progressively analyzed, so facts and concepts extracted during the analysis are being incorporated, on the run, into the contextual framework to be used for the next discourse or document, very much like the reader of this document uses previously gathered information to interpret this paragraph. This unique feature is especially crucial in the context of automatically reviewing email threads – where the narrative develops with each message and roles are frequently switched between parties.

 The Caseopea search engine is based on similar principles and technology – a search query is fully interpreted based on linguistic and semantic knowledge, and then matched against analyzed documents using many levels of inferences: contextual, temporal and consequential – a far cry from keyword or pattern search.

 Case specific customization can be incorporated dynamically using the Caseopea interactive modeling environment, in which users can finely control the process or tie it to specific workflows, rules, and case-specific knowledge with no programming or technical skills required – the environment is intuitive yet powerful.

 Caseopea will help medium to large law firms and enterprises with their e-Discovery, saving time and resources on one hand and improving e-Discovery quality on the other.

Caseopea can put lawyers back on top of Legal Discovery.




A query turned into structure.

ESVL.bmp (2824566 bytes)

Semantic Analysis Products