Jyothi Vinjumur

Hello and Welcome to my web page. My name is Jyothi [Pronounced Geo-thi]. I am a PhD candidate at University of Maryland, College Park. I am appointed as a Graduate Research Assistant and my supervisor is Dr. Doug Oard

My research interest is in the field of Information Retrieval in the domain of Law (A process called Electronic Discovery or e-discovery).

In 2006, the Federal Rules of Civil Procedure were amended to make it clear that all forms of electronically stored information, including emails, were within the scope of evidence that could be requested from a counterparty incident to civil litigation in the United States. Thus was born the multi-billion dollar industry that has come to be called electronic discovery or e-discovery.

The high cost of e-discovery results from two main factors: (1) Because the standard for relevance is expansive, large numbers of relevant documents could be found, and (2) Producing parties can assert privilege (to foster socially desirable outcomes such as open communication between attorneys and their clients) on some relevant documents to withhold confidential content. Thus, in practice, electronic evidence that are found to be responsive to a production request are subjected to an exhaustive manual review for privilege in order to be sure that material that has to be withheld is not inadvertently revealed. Although the budgetary constraints on relevance review can be achieved using automation to some degree, attorneys have been hesitant to adopt technology for the high stake privilege review process.

For this reason, my dissertation focuses on introducing a framework that encourages the use of automation during e-discovery (to support both relevance and privilege review). The objective of our framework is; (1) to maximize the benefits obtained by the manual review process (2) to optimize the cost of e-discovery process. We propose to build our core framework by designing what we call a \emph{hybrid model} which is neither fully automated nor fully manual. We represent the hybrid model as is a cost-sensitive single-label classification problem. The cost-sensitivity is reflected as a part of document ranking process which is dependent on the different types of classification errors. We conclude by gathering relevance judgments which will be used to evaluate our hybrid model.