The answer to this question is no, according to Judge Ronald J. Miller, Jr. of the U.S. District Court for the Northern District of Indiana. In In re: Biomet M2a Magnum Hip Implant Products Liability Litigation (MDL 2391), No. 3:12-MD-2391, 2013 U.S. Dist. LEXIS 172570 (N.D. Ind. Aug. 21, 2013), the court did not require a producing party to share the set of documents it used to train its technology-assisted review system.
This case already has significant implications for users of technology-assisted review (TAR). In April 2013, the defendant used keyword searches and deduplication to reduce the initial document pool from 19.5 to 2.5 million documents. It then applied TAR to the remaining documents to further reduce the pool. The Plaintiffs Steering Committee (PSC) complained and asked the defendant to “go back to square one in its document production” and use TAR from the beginning, but the court declined to require the defendant to do so, finding nothing faulty in their hybrid approach.
In the most recent dispute, the PSC wanted the court to order Biomet to produce the seed set used to train the TAR engine. They claimed the seed set was required to “intelligently propose more search terms.” However, Biomet refused and remarked only that responsive documents from the seed set had been produced. The court agreed that the PSC’s request was “well beyond the scope of any permissible discovery by seeking irrelevant or privileged documents used to tell the algorithm what not to find.” The court could not order Biomet to compel information “not made discoverable by the Federal Rules,” specifically referring to the requirement that evidence be relevant to a claim or defense as provided in Rule 26(b)(1). However, the court characterized Biomet’s position as “troubling” and urged the defendant to cooperate, issuing a subtle warning: “[a]n unexplained lack of cooperation in discovery can lead a court to question why the uncooperative party is hiding something, and such questions can affect the exercise of discretion.”
The decision is likely to continue the discussion of whether TAR is a “black box” technology. When parties use techniques other than TAR to cull documents, they are not asked to provide a sampling to ensure the integrity of the process, nor are they asked for documents used to train document reviewers. The same logic should apply here. Moreover, the request arguably requires the production of documents that reflect attorney thought processes. Cooperation is the ideal, but here, unless there is evidence of a deficient production, parties should be allowed to protect the nonresponsive portions of their seed sets.