DoctorWho wrote: ↑Wed Sep 27, 2023 6:32 am
That whole world was invented after my brief litigation experience -- where way-overpaid associates reviewed documents.
I've heard the history that Microsoft in the 90s (a nearly perfect defendant for charging with patent infringement) got tired of the review model and either set up or helped set up a group dedicated to only document review. How much of that history is true?
PS. Does R = relevant? If so, how does that work in practice for, say, a 50 page document that might have one paragraph of relevent info? Do human eyes still scan through the doc?
R: Relevant/Responsive/Related (depending on which attorney drafts the RFP/where they are from/moon phase).
Brief History of eDoc Review:
Microsoft: I've not heard the particular story, but it makes some sense (I know google, for example, has (or had) it's own contract doc review staff to prevent outside counsel over-billing). My understanding is that tech people were pushing this as the next big thing from the 90s on. But it took till the mid 00s before the court rules (and the total domination of email as the way of doing business) finally came around to ediscovery being essential in civil litigation (FRCP explicitly endorsed computer chat and emails as discoverable documents in 06). However, it was the '08 crash (per what I've read) that really sealed the deal. No one was paying BigLaw rates for people to look at 500-1k emails a day, and it forced even established firms to seek out lower cost contract labor for the 1st line review.
Another factor that was huge was growth of project size. I've heard from old timers that a big project was 50k documents, with a huge project at 200k. Once email/chat exploded it wasn't uncommon for search terms to turn up 1m+ documents to be reviewed for relevance. That's just not possible for a law firm to handle with any # of "real" attorneys.
Plus the courts are, as is their wont, rightfully extremely distrustful of using machine learning/algos for doc review (spoiler alert: they're shit), so you've got email/chat/hard drive growth (08 was what? 100gb? less on normal people computers, now multi TB computers are common) meaning even more documents/copies of docs/old versions never deleted, chat logs for a decade, spam emails, but you can't really use computers to cull all this out b/c the courts assume you're scamming them.
So a perfect storm of more docs, cost conscious clients and even if the client is willing to pay, simply not enough attorneys to do it anyway. Consider project that takes 120 reviewers a month to do, that'd be 10% of Quinn Emmanuel or a similar sized firm working on 1 project. Just can't do it.
Process/answer to: "If so, how does that work in practice for, say, a 50 page document that might have one paragraph of relevent info? Do human eyes still scan through the doc?"
So right now the "standard" model is something like this. You've got a universe of docs you've culled out of all available data via fighting with the other side on search terms. You either linearly review the documents 1 by 1, or use some kind of assisted review. Either way you set a bunch of highlights up on the documents to make it easier to find key info.
In the linear review model once there is a single responsive search term hit in the document an attorney will review the entire document for Relevance, Priv (and sometimes other factors like confidentiality, PII, hotness) PLUS any attached documents (so emails + attachments, powerpoints + linked files and so on). ALLEGEDLY the attorney doesn't rely on the highlighting and skims all 50 pages of the document for privileged content/responsive content/hot content (such as dick pic slipped into the middle of a slideshow in a sexual harassment case).
Now, if the document is actually R (and not priv), what happens next is a matter of whatever you've fought out with the other side. 99% of the time this means the entire document + attached documents are produced. Sometimes you can get the most sensitive documents classified as Attorney's Eyes Only (meaning opposing counsel cannot show them to the client). On very rare occasions you can redact key non-responsive information (I'd say this happens on 1 case every 18 months, but sometimes stuff is that sensitive and not related to the case that the court allows it).
This is why the fight over search terms and exact wording of RFPs is important, b/c it's hugely powerful in determining how much extraneous info (i.e. ammunition) you're handing over to the other side b/c there is one R paragraph buried in a massive email chain and now the entire email chain + all attachments are going over the wall.
In a computer assisted model (TAR/CAL are common types) you basically review an initial set of documents, this is used to train the algo, the algo rank orders all documents, you review them in order (which continuously re-trains the model). Eventually the model spits out an estimate of the likelihood of the remaining documents having R content that is low enough the parties agree it's ok to not review them. You run a test sample on the remaining documents, then you're done (with a bunch of other quibbles b/c the process is actually very complicated and doesn't work for anything other than text (yet) and also is shit).
In my experience it is extremely, extremely rare, for a document to go out the door without human eyes (allegedly) looking at the entire thing, plus it being feed through at least one quality control workflow. Sometimes it happens (mass coding where counsel believes there can't be priv), but you never know when there's going to be a handwritten note on a scanned document that says "Legal issue here per in-house counsel, must stop program" or something like this, so even if it's just an "everything we don't find priv on is going out the door" priv only review. However, my experience has sampling bias, b/c I'm only brought in once the decision to manually review at least some docs is made, it's possible there are other reviews I'm involved with/don't ever see that push shit out the door without someone allegedly checking each page of each doc.