Online Legal Research: is there an affirmative duty to use more than one research platform?

Earlier this week, Robert Ambrogi posted Turns Out Legal Research Services Vary Widely in Results.  Ambrogi, one of the leading commentators on legal technology, wrote:

  • “Call me naive, but I would have thought that entering the identical search query on, say, both Westlaw and Lexis Advance would return fairly similar results, at least among the cases ranked highest for relevance. After all, shouldn’t the cases that are most relevant to the query be largely the same, regardless of the research platform?”

Then, he added:

  • “Turns out, the results they deliver vary widely — not just between Westlaw and Lexis Advance, but among several legal research platforms. In fact, in a comparison of six leading research platforms — Casetext, Fastcase, Google Scholar, Lexis Advance, Ravel and Westlaw — there was hardly any overlap in the cases that appeared in the top-10 results returned by each database.”

Ambrogi’s post referred to Susan Nevelow Mart’s research paper The Algorithm as a Human Artifact: Implications for Legal {Re}search.   Mart is the Director of the Law Library and an Associate Professor at the University of Colorado Law School.

In this column that he wrote for Above The Law, Ambrogi dove deeper in Professor Mart’s findings.  Before I talk about the findings, I want to go back to my post Are Robots Nonlawyer Assistants.  

In my post, I suggested that lawyers who use artificial intelligence to perform “mundane legal tasks” might have an affirmative duty under the Rules of Professional Conduct “to have some sort of understanding of the coder’s qualifications.”  Well, as it turns out, a similar notion underpins Professor Mart’s research.

As Ambrogi reports, several years ago, a senior VP at Westlaw informed Professor Mart that the company’s “algorithms are created by humans.”  Mart, then, theorized that the choices that a human makes in creating an algorithm will necessarily influence the results delivered by the algorithm.  In other words, that the coder’s biases & assumptions will find their way into the algorithm and impact the results.  She set out to study her hypothesis.

Mart’s findings are eye-opening.  Using the same query across 6 providers – Casetext, Fastcase, Google Scholar, LexisAdvance, Ravel and Westlaw – she found that among the top 10 cases returned by each:

  • on average, 40% of the cases were returned by only 1 provider;
  • 7% of the cases were returned by all 6 providers.

I could go on & on.  Here’s the upshot, in an excerpt of the abstract from Professor Mart’s paper:

  • When legal researchers search in online databases for the information they need to solve a legal problem, they need to remember that the algorithms that are returning results to them were designed by humans. The world of legal research is a human-constructed world, and the biases and assumptions the teams of humans that construct the online world bring to the task are imported into the systems we use for research. This article takes a look at what happens when six different teams of humans set out to solve the same problem: how to return results relevant to a searcher’s query in a case database. When comparing the top ten results for the same search entered into the same jurisdictional case database in Casetext, Fastcase, Google Scholar, Lexis Advance, Ravel, and Westlaw, the results are a remarkable testament to the variability of human problem solving. There is hardly any overlap in the cases that appear in the top ten results returned by each database. An average of forty percent of the cases were unique to one database, and only about 7% of the cases were returned in search results in all six databases. It is fair to say that each different set of engineers brought very different biases and assumptions to the creation of each search algorithm. One of the most surprising results was the clustering among the databases in terms of the percentage of relevant results. The oldest database providers, Westlaw and Lexis, had the highest percentages of relevant results, at 67% and 57%, respectively. The newer legal database providers, Fastcase, Google Scholar, Casetext, and Ravel, were also clustered together at a lower relevance rate, returning approximately 40% relevant results.

Most importantly, here’s the ethics hook:  Rules 1.1 & 1.3 require lawyers to provide competent & diligent representation. Knowing that results vary widely by provider, do Rules 1.1 and 1.3 require lawyers to use more than one provider when conducting online legal research?

Although I’ve not yet had my daily requirement of coffee, my initial reaction is that it’d be much easier to argue “yes” than to argue “no.”  Actually, the real answer might be that it’s neither competent nor diligent for a lawyer to limit research to the first 10 results to a single query.

Indeed, in the abstract to her paper, Professor Mart notes:

  • “Legal research has always been an endeavor that required redundancy in searching; one resource does not usually provide a full answer, just as one search will not provide every necessary result. The study clearly demonstrates that the need for redundancy in searches and resources has not faded with the rise of the algorithm. From the law professor seeking to set up a corpus of cases to study, the trial lawyer seeking that one elusive case, the legal research professor showing students the limitations of algorithms, researchers who want full results will need to mine multiple resources with multiple searches.”

Anyhow, I was excited to post this, but now I can’t think of a creative way to wrap it up or to make a point.  I guess my point is this: know that online legal research services aren’t perfect.

Finally, maybe Professor Mart’s findings are a new twist on something that’s been going on forever.   I’m reminded of thinking “what the _____?” when I pulled a case that did not “follow” the case that I’d been thrilled to find, even though Shepard’s had promised me (with an “f”) that it would.  The human who coded it was, in fact, only human.



One thought on “Online Legal Research: is there an affirmative duty to use more than one research platform?

  1. Good topic. It’s a problem with the law and why I don’t like relying on Restatements without “diligent” research designed to support the particular circumstances of a case. Restatements act as Algorithms and shouldn’t be used as such.

    The legal system needs better means to know itself. Algorithms in law are both tempting and appalling. They provide pre-formed means to resolve specific variable relations, but American law relies on the flexibility of law to adapt to similar fact situations with nuanced or quiet but significant differences.

    One of the problems with law is the methods are not self-checking like in empirical sciences or math. Equations and experiments prove right and wrong. A legal system can fail for long periods before “self-checking” in the form of police action or civil war takes place.

    We should borrow from the social sciences a bit. I ran across an old Law Review article by a Professor from Case Western Reserve. See CWRU Law Review/Vol. 19, Issue 1: “The Relevance of Behavioral Science for Law”


Comments are closed.