Essay

Making Doctrinal Work More Rigorous: Lessons from Systematic Reviews

William Baude

Neubauer Family Assistant Professor of Law, The University of Chicago Law School

Adam S. Chilton

Assistant Professor of Law, The University of Chicago Law School

Anup Malani

Lee and Brena Freeman Professor of Law, The University of Chicago Law School

Legal scholars, lawyers, and judges frequently make positive claims about the state of legal doctrine. Yet despite the profligate citation norms of legal writing, these claims are often supported in a somewhat imprecise way—such that the exact evidence is unclear or difficult for others to probe or falsify. In response to similar issues, other disciplines have developed methodological standards for conducting “systematic reviews” that summarize the state of knowledge on a given subject. In this Essay we argue that methods for performing systematic reviews that are specifically tailored to legal analysis should be developed. We propose a simple four-step process that could be used whenever someone is trying to make objective claims about the state of legal doctrine, and we illustrate the value of this method by applying it to doctrinal claims that have been made in recent legal scholarship.

TABLE OF CONTENTS

I. The Case for Increased Rigor

We begin by surveying unsystematic claims about the state of legal doctrine, then go on to explain why, even if the claims are true, there are still benefits to more systematic review.

A. Examples

Lawyers regularly make claims about the law, and in particular about case law. Indeed, it might be one of the research tasks that they are most frequently paid to do. And while much legal scholarship is more normative, claims about the law are still common. For example, a civil procedure scholar may argue that a particular rule for class action cases is the increasingly prevailing view in federal courts, or a public law scholar may discuss the administrability problems created by a trend in state constitutional law. Yet those scholars might point to only two or three cases as evidence of the trend, and provide no information about the universe from which they were chosen.

These are not just hypothetical examples—both are from recent law review articles. We stress that in each case, the authors may well be right. Indeed, we have no particular reason to think that these experts in their fields are wrong. And by describing these examples, we do not mean to criticize them for failing to adhere to an existing standard of proof or citation (which is why we do not name them here). In fact, our argument is that these examples are not unique, but instead illustrative of a broad pattern.

To get a better sense of what kind of evidence is provided to establish legal claims, we reviewed every article published in the last completed volume of ten top law reviews.4 For each article, we had a research assistant read the abstract and record any claim about the state of legal doctrine.5 The research assistant then read the article and recorded the evidence that was provided as support for those claims. Finally, we coded the support provided for the doctrinal claim into one of three categories: citing one or zero cases for support; citing multiple cases as support; or conducting some form of a systematic review (that is, defining the entire set of cases that was relevant to the claim and the evidence to support it).

The results of this research are presented in Table 1. The analysis suggests that roughly 45 percent (56 of 127) of articles included a claim about the state of legal doctrine in the abstract. Of these 56 articles, only 25 percent (14 of 56) provided any form of systematic review to support the doctrinal claim. The rest of the articles provided string cites to cases (and occasionally academic articles as well), but did not explain how they identified the universe of cases or whether the cases were representative.

This strikes us as suboptimal. The norms of citation in legal academia ought to be designed to give nonexpert readers a chance to test those claims and a sense of how much confidence those claims deserve. Again, we do not fault anybody for failing to adhere to a norm that does not yet exist. But our suggestion is that it would be good for legal academia to develop a standard that helps legal analysts more rigorously see and more persuasively show what the law is.

Table 1. Support for Doctrinal Claims in Recent Volumes of Ten Major Law Reviews

Although this analysis focused on legal scholarship, we also see the same problems in more formal academic output, such as the Restatements of Law published by the American Law Institute. The Restatements have long been an important and widely cited resource in American law,6 and a recent volume has been given “the highest praise” for its “clear and careful exposition of the law.”7

But that very same volume has proven controversial in the courts. In a recent Supreme Court case, the justices divided over whether to accept a special master’s decision that had relied heavily on the third Restatement of Restitution and Unjust Enrichment.8 The majority adopted the master’s recommendation, repeatedly citing the Restatement,9 while a dissent complained that the Restatement “lacks support in the law,” would “alter the doctrinal landscape of contract law,” and had not been relied on by courts.10

Justice Antonin Scalia wrote separately to criticize the Restatement even more pointedly. “[M]odern Restatements,” he said, “must be used with caution.”11 They “have abandoned the mission of describing the law,” and contain sections “that should be given no weight whatever as to the current state of the law.”12 Hence, he concluded, “it cannot safely be assumed, without further inquiry, that a Restatement provision describes rather than revises current law.”13

The power of these criticisms was exacerbated by the methodological ambiguity of the Restatements—when do they describe the law and when do they aim to revise it? But the
American Law Institute may be trying to do better. In 2015, the Institute adopted new principles clarifying the “four principal elements” of the Restatement process, two of which are descriptive (“to ascertain the nature of the majority rule” and “to ascertain trends in the law”) and two of which are normative (determining what rule would produce more coherence or be more desirable overall).14 While “the relative weighing of these considerations” is “art and not science,” the Institute also acknowledges that it “needs to be clear about what it is doing.”15 And more specifically, Professors Oren Bar-Gill, Omri Ben-Shahar, and Florencia Marotta-Wurgler are writing a new Restatement of the Law, Consumer Contracts using principles analogous to the ones we discuss here.16 These are great steps. Our goal is to assist and encourage these approaches.

B. The Value of a More Rigorous Approach

Even if a given claim about legal doctrine is correct, there are benefits to establishing the claim in a more rigorous way. We will briefly mention five.17

First, a more rigorous demonstration of evidence makes it easier for readers to evaluate whether the ultimate claims are true or false. When less comprehensive support is provided, readers instead rely on their outside knowledge or on the author’s credibility as evidence for the validity of the claim. Expecting readers to rely on these proxies is problematic because not everyone will have the same outside knowledge or view of the author’s reputation. Using reputation as a proxy also invites ad hominem attacks on the author’s credibility.

Second, a more rigorous demonstration of evidence makes it easier for readers to assess how much uncertainty is associated with a given claim. For example, it may be true that courts generally agree on a point of law, but valuable to know how many cases have disagreed. Similarly, it is valuable to know whether a trend has been shown only in certain courts, or in certain years. This evidence helps a reader understand the degree of uncertainty associated with a claim, and also know the scope conditions of when that claim is valid.18

Third, providing more complete support for claims can reduce error. Even authors who are fairly confident in their knowledge make mistakes. When authors undertake to demonstrate their work, they will be less likely to make a mistakenly false statement. This logic has been one of the reasons that quantitative researchers are increasingly expected to provide their data and code. Simply put, the original researchers will be more careful when they know it will be easier for future researchers to double-check their work.

Fourth, more complete documentation of support increases general progress in the field. Both common-law legal reasoning and research are social enterprises in that they build on work from the past. When authors do not document the support for their claims, however, people trying to answer the same questions in the future have to recreate their work. Because research is a social enterprise, research norms should support this kind of documentation, just as journals and funding agencies increasingly require empirical researchers to publish their data.19

Fifth, providing such demonstrations can help to reduce actual or perceived bias.20 A large body of scholarship has studied the role that political ideology has on legal decision-making. This literature has consistently found that the political views of judges predict their decisions,21 and more recently has even found that the political views of law professors predict the conclusions they reach in their scholarship.22 One way to help reduce the risk or perception of bias is to provide the evidence that the claim is based on.

II. Systematic Reviews

In this Part, we discuss the history and justifications for systematic review, explain the steps of systematic reviews, and discuss why or why not it might be an appropriate model for doctrinal work. The last task is the most critical, as systematic review is not a perfect fit for doctrinal work, so only steps that are profitably imported into analysis of case law should be recommended.

A. History and Justification

The sciences, especially the biological and psychological sciences, have long recognized the need for a methodology to synthesize the results of prior research on a scientific question.23 An individual study may have a limited sample and thus limited statistical power to answer a research question. Moreover, its specific conclusions may be bound by the specific circumstances in which it was conducted. By contrast, a review could aggregate the data and contexts from multiple studies to yield both a more precise and a more generalizable study.24 The intellectual challenge of finding a method to combine results from multiple studies has long attracted the attention of leading statisticians, including Professor Karl Pearson and Sir Ronald Fisher in the early twentieth century.25 A famous early example is Pearson’s effort to synthesize a number of studies that examined the value of enteric fever inoculation in 1904.26

Demand for a method for synthesizing studies was initially limited, however, because there were simply too few medical studies conducted to be synthesized and because medical practice was informal and decentralized. As reliable research designs developed—especially the randomized controlled trial—and computing power increased, more and more primary research was conducted.27 Moreover, in the 1970s, a movement emerged that argued that medical practice should be driven by research evidence and not physicians’ idiosyncratic personal experiences or hunches.28

One of the principal products of the evidence-based medicine movement is the Cochrane Collaboration, which promotes the development of a rigorous methodology for synthesis, also known as “systematic reviews,” and hosts an online database of reviews of prior research.29 The need to define best practices for systematic reviews is now embraced widely in the medical literature, which has generated consensus statements on how such reviews ought to be conducted.30

The primary alternative methodology to the systematic review is the narrative review. A narrative review is a mainly qualitative, critical examination of the prior literature on a subject.31 The main criticism of this methodology—and thus the justification for systematic reviews—is that the authors have discretion to select which medical studies they review and how they interpret the studies they select. This discretion can lead to confirmation bias—authors select articles that tend to reinforce the authors’ priors.32 Moreover, the narrative review does little to address the problem of publication bias, which is the tendency for papers with less interesting results—usually results showing no effect, also known as null results—not to be published.33 This omission leads to overestimates of correlations, which often means the reviews will conclude that treatments have effects even when they actually may not.34

B. Steps in a Systematic Review

Systematic reviews address these biases with four basic steps. First, a review’s author clearly defines the question she seeks to answer.35 For example, what is the value of bariatric surgery for reducing obesity?36 This helps ensure that the author stays on target when searching for relevant literature. Although it may be too obvious to need stating, a major cause of bias is an author answering a different question than the one that motivated a review.37

Second, the author conducts an exhaustive search for relevant studies. In order for readers to judge how well the search was done, the author should be explicit about the databases searched, the search terms used, and any inclusion or exclusion criteria applied.38 The latter are criteria that determine whether a study falls within the ambit of her search or is to be dropped because it does not.39 Disclosures about the search methods also allow the reader to judge the potential for bias in the review40 and the development of “best practices” for search. The literature search step is crucial because an important source of confirmation bias is the omission of relevant studies that may disagree with the author’s prior beliefs about the correct answer to her research question.41

Third, the author appraises the quality of the studies that she has gathered.42 This is different than exclusion criteria, which are typically based on explicit criteria like whether the studies look at the right treatment, the target patient population, the intended outcome, etc. The quality appraisal looks instead at things like the methodology employed in the study (for example, was it an observational study or a randomized controlled trial,43 or was it double-blind, single-blind, or not blinded44 ). This step is used to increase the weight of methodologically sound studies in the author’s subsequent synthesis of the evidence across studies.

Finally, the author synthesizes the results of the different studies that survive. The author should be explicit about the methodology she uses to synthesize the study.45 For example, she may use a voting method in which she simply counts the number of studies that find positive impacts of a treatment and those that do not and then reports what the majority of studies find, perhaps with different votes for different classes of studies, with classes defined by the quality of the study.46 She may be even more rigorous and extract the statistical results from each and combine them using meta-analysis, a quantitative methodology for combining summary statistics or even the data from multiple studies.47 The author should also be explicit about how she thinks publication bias may affect the conclusions she is able to draw. Obviously, the better the method of synthesis the author employs, the better the review. However, being explicit about the method is almost as important as the method itself, because transparency allows others to replicate the review author’s work, ensuring that the review was not manipulated and increasing confidence in the review’s conclusions.48

C. An Appropriate Model for Doctrinal Work?

Although much of the impetus for development of a methodology for systematic reviews comes from the biological and psychological sciences, it would seem to be of value to any field wherein there is a need for synthesizing the results from multiple inquiries into the same issue. One of the early converts to systematic reviews was the public policy literature, which set up the Campbell Collaboration to support and disseminate such reviews of policy interventions, especially in the fields of education, crime and justice, social welfare, and international development.49 Efforts have also been made to import this methodology into management science 50 and even software engineering.51

It would seem that legal research, especially doctrinal work, would be a natural candidate for application of systematic review. As noted above, many scholars make descriptive claims about the law, and that work may be vulnerable to conscious or unconscious bias because the author neglects cases that do not fit.52 Readers of doctrinal work cannot assess any bias from this case selection process, and can compound the problem by citing uncritically the conclusions of the doctrinal analysis in their own legal analysis.

The mere need to synthesize prior work, however, is not sufficient for justifying the wholesale importation of the methodology of systematic reviews. There are important differences between the medical sciences, for which the approach was developed, and doctrinal analysis. First, medical studies are quantitative while legal cases are qualitative. It is more difficult to aggregate or combine qualitative research. Second, medical studies have positive aims (for example, figuring out whether a treatment works or not), while legal analysis often embeds normative aims (for example, arguing that one rule is better than another).53

These differences justify caution when translating elements of systematic reviews to doctrinal work, but do not necessarily justify ignoring entirely the lessons of the methodology. The fact that prior cases are qualitative does not at present prevent lawyers and legal academics from drawing conclusions from prior cases about what courts are likely to do in future cases. The lesson we should learn from systematic reviews is that, even when conducting qualitative synthesis, an author should be clear about which cases made her sample. This will reduce the risk that the author draws incorrect conclusions because her qualitative synthesis ignored certain relevant cases, and allow future researchers to know how to expand on or replicate the author’s claims. She should also be clear about the sorts of logical steps she took when conducting her qualitative synthesis (for example, which cases she valued more because of the judge or because the context was more generalizable).

Likewise, the fact that legal work is often normative54 is not an argument against greater rigor during case selection and transparency about the nature of legal analysis. Indeed it is the opposite. Readers may mistrust a positive argument if they suspect that the author is smuggling in normative analysis, and they may be misled by a normative argument whose positive premises are unclear. Systematic review clarifies the relationship between positive and normative and so helps normative arguments be made more clearly and rigorously.55 As for authors who might wish to be intentionally unclear, our analysis makes it easier for the reader to disentangle unsystematic steps in the author’s analysis.

III. Developing a Method of Systematic Review for Legal Analysis

In this Part we first outline a process for how to conduct a systematic review of legal doctrine and then provide an example of this process for a recent piece of legal scholarship.

A. A Four-Step Process for Conducting a Systematic Review of Legal Doctrine

We propose a four-step process for making claims about the state of legal doctrine: (1) clearly stating the legal question that is being answered; (2) defining the sample of cases that will be used; (3) explaining how the cases in the sample will be weighted; (4) conducting the analysis of the sample of cases and stating the conclusion. We briefly explain each of these four steps below.

1. Stating the question.

The first step in providing the evidence for a legal claim is defining the exact question that the subsequent analysis is trying to answer. There are two things to keep in mind at this stage.

First, the question should be precise. The idea of stating a legal question will obviously be familiar to anyone in the legal profession. Legal questions are asked during Socratic cold calls during law school, are used to motivate legal memos, and guide many forms of legal briefs. These questions, however, are often asked in a fairly broad manner. The key when asking a legal question to motivate a systematic review of legal doctrine is to make sure to state a question that is sufficiently precise as to guide the time frame, jurisdictions, and relevant universe of cases that will be used to answer the question.

Second, it is helpful to think about what evidence is required to establish a given claim. For example, if the question is how courts “typically” decide a particular type of case, answering the question requires knowing, say, the median way that courts have decided the case. Once again, knowing what evidence is required for the question helps to determine exactly what sample of cases is relevant and how to analyze them. Below we provide examples of common kinds of claims and the evidence they require.

Courts generally decide issue X in way Y. This kind of claim can be thought of as calling for the median outcome, or “majority rule,” for a given kind of case. To establish this kind of claim, it is necessary both to establish the universe of relevant cases and to classify the outcomes of those cases in some way.

Courts have increasingly decided issue X in way Y. This kind of claim can be thought of as calling for the correlation of outcomes over time. To establish this kind of claim, it is necessary to establish the universe of relevant cases, to classify the outcomes of those cases, and to make note of when those cases occurred.

There is a split in how courts decide issue X. This kind of claim can be thought of as making a claim about the variance of outcomes. Depending on the scope of this claim, it may be necessary to establish the universe of relevant cases and to classify the outcomes of those cases.

Courts have frequently confronted issue X. This can be thought of as a claim about the size of a given sample. Making this claim thus requires documenting the number of cases that meet the relevant criteria.

At least one court has decided issue X in way Y. This can be thought of as a claim about the existence of a given phenomenon. To establish this claim, it is not necessary to establish the universe of cases. Instead it is simply necessary to find one case that meets a given criteria.

2. Defining the sample of cases.

After a question has been clearly stated, the next step is to define the relevant sample of cases that were analyzed. There are also two major steps to this process.

First, it is important to establish what process was used to assemble the universe of cases. For example, one might say what courts one searched for cases from, and over what time period. This way it is possible for anyone else to understand exactly the universe of cases that was analyzed as support for a given doctrinal claim.

Second, it is important to state any inclusion or exclusion criteria that were applied to a sample of cases. For example, if the universe includes a large number of cases, it is important to say which cases were analyzed. In some situations, the entire sample of cases may be analyzed, but in others it might be a random sample. Alternatively, it may be the case that certain kinds of cases are excluded from the analysis because they are not relevant (for example, all potentially express preemption cases in an inquiry into field preemption). All of these decisions should be clearly documented.

Finally, in an ideal world (or if a process like ours begins to become more commonplace), one might also hope that analysts would specifically document the technology of their search process. For instance, they might say what databases they searched, what terms they used, and on what dates the search was conducted. This is considered an important step of systematic reviews in the medical literature. But we suspect that there may be more reluctance and resistance to translating it into legal scholarship. This is likely partly for reasons of style and etiquette, but also because the legal research process is more heterogeneous than the research processes in other disciplines. Although it would be beneficial if scholars documented this part of their process as well, it is not as important as clearly defining the universe of cases.

3. Explaining the weighting.

Once the sample of cases is established, it is important to state how the cases in the sample will be weighted in the analysis. Just as it may not be appropriate to give all clinical studies equal weight during a systematic review of the medical literature on a given subject if the quality of the studies differs, it may not be appropriate to give all cases the same weight. For instance, it may be appropriate to weight cases more heavily if they are: of greater precedential status; more recent; cited more frequently or written by more frequently cited judges; or engaged in more analysis on the relevant topic. Once again, the key is transparency. Legal analysis need not be the simple sum of equally weighted cases, but the weighting should be explained to readers.

4. Conducting the analysis and stating the conclusion.

The final step is analyzing the sample and answering the question posed. There are three pieces of information that should be provided about this process.

First, one should provide the criteria that were used to analyze the cases. This may be as simple as saying, “I counted any case that mentioned issue X as relevant” or “I counted cases as relevant only if the central issue of the case was X.”

Second, one should say how the cases were analyzed. For example, one approach may be to conduct a keyword search over a set of cases, while another would be to carefully read all of the relevant cases.

Third, a conclusion should be stated that is not broader than what the evidence can support. For example, if only federal district court opinions from 2010 to 2015 were analyzed during this process, the conclusion that follows is that “district court decisions between 2010 to 2015 handle issue X in way Y” and not “courts handle issue X in way Y.” To be sure, scholarship frequently asks readers to make inferences from one set of data points to a broader one—the fact that a certain set of decisions handle issue X in way Y may be argued to imply that other courts do so as well. But once again, a clear analysis should make clear what claim is being made about the cases and what the requested inference is.

B. A Sample Review

We hope that this four-step process can serve as a relatively simple way to advance the rigor—and hence the credibility and transparency—of doctrinal analysis. In their own work, Professors Bar-Gill, Ben-Shahar, and Marotta-Wurgler are using a systematic review to write a Restatement,56 and we applaud the effort. We think similar methods can add to the value of legal scholarship, and we will try to demonstrate with a concrete example.

One of us (Baude) previously published an article that investigated whether “originalism” is “our law,” in part through a synthesis of Supreme Court opinions.57 We think that the persuasiveness of that analysis might have been helped by the principles of systematic review. And so in the course of writing this Essay we decided to conduct a systematic review relevant to some of the claims in that article. Below, we describe the steps of that review and its results.

1. Stating the question.

One of the claims in the article was that the Supreme Court’s cases, with no exceptions or relatively few exceptions, were consistent with what Baude described as “inclusive originalism.”58 More specifically, it claimed: “First, in cases where the Court acknowledges a conflict between original meaning or textual meaning and another source of constitutional meaning, the text and original meaning prevail. Second, across the larger run of cases that do not feature an explicit clash of methodologies, the Court never contradicts originalism.”59

To check this claim more systematically, we examined a set of 280 Supreme Court cases60 with the help of a research assistant and answered the following questions for each case: (1) Did the case decide a constitutional question? (2) If so, did the Court either reject the original meaning or say that the original meaning would not matter to its analysis?

2. Defining the sample.

The previous article attempted to focus on Supreme Court cases that reflect our current positive law commitments, which include both modern cases and older decisions that continue to be recognized as “canonical.”61 For purposes of our review, we focused on a subset of these cases and used a media salience metric called the “NYT Measure”: whether a case was listed on the front page of The New York Times.62 We defined the sample to include all 280 cases decided between 1989 and 2009 (the most recent period available) that appeared on the front page of The New York Times. We then excluded the eighty-four cases that did not decide a constitutional issue.

This is of course an incomplete sample, and we note that several important cases discussed in the article63 did not appear in the dataset. But the metric is a “valid, reliable, and unbiased measure of salience,”64 and therefore useful for a systematic review of salient cases.

3. Explaining the weighting.

Our narrow definition of the sample—focusing on only salient cases—means that almost all cases that discussed the original meaning of a constitutional provision could get equal weight. However, depending on the specific question, it could be appropriate to give older cases or cases whose reasoning is partly repudiated or contested less weight in the final analysis.

4. Conducting the analysis.

The results of our analysis are presented in Table 2. Of the 196 constitutional law cases in our sample, our systematic review revealed only 1 in which the Court seemed to say that the original meaning of the constitutional provision (known or not) did not matter: Lawrence v Texas.65 It is worth noting that this case was discussed at length in the original article.66

Our review also uncovered eight other borderline cases: County of Allegheny v American Civil Liberties Union, Greater Pittsburgh Chapter67 (since implicitly partly overruled by a 2014 decision68 ); Planned Parenthood of Southeastern Pennsylvania v Casey69 (also discussed at length in the article70 ); BMW of North America, Inc v Gore;71 Kelo v City of New London;72 and a string of Eighth Amendment cases involving “evolving standards of decency.”73 Each of these borderline cases probably does not reject inclusive originalism,74 but presented a sufficiently close call that our review flagged them as unclear. This demonstrates an additional useful function of the review—identifying cases that might deserve further explanation—in addition to demonstrating one of the article’s claims in a more systematic way.

Table 2. Systematic Review of Originalism in Salient Supreme Court Cases

Conclusion

Although we believe that legal analysis could be improved if methodological standards for analyzing case law were developed, we acknowledge that our process has drawbacks. Most notably, documenting the steps we describe can consume time and space that could be spent on other things. Nor is systematic review appropriate for advocates making normative or prescriptive claims about what legal doctrine should be.

But we hope to convince others of the benefits of this framework when making positive claims about legal doctrine a central part of the analysis in law reviews, Restatements, and judicial opinions. Law review articles provide research for lawyers, judges, and policy-makers to rely on. They would be more useful—and perhaps more likely to be cited—if they provided all the evidence necessary to support their central claims. Systematic reviews could help the reporters of Restatements alleviate the concern that they color their analysis to reach their desired conclusions. Systematic reviews could help courts by lending credibility and reducing any perception of bias about their decisions.

Even if many authors are reluctant to adopt these techniques directly, we believe their insights can be useful in other ways as well. For instance, for claims that are not central to an analysis, it still may be best to cite secondary sources that did conduct a systematic review. This is because these sources would provide better evidence than articles that may have made the same claim while simply citing other articles or legal materials. And when one is skeptically questioning a doctrinal claim that does not document its methodology, our framework may provide a useful point of departure—it can help critics and skeptics zero in on which part of an argument most needs to be supported and proven.

Finally, we emphasize that we recognize that there are many different ways to incorporate some of the insights of systematic reviews. We do not intend this Essay to be the final statement on the matter, but instead hope to generate debate on how more rigorous methods can be incorporated into traditional legal analysis.

4We set out to analyze the flagship law reviews of the ten highest-ranked schools in the 2017 US News & World Report ranking of law schools. See Best Law Schools (US News & World Report 2016), archived at http://perma.cc/6NJ2-NJGE. Because the flagship journals of two schools—the University of Pennsylvania Law Review and the Virginia Law Review—did not consistently have abstracts for their articles, we skipped these schools in our analysis and moved to the next schools on the list.
5We focused on doctrinal claims made in the abstract because our goal was to identify doctrinal claims that were central to the article’s argument. The abstracts of some articles contained multiple doctrinal claims, and each doctrinal claim was counted independently.
6See Caleb Nelson, The Persistence of General Law, 106 Colum L Rev 503, 510 n 35 (2006) (“[C]ourts continue to treat the Restatements as presumptively accurate summaries of general American jurisprudence.”). See also Bennett Boskey, The American Law Institute: A Glimpse at Its Future, 12 Green Bag 2d 255, 258 (2009) (“It is fair to say that, on the whole (though of course not 100 percent of the time), the Restatement Second became a benign influence that moved the law along progressively and toward greater certainty but without undue disruption.”).
7Ben Kremer, Book Review, 35 Melb U L Rev 1197, 1215 (2011) (praising the Restatement (Third) of Restitution and Unjust Enrichment, which was published in 2011). See also Lionel Smith, Book Review, 57 McGill L J 629, 629, 632–33 (2012) (same).
8See Kansas v Nebraska, 135 S Ct 1042, 1056–58 (2015); id at 1064 (Scalia concurring in part and dissenting in part); id at 1068–69 (Thomas concurring in part and dissenting in part).
9Id at 1056–58.
10Id at 1068–69 (Thomas concurring in part and dissenting in part), quoting Caprice Roberts, Restitutionary Disgorgement for Opportunistic Breach of Contract and Mitigation Damages, 42 Loyola LA L Rev 131, 134 (2008).
11Kansas, 135 S Ct at 1064 (Scalia concurring in part and dissenting in part). Ironically, the one original Restatement that Scalia cited as an example of trustworthy craft—the first Restatement of Conflict of Laws—is one that had been singled out for opprobrium by a recent officer at the American Law Institute. See Boskey, 12 Green Bag 2d at 257 (cited in note 6) (“[T]he judiciary and the bar welcomed the help of most of the Restatement First (possibly excepting the Restatement of the Conflict of Laws, for which the ideologically-imprisoned Professor Joseph H. Beale had been the reporter).”).
12Kansas, 135 S Ct at 1064 (Scalia concurring in part and dissenting in part).
13Id (Scalia concurring in part and dissenting in part).
14Capturing the Voice of The American Law Institute: A Handbook for ALI Reporters and Those Who Review Their Work *5 (ALI 2015), archived at http://perma.cc/6ZY8-MVFW. We thank Professor Richard Revesz for calling the adoption of these principles to our attention.
15Id at *6.
16See Oren Bar-Gill, Omri Ben-Shahar, and Florencia Marotta-Wurgler, Searching for the Common Law: The Quantitative Approach of the Restatement of Consumer Contracts, 84 U Chi L Rev 7, 15–18 (2017).
17These benefits largely parallel the arguments that have been used to motivate the transparency and replication movement that has been taking place in the social sciences. See Lee Epstein and Gary King, The Rules of Inference, 69 U Chi L Rev 1, 38–54 (2002).
18See id at 80–91.
19See id at 38.
20Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253 (cited in note 2).
21See, for example, Cass R. Sunstein and Thomas J. Miles, Depoliticizing Administrative Law, 58 Duke L J 2193, 2199–2208 (2009). See also Thomas J. Miles and Cass R. Sunstein, The New Legal Realism, 75 U Chi L Rev 831, 836–40 (2008) (reviewing the literature).
22See Adam S. Chilton and Eric A. Posner, An Empirical Study of Political Bias in Legal Scholarship, 44 J Legal Stud 277, 286–93 (2015).
23See, for example, Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253–54 (cited in note 2); Valentine, Pigott, and Lau, Systematic Reviewing and Meta-Analysis at 906–09 (cited in note 3).
24See Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253–54 (cited in note 2); Valentine, Pigott, and Lau, Systematic Reviewing and Meta-Analysis at 906 (cited in note 3).
25See generally, for example, Karl Pearson, Report on Certain Enteric Fever Inoculation Statistics, Brit Med J 1243 (Nov 5, 1904) (presenting an early effort to combine results from different sources). See also R.A. Fisher, Statistical Methods for Research Workers 99 (Oliver & Boyd 14th ed 1970) (originally published 1925) (“[I]t sometimes happens that although few or [no statistical tests] can be claimed individually as significant, yet the aggregate gives an impression that the probabilities are on the whole lower than would often have been obtained by chance.”).
26See generally Pearson, Brit Med J 1243 (cited in note 25).
27Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253–54 (cited in note 2).
28See, for example, A.L. Cochrane, Effectiveness and Efficiency: Random Reflections on Health Services 20–22 (Nuffield Provincial Hospitals Trust 1972).
29See About Us (Cochrane Collaboration), archived at http://perma.cc/A8W6-BNVL.
30See generally, for example, David Moher, et al, Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement, 151 Annals Internal Med 264 (2009); Donna F. Stroup, et al, Meta-Analysis of Observational Studies in Epidemiology: A Proposal for Reporting, 283 JAMA 2008 (2000); David Moher, et al, Improving the Quality of Reports of Meta-Analyses of Randomised Controlled Trials: The QUOROM Statement, 354 Lancet 1896 (1999).
31See Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253 (cited in note 2).
32See Julia H. Littell, Evidence-Based or Biased? The Quality of Published Reviews of Evidence-Based Practices, 30 Children & Youth Serv Rev 1299, 1300 (2008); Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253 (cited in note 2).
33See Philippa J. Easterbrook, et al, Publication Bias in Clinical Research, 337 Lancet 867, 868–71 (1991); Jerome M. Stern and R. John Simes, Publication Bias: Evidence of Delayed Publication in a Cohort Study of Clinical Research Projects, 315 Brit Med J 640, 642–45 (1997).
34See Lasse M. Schmidt and Peter C. Gøtzsche, Of Mites and Men: Reference Bias in Narrative Review Articles; A Systematic Review, 54 J Fam Prac 334, 336 (2005) (finding that narrative reviews of the studied interventions were overly positive in their assessments of treatments relative to systematic reviews and clinical trials).
35See Valentine, Pigott, and Lau, Systematic Reviewing and Meta-Analysis at 906–07 (cited in note 3); Denise O’Connor, Sally Green, and Julian P.T. Higgins, eds, Defining the Review Question and Developing Criteria for Including Studies, in Julian P.T. Higgins and Sally Green, eds, Cochrane Handbook for Systematic Reviews of Interventions 83, 91–93 (Wiley-Blackwell 2008); Khalid S. Khan, et al, Five Steps to Conducting a Systematic Review, 96 J Royal Society Med 118, 118–19 (2003).
36See Henry Buchwald, et al, Bariatric Surgery: A Systematic Review and Meta-Analysis, 292 JAMA 1724, 1724–25 (2004).
37See Mark Crowther, Wendy Lim, and Mark A. Crowther, Systematic Review and Meta-Analysis Methodology, 116 Blood 3140, 3141 (2010) (“A major cause of bias in a systematic review is answering a different question to that originally asked.”).
38See Carol Lefebvre, Eric Manheimer, and Julie Glanville, Searching for Studies, in Higgins and Green, eds, Cochrane Handbook 95, 95 (cited in note 35); Julian P.T. Higgins and Jonathan J. Deeks, eds, Selecting Studies and Collecting Data, in Higgins and Green, eds, Cochrane Handbook 151, 151 (cited in note 35) (“Methods used for these decisions must be transparent.”); Khan, et al, 96 J Royal Society Med at 119–20 (cited in note 35).
39See, for example, Harriette G.C. Van Spall, et al, Eligibility Criteria of Randomized Controlled Trials Published in High-Impact General Medical Journals: A Systematic Sampling Review, 297 JAMA 1233, 1233–34 (2007).
40Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253 (cited in note 2) (“In a systematic review, all decisions used to compile information are meant to be explicit, allowing the reader to gauge for him- or herself the quality of the review process and the potential for bias.”).
41See Littell, 30 Children & Youth Serv Rev at 1300 (cited in note 32). See also Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 256–57 (cited in note 2) (arguing that more comprehensive searches also reduce the risk of publication bias).
42See, for example, Khan, et al, 96 J Royal Society Med at 120–21 (cited in note 35); Julian P.T. Higgins and Douglas G. Altman, eds, Assessing Risk of Bias in Included Studies, in Higgins and Green, eds, Cochrane Handbook 187, 187 (cited in note 35).
43An observational study looks retrospectively at outcomes from treatments that patients chose, while a randomized controlled trial randomly assigns patients to treatment to address selection bias. See Miquel Porta, ed, A Dictionary of Epidemiology 203, 238 (Oxford 6th ed 2014).
44A single blind of the study subject prevents the subject from changing her behavior in response to the treatment, including dropping out of the study. Such behavior introduces selection effects due to either unobservable behavior while on treatment or unraveling of the benefit of random assignment. A single blind of the investigator prevents the investigator from seeing what treatment the patient received in order to limit the measurement error wherein the investigator’s measurement of (especially subjective) outcomes reflects her priors about the value of a treatment. A double-blind study blinds both the subject and the investigator. See id at 27.
45See Khan, et al, 96 J Royal Society Med at 121 (cited in note 35).
46See Philip Davies, The Relevance of Systematic Reviews to Educational Policy and Practice, 26 Oxford Rev Educ 365, 367–68 (2000) (describing the voting method). See also Valentine, Pigott, and Lau, Systematic Reviewing and Meta-Analysis at 908 (cited in note 3) (describing broadly the process of coding and synthesizing sources).
47Professors Gene V. Glass and Mary Lee Smith were among the first researchers to refer to their work as a meta-analysis. See, for example, Gene V. Glass and Mary Lee Smith, Meta-Analysis of Research on Class Size and Achievement, 1 Educ Eval & Pol Analysis 2, 3 (Jan 1979). Details of how to conduct meta-analyses may be found in Jonathan J. Deeks, Julian P.T. Higgins, and Douglas G. Altman, eds, Analysing Data and Undertaking Meta-Analyses, in Higgins and Green, eds, Cochrane Handbook 243 (cited in note 35).
48See Garg, Hackam, and Tonelli, 3 Clinical J Am Society Nephrology at 253 (cited in note 2); Valentine, Pigott, and Lau, Systematic Reviewing and Meta-Analysis at 909 (cited in note 3).
49See About Us (Campbell Collaboration), archived at http://perma.cc/J2XQ-XP4U.
50See generally, for example, David Tranfield, David Denyer, and Palminder Smart, Towards a Methodology for Developing Evidence-Informed Management Knowledge by Means of Systematic Review, 14 Brit J Mgmt 207 (2003).
51See generally, for example, Jorge Biolchini, et al, Systematic Review in Software Engineering (COPPE/UFRJ/PESC, Systems Engineering and Computer Science Department Technical Report RT–ES 679/05, May 2005), archived at http://perma.cc/Q5T9-J3WS.
52See Part I.B.
53Similar arguments have been made against the importation of systematic reviews into management science. See Tranfield, Denyer, and Smart, 14 Brit J Mgmt at 212–14 (cited in note 50).
54See Jack Goldsmith and Adrian Vermeule, Empirical Methodology and Legal Scholarship, 69 U Chi L Rev 153, 155–56 (2002).
55See Eric A. Posner and Adrian Vermeule, Inside or outside the System?, 80 U Chi L Rev 1743, 1745 n 2 (2013):

Law professors may of course play either the role of the analyst, as when they attempt to explain judicial behavior, or the role of an actor within the system, as when they argue cases or write briefs as amici curiae. The latter activities may blur the difference between roles as a practical matter (and in some cases that blurring is precisely the point). Yet as a conceptual matter, the distinction never blurs. Law professors may switch hats very rapidly, or try to wear two hats at once, but that behavior is irrelevant to the conceptual distinction we draw.

See also id at 1797 (“At a minimum, analysts who speak both as political scientists and as legal theorists must be careful not to switch their hats so rapidly that they end up attempting to wear two hats at the same time.”).
56See generally Bar-Gill, Ben-Shahar, and Marotta-Wurgler, 84 U Chi L Rev 7 (cited in note 16).
57William Baude, Is Originalism Our Law?, 115 Colum L Rev 2349, 2370–86 (2015).
58Id at 2391.
59Id at 2371. This was not the only claim in the article, but it is the one most immediately susceptible to systematic review.
60See Part III.B.2 for how we defined that sample.
61See Baude, 115 Colum L Rev at 2371, 2391 (cited in note 57).
62The metric was developed by Professors Lee Epstein and Jeffrey A. Segal. See Lee Epstein and Jeffrey A. Segal, Measuring Issue Salience, 44 Am J Polit Sci 66, 72–73 (2000). See also Lee Epstein, et al, Table 2-13 Major Decisions of the Supreme Court: New York Times Measure, 1946–2009 Terms (CQ 2012), archived at http://perma.cc/37MT-JBGJ (listing the cases included in the metric).
63See, for example, Baude, 115 Colum L Rev at 2376 (cited in note 57) (discussing Crawford v Washington, 541 US 36 (2004)).
64Epstein and Segal, 44 Am J Polit Sci at 72 (cited in note 62).
65539 US 558, 571–72 (2003).
66Baude, 115 Colum L Rev at 2381–82 (cited in note 57).
67 value="67">492 US 573, 590 (1989).
68See Town of Greece, New York v Galloway, 134 S Ct 1811, 1821 (2014).
69 value="69">505 US 833, 847 (1992).
70Baude, 115 Colum L Rev at 2384 (cited in note 57).
71517 US 559, 599–600 (1996) (Scalia dissenting).
72545 US 469, 479–80 (2005).
73Kennedy v Louisiana, 554 US 407, 419 (2008); Roper v Simmons, 543 US 551, 561 (2005); Atkins v Virginia, 536 US 304, 311–12 (2002); Hudson v McMillian, 503 US 1, 8 (1992).
74See Baude, 115 Colum L Rev at 2356–57 & n 24 (cited in note 57) (discussing the Eighth Amendment).