Typical law review articles not only clarify what the law is, but also examine the history of the current rules, assess the status quo, and present reform proposals. To make theoretical arguments more plausible, legal scholars frequently use examples: they draw on cases, statutes, political debates, and other sources. But legal scholars often pick their examples unsystematically and explore them armed with only the tools for doctrinal analysis. Unsystematically chosen examples can help develop plausible theories, but they rarely suffice to convince readers that these theories are true, especially when plausible alternative explanations exist. This project presents methodological insights from multiple social science disciplines and from history that could strengthen legal scholarship by improving research design, case selection, and case analysis. We describe qualitative techniques rarely found in law review writing, such as process tracing, theoretically informed sampling, and most similar case design, among others. We provide examples of best practice and illustrate how each technique can be adapted for legal sources and arguments.


For over a century, American legal scholars have participated in the realist project, understanding law not as an autonomous, independent system of rules, akin to geometry, but as the product of heated political, economic, and societal conflicts.1 When interpreting and evaluating the law, American legal scholars rarely limit themselves to doctrinal analysis of legal texts; they draw on diverse historical and contemporary examples to make theoretical claims more plausible. Legal scholars, however, do not usually approach this exercise as an empirical one. Indeed, legal academics often assume empirical techniques are useful only for statistical analyses.

Qualitative empirical methods commonly used across the social sciences are not systematically used to study law.2 This is surprising because qualitative methods are particularly well suited for analyzing the types of evidence, and developing the types of arguments, we typically see in law reviews. Court decisions alone offer unusually extensive and in-depth perspectives on law, on the actions of various stakeholders, and on the societal context in which these operate. Constitutions, statutes, administrative regulations, depositions, and interrogatories are among the many readily available sources lawyers draw from. Moreover, the events embedded within legal processes that produce these pieces of evidence are interconnected. For example, rules of precedent link cases, making the sequence in which cases are decided very important. Qualitative analysis tools are specifically designed to study these interdependencies, and thus are particularly useful for legal scholars. These tools are different from statistical techniques, which often require that an observation’s occurrence does not influence whether another occurs.

Instead of drawing on qualitative techniques, legal scholars depend heavily on doctrinal analysis tools to conduct research. Doctrinal analysis and social science methods often lead scholars to choose and evaluate evidence in conflicting ways. For example, doctrinal tools prompt legal scholars to focus on cases in which the highest national court introduces a significant ruling that breaks from precedent. From a doctrinal analysis standpoint, focusing on such cases makes sense: higher courts can overrule lower courts, and it would be malpractice to ignore major changes in the law. As a result, many books and articles focus on US Supreme Court cases such as Brown v Board of
Education of Topeka3 and Roe v Wade.4 However, to make sound generalizations about law and society, emphasizing pathbreaking cases is often inappropriate, because they are idiosyncratic.5

To illustrate our approach, we identify extraordinary law review articles that apply qualitative methodologies effectively. Unfortunately, these articles are rare. A simple search in Hein­Online shows that, while over 84 percent of the articles published in the last fifteen years use the word “example,” only 7 percent reference qualitative and quantitative techniques used in the social sciences.6

Indeed, as Table 1 shows, quantitative methodologies are more commonly referenced in law reviews than are qualitative approaches. For example, while 4,284 articles mention random sampling, a common technique in quantitative work, only 281 articles refer to purposive sampling, a common qualitative technique.7 And even when a methodological technique is referenced, it is often applied incorrectly.8

Such limited use of qualitative methods is surprising because legal scholars are deeply concerned about the problems these methods address. Concerns about cherry-picking evidence, for example, trouble legal academics; however, few use qualitative sampling and case selection techniques. As Table 1 indicates, legal scholars are even more unfamiliar with qualitative techniques used to test and analyze theories. For example, only 136 articles referenced “process tracing,” a common method for testing causal propositions.

Table 1.  Search for Methods Terms in HeinOnline (2000–2015)9

In this Essay, we direct legal scholars to qualitative techniques appropriate for distinct research goals. We draw on Professor Martha Minow’s categorization to identify legal scholarship archetypes.10 A major category of legal projects focuses on doctrine. Some seek to restate doctrine, often by organizing case law and focusing on new developments.11 Others recast doctrine, revealing similarities among seemingly different cases.12 Many doctrinal research projects suffer from selection bias; authors emphasize examples that confirm their typologies, ignoring cases that don’t fit.13 Sampling methods are particularly helpful for these projects and allow legal scholars to generalize beyond the specific cases they analyze in depth.

Another set of legal projects aims to establish causal connections between the law and political, societal, or economic developments. Some use historical analysis to explain developments in the law and legal institutions; others engage in policy analysis, identifying legal problems and proposing solutions.14 These projects are more analogous to social scientific inquiries. To make strong causal claims, legal scholars must systematically identify and eliminate plausible alternative explanations of the outcome. To do so, we recommend two qualitative techniques. First, careful case selection can help legal scholars identify circumstances in which their theories can be effectively tested. Second, careful within-case analysis helps bolster the conclusions. This requires researchers to derive multiple empirical implications from their preferred explanations. If a great number of these implications prove true, then the researcher’s argument becomes more plausible. We describe a variety of case selection and case analysis techniques in the pages that follow.

Figure 1.  Qualitative Methods Appropriate for Different Claims15

Figure 1 above can help legal scholars locate the most appropriate methods for their projects. Doctrinal analysis tools should suffice for scholars who wish only to describe a few cases in depth. When, however, scholars wish to generalize these descriptive claims to a broader population of cases, sampling techniques are needed. And all causal claims require careful thinking about counterfactuals. In forming counterfactuals, scholars imagine plausible, alternative outcomes to the one that occurred, or alternative mechanisms to the one commonly assumed, and identify what factors led to the outcome chosen rather than the alternatives.

In the pages that follow, we start with some thoughts on identifying puzzles. We then discuss sampling and case selection techniques. We detail how scholars can use random and theoretically informed sampling to increase arguments’ generalizability and discuss case selection techniques. We then introduce process tracing, describing the importance of interdependent observations and detailing how to effectively use process tracing when observations are linked temporally and in a path-dependent manner.

I.  Imagining Alternatives and Identifying a Puzzle

“[A]ll you really need to have is an ‘explanandum’—a puzzle, paradox, or conundrum about the social world that in one way or another upsets our expectations, and for which there is no ready answer. But this is not at all a trivial accomplishment.”16

For social scientific research, the starting point—and perhaps half the battle—is identifying a puzzle that cannot be easily solved. Legal advocacy training does not highlight this element of puzzlement. In fact, many masterful legal strategists downplay the novelty of their arguments so that courts can more easily accept them.

To identify a puzzle, one can begin by imagining alternative outcomes to the one that occurred. The sources legal scholars regularly use are superb starting points for this task. The adversarial process inherently offers (at least) two alternative ways of understanding a set of facts—the plaintiff’s and the defendant’s. Amicus briefs and other third-party interventions can also help sketch out alternative options. Additionally, separate opinions from judges, including powerful concurrences and dissents, provide a range of plausible alternative legal outcomes. Furthermore, trial and appellate court judges can offer different answers to the same question, creating legally plausible alternative conclusions. In short, the legal process itself offers a broad range of well-constructed alternatives.

Legal scholars often go beyond these first steps to construct plausible but nonobvious alternative worlds, and draw comparisons across historical periods, legal fields, and jurisdictions. For example, in Pigs and Positivism, Professor Hendrik Hartog constructs a nonobvious but plausible counterfactual by examining a case concerning pig owners’ right to let pigs roam in urban settings.17 Predictably, the prosecution emphasized the risks and nuisances pigs create, while the defense minimized them.18 Drawing on historical and comparative evidence, Hartog spells out a plausible, alternative understanding of the case. Defense lawyers could have argued that pig keepers possess a customary right to let their pigs roam freely because this was a commonly accepted practice historically.19 Despite its plausibility, the defense did not make a claim about custom—why?

By identifying this third plausible alternative, Hartog demonstrates that, while prosecutors and defense attorneys predictably disagree, the terms of disagreement explain the bounds of what is legally acceptable in particular times and places.20 Hartog shows that an argument about custom was just outside the bounds of acceptability in early nineteenth-century New York City, even though it might have been entirely acceptable at a slightly earlier moment, in a more rural American setting, or in contemporary Britain.21

After imagining plausible alternatives, scholars select cases that allow them to effectively explore why a particular path was or should have been chosen rather than its alternative. In the Part that follows, we present useful techniques for scholars to systematically select cases.

II.  Sampling and Case Selection

Concerns about case selection and sampling are widespread among legal scholars, particularly the worry of cherry-picking cases that best fit an argument. What is less well-known is how to create representative samples and select cases to make credible, generalizable causal claims. We introduce some helpful sampling and case selection techniques in the paragraphs that follow.

A.    Sampling

Through sampling, researchers gather a subset of units from which they can make inferences about a broader population. Sampling techniques are useful for scholars pursuing doctrinal projects because the credibility of a generalization about doctrine depends on the representativeness of chosen examples. Sampling also holds important advantages for scholars pursuing causal arguments because it helps eliminate alternative explanations of the outcome. Below, we start with some general considerations about carefully sampling legal cases. We then present two particularly useful sampling techniques: random sampling and theoretically informed sampling. We discuss random sampling to dispel the assumption that it is too complicated to use in qualitative research. We present theoretically informed sampling because it allows scholars who work with few cases to make valid inferences.

Careful sampling requires scholars to clearly define the scope of their generalizations and the population to which their inferences apply. To see careful sampling in practice, we turn to Multiple Disadvantages: An Empirical Test of Intersectionality Theory in EEO Litigation.22 Professors Rachel Best, Lauren Edelman, Linda Krieger, and Scott Eliason sample judicial opinions in equal employment opportunity cases in US federal courts to argue that antidiscrimination lawsuits provide the least protection for plaintiffs with multiple social disadvantages.23 Plaintiffs who allege discrimination based on multiple traits, such as race and gender, are only half as likely to win their cases as other plaintiffs.24

Careful sampling is critical in making this claim persuasive. First, the authors select the appropriate unit in which to test their theory: federal circuit and district court cases.25 Circuit decisions establish precedent, while district courts handle a substantial number of discrimination cases and are thus “the primary federal locale for civil rights dispute resolution.”26 If the authors had used Supreme Court cases as their unit of analysis, it would have been harder to assess whether plaintiff characteristics influence judicial rulings. Supreme Court cases are idiosyncratic; they often involve novel issues and particularly motivated parties. The authors could not draw valid general inferences from these cases.

Second, the authors clearly explain their sample’s limitations and define the scope of their inferences. The authors randomly sampled from relevant district and circuit court opinions available on Westlaw.27 The authors emphasize that they could not include disputes that were resolved before reaching the courts or court opinions that were never published.28 By defining the limits of their sample, the authors strengthen the plausibility of their inferences.

1.   Random sampling and systematic sampling.

Random sampling is widely used in the social sciences. Random sampling involves selecting subjects from a larger population by chance; each subject has equal probability of being selected. Random sampling has distinct advantages because it eliminates the possibility that the characteristics of selected units influence the outcome. This technique allows scholars with limited information about the universe of cases to draw generalizations efficiently.

Random sampling is critical to Best and her colleagues’ ability to make a general claim about plaintiffs’ success in antidiscrimination lawsuits. The authors collected all relevant district and circuit court opinions between 1965 and 1999 available on Westlaw, from which they randomly chose 2 percent.29 Each district court opinion has unique characteristics that could influence its outcome; moreover, the authors do not possess anywhere near complete knowledge about every district court case. Random sampling allows the authors to make valid generalizations to all published district and circuit court cases despite these challenges.

A related technique—systematic sampling—can also produce credible generalizations. Systematic sampling involves randomly choosing a starting point and then selecting cases based on a fixed interval.30 For example, for his book Habeas Corpus: From England to Empire, Professor Paul Halliday creates a systematic sample of all uses of the writ of habeas corpus issued by the courts of the King’s Bench from 1500 to 1800.31 Starting in 1502, Halliday chooses petitions filed in every fourth year until 1798.32 Creating this systematic sample allows Halliday to identify common case characteristics and make generalizations about how people approached law.33 Systematic sampling also allows scholars to correlate outcomes to variables; this is important for Halliday, who “correlat[es] outcomes to . . . the wrongs for which prisoners were held and the jurisdictions that ordered confinement.”34

Random sampling has an important limitation: it requires the researcher to select a relatively large number of cases. We turn next to theoretically informed sampling, which is more appropriate for studying smaller numbers of cases.

2.  Theoretically informed sampling.

Theoretically informed sampling holds distinct advantages for producing causal claims and credible generalizations with a small number of cases. First, the researcher identifies theoretically important characteristics that could influence the outcome. The researcher then sorts cases into categories defined by these characteristics and selects cases from each category.35

For example, if a researcher was interested in treaty compliance, she would begin by identifying state characteristics that could delay compliance, such as limited bureaucratic capacity, poverty, or federalism. The researcher would then create categories defined by different combinations of these variables (for example, a wealthy federal state with high bureaucratic capacity) and sort states into each category. She would then select cases from each category, either randomly or based on practical and theoretical concerns. For example, because US treaty ratification behavior is very different from that of other wealthy federal states with high bureaucratic capacity, the researcher might want to include additional wealthy federal states. Ultimately, the researcher should “select[ ] a manageable number of cases that are diverse in terms of theoretically important traits.”36

Theoretically informed sampling is more difficult to carry out than random sampling and more likely to lead the researcher to introduce bias into the selection process. Despite these drawbacks, theoretically informed sampling has distinct advantages over random sampling for scholars working with a small number of cases. Random sampling has poor small-sample properties: the chances that a researcher who randomly selects five countries will end up with five developing countries, or five agricultural economies, rather than five diverse states, are surprisingly high. Scholars cannot then make valid generalizations because the cases selected have particular, shared characteristics.37

We could not locate exemplary uses of theoretically informed sampling in the legal literature. This makes our description more challenging, yet more likely to be useful. Below is an example that illustrates some of the steps outlined above, but that has important limitations. In Legalizing Gender Inequality: Courts, Markets, and Unequal Pay for Women in America, Professors Robert Nelson and William Bridges investigate “wage differences between jobs held primarily by women and those held primarily by men within the same organization.38 Al­though relevant literature argues that market principles produce these differences, Nelson and Bridges argue that organizational processes cause pay differences between typically “male” and “female” jobs.39 Undergirding this argument are four case studies of gender discrimination lawsuits.40

The authors select these cases to capture theoretically important variation across lawsuits.41 The authors define the universe of cases, which includes defendant organizations large enough to have sufficiently differentiated occupations, internal labor markets, and bureaucratic personnel systems.42 Within these parameters, the authors identify firm characteristics that might influence their outcome of interest, development of gender inequality. The potentially influential characteristics include whether organizations are public or private and the proportion of the workforce with firm-specific skills.43 After creating four categories (for example, public companies requiring firm-specific skills), the authors select cases from each category according to practical considerations, namely, whether evidence was accessible.44 Essentially, the authors select cases based on the values of potentially influential variables because it allows the authors to effectively evaluate whether and how organization type and skill requirements influence the outcome. Because the authors demonstrate that these other variables do not fully account for the patterns they observe, it strengthens their argument that their independent variable of interest is driving the outcome. As such, by using theoretically informed sampling, researchers can use few cases to assess their independent variable’s effect on the outcome.

Despite their use of theoretically informed sampling, the authors’ selection process raises important questions. For example, they examine only organizations sued for gender discrimination; these organizations may have especially egregious practices, and thus may be unrepresentative.45 The authors try to alleviate this concern by, among other things, comparing employment numbers to similarly sized firms and including statements from employers that the firms sued were not unusual.46

B.    Case Selection Techniques

While sampling techniques strengthen generalizations about the prevalence of certain population characteristics, case selection techniques are used to make structured and focused comparisons across cases, strengthening causal claims. We describe several case selection techniques below.

1.   Most difficult case design.

Selecting cases in which one’s theory is least likely to hold true can offer strong theoretical leverage. These cases, called “least-likely” cases,47 undergird most difficult case design. If a researcher demonstrates that her theory holds true in an unlikely case, the argument is likely to hold in a broader range of cases.48 In The Hollow Hope: Can Courts Bring About Social Change?, Professor Gerald Rosenberg uses two prominent US Supreme Court cases, Roe and Brown, to argue that the US Supreme Court’s influence on public policy is limited.49

Using a least-likely case selection strategy is particularly effective for increasing the causal strength and generalizability of Rosenberg’s argument. The Supreme Court is more visible and influential than any other court in the American political system.50 Roe and Brown are considered prime examples of a court producing significant social reform.51 If Rosenberg’s theory holds true in the cases in which it is most likely to fail, it is plausible that his hypothesis could hold true in other, “easier” cases. If Rosenberg had instead chosen a case from a lower court believed to have little impact on social reform, his claim would have been far less plausible, and would have generated far less interest.

2.   Most similar case design.

In most similar case selection, the researcher chooses cases that have similar values on theoretically important characteristics, but differ on the independent variable of interest.52 This allows the researcher to “hold constant” the other characteristics’ effects.53 In Judicial Comparativism and Judicial Diplomacy, Professor David Law uses a most similar case design to explore why some courts use foreign law more than others.54 Law hypothesizes that a court’s institutional capacity to learn about foreign law, and the emphasis a legal education system places on foreign law, shapes a court’s use of foreign law.55        

Law selects the Japanese Supreme Court, the Korean Constitutional Court, and the Taiwanese Constitutional Court because they share characteristics that potentially explain judicial engagement in comparativism.56 These countries are geographically adjacent, are democratic, share security and economic alliances with the United States, train judges similarly, have German-influenced civil law systems, have comparable popular attitudes toward comparativism, and share welcoming attitudes toward foreign law.57

Despite their similarities, these courts differ on the outcome and explanatory variables of interest, namely, the court’s use of foreign law, the court’s institutional capacity for comparativism, and the use of comparativism in legal education. The use of foreign law by Japan’s highest court is minimal relative to Korea’s Constitutional Court, which draws on foreign law in a majority of cases,58 and to Taiwan’s Constitutional Court, which consults foreign constitutional materials almost automatically.59 While neither the Japanese justices nor their clerks conduct foreign legal research routinely,60 the Korean Court has extensive foreign law research mechanisms, including a research institute for comparative constitutional scholarship.61 Moreover, each country’s legal education system emphasizes comparativism differently. In top South Korean and Taiwanese universities, all constitutional law professors studied law abroad, compared to 25 percent to 66 percent in top Japanese universities.62 While law professors regularly work for the Korean Constitutional Court63 and a majority of the Taiwanese Constitutional Court justices are former legal professors, Japanese professors rarely hold seats on Japan’s Supreme Court.64 By using most similar case design, Law effectively isolates important differences between the countries at issue, demonstrating how the highlighted differences influence judicial usage of foreign law.65

3.   Variants on most similar case design.

Variants on most similar case design have distinct advantages for assessing claims that are of particular interest to legal scholars, such as whether particular legal devices are necessary or sufficient to produce an outcome of interest. For example, many legal scholars want to know whether particular legal rules are essential for well-functioning markets, effective political participation, or robust environmental protection. Similarly, many legal scholars wonder whether adopting similar laws (for example, a model code) in different jurisdictions will result in largely similar outcomes.

In Private Enforcement of Corporate Law: An Empirical Comparison of the United Kingdom and the United States, Professors John Armour, Bernard Black, Brian Cheffins, and Richard Nolan use a variation of most similar case design to assess whether formal private enforcement of corporate law is necessary for strong securities markets.66 The authors select the United States and the United Kingdom because they share similar values on important characteristics.67 “Both are common-law jurisdictions with strong judiciaries, low levels of government corruption, [ ] highly developed stock markets,” liquid securities markets, and many publicly traded firms.68

The authors argue that, “[i]f private enforcement is [indeed] essential for robust stock markets,” they should observe “vigorous private enforcement of corporate law in both” countries, as these countries are otherwise similar in relevant respects.69 The rate of private enforcement, however, drastically differs. The United States possesses a relatively high frequency of suits brought against directors of public companies. These suits are almost nonexistent in the United Kingdom.70 By selecting cases that share otherwise-similar characteristics and outcomes, Armour and his coauthors trace back from the outcome and determine if the development of strong stock markets depends crucially on the private enforcement of corporate law. By showing that, contrary to expectations, private enforcement is not present in both cases, the authors effectively eliminate this as an essential precondition for strong securities markets.

Variations of most similar case design are also useful for legal scholars evaluating whether similar legal frameworks are used in the same way, or produce similar effects, across contexts. In How Dispute Resolution System Design Matters, Professor Shauhin Talesh examines why California and Vermont consumers receive different protections despite the fact that these states have nearly identical automobile consumer protection laws, or “lemon laws.”71

Starting with nearly identical lemon laws, Talesh identifies differences between the contexts that could influence the implementation of these laws. Talesh finds that California and Vermont vary in terms of public and private control of dispute resolution structures.72 In California, disputes are resolved in forums funded by automobile manufacturers but operated by external third-party organizations.73 In Vermont, consumer disputes are resolved in a state-operated dispute resolution structure.74 These dispute resolution structures filter business and consumer preferences differently, giving similar lemon laws distinct meanings. California’s managerial-justice adjudicatory model stresses business values of efficiency and managerial discretion. Vermont, by contrast, uses a collaborative justice model that reflects consumer values.75

It is not only similarly structured laws, but also identical words, that are interpreted in very different ways. For example, both Vermont and California emphasize impartiality and neutrality in the fact-finding process; however, these words’ meanings differ across states. In California, arbitrators who actively investigate facts “compromise” impartiality and neutrality, while Vermont arbitrators must actively investigate facts to establish impartiality and neutrality.76 This distinction leads California arbitrators to provide advantages for businesses, while Vermont arbitrators favor consumers.77 Ultimately, by selecting cases with similar laws yet different outcomes, Talesh effectively establishes the critical role of varied implementation.78

4.   Most different case design.

In most different case design, researchers select cases that differ on all relevant characteristics except the explanatory variable and outcome.79 As such, most different case designs can suggest that the same variable produces the same effect across extremely different contexts. In The Euro-Crisis and the Courts: Judicial Review and the Political Process in Comparative Perspective, Professor Federico Fabbrini argues that, in response to the European debt crisis (the Euro-crisis) and new legal architecture of the Economic and Monetary Union (EMU), European courts have increased their involvement in the fiscal domain.80

Fabbrini compares high court judicial decisions in Estonia, France, Germany, Ireland, and Portugal, highlighting that these five member states represent the very diverse political, economic, and legal conditions that characterize the European Union (EU).81 These countries vary dramatically: not only in size, wealth, and culture, but also in terms of the length of their EU membership and the power available to their supreme courts to review legislation.82

Drawing from post-Euro-crisis court rulings, Fabbrini identifies a common cause of this increasingly high degree of judicial intervention in fiscal and economic affairs: EU member states’ intergovernmental management of the Euro-crisis.83 As the dominant decision-making bodies, EU member states’ executive branches reformed the EMU architecture via international agreements, allowing courts to influence fiscal reform.84 By using most different case logic, Fabbrini emphasizes the common cause of the increase in judicial involvement in economic affairs, thereby increasing the credibility and generalizability of his argument. However, most different case design has important limitations: when selected cases share more than one relevant similarity, this technique cannot, on its own, help the researcher distinguish between them. More generally, qualitative work requires that case selection be combined with within-case analysis, to which we turn next.

III.  Process Tracing: Developing Multiple Empirical Implications

After imagining alternative plausible outcomes and selecting cases, qualitatively oriented scholars trace the events prior to the outcome, parsing their theory into logically interconnected propositions that explain why the outcome occurred. If a legal scholar attributes an outcome to a particular cause, it is reasonable to think that this cause would produce other “traces,” or implications. Using available evidence, this scholar can see whether these expected implications actually occurred, thereby strengthening (or weakening) her explanation of the outcome. Additionally, scholars can weigh the plausibility of these implications against alternative explanations of the outcome.85

The logic of process tracing should not be unfamiliar to lawyers; similar logic is used to assemble evidence in individual cases. In process tracing, scholars form multiple hypotheses about what caused an outcome, identify implications of each hypothesis, and weigh the hypotheses against available evidence. Similarly, to link a suspect to a crime, a prosecutor identifies a motive and develops a theory connecting a suspect’s motive to the time, place, and method of the crime. The prosecutor examines whether the evidence is more consistent with her theory or alternative theories. Evidence will vary in probative value; for example, eyewitness testimony might be less definitive than DNA evidence.86 Although lawyers “process trace” when composing legal briefs and establishing narrow causal propositions, legal scholars do not use this logic systematically in law review writing. That is, in brief writing, lawyers often assess how diverse facts contribute to their legal arguments, but in academic writing, we often see less effort spent to collect and assess key facts that would make theoretical propositions plausible.

After developing a theoretical explanation of the outcome, scholars using process tracing must assess how diagnostic evidence increases or decreases the probability that this explanation is true. These pieces of diagnostic evidence are called causal process observations (CPOs) because they elucidate the broader causal mechanism linking the variables.87 These pieces of evidence differ from the independent observations used in statistical analyses; they do not add breadth but depth, and are logically connected, rather than independent of one another. Different types of CPOs have varying probative value. In Professor David Collier’s language, “doubly decisive” evidence and “smoking gun” evidence have high probative value: doubly decisive evidence supports one theory and discredits alternatives, while smoking gun evidence supports one theory but does not speak to alternatives.88 In contrast, “straw-in-the-wind” evidence and “hoop” evidence are only mildly helpful.89

Below we provide two applications of process tracing to show how it can assess different types of causal arguments using various legal sources. We distinguish theoretically between (a) testing a theory with multiple empirical implications connected chronologically, and (b) testing a particular type of chronological connection common in legal scholarship—path-dependent processes90—in which early events have unusually large consequences later on.

A.    Process Tracing When Observations Are Linked Temporally

Researchers can effectively use process tracing to evaluate theories with chronologically connected empirical implications. To do so, the researcher breaks down her explanation of an outcome into various sequential, causal propositions, and evaluates these propositions against temporally interlinked observations. In The Strength of a Weak Agency, Professors Nicholas Pedriana and Robin Stryker explain how social movement pressure can expand the capacity of an agency with a small staff, limited budget, and limited jurisdiction.91 Specifically, they highlight how the NAACP and Legal Defense Fund (LDF) pressured the Equal Employment Opportunity Commission (EEOC) to aggressively interpret Title VII,92 thereby expanding the agency’s powers.93 While political leaders and lawyers initially understood Title VII as prohibiting only intentional discrimination, social movement pressure forced an aggressive EEOC litigation strategy, culminating in Griggs v Duke Power Co,94 which prohibited unintentional discrimination.95

Pedriana and Stryker’s first proposition involves social movements flooding the EEOC with complaints to demonstrate that the agency’s existing resources and capacity were insufficient.96 Next, early EEOC leaders disagreed about expanding the agency’s mission, leading the EEOC to pursue interpretations the agency’s leaders understood as very aggressive.97 This set of propositions has relatively distinctive empirical implications, and helps Pedriana and Stryker distinguish their theory from alternative explanations. One possible alternative is that EEOC leadership, seeking to increase their powers, would have pursued an expansive mandate even without social movement pressure.98 Or perhaps the premise that the EEOC had an initial narrow mandate is incorrect.99 Alternatively, perhaps the Supreme Court would have decided Griggs similarly regardless of social movement pressure and EEOC advocacy.100

To reject the alternative explanation that power-seeking bureaucrats drove EEOC expansion, the authors highlight that the first EEOC chairman, Franklin Delano Roosevelt Jr, was yachting during congressional hearings regarding appropriations for his agency.101 Roosevelt focused on public relations because he wanted to run for governor of New York, leaving EEOC senior staff unsure about the agency’s central objectives and how to accomplish them.102

To evaluate their proposition that social movements exposed the EEOC’s ineffectiveness, thereby pressuring the EEOC to adopt an aggressive strategy, Pedriana and Stryker note that the NAACP and the LDF filed mass complaints in the months after Title VII came into force.103 Jack Greenberg, director of the LDF, publicly stated that “the best way to get it amended [Title VII] is to show it doesn’t work.”104 Throughout its initial years, the EEOC was continually handling at least four times the number of complaints it was budgeted to handle due to the unrelenting tide of complaints from the LDF and the NAACP.105 The volume of complaints and social movement leaders’ statements are, in the language of Collier’s classification structure, “smoking gun” evidence. Given this evidence, it would be surprising if the alternate explanation—that social movement pressure had no effect on the EEOC—were true.

To evaluate their proposition that there was a push to expand the EEOC’s mandate, Pedriana and Stryker show that EEOC leadership initially disagreed over whether Title VII covered intentional discrimination and discriminatory effects.106 Pedriana and Stryker first follow steps that legal scholars normally use: they draw from the text of Title VII, the legislative history of the statute, and statements made by the nonpartisan Bureau of National Affairs.107 Perhaps recognizing the potential for strategic use of the legislative record, Pedriana and Stryker also draw on EEOC internal communications and staff statements.108 Although the EEOC later (successfully) challenged employment tests as discriminatory based on statistical evidence of their impact on minority applicants, the EEOC’s general counsel initially stated that “if [the EEOC testing guidelines are] intended as a legal position as to what is meant by professionally developed tests then it is very wide off the mark . . . I cannot conceive arguing this position before a District judge.”109 Additionally, EEOC Executive Director Herman Edelsberg said that incorporating disparate impact into the guidelines would make them “too ambitious to be a legal document.”110 Again, this is smoking gun evidence; it would be very surprising if the alternate explanation—that the EEOC’s mandate was unquestionably broad—were true given this evidence.

Pedriana and Stryker demonstrate how legal scholars can develop temporally linked propositions with distinctive empirical signatures, and how evaluating these propositions against available evidence can substantially increase their persuasiveness. We now turn to path-dependent causal claims and explain how best to substantiate them.

B.    Process Tracing When Observations Are Path Dependent

Legal scholars commonly make claims about path depen­dence, processes in which early events have large consequences later on. A HeinOnline search showed that 2,662 articles mentioned path dependence explicitly from 2000 to 2015. Legal interpretation techniques, including rules governing precedents, analogical reasoning, and conventions about interpreting similar language systematically, make early judicial decisions crucial. Below we explain why process tracing can help develop path-dependent claims.111

What distinguishes path dependence from other claims about event sequence? First, in path-dependent processes, positive feedback loops make early events have bigger consequences than later ones.112 Second, path-dependent processes have critical junctures, when one option is picked among many; after this choice, it becomes increasingly difficult to return to alternatives.113 The adoption of the QWERTY keyboard effectively illustrates path dependence. While countless ways of arranging letters on a keyboard were initially possible, once the QWERTY sequence was chosen and adopted by millions of typists, it became nearly impossible to switch to another, more efficient arrangement.

Process-tracing techniques are very useful for identifying feedback loops and critical junctures.114 In The Lost Promise of Civil Rights, Professor Risa Goluboff explains how the NAACP adopted the now-dominant civil rights litigation strategy and why it concentrated on government-imposed segregation rather than challenging abysmal labor conditions, an alternate strategy championed by the Civil Rights Section (CRS) of the Justice Department.115 Goluboff theorizes that early legal victories encouraged similar litigation and subsequent victories, creating a positive feedback loop that institutionalized this litigation strategy, making alternative litigation strategies much harder to pursue later on.116

To establish that an event constitutes a critical juncture, a scholar must demonstrate that there were at least two alternatives available and that, after one alternative was chosen, it became increasingly difficult to return to the other option. Goluboff does this for key decisions in the 1930s and 1940s.117 She also establishes that, once the NAACP chose its litigation strategy, choices about the cases it selected made it difficult, if not impossible, to change. Initially, the NAACP received both racial discrimination complaints from northern industrial workers and labor discrimination complaints from southern agricultural workers.118 While the NAACP originally pursued both types of complaints, by the 1940s, the NAACP fashioned a legal strategy around the racial discrimination claims of industrial workers.119 Multiple factors influenced this decision. The NAACP relied heavily on local counsel, and in the 1940s most black lawyers were in northern cities.120 Additionally, the NAACP found that “sympathetic judges and amenable lawyers” were scarce in the south, making it “easier to win cases” in the north.121

Perhaps the biggest critical juncture was the Supreme Court’s decision in Brown, which vindicated the NAACP’s legal strategy and established equal protection as the dominant civil rights lens.122 Brown is perhaps the most significant US Supreme Court case; the antidiscrimination framework Brown and its progeny represent is common in casebooks and taught across law schools nationally.123 While establishing the antidiscrimination approach’s dominance is easy, it is challenging for legal scholars to imagine that an alternative vision was possible. Goluboff convincingly establishes this alternate vision in a number of ways. Goluboff develops a plausible, alternate legal vision championed by the CRS: raw legal material for an alternate vision of civil rights, namely, agricultural workers’ horrific complaints, was ample,124 allowing the CRS to develop a conception of civil rights based on labor and economic discrimination.125 Additionally, she highlights that the Supreme Court overturned its own precedents with unusual frequency throughout the 1930s and 1940s126 and presents comments from prominent civil rights lawyers and casebooks exemplifying their perceptions of ambiguity in civil rights doctrine.127 This is smoking gun evidence because it makes it highly unlikely that the Brown decision was inevitable.

Implications and Conclusions

In place of a conclusion, we speculate on an observation that transformed quantitative research. In a much-cited 1986 piece, Paul Holland argued that some questions can be answered much more easily than others.128 For example, it is very difficult to ascertain why people commit crimes; however, we can more easily determine whether expanding the police force reduces crime rates. Statistical analysis, Holland argued, has distinct advantages for answering the second type of question, which focuses on measuring the effect of a given variable.129 The ease and effectiveness with which statistical analyses can answer “effects-driven” questions have led this method and question type to dominate social science research. More and more, social scientists are asking answerable questions with quantitative methods; however, fewer reflect on whether these questions, while answerable, are interesting and contribute to our understanding of the world.

Legal scholars arguably face the opposite problem. Legal scholarship has no shortage of interesting questions. However, many of these critical questions are never answered; legal scholars rarely defend their preferred theories against plausible alternatives effectively. By showcasing a variety of methodological techniques that are well suited to the types of claims and evidence legal scholars typically work with, we hope to move closer to answering the critically important questions legal scholars pose.

