How to Test the Ship of Theseus

The story of the Ship of Theseus is one of the most venerable conundrums in philosophy. Some philosophers consider it a genuine puzzle. Others deny that it is so. It is, therefore, an open question whether there is or there is not a puzzle in the Ship of Theseus story. So, arguably, it makes sense to test empirically whether people perceive the case as a puzzle. Recently, David Rose, Edouard Machery, Stephen Stich and forty-two other researchers from different countries have undertaken that task. We argue that their tests do not provide any evidence that bears on the question as to whether the Ship of Theseus case is a genuine puzzle. In our discussion we address also what should be taken into account if one wishes to test the puzzling, or not puzzling, status of the Ship of Theseus story.


The Test
The story of the Ship of Theseus (SoT from now on) is one of the most venerable conundrums in philosophy. Some philosophers consider it not just a conundrum, but a genuine puzzle (Scaltsas 1980, 152;Wiggins 1980, 97). Others deny that it is so (Smart 1972, 148;1973, 27). It is, therefore, an open question whether there is or there is not a puzzle in the SoT story, and also whether the case is considered puzzling across different cultures. Recently, David Rose, Edouard Machery, Stephen Stich and forty-two other authors from different countries (RMS from now on) have undertaken the task of conducting empirical tests with a view to provide an answer to that open question. 1 According to RMS, a puzzle is a thought experiment fulfilling a "provocative function" (2020, 159), which they characterize in terms of two conditions: ambivalence and universality.
1 RMS's study is part of a larger project made possible through the support of a grant from the Fuller Theological Seminary / Thrive Center in concert with the John Templeton Foundation.
The ambivalence condition is stated as follows: "Readers should feel inclined to assert two prima facie inconsistent propositions" (RMS 2020, 159). As regards universality, RMS point out that a puzzle "[…] must elicit an ambivalent state of mind in readers of all demographic, particularly of all cultural, backgrounds" (2020, 159).
The story of the SoT that RMS presented to participants in their study is adapted from D. Rose (2015), and it contains the usual elements of the story, namely, a ship whose planks are gradually replaced through maintenance until no original plank remains ("Replacement") and the ship that results from putting together the original planks that were preserved ("Original Parts"). The story was translated into 17 languages and presented to 2,426 people in 22 countries. The participants in the experiment were asked to read the story and to answer whether, in their view, Replacement or Original Parts was the original ship. Their degree of confidence was also measured.
64% of the participants in the study thought that Replacement was the original ship. However, RMS note that, although there was a sharp majority in favor of Replacement, there was "quite a sizable minority -in the 30%-40% range -who thought that Original Parts was the original ship" (2020, 167), a minority that expressed high confidence in their judgment. In any case, regardless of their answer, participants reported, in general, a high level of confidence. 2 Moreover, with slight differences, the disagreement was universal across countries and cultures.
So, RMS conclude: Our results do indeed suggest that the Ship of Theseus case is a puzzle: People across cultures are ambivalent about what to say in response to the case. But they do not suggest it is one that feels unsolvable or that it is "irreclaimably paradoxical", placing us in a permanent state of indecision. If this were the case, then we should have found that people were divided on whether Replacement or Original Parts was the Ship of Theseus and that they were not very confident in the option they ultimately settled on. But this is not at all what we found. The majority of sites offered a clear verdict and did so quite confidently. (2020,168) Ultimately, according to RMS, "the Ship of Theseus is a genuine puzzle but one that people can solve to their satisfaction" (2020, 169).

The Role of Ambivalence
In our view, the experiment conducted by RMS does not grant any conclusion on the puzzling nature of the SoT story. To see this, let us first reflect on two, very different, puzzles: the Liar and the Trolley Problem. When we are asked whether the sentence "this sentence is false" is true or false, we can soon perceive the circle that leads to contradiction. And when we face the choice of either pushing the lever killing the one person or refraining from doing anything (thus letting five people die), both choices seem problematic, in spite of the fact that both courses of action are supported by ethical principles that we rely on in ordinary situations.
Indecision and ambivalence are felt when one is confronted with these cases: for different reasons in each case, we simply do not know what to say. Arguably, the psychological reaction, the indecision and ambivalence that each of us can feel, is not what makes a given case a genuine puzzle, although it is a good indicator of the existence of a puzzle. 3 That is why we think it is worthwhile to test, as RMS set to do, whether people are ambivalent about the story of the SoT. 4 However, there is an important confusion in their procedure. The principle of ambivalence, as RMS state it, is ambiguous. The claim "readers should feel inclined to assert two prima facie inconsistent propositions" (RMS 2020, 159) can be understood as requiring interpersonal disagreement (among different readers) or intrapersonal conflict or indecision, felt by each reader. Only the latter form of clash is arguably a good indicator of the presence of a puzzle. The paradigmatic cases of philosophical puzzles, such as the Liar and the Trolley Problem, do reveal such intrapersonal conflict.
What RMS show is that there is sharp interpersonal disagreement among different readers: 64% of participants thought that Replacement was the original ship whereas 36% thought that Original Parts was the original ship (2020,163). And the disagreement is indeed sharp because in both cases partici-pants were quite confident in their judgment (2020,166). But the presence of sharp interpersonal disagreement does not qualify as evidence that we are confronted with a genuine puzzle. 5 If interpersonal disagreement were the mark of a philosophical puzzle then any disagreement that can generate philosophical discussion would constitute a puzzle. But, in general, studies that show that there is interpersonal disagreement about a subject matter are not presented as studies that reveal the puzzling nature of that subject matter.
For instance, Edouard Machery, Ron Mallon, Shaun Nichols and Stephen Stich (2004) conducted an experiment using Kripke's Gödel case (Kripke 1980). The results of that experiment, they argued, show that East-Asians are inclined to think that the man who proved incompleteness and was found dead in mysterious circumstances is the referent of the name "Gödel", whereas Westerners were not at all inclined to this response. Subsequently Edouard Machery, Christopher Olivola and Molly DeBlanc (2009) conducted a similar test in different countries that showed divisions within each culture. In each case, the authors did not present their results as providing evidence for the existence of a puzzle. They simply argued that those results constituted proof that substantial segments of the population do not agree with Kripke's intuitions on the Gödel's case. 6 These two studies purport to show that there is interpersonal disagreement as to who "Gödel" refers to. 7 And, if the authors do not present the disagreements as providing evidence for the existence of a puzzle, we think, it is precisely because their study is not designed to show intrapersonal disagreement. 8 The SoT story is often presented as giving rise to a conflict with the transitivity of identity. One feels inclined to say that the SoT is Replacement and also that the SoT is Original Parts, but clearly Original Parts and Replacement are different. In general, showing that some people (perhaps a majority) think 5 It might be even argued that RMS's results militate against the conclusion that the SoT story constitutes a genuine puzzle, precisely because the participants reveal a high degree of confidence, incompatible with intrapersonal ambivalence (namely, it is not the case that they do not know what to say). We will address this issue in Section 3. 6 In fact, the divisions reported by Machery, Olivola and De Blanc in India, Mongolia and France are very similar to those reported in the test of the SoT story. For instance, in Mongolia, 66% lean one way and 34% the other, close to the 64% and 36% reported in the SoT test. 7 There has been a long and lively discussion as to what the studies do show, but the issue is of no relevance for the purposes of this paper. 8 Neither set of authors even ask participants for the degree of confidence in their answers. that, say, A is B and some other people (a substantial minority) think that A is C does not create any contradiction with the principle of transitivity of identity. Some people think that the author of the bestseller My Brilliant Friend (published under the name or nom de plume "Elena Ferrante") is the contemporary historian Marcella Marmo and some other people think that the author is the writer Domenico Starnone. 9 Both groups have a claim to being right, for there is evidence pointing in both directions. Clearly, Marcella Marmo is not Domenico Starnone, yet no one would conclude that this disagreement threatens the principle of transitivity of identity.
Although these interpersonal disagreements may be part of interesting philosophical discussions, they surely do not indicate the existence of puzzles. Likewise, the evidence that RMS collect as regards the story of the SoT is not an indicator of the presence of a puzzle. Now, the results of RMS's test show that people disagree about the right answer. Indeed, they show that such disagreement occurs with high levels of confidence and without indication of intrapersonal conflict. Thus, one might ask: do RMS show (unbeknownst to them) that the SoT story does not constitute a genuine puzzle after all? Not quite.

The Story and its Presentation
Let us think what would be a good presentation of the SoT story, the kind that we might easily find discussed in an undergraduate course in Philosophy. Ideally, the discussion proceeds in three steps. First, some story is told that invokes the principle that gradual replacement does not affect the identity of an object. For instance, a wall can have its bricks gradually replaced and still remain the same wall. Second, some other story is told that invokes the principle that disassembling and reassembling an object does not affect its identity. For instance, a watch can be disassembled and reassembled in order to clean it and yet remain the same watch. When the SoT story is then presented, readers are in an adequate position to consider whether their answers to the previous two stories entail that both the gradually replaced ship and the reassembled ship have a claim to being the original ship. That would violate the transitivity of identity. Pickup (2016) underscores that the three steps are fundamental if one is to see a problem in the story of the SoT: in a situation in which an object is disassembled and reassembled the identity of the object in question seems unproblematic; a situation in which parts of an object are gradually replaced seems entirely unproblematic, too. But then, in a situation that contains the previous two situations as parts, a problem seems to arise.
We are not claiming that the Ship of Theseus story is a genuine puzzlein fact, the authors are divided on that issue. 10 Our point is that the SoT story should be told in a way in which the alleged conflict between two principles that justify plausible answers in ordinary cases (a conflict that, if it exists, would make the SoT a puzzle) can come to the surface. Asking the question RMS ask without the three-step presentation does not place the subject in an adequate situation to be able to consider whether preservation of identity under gradual replacement, and preservation of identity under disassembly and reassembly conflict.
It might be argued that readers of RMS's vignette will put two and two together and gauge the potential conflict. That may be right. But RMS include no measure to indicate that this is the case, nor an acknowledgment that they are counting on readers making the connections. 11 More importantly, RMS do not allow readers who have gauged the conflict, and feel intrapersonal ambivalence, to express it. The reason is that readers of their vignette have only two options: they have to choose the reassembled ship or the gradually replaced one. But for the reader to be able to express intrapersonal ambivalence, options such as "both", "neither" and "I do not know" should be offered as possible answers as well. 12 Interestingly, it might be argued that it is an open question whether there is a hierarchical order between the principles that govern identification and reidentification of objects. One might even wonder if such a hierarchy would be sensitive to cultural background. Perhaps, one might argue, this is the reason RMS obtain the result that the majority of people are inclined towards a certain answer and with little hesitation. If the SoT story had been tested in the three-step way suggested here, and if the results had been the same that RMS obtained (namely, interpersonal disagreement and high levels of confidence), then it could be argued that there is a hierarchy of principles 10 See, for instance, García-Moya (Unpublished). 11 We are grateful to a referee for urging us to clarify these points. 12 Adding options has been proposed in conversation with Vilius Dranseika. Also, verbal expression is not the only way to capture indecision or ambivalence. Eye-tracking, for instance, has been used in other experimental studies. See Cohnitz and Haukioja (2015) and Shtulman and Valcarcel (2012). We thank Eugene Fisher for bringing that to our attention. and that people disagree as regards which principle is prior. If that were the case, the SoT story would be interesting and challenging, but perhaps not a genuine puzzle. Yet, it is important to stress that the way RMS tell the story of the SoT is not useful as a test in that regard either. Testing the presence of a hierarchy requires collecting data about whether certain principles are used happily on some occasions and are overridden in other occasions. Both the happy application of principles and the possibly overriding application must be tested. That could be done by testing the story in the step-by-step way suggested here, but it cannot be achieved by the one-step story presented by RMS.

Conclusions
We conclude that RMS's test does not show that the story of the SoT is a puzzle because the data collected is data about interpersonal disagreement which, unlike intrapersonal conflict, is not a good indicator of the presence of puzzles. In fact, the high level of confidence reported by the participants in the experiment might suggest that the story of the SoT constitutes no puzzle at all. However, the story that RMS present is simply not adequate to test the puzzling nature of the SoT.
Hence, the test conducted by RMS has no bearing on the question as to whether the SoT constitutes a genuine philosophical puzzle, and it does not advance in any way the traditional discussion about this venerable story.
Finally, we think that there is a general lesson to be learned about puzzles and philosophical experiments. A lot of work in experimental philosophy has consisted in highlighting clashes of intuitions between groups of people (e.g. cultures, genders, general public vs. experts). All these studies rely crucially on the existence of interpersonal disagreement, as they should, since their purpose is to highlight disagreements among different people or groups. But RMS take that very same methodology and apply it to test the presence of a puzzle. That is a mistake: testing the presence of a puzzle should focus on intrapersonal conflict and therefore requires a different methodology.* * Versions of this paper were presented at the 2020 Online Conference in Experimental Philosophy and at the Logos Seminar. We thank the audiences for their comments. We acknowledge the support of the European Union and the Spanish Ministerio de Ciencia e Innovación, through grants FFI2016-80588-R, and 2019PID-107667GB-100.