This is not an html file just grab it.

Date: Fri, 17 Jan 97 15:41 +0200 From: To: Maya Bar Hillel Cc: Dror Bar-Natan , Ilya Rips , aumann Subject: What would convince me? Shalom Maya, > Date: Sat, 11 Jan 1997 23:52:14 +0200 (WET) > From: Maya Bar-Hillel > To: aumann israel > Cc: Dror Bar-Natan , msmaya > ... please consider this a written request for suggestions as to > what kind of results which Dror and I might come up with you > would find more convincing than those I already told you about > (conditional, of course, on our rechecking that there is no error > in the numbers). 1. This is mainly to respond to the above request. Let me say at once that I find your approach interesting, and think it has potential. There are two major (somewhat related) points that I will make below: a) The statistical analysis, though it seems attractive at first, turns out, on examination, to be unsound. That is to say, the idea is interesting, but is not carried out in an appropriate way. Below I will suggest more appropriate ways to carry it out (these suggestions have NOT been checked out with either Rips or Witztum or, for that matter, with anybody else). b) A charge of (intentional or unintentional) "cheating" -- such as is inherent in your story -- should be plausible in view of the "uvdot bashetakh;" i.e., in view of things like the chronology, the necessary extent of the circle of conspirators, and so on. You DON'T necessarily have to have a proof that would yield a conviction in a court of law; you DON'T have to believe what the suspects say; but you DO have to relate to these things, there has to be some kind of plausible story, it can't be wildly implausible, you can't just wave the facts away. Dror realizes this. 2. This is being written in a big hurry, as I have a million things to do before leaving for abroad on Sunday. So I will try to be accurate and precise and reasonably complete, and to use appropriate phrasing, but I may slip up here and there. Please forgive me. I am sending a copy to Rips (who will presumably pass it on to Witztum), so if I misunderstood something they told me, they can correct it. 3. It's important to get the chronology straight. The following chronology is partial in the sense that not all relevant events are included, but as far as it goes, I believe it's correct (I don't know the order within any given item). I will not use dates -- these can be checked out -- but present the events in their correct time order. (In general, I don't know the chronological order within each of the items below.) A. Experience with one-dimensional, and later with two- dimensional ELS's. B. It is noticed (I believe by Rips) that the Rambam appears in close proximity to "Mishne Tora" with a skip of 613. Also that Herzl (the founder of modern Zionism) appears in close proximity to a phrase that includes his BIRTHDAY. There is NO experience with Tora personalities vis-a-vis their dates. (This was told to me by Rips last Friday, January 10, 1997 -- BEFORE I reported to him on the Zarka Ma'in meeting). C. RIPS suggests checking Tora personalities vis-a-vis their dates in a systematic way. (Confirmed to me by Rips last Friday, January 10, 1997 -- also before my report on the Zarka Ma'in meeting.) D. The first list is generated. Havlin is approached for the appellations, Urbach for the date forms, the dates are checked out and (where necessary) corrected, spelling forms are determined. The statistics P1 and P2 are defined. E. The test is performed on the first list. Both P1 and P2 -- which at that time were considered significance levels -- turn out amazingly small. F. The results are sent to Diaconis. He asks (inter alia) for the same test to be carried out on a fresh list of personalities. G. The second list is generated and tested, using exactly the same test as for the first list. Again, high "significance." H. The results are sent to Diaconis. He is unconvinced, asks for a permutation test, and asks that the first list not be used in a formal test. I. P3 and P4 are defined. J. The details of a formal test are agreed between Diaconis and Aumann (I'm trying to avoid pronouns, because they often lead to confusion). K. The formal test turns out significant at a level of 16 out of a million. (That is, the best result of the four statistics is 4 out of a million, and then Bonferoni.) 3. Now let's get to Maya's tests. The idea is that the WRR (Witztum-Rips-Rosenberg) test involves many arbitrary choices. For each of 13 such choices, Maya looks whether the test statistic comes out better when the choice is made as it was. One might expect that it comes out better in about half the cases, and worse in about half the cases. But Maya finds that in each of the 13 cases, WRR's choice was to their advantage. This seems highly improbable, UNLESS the WRR statistic was observed BEFOREHAND to react favorably to at least some of the choices involved. And of course, if one does this, it is less surprising that one can generate significance at a high level. 4. For this to make sense, clearly the statistic to be calculated in connection with each choice should be the one with which WRR were working at the time that the choice in question was made. HERE IS PROBLEM NO. 1 WITH MAYA'S TESTS: She does NOT do this. The statistic Maya uses is the rank order out of ten million random permutations. But the entire test -- dates, spelling, appellations, date forms, EVERYTHING, was fixed BEFORE Diaconis suggested the permutations. Using the permutations here is an inadmissible anachronism -- it's like asking why the defenders of Metzada didn't use Uzi's. Indeed, many of the choices that Maya examines were made before the FIRST list was tested, i.e., even before Diaconis suggested the second list, and a fortiori before he suggested the permutations. 5. Of course, it is remotely possible that before testing the first list, WRR foresaw (from Dilugim? :-) ) that Diaconis would ask them for a fresh list of personalities, and also foresaw that still later he would ask them to use a permutation test rather than the test they had been using. Theoretically, one could even raise the possibility that Diaconis himself was part of the conspiracy, though I think that everybody concerned would be willing to rule THAT one out. What I'm suggesting is that in carrying out this investigation, one must stay more or less in the realm of the plausible; one must maintain common sense. And, common sense calls for using the statistics P1 and P2, and NOT the permutation test, to test Maya's hypothesis. 6. Let us now look at each of the 13 choices that Maya examined. 1: When Margaliot had an incorrect date, WRR substituted the correct date -- rather than using Margaliot's incorrect one (first list). 2: Same (second list). 3: When Margaliot had an incorrect date, WRR substituted the correct date -- rather than omitting the item (first list). 4: Same (second list). 5: WRR used birthdays as well as deathdays; they could have used just the death days (first list). 6: Same (second list). 7: WRR used both forms for each of 15 and 16; they could have used just "tet-vav" and "tet-zayin" (first list). 8: Same (second list). 9: The form "be-alef be-tishri" could have been used, and was not (first list). 10: Same (second list). 11: WRR used an incorrect first list, because of incorrect measurement of the length of a column (they claim by mistake, but maybe it was really on purpose). They could have used a correct list. 12 & 13. Same for the second list. In this case Maya examined two rather complicated alternatives, but did NOT (!) examine the alternative of simply using the correct second list. 7. Maya's 13 tests fall naturally into two classes: 1 through 8 and 9 through 13. The first class (1 through 8) consists of cases where WRR expected beforehand that the results would be improved by their choice; eventually, according to your report, they indeed were. Tests 1 through 4 are good examples of this. Clearly, if one thinks that dates are important, it's a good idea to get the right dates! That the right dates score better than the wrong ones -- or than none at all -- should be no surprise IF the research hypothesis is correct. Maya, of course, rules out the possibility that the research hypothesis is correct -- but you can't assume that in your analysis! Similarly for Tests 5 and 6. WRR think that birthdays ARE significant -- remember, the whole idea started with Herzl's birthday (see 3B above)! And similarly for 7 and 8. One must remember that the reason for using "tet-vav" and "tet-zayin" is to avoid using the Name of the Lord in vain. But IF there IS a code in Bereshit, then that would not be using the Name in vain! After all, Bereshit is full of entirely explicit occurrences of the Name; the hypothesis that the author of Bereshit is willing to use the Name does make some sense. THAT is reason that Rips suggested at the outset to use both forms. And apparently, it works! Applying Maya's idea to this kind of phenomenon is a little like saying that in testing whether surgery + radiation works in treating cancer, one must try the treatment without radiation! 8. Tests 9 through 13 are admittedly different. In 9 and 10 there is no apparent reason for having left out the additional form (WRR wrote that they received an expert opinion to this effect, and the expert has meanwhile died; but we'll ignore that). In Tests 11, 12, and 13, WRR admit to measurement errors, which Maya claims are to their advantage. Are they indeed? NO! BOTH in Test 9 AND in Test 10, min(P1,P2) actually becomes SMALLER when the form "be-alef be-tishri" is added. And remember: that -- NOT the permutation -- is the correct statistic to use (see Point 5 above). So WRR's choice was to their DISADVANTAGE in both Tests 9 and 10. 9. In the case of Test 11, Maya is right! min(P1,P2) does become somewhat larger when the correct list is used. 10. The case of Tests 12 and 13 is somewhat strange. As mentioned above, in this case Maya examined two rather complicated alternatives, but for some reason did NOT (!) examine the simple alternative of just using the correct second list. If one does use this simple alternative, one finds again that min(P1,P2) becomes smaller when the correct list is used. So again, WRR's mistake was to their DISADVANTAGE. 11. Since the the second list was the one actually used in the formal test on which Diaconis and Aumann agreed, and that was eventually published in STATISTICAL SCIENCE, it is of some interest to ask how the use of the correct list would affect the true significance level -- that given by the permutation test. That is, quite apart from the question of cheating -- which we have seen is NOT indicated by this mistake -- the question arises: In view of this mistake, does the result in fact remain valid? Answer: YES, very much so. The significance level improves by a FACTOR of 40 -- from 16 in a million to 4 in ten million! And this 4 in ten million is itself only because of Bonferoni -- the true best result is 1 in 10,000,000, and may be even better (only 10,000,000 permutations were examined). 12. Summary: Out of Maya's 13 tests, the first 8 are disqualified on conceptual grounds. Of tests 9, 10, and 11, Maya is right in one, wrong in two. Tests 12 and 13 seem contrived and complicated; if one replaces them by a natural, straightforward test, Maya is again wrong. Final score for Maya: 1 in 13, or 1 in 5, or at the very best, 1 in 3. But no matter how you score it, there's NO indication of cheating. 13. Here's another choice that WRR made that they didn't have to make at all, and that was definitely to their DISADVANTAGE: The addition of the statistics P3 and P4 (see 3I above). This was rather late in the game -- AFTER the permutation test had been suggested; so if they were cheating, they should have known by that time what they're doing. But it cut down significance by a factor of 1.6 -- from 10 in a million to 16 in a million. Again, it COULD all be a nefarious plot -- a kind of decoy; having foreseen Maya's test as well, they wanted to show how honest they are. But how plausible is that? 14. Let's now consider the matter of "stars," which Maya raised at Zarka Ma'in. This sounds interesting at first, but on consideration, it's not clear that there's anything to it. One must remember two things. First, by all accounts, EVERYTHING had been fixed by the time the permutations were suggested. The matter of stars must therefore be evaluated in the light of P1 and P2, or if you wish, in the light of the distribution of the c(w,w'). The second thing to note is that WRR's contention has been, all along, that there are an unusually large number of unusually small c(w,w') -- i.e., stars (look at the bar graphs on P.437 of their article). If you look at the construction of P1 and P2, that's what it amounts to. So what's it all about? I'll agree that this bears further looking into. But for the time being, I see nothing there. 15. Burden of Proof: At Zarka Ma'in, Maya said that the burden of proof is now on WRR. I don't see it that way. They've gotten a very high significance level. So far, it's stood up under examination. Asking questions and raising possibilities, which on examination turn out groundless, is not enough. 16. So now, what kind of results that Dror and Maya might come up with would be more convincing than those that Maya already discussed? 17. Before answering this, let me turn the question around and ask YOU, Maya: What kind of results that WRR might come up with would be more convincing to you than those already published? The answer is public knowledge: There are none. You are on record as saying (in the discussion after my "Rationality on Friday" talk) that NOTHING will convince you. That's OK. Michelson and Morley believed in an ether to the end of their lives. Though the experiment that destroyed the ether was their own, and though they kept refining the experiment and never found any effect at all, they kept believing in the ether. A scientist does not need to keep an open mind on everything; like anybody else, he has a right to his faith. In this respect, I have an advantage over you. I've always been very, very skeptical about this business. Frankly, I still am. Though I can't say why, I'm far from convinced that they're right. In my bones, I feel that I need more evidence -- lots more. BUT UNLIKE YOU, I DON'T RULE IT OUT COMPLETELY. Though utterly fantastic, it's just barely possible. I'm keeping an open mind, and I'm going to play it by the rules. I really want to find out whether there is a phenomenon there or not. In contrast, you're already sure; you only want to find out HOW they cheated, not IF they cheated. 18. Back to Point 16: First of all, whatever you do, you've got to say BEFOREHAND "I'm going to do this and that and that." You've got to do that BEFORE you actually compute anything. And, you've got to give PRECISE criteria for success and failure. YOU can make them up as you wish, but you've got to tell the world BEFOREHAND what they are. And success or failure, you've got to tell us afterward how your tests came out. So we can keep score. That's what they did. I didn't believe they would, but they did. And if you want to convince ME, you're going to have to do the same. If at first you don't succeed, you can keep trying. Just tell us BEFOREHAND what you're doing, and what the criteria are, and whether or not this test is going to be definitive, and so on. You can keep it open, or close it, or do what you want. Just tell us. Beforehand. 19. Now to specifics. As you yourself pointed out, the procedure for calculating the c(w,w') is very complicated. You can ring the changes on it in lots of ways. For example, there are lots of ways of perturbing the ELS's. Or you can use a different distance function. Or you can raise the 8. Lots of possibilities there. I don't have time now, but I'll be able to make some specific suggestions. I haven't checked this out with Ilya and Doron (in fact, I didn't check out anything in this letter with them, except where I cite them explicitly), and they should tell me if they think I'm wrong, and why. But on the face of it, all these things seem to me to be neutral to substance (UNLIKE your tests 1 - 8). So if in all these cases, or in disproportionately many of them, WRR's choice was to their advantage, then I'll sit up and take notice. You'll still have to tell a plausible story. And you'll still have to explain things like Gans. You won't be home yet, but at least you'll be on first base. (19a. There's a small problem of credibility here. I have complete faith in your honesty. But the honesty of WRR has been impugned, so if one wants to maintain objectivity, one should address this matter for your tests too. I think this can be overcome; there are so many parameters floating around here that, for example, one could choose a test or tests at random from a large number of possibilities.) 20. Appellations and dates: Let's replicate this! I'm sure we can agree on a way of finding an expert; if you wish, he can be anonymous (not known to either side). Dror suggested a letter to "Prof. Zalman Hayadua". That's fine! He ("Prof. Zalman Hayadua") will get agreed-upon written instructions, and he'll produce lists, and we'll be able to run the thing from there. 21. War and Peace: First of all, I've got to see the list of appellations, to be able to judge whether it begins to make any sense. In fact, I (Yisrael) can't judge, but I can try to find out from somebody else. Here we're back to Point 20, and we can use that kind of procedure to judge the list. IF it does, it IS interesting to have established that by careful cheating, you CAN do this kind of thing. But you've still got to establish that that kind of careful cheating was in fact feasible under the circumstances, you've still got to tell a plausible story. Was Havlin in the conspiracy? How about Gans (who says he started by trying to break the whole business)? If something is just barely possible with careful cheating, that doesn't mean that everybody who did it is a cheater. Maybe it's possible without cheating, too. If you show that by careful work, you can make a good counterfeit $100 bill, that doesn't mean that everybody with a $100 bill is a counterfeiter; you've got to find the printing press, or at least to make it plausible that he had one. That's about it for now. I'm thinking of sending this to the people who were at Zarka Ma'in. Any objections? Kol Tuv, Shabbat Shalom, Yisrael