Hi Matt (through aussieguy).
Though the authors would like readers to believe that our use of NSC was "naïve" (as they write in the article), it was not in fact naïve. (It warrants mention that the statistician, Prof. Witten, who did the statistics on our project, is Tibsharani's former grad student. Tibsharani is the one who invented NSC. ) Our use of the algorithm was perfectly legit for what we applied it to. Schaalje misrepresents our objectives in order to create a straw man argument related to the business of a closed set of candidate authors.
Being someone’s graduate student doesn’t guarantee that the person will not naïvely misapply their mentor’s work. Tibshirani invented the method for a very different application, tumor type identification from gene-microarray data, in which the closed-set assumption is usually met and there is nothing like text size to worry about. Go talk to Tibshirani himself.
Your use of NSC was *not* perfectly legit. For one thing, the goodness-of-fit test showed that predictions based on your fitted model did not match your data. Assumptions that are contradicted by the data are bad assumptions. Inferences based on bad assumptions are bad inferences. It’s like saying that we added two numbers when we should have multiplied them. But we felt like adding for this problem, so it’s magically legit.
Burrows, Juola and Koppel (leaders in the authorship attribution field) all worry about mistaking open-set problems for closed-set problems. It’s not a straw man to worry about this issue in this of all cases. Matt, for interest’s sake, how do you justify ignoring the false-positive cutoff for your use of Burrow’s delta?
Schaalje's primary complaint is that we use a closed set of candidate authors. That is a legitimate complaint and it is a point that we acknowledge quite clearly in the paper.
You simply pointed out that you chose to use closed-set classification, as if the choice to do so has no consequences for your inferences. If you add two numbers when you should have multiplied them, you get the wrong answer.
That said, all authorship problems are ultimately closed set problems. You cannot test for every single person in the entire history of the world. Instead, you have some candidates and wish to figure out which among them is the most likely culprit. . .
. . . or, as Burrows said, determine if it is wiser to look further afield. In other words, you can’t just assume away the possibility that none of them is likely. The open-set technology allows you to honestly allow this possibility. So you’re wrong. Not all authorship problems are closed set problems. I would say the vast majority are open.
Our objective was to test *existing* theories of authorship of the Book of Mormon. And that's exactly what we did. We took the list of suspects and tried to rank them in terms of their likelihood.
Read your own paper. You acted as if your authorship probabilities were absolute probabilities rather than probability rankings. When the absolute probabilities are so small, it is highly misleading to build a case on probability rankings.
We point out in the paper that it is possible that the *real* author is not in the closed set. The real author could, for example, be Moroni or Napoleon. The point of our work was to "reassess" prior theories of authorship using machine classification. Our result is only compelling if you first accept the historical evidence and then accept that the candidates we tested represent a good set of candidates.
Even if you tentatively accept the historical evidence, the results are not compelling if the closed-set assumption is empirically untenable. All of your candidates (other than Isaiah) used your marker words very differently than they were used in Book of Mormon texts. You can see that in the principal component plots and the goodness-of-fit tests.
The Schaalje paper creates a straw man argument and then plays fast and lose with the facts of our paper, cherry picking little bits here and there to make it look like we did something other than what we did, or that we had a goal other than what we had. In my opinion the fundamental conclusion of our work is that it lends additional support to the spalding-rigdon theory of authorship.
Please back these statements up with more than bald assertions. What is the straw man? Exactly where did we misrepresent your paper?
Schaalje makes a silly point of testing for Rigdon in the Federalist papers. No one ever suspected Rigdon of writing the Federalist papers. And then to make sure that his results pack the right punch to match his bias, he uses an entirely different feature set from ours and then, voila, they say that Rigdon shows up as most likely author of some of the Federalist papers. This is just plain hogwash.
This example dramatically shows how silly attribution results can be when closed-set methods are used inappropriately. But open-set technology rescued the attributions from being absurd even in this silly example.
The use of a different feature set makes no difference. What bias is introduced by using a different feature set? Do you think we chose features that would make Rigdon look similar to Hamilton under closed-set methods and different from Hamilton under open-set methods? Look at the principal components plot. If anything our feature set seems to bias things against the point we are making. I’m completely baffled as to why you can’t see the warning in this example. Most readers of this board immediately see the relevance of this example. Why do you choose not to?
Schaalje tries to suggest that text length is a factor. we checked that there was indeed no correlation between text length in our study and the results we derived. I can't remember if that was included in the final version of our paper.
How did you check this? Exactly what did you correlate with what? Look at figure 6 of our paper. The variances of the features--the major source of uncertainty in the authorship attributions--depend greatly on text size. But your probability calculations have no adjustment for text size at all. You pretend that you are calculating absolute probabilities, and these most definitely depend on text size.
For what it is worth, Schaalje does offer some interesting ideas but ultimately they are constructed out of an entirely different data set and used in an entirely different approach.
I think this gets at the fundamental problem. We adamantly do not believe that historical theories give you license to ignore empirical characteristics of the data.
And all of this they did in the context of statistics and formulas that only a highly trained statistician would understand. In my opinion, they should have first sought to have the statistical methodology reviewed and published in a journal of statistics. They suggest a modification of NSC and NSC is an approach that has been peer reviewed by statisticians who have the necessary expertise to evaluate it.
We did. The reference (Communications in Statistics – Theory and Methods) is in the bibliography. (By the way, we similarly believe that Matt and Craig should have first sought to have their historical theories reviewed in a history journal. Since we did go to a statistics journal, why not do it now?)
In my opinion, Schaalje played fast and loose with everything in our own work in order to present a critique that would be pointless to try and rebut. Their paper is full of baffling formulas and mathematical slights of hand. In my opinion, it says virtually nothing.
If you don’t understand the formulas, have your coauthor (Witten) explain them to you before giving an opinion that means exactly nothing. Where are the slights of hand? Please give just one example.
Unfortunately, it will achieve exactly what the authors hope, namely, it will provide them with a peer reviewed paper that they can hold up and claim to be a rebuttal of our paper. Anyone who can read and understand it will see that it is nothing of the sort, but few in Schaalje's target audience will, I suspect, actually try to read it.
It’s clear that *you* didn’t try to read it with understanding before forming your opinion. It’s also interesting that you admit to not being able to understand the math, but have no problem firmly concluding that ‘it is nothing of the sort.’ Aside from believing that we have made a real contribution to statistical authorship attribution methodology, our real hope is that future research on statistical authorship attribution will not be as shoddy as the Jockers et al. paper.