The Interpreter; Bayes Theorem; Nephites and Mayans

Post by **_Gadianton** » Thu Jun 20, 2019 4:59 pm

You've computed a bunch of LRs, by some more rigorous means than just guessing, and now you're wondering how excited you should be feeling about them.

K-R suggests that an LR of 2 is barely worth mentioning. That is not a prescription for assigning things an LR of 2 just because you feel that they might be barely worth mentioning.

Arbitrarily insisting that data be scored at one of the three ratios allowed by the Dales rules out

Suppose I'm a defense attorney and I know the prosecution has a smoking gun, a 50. I have six pieces of evidence, and I'm praying to God one of them is a .02, but I run through the LR calculations, and each of them individually comes up to "barely worth a mention" at around .5. I'm depressed about this, until I come to believe each is independent, and then .5^6 = .015625.

agree or disagree: Either there is something wrong with multiplying LRs, or multiplying LRs is appropriate, but it would be a herculean task to find 6 fully independent weak pieces of evidence. Either all of that, OR we need to update our expectations and re-label the scale. If it's not unusual to have 6 barely- worth-mentioning independent pieces of evidence, then the scale is useless. It would be like saying six tremors = a 10 on the Richter scale.

Post by **_Res Ipsa** » Thu Jun 20, 2019 5:15 pm

Gadianton wrote:
You've computed a bunch of LRs, by some more rigorous means than just guessing, and now you're wondering how excited you should be feeling about them.

K-R suggests that an LR of 2 is barely worth mentioning. That is not a prescription for assigning things an LR of 2 just because you feel that they might be barely worth mentioning.

Arbitrarily insisting that data be scored at one of the three ratios allowed by the Dales rules out

Suppose I'm a defense attorney and I know the prosecution has a smoking gun, a 50. I have six pieces of evidence, and I'm praying to God one of them is a .02, but I run through the LR calculations, and each of them individually comes up to "barely worth a mention" at around .5. I'm depressed about this, until I come to believe each is independent, and then .5^6 = .015625.

agree or disagree: Either there is something wrong with multiplying LRs, or multiplying LRs is appropriate, but it would be a herculean task to find 6 fully independent weak pieces of evidence. Either all of that, OR we need to update our expectations and re-label the scale. If it's not unusual to have 6 barely- worth-mentioning independent pieces of evidence, then the scale is useless. It would be like saying six tremors = a 10 on the Richter scale.

The fact that the Dales’ methodology allows six pieces of evidence, no matter how weak, to overcome one piece of evidence, no matter how strong, should have been a giant red flag that the methodology was flawed. I view the scale in this case as a red herring, as we aren’t working with measurable likelihoods.

Post by **_Lemmie** » Thu Jun 20, 2019 6:11 pm

Suppose I'm a defense attorney and I know the prosecution has a smoking gun, a 50. I have six pieces of evidence, and I'm praying to God one of them is a .02, but I run through the LR calculations, and each of them individually comes up to "barely worth a mention" at around .5. I'm depressed about this, until I come to believe each is independent, and then .5^6 = .015625.

The Dales use a scale with only 3 allowable, discrete points on it:

2-----10-----50. for convenience, let's identify those as good----better-----best.

The KR paper, on the other hand, says this:

numbers ranging from 1 to 3------- not worth mentioning

" " " ranging from 3 to 20---------positive

" " " ranging from 20 to 150-------strong

" " " greater than 150 ------------very strong.

So every LR, from 1 to approaching infinity is included. (And for the other LR, take the reciprocals, so every LR from 1 on down to approaching zero is included.)

So back to your case. Suppose the only 6 items you can come up with are so weak that their true value is 1, 1, 1, 1, 1, and 0.9 (instead of 0.5, 0.5, 0.5, 0.5, 0.5, and 0.5).

if you use a more accurate scale, 50 multiplied by (1x1x1x1x1x.9) = 45, >1, prosecution wins.

So you argue, No, the evidence MUST be given the closest value on the Dales' scale. (notice 1s aren't allowed.)

So you use the bad scale, and now the problem is 50 multiplied by (0.5 to the 6th) = 0.7813, which is less than 1, your evidence apparently outweighs the 50 from the prosecution, and you win.

gad wrote:agree or disagree: Either there is something wrong with multiplying LRs, or multiplying LRs is appropriate, but it would be a herculean task to find 6 fully independent weak pieces of evidence. Either all of that, OR we need to update our expectations and re-label the scale. If it's not unusual to have 6 barely- worth-mentioning independent pieces of evidence, then the scale is useless. It would be like saying six tremors = a 10 on the Richter scale.

Agreed, the scale is useless because it prohibits any value other than the 3 options, and is subjective.

Now just to mess with you, suppose your 6 pieces of evidence are NOT independent. Suppose each additional piece has a 90% chance of occurring with the ones previously listed.

If you fix both, the true final LR would be:

50 x [(1^5)x0.9x(0.9^5)] = 26.57

Prosecution wins, because your weak evidence was properly evaluated.

Post by **_Physics Guy** » Thu Jun 20, 2019 6:22 pm

I'm not completely sure but I think the K-R scale is okay, as long as you remember that it's a scale for attaching qualitative descriptions to solid LRs, not a rule for assigning LRs to qualitative descriptions.

The difference is crucial, because it means that an LR of 2 (or on the other side 1/2) is not a typical LR for weak evidence, but an upper bound. Conversely an LR of 50 (or 1/50 on the other side) is not typical of all strong evidence; it's the lower bound at which you can start to call your evidence strong. Most weak evidence gives an LR very close to 1—and K-R is not denying that in the least. Real smoking-gun evidence can give an LR much larger than 50 or smaller than 1/50—and K-R is not denying that, either.

So a defence attorney with six pieces of weak evidence against a single smoking gun is probably not looking at six LR's of .5 against one LR of 50. They probably have something more like six LRs of 0.98 against one LR of 1000. Even if the six weak pieces of evidence are independent they don't outweigh the smoking gun.

If the defence attorney actually is pitting six independent LRs of .5 against one LR of 50, then the accused probably deserves to go free, because although the single piece of adverse evidence is quite strong and the six pieces of exonerating evidence are fairly weak individually, the six of them together really do outweigh the one by enough to yield reasonable doubt.

K-R is a pretty authoritative reference but it provides no warrant whatever for the kind of thing the Dales have done, which is to weight all weak evidence as if it were the strongest evidence that could still be called weak, and all strong evidence as if it were the weakest evidence that could still be called strong. This systematically enhances the impact of weak evidence, potentially by enormous factors, and downplays the impact of strong evidence, also potentially by enormous factors. Nothing in K-R says that it's okay to do that. If K-R fails to say really explicitly that it's not okay, then that's only because K-R was written for people who already know well enough not to do completely stupid things.

Post by **_Lemmie** » Thu Jun 20, 2019 6:35 pm

Exactly my point. They are calling it a "scale" when it is really only a simple measure with three discrete points. It doesn't allow for any range.

But, forcing all evidence into one of the three without even allowing a neutral value of one exaggerates the error considerably. Every piece of evidence considered is already, at a minimum, assumed to be strong. That skews the analysis irreparably.

Nothing in K-R says that it's okay to do that. If K-R fails to say really explicitly that it's not okay, then that's only because K-R was written for people who already know well enough not to do completely stupid things.

Post by **_Res Ipsa** » Fri Jun 21, 2019 12:29 am

Lemmie, if we estimated the dependence of the Dales’ correspondences at 20%, would we have to multiply the denominator of their LR by .2^(number of evidences)?

Post by **_Lemmie** » Fri Jun 21, 2019 1:41 am

Res Ipsa wrote:Lemmie, if we estimated the dependence of the Dales’ correspondences at 20%, would we have to multiply the denominator of their LR by .2^(number of evidences)?

Here's a helpful place to start, Res Ipsa: http://www.stat.yale.edu/Courses/1997-9 ... ndprob.htm

Post by **_honorentheos** » Fri Jun 21, 2019 2:10 am

Lemmie wrote:
Suppose I'm a defense attorney and I know the prosecution has a smoking gun, a 50. I have six pieces of evidence, and I'm praying to God one of them is a .02, but I run through the LR calculations, and each of them individually comes up to "barely worth a mention" at around .5. I'm depressed about this, until I come to believe each is independent, and then .5^6 = .015625.

The Dales use a scale with only 3 allowable, discrete points on it:

2-----10-----50. for convenience, let's identify those as good----better-----best.

The KR paper, on the other hand, says this:

numbers ranging from 1 to 3------- not worth mentioning

" " " ranging from 3 to 20---------positive

" " " ranging from 20 to 150-------strong

" " " greater than 150 ------------very strong.

So every LR, from 1 to approaching infinity is included. (And for the other LR, take the reciprocals, so every LR from 1 on down to approaching zero is included.)

So back to your case. Suppose the only 6 items you can come up with are so weak that their true value is 1, 1, 1, 1, 1, and 0.9 (instead of 0.5, 0.5, 0.5, 0.5, 0.5, and 0.5).

if you use a more accurate scale, 50 multiplied by (1x1x1x1x1x.9) = 45, >1, prosecution wins.

So you argue, No, the evidence MUST be given the closest value on the Dales' scale. (notice 1s aren't allowed.)

So you use the bad scale, and now the problem is 50 multiplied by (0.5 to the 6th) = 0.7813, which is less than 1, your evidence apparently outweighs the 50 from the prosecution, and you win.

gad wrote:agree or disagree: Either there is something wrong with multiplying LRs, or multiplying LRs is appropriate, but it would be a herculean task to find 6 fully independent weak pieces of evidence. Either all of that, OR we need to update our expectations and re-label the scale. If it's not unusual to have 6 barely- worth-mentioning independent pieces of evidence, then the scale is useless. It would be like saying six tremors = a 10 on the Richter scale.

Agreed, the scale is useless because it prohibits any value other than the 3 options, and is subjective.

Now just to mess with you, suppose your 6 pieces of evidence are NOT independent. Suppose each additional piece has a 90% chance of occurring with the ones previously listed.

If you fix both, the true final LR would be:

50 x [(1^5)x0.9x(0.9^5)] = 26.57

Prosecution wins, because your weak evidence was properly evaluated.

I sincerely beg you to post this on The Interpreter's comment section.

Post by **_Gadianton** » Fri Jun 21, 2019 2:25 am

Lemmie wrote:Suppose the only 6 items you can come up with are so weak that their true value is 1, 1, 1, 1, 1, and 0.9 (instead of 0.5, 0.5, 0.5, 0.5, 0.5, and 0.5)

Physics Guy wrote:for the kind of thing the Dales have done, which is to weight all weak evidence as if it were the strongest evidence that could still be called weak, and all strong evidence as if it were the weakest evidence that could still be called strong.

Great answers. I also apologize because I remembered something wrong from my weekend reading. There are two scales listed, Lemmie referred to the second, but I had the first in mind due to the range of "not worth a mention" to "Decisive" (and how many weak pieces of evidence are you going to say can overrule a decisive piece of evidence?). Decisive carries a little more weight than "very strong" and I *thought* the dales were taking the category breakdowns from that first table, but now with the page in front of me, I don't see where they got their numbers. So my example was confusing because I thought I was using the breakdowns from the document and temporarily leaving the D's out of it -- I should have put 3.2, 10, and 100. (or 3, 20, 150).

Those are actually worse, 4 substantial "barely"s overpower one barely "Decisive". I guess independence is really just that strong.

Post by **_Gadianton** » Fri Jun 21, 2019 2:33 am

wonder if this is worth a bare mention. I'm just not happy about 4 substantial barelys overpowering a barely decisive.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4891395/

The discrepancy with the collapsed-data Bayes factors is large, and this serves to demonstrate that when effect sizes are related across studies—which is reasonable to assume– Bayes factors should not be multiplied (e.g., as was done by Bem et al. (2011) in order to present evidence in favor of extra-sensory perception)*. As explained by Jeffreys (1961, pp. 332–334), Bayes factors may only be multiplied when the prior distributions are properly updated. To clarify, consider two studies, E1 and E2, and a fixed effect. When the two experiments are analyzed simultaneously, the Bayes factor can be denoted BF(E1, E2), and it is obtained by integrating the likelihood over the prior distribution (see Appendix for details). When the two experiments are analyzed sequentially, the same end result should obtain, and this occurs with a Bayes factor multiplication rule based on the definition of conditional probability: BF(E1, E2)=BF(E1)×BF(E2∣E1). Note that the latter term is BF(E2∣E1), indicating that it is obtained by integrating the likelihood over the posterior distribution obtained after observing the first experiment. Thus, multiplying Bayes factors across N related units (participants or studies that show similar effects) is incorrect because the prior is used N times instead of being updated.

*LOL f*n LOL

DiscussMormonism.com

The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans

Re: The Interpreter; Bayes Theorem; Nephites and Mayans