• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Polygraphs: The evidence

But this discussion was never about using polygraphs to weed out spies. It was about drawing a conclusion about 100 tests grouped together.

So this doesn't factor into that?:

roughly 99.6 percent of positives (those failing the test) would be false positives.

That would mean that 99.6 percent your positives of "real atheists" could very well be false positives.

Does this not register? It really isn't that awful for you to admit you were wrong, for once. I certainly wouldn't hold it against you.

How is 99.6 percent of false positives not relevant to a case where you're trying to find "fake" atheists from "real" atheists?

If 100 of them showed up as liars, 99.6 of them could easily be false positives. Is this not significant? The only real "fake" atheist would be .4 of a person! You're talking about atheist legs here, not even above the hips! :D
 
Last edited:
For pete's sake, this is a really annoying discussion.

What percent accuracy do you want? What is your estimate? What's your evidence?

Take that then apply it to the group of 100 tests. What do you get?

How do you know it is correct any of the time?

Will you present a paper at TAM?
 
That would mean that 99.6 percent your positives of "real atheists" could very well be false positives.

Highly unlikely. The stats just don't bear this out.


How is 99.6 percent of false positives not relevant to a case where you're trying to find "fake" atheists from "real" atheists?

Because we have a priori (by assumption) knowledge of the percentages of "real spies" in the case under discussion. Specifically, by assumption, 0.1% of the "employees" are "spies" in the case under discussion. Another way of looking at it is that in the pool of people who fail the polygraph examination, the percentage of spies is four times higher than in the population at large, which is quite a substantial lincrease.

If 100 of them showed up as liars, 99.6 of them could easily be false positives.

Not under the circumstances of the test as described.

By assumption, under the test you (and the NAS) described, the polygraph is about 80% accurate, which means that in a pool of 100 genuine atheists, you would expect about 20 of them to be (incorrectly) identified as liars. The form of this distribution is well-understood. For example, the chance of getting between 15-25 (incorrect) liars is about 65%. The chance of getting betweel 10-30 (incorrect) liars is about 95%. The chance of getting between 5-35 (incorrect) liars is about 99%. Alternatively, the chance of getting 31 or more "false positives" is less than 1%.

The chance of getting 50 or more "false positives" is sufficiently small that it would qualify for Randi's million. The chance of getting 99.6 "false positives" is sufficiently small as to make Randi's million a sure thing.

The difference? The size of the subject pool, and the rarity of the syndrome under study. With 10000 truthful subjects, I will probably find at least 2000 false liars by chance alone, but that would still dwarf something of which there are only ten to find in the pool. But with 100 truthful subjects, I will probably only find about 20 false liars --- and so if I find fifty liars, I can be confident that about thirty of them are probably genuine.

(I should point out that this is a well-studied problem in biostatistics. There's nothing "woo" about tests with less than perfect sensitivity and accuracy, they're a simple fact of life. Which is why it's so difficult to do large-scale testing for rare diseases. Some of the best work on this has been done by the various Armed Forces, when they're looking for rare tropical diseases in returning soldiers....)


Is this not significant?

Not in the slightest.

The NAS report is talking about large-scale screening, and identifies, correctly, that any detection device with only moderate accuracy is not very useful for such large-scale screening. Scepticgirl is talking about event-specific, small-scale analysis. Different problems, different issues, different solutions, and different requirements.
 
Last edited:
But this discussion was never about using polygraphs to weed out spies. It was about drawing a conclusion about 100 tests grouped together.

My recommendation:

Stop trying to explain statistical analysis of experimental results to people who don't understand statistical analysis.

Sometimes it is better to throw in the towel to stop yourself getting a headache.
 
Ah yes, the "those ignorant people!" argument and "we can't learn them!"

Yeah, I'm unsubscribing from this thread. Doubt it'll go anywhere.
 
The only problem I see here is that you are all speaking about lie detection, statistics, but all polygraph does is *at best* try to detect "emotional stress" as shown by the various measurement of bodily function, in comparison to a baseline. Sometimes , but not always, a lie will generate (for an unaware/untrained persons) a fluctuation above the baseline (that is the sad theory). But so will many other stuff. And since the baseline is made by an operator NOT on valid scientific measurement or whatnot, but more or less on gut feeling, this make it already doubtful.

So it does not matter if it detects 80% truth, if along the way it detect 100% emotional stress, and thus loaded question , and question with an emotional baggage. You would have to first separate the emotional response to the lie, and for such question this would darn be impossible. Your polygraph test would show that on the question "did you rape your neighbors child multiple time ?" most people would fail the polygraph test.
 
Last edited:
How do you know it is correct any of the time?

CFLarsen, you need to step back and think about this. Statistical tests can tell you how well an imperfect test works. We do it all the time:

1) A self-proclaimed psychic says "I can psychically read Zener cards while blindfolded." We draw 100 known Zener cards and as the psychic to guess them. Then we compare each guess to the card. At the end of the test we can say "The psychic got 70% of these guesses right". In the future, if the psychic guesses at an unknown Zener card, we make an inference: "The psychic has a 70% chance of getting this guess right."

2) A nuclear-physics experiment has a detector which makes "blips" in response to neutrons and to alpha particles. Usually neutrons make sharper blips and alphas make broader blips. We expose the detector to a neutron beam and find that 70% of the blips are sharp; then we expose it to an alpha beam and find that 70% of the blips are broad. Later on, when we see an unknown blip in the detector, we check whether it was sharp or broad, and make a statistical inference (with appropriate error bars) about whether that blip came from a neutron or an alpha.

(This does *not* tell you how many false-positives or false-negatives this detector produces: that's a function both of the detector acceptance and rejection (as we call them) and of the actual neutron/alpha ratio encountered in the experiment.)

3) We take 100 people and make them play a psych-experiment-ish game. "Test subject #1, you are now on the Red team. You need to steal a flag from the Blue side; first, sneak past the Blue lie-detector by convincing it that you are Blue, then come back and present yourself to the Red lie-detector." Or whatever. At the end of the experiment, you have 100 examples of people lying (as the experimenter, you *know they're lying*) and 100 examples of people telling the truth (and, as the experimenter, you *know* they're telling the truth.) You can tally up that "70% of liars were correctly identified as liars by the machine", and "25% of truth-tellers were misidentified as liars".

It's just like any other test. You need a "calibration" phase where you provide the machine with known quantities (known neutrons, known Zener cards, known lies), and check how often its reports are correct.

So, a lie-detector calibration experiment is a perfectly reasonable thing. Your general objection is mistaken, and if you think about it a bit you ought to withdraw it. You may still (rightly) have a specific objection: "Has an appropriate lie-detector-evaluation actually been done? And what were the results?" But skepticgirl's links are a perfectly acceptable attempt to answer that objection. If you want to fisk the detailed experimental design, please do so.
 
The only problem I see here is that you are all speaking about lie detection, statistics, but all polygraph does is *at best* try to detect "emotional stress" as shown by the various measurement of bodily function, in comparison to a baseline. Sometimes , but not always, a lie will generate (for an unaware/untrained persons) a fluctuation above the baseline (that is the sad theory). But so will many other stuff. And since the baseline is made by an operator NOT on valid scientific measurement or whatnot, but more or less on gut feeling, this make it already doubtful.

I'm afraid that this isn't correct. The baseline is made by asking the subject a number of other questions, including questions that are known to be stressful in and of themselves (as well as questions that are known to not generally be stressful), and calculated on a per-subject basis.


So it does not matter if it detects 80% truth, if along the way it detect 100% emotional stress, and thus loaded question , and question with an emotional baggage. You would have to first separate the emotional response to the lie, and for such question this would darn be impossible.

Not at all. It's standard practice.

Your polygraph test would show that on the question "did you rape your neighbors child multiple time ?" most people would fail the polygraph test.

One doesn't "fail" or "pass" a polygraph test on individual questions. In this regard, the term "lie detector" is misleading; the results of polygraph tests give the overall probability that the subject is attempting deception, not simple detection of individual lies. That's one reason why the tests are so long, so that the same question (or very subtle variations) can be asked over and over again --- to lie to one, you will have to lie to all (and therefore you will display an overall pattern of deception). (It also works to reduce the emotional response --- you may be shocked and offended the first time you are asked about raping your neighbor's child, but such shock and offense wears off relatively quickly if you are innocent; much of the emotional stress will vanish by the end of the test, simply because by that time you're simply responding truthfully to a rather stupid question.

The rest of it is simply empirical science. It is in fact relatively easy to set up a controlled test where the truth or falsity of the statements are known to the overall investigators (but not to the individual polygraph operators); Mythbusters has done such a test (and the polygraph worked). In the Mythbusters experiment, Kari, Tore, and Grant were handed sealed envelopes (at random) in which two of them were instructed to go into a room and steal a wallet. Kari was told to go into a room and look around but not touch anything. The polygraph operator was told that a wallet had been stolen and the three suspects were our three friends.

In this case, "we" (the viewers) know the truth; if the polygraph operator nailed Kari but let the others walk, we would know he had missed, and conversely, if he nailed the guys but let Kari walk, he hit. As it happened, he hit. Not that impressive in a sample size of three, but this is TV, not science.

If you want the science, check out the opening citation. They cite 52 different studies and present ROC curves for all 52. If you've never seen such curves before, they may be difficult to interpret, but the curves themselves are fairly standard (and rather impressive).

"Chance" performance would be the diagonal line from lower left to upper right. No study came close to doing that poorly. All studies had a much higher chance of identifying a deceptive person as deceptive than a non-deceptive one, at any meaningful threshhold.

For example, say that you will accept a 10% "false positive" rate -- a fairly conservative one. In practice, this means that you look at the "baseline" (there's that word again) that 90% of the known truthful people never crossed. Now, the question becomes : how many of the known liars crossed that baseline (and would be detected as liars)?

In the worst-performing study, about 40% of the liars would be caught by a test that caught only 10% false positives. In the best-performing one, 100% of the liars were caught at that level.

So out of a group of 200 people, half liars, we would expect (from a high-performing case) to get about 95 of the 100 liars, and 10 of the truth-tellers; of the 105 people who failed the test, better than 90% were in fact liars. That's pretty good. (Actually, from the best performing case, we'd get all the liars, but ceiling effects enter here and you can't trust that particular number to hold up.)
 
So, a lie-detector calibration experiment is a perfectly reasonable thing. Your general objection is mistaken, and if you think about it a bit you ought to withdraw it. You may still (rightly) have a specific objection: "Has an appropriate lie-detector-evaluation actually been done? And what were the results?" But skepticgirl's links are a perfectly acceptable attempt to answer that objection. If you want to fisk the detailed experimental design, please do so.

I understand the concept fine. However, your third example is countered in the study skeptigirl linked to:

Realism of Evidence
The research on polygraph accuracy fails in important ways to reflect critical aspects of field polygraph testing, even for specific-incident investigation. In the laboratory studies focused on specific incidents using mock crimes, the consequences associated with lying or being judged deceptive almost never mirror the seriousness of those in real-world settings in which the polygraph is used. Polygraph practitioners claim that such studies underestimate the accuracy of the polygraph for motivated examinees, but we have found neither a compelling theoretical rationale nor a clear base of empirical evidence to support this claim; in our judgment, these studies overestimate accuracy. Virtually all the observational field studies of the polygraph have been focused on specific incidents and have been plagued by measurement biases that favor over-estimation of accuracy, such as examiner contamination, as well as biases created by the lack of a clear and independent measure of truth.
From skeptigirl's link

Probably the biggest problem with polygraphs is that we cannot distinguish between lying and being emotionally distressed for other reasons.

Polygraphs have all the characteristics of a woo claim: The theory behind it is very weak, the evidence is very weak, there is no standardization, there is no scientific progress, there is no accumulated knowledge or accumulated evidence in favor of it.

skeptigirl claims otherwise. I want to see her defend her position at TAM. As skeptics, we simply cannot sit by and let such a claim go unchallenged.
 
Polygraphs have all the characteristics of a woo claim: The theory behind it is very weak, the evidence is very weak, there is no standardization, there is no scientific progress, there is no accumulated knowledge or accumulated evidence in favor of it.

They have all the characteristics of a woo claim except for one. Unlike woo claims, there is a solid body of peer-reviewed evidence that demonstrates that their performance is substantially better than chance.


Neither the lack of standardization, of progress, or of accumulated knowledge are signs that a claim is unfounded, only that it is poorly understood. This is something that the NAS makes totally clear in their report.

Despite this, they are very firm that under proper conditions, polygraphs are effective. From the executive summary,

We conclude that in populations of examinees such as those represented in the polygraph research literature, untrained in countermeasures, specific-incident polygraph tests can discriminate lying from truth telling at rates well above chance (emphasis in original)

They are very concerned about the use of polygraphs for security screening, but that's (as has been pointed out) entirely different.

It is a misrepresentation verging on a lie to claim that the evidence that a specific-incident polygraph test can distriminate lying from truth telling at rates substantially better than chance is weak.
 
They have all the characteristics of a woo claim except for one. Unlike woo claims, there is a solid body of peer-reviewed evidence that demonstrates that their performance is substantially better than chance.

The quote you provided does not say "substantially".

Do you think that skeptics like Randi, Shermer and Bob Carroll should change their minds about polygraphs?

Neither the lack of standardization, of progress, or of accumulated knowledge are signs that a claim is unfounded, only that it is poorly understood. This is something that the NAS makes totally clear in their report.

We are not talking about a new method here. Polygraphs are about 80 years old, but we don't see a heightened understanding of what it is. On the contrary, the more we learn about the psychology and physiology of lying, the less credible are polygraphs.

Despite this, they are very firm that under proper conditions, polygraphs are effective. From the executive summary,

They are very concerned about the use of polygraphs for security screening, but that's (as has been pointed out) entirely different.

It is a misrepresentation verging on a lie to claim that the evidence that a specific-incident polygraph test can distriminate lying from truth telling at rates substantially better than chance is weak.

It is absolutely neither misrepresentation or a lie. You, OTOH, misrepresented the conclusion by deliberately leaving out two the preceding sentence, and the following sentence:

Notwithstanding the limitations of the quality of the empirical research and the limited ability to generalize to real-world settings, we conclude that in populations of examinees such as those represented in the polygraph research literature, untrained in countermeasures, specific-incident polygraph tests can discriminate lying from truth telling at rates well above chance, though well below perfection.

The problem is that you can't know in real life if your population has been trained in countermeasures.
 
The quote you provided does not say "substantially".

That is correct. It says "well."

Do you think that skeptics like Randi, Shermer and Bob Carroll should change their minds about polygraphs?

I do not have personal knowledge of their beliefs. To the extent that they believe that polygraphs cannot distinguish lying from truth telling, then yes, they should change their minds.




We are not talking about a new method here. Polygraphs are about 80 years old, but we don't see a heightened understanding of what it is. [/QQUOTE]

Which is not relevant. We didn't understand the mechanism of genetics for 80 years after Darwin (with the Nobel-winning work of Watson and Crick), but lack of understanding of the mechanism did not prevent us from observing the reality.

For that matter, we knew that the orbit of Mercury did not match Newton's predictions for 80 years before Einstein solved the gravity problem.


It is absolutely neither misrepresentation or a lie. You, OTOH, misrepresented the conclusion by deliberately leaving out two the preceding sentence, and the following sentence:

The only misrepresentation is on your part.

if your claim is that a technology must be "perfect" to be not be woo -- well, then you've just put both penicillin and space flight into the woo category. If you claim that only experiments done under field conditions can demonstrate the viability of a theory, then you've put both biological evolution and electroweak unification in to the woo category. If you insist that only a fully-developed technology is not woo --- well, I'm not sure what areas of technology cannot be improved. Perhaps the screwdriver.

Despite the fact that the technology works much better in the lab than in the field, the simple fact that the technology works at all and can be reproduced at will makes it "not woo."





The problem is that you can't know in real life if your population has been trained in countermeasures.
 
That is correct. It says "well."

Why did you overstate the effectiveness of the polygraph?

Which is not relevant. We didn't understand the mechanism of genetics for 80 years after Darwin (with the Nobel-winning work of Watson and Crick), but lack of understanding of the mechanism did not prevent us from observing the reality.

For that matter, we knew that the orbit of Mercury did not match Newton's predictions for 80 years before Einstein solved the gravity problem.

Nonsense. The basic premise of the polygraph - that it says "Lie" whenever you are stressed - is a flawed premise.

The only misrepresentation is on your part.

And yet, you said "substantial".

Did you know that:

2003 National Academy of Sciences Report

The accuracy of the polygraph has been contested almost since the introduction of the device. In 2003, the National Academy of Sciences (NAS) issued a report entitled “The Polygraph and Lie Detection”. The NAS found that the majority of polygraph research was of low quality. After culling through the numerous studies of the accuracy of polygraph detection the NAS identified 57 that had “sufficient scientific rigor”. These studies concluded that a polygraph test regarding a specific incident can discern the truth at “a level greater than chance, yet short of perfection”. The report also concluded that this level of accuracy was probably overstated and the levels of accuracy shown in these studies "are almost certainly higher than actual polygraph accuracy of specific-incident testing in the field.”
Source

?

if your claim is that a technology must be "perfect" to be not be woo

It isn't.

If you claim that only experiments done under field conditions can demonstrate the viability of a theory

I don't.

If you insist that only a fully-developed technology is not woo

I don't.

Despite the fact that the technology works much better in the lab than in the field, the simple fact that the technology works at all and can be reproduced at will makes it "not woo."

Can't agree there. The polygraph cannot be used to detect whether people lie or not. Simple as that.

I do not have personal knowledge of their beliefs. To the extent that they believe that polygraphs cannot distinguish lying from truth telling, then yes, they should change their minds.

Argue that at TAM. Write an article and send it to Skeptic Magazine or SkepticReport.
 
Write an article and send it to Skeptic Magazine or SkepticReport.

Not a chance. I don't trust the editorial integrity of the SkepticReport editors. I think they're too strongly biased to give fair reading to articles, and they don't have the necessary background to understand the statistics.
 
Not a chance. I don't trust the editorial integrity of the SkepticReport editors. I think they're too strongly biased to give fair reading to articles, and they don't have the necessary background to understand the statistics.

You don't write for the editor - singularis. You write for the audience.

I'll publish whatever you write as it is, completely unedited.

What about Skeptic Magazine, or TAM?

Why did you overstate the effectiveness of the polygraph?

Did you know that the report also:

concluded that this level of accuracy was probably overstated

?
 
Y
Did you know that the report also:

concluded that this level of accuracy was probably overstated
?
?

And this is why I won't write for SkepticReport. The editors cherry-pick and misquote, out of context, nearly as badly as creationists do.

What, in context, is 'this level of accuracy"?

Your prior quotation, from Wikipedia, suggests that:

After culling through the numerous studies of the accuracy of polygraph detection the NAS identified 57 that had “sufficient scientific rigor”. These studies concluded that a polygraph test regarding a specific incident can discern the truth at “a level greater than chance, yet short of perfection”. The report also concluded that this level of accuracy was probably overstated and the levels of accuracy shown in these studies "are almost certainly higher than actual polygraph accuracy of specific-incident testing in the field.”

This misrepresents the NAS report at several levels.

First, "the studies" did not conclude that polygraphs could discern the truth at "a level greater than chance, yet short of perfection." That conclusion, as well as the quotation, is NAS's. (Report, p.4) Individual studies, of course, have their own individual estimates which vary widely as are reported in the appendix.

In that same paragraph, they compare the use of polygraph for two completely different purposes --- the use as a specific-incident investigation tool and the use as a screening tool. The report stated (p. 4) that "because actual screening applications involve considerably more ambiguity for the examinee and in determining truth than arises in specific-incident studies, polygraph accuracy for screening purposes is almost certainly lower than can be achieved by specific-incident polygraph tests in the field."

That says nothing about the accuracy of the polygraph per se, merely that the accuracy to be expected in screening situations is lower than that in specific-incident situations. It's an apples to oranges comparison, with us having a priori reason to believe that oranges are harder. To make the fact that oranges are believed to be harder into an indictment of apples is misrepresentation.

So, in direct answer to your question, I did not know that the report claimed that better-than-chance accuracy for polygraph testing is probably overstated, because it simply did not make that claim.

The closest to this is probably the section on p. 128 :

Theory and basic research give no clear guidance about whether laboratory conditions underestimate or overestimate the accuracy that can be expected in realistic settings.

Available data are inadequate to test these hypotheses. [...]

Evidence from Medical Diagnostic Testing. Substantial experience with clinical diagnostic and screening tests suggests that laboratory models, as well as observational field studies of the type found in the polygraph literature, are likely to overstate true polygraph accuracy.Much information has been obtained by comparing observed accuracy when clinical medical tests are evaluated during development with subsequent accuracy when they become accepted and are widely applied in the field. An important lesson is that medical tests seldom perform as well in general field use as their performance in initial evaluations seems to promise (Ransohoff and Feinstein, 1978; Nierenberg and Feinstein, 1988; Reid, Lachs, and Feinstein, 1995; Fletcher, Fletcher, and Wagner, 1996; Lijmer et al., 1999).

The reasons for the falloff from laboratory and field research settings to performance in general field use are fairly well understood.

In other words, the problem is with the specific numeric claims of accuracy, not with the "better than chance" and is to be expected in any transition from laboratory to field.

This is followed up (p. 129) by
In view of the above issues, we believe that the range of accuracy indexes (A) estimated from the scientifically acceptable laboratory and field studies, with a midrange between 0.81 and 0.91, most likely over-states true polygraph accuracy in field settings involving specific-incident investigations.

Thus, what is specifically overstated are the estimates (from 0.81 to 0.91), not the simple better-than-chance accuracy.

They specifically reiterate this point in their conclusion. After discussing (p148) a number of problems with measuring accuracy, including the overestimation problem to which they specifically mention (p. 149: "the accuracy index most likely overestimates performance in realistic field situations due to technical biases in field research designs, the increased variability created by the lack of control of test administration and interpretation in the field, the artificiality of laboratory settings, and possible publication bias.") they nevertheless continue (p. 149) with

Despite these caveats, the empirical data clearly indicate that for several populations of naïve examinees not trained in countermeasures, polygraph tests for event-specific investigation detect deception at rates well above those expected from random guessing. Test performance is far below perfection and highly variable across situations. The studies report accuracy levels comparable to various diagnostic tests used in medicine.

The study goes on to warn
We note, however, that the performance of medical diagnostic tests in widespread field applications generally degrades relative to their performance in validation studies, and this result can also be expected for polygraph testing
but this fact obviously has not reduced the usefulness of medical diagnostic tests "in widespread field applications" despite the marginally degraded performance accuracy.

It is therefore fair to say that the NAS is largely positive about the ability of polygraphs to distinguish lying from truth-telling under laboratory conditions, with naive subjects, in specific-incident investigations. Indeed, the NAS explicitly makes such a statement at least twice, on page 4 and page 149.

There is no statement in the entire body of the NAS report that suggest that, under such conditions, polygraph testing is no more accurate than chance, or even that polygraph testing is not significantly more accurate than chance. They do suggest that some of the specific claims of laboratory accuracy may not be reproducible under field conditions, but this is neither surprising nor does it make the technology invalid.

I stand by my writing. You, and Wikipedia, have misrepresented the content of the NAS report in a way to make it much less positive about the actual obtainable accuracy of polygraph testing. Given this history of editorial misrepresentation, there is no possible way that I would write for your journal.
 
Oh thank the flying spaghetti monsters some rational people have joined the discussion.

Lonewulf, 3point did not call you ignorant. But you are uninformed or uneducated about some of the principles involved in this discussion. As is Aepe. (I'll get to Claus in the next post).

---

First to Aepe (though I am repeating what has already been said, sometimes it helps to hear things in more than one way), we use many indirect measurements to tell us about something else. For example, I might measure my body temperature to tell me if I have an infection. I am not measuring infection, I am measuring body temperature.

I do need to validate that body temperature is a measure of infection. But once I do that, I may continue to measure body temperature and know that it is a reasonable indicator of infection.

So measuring emotional responses, once correlated with truthfulness, is a valid means of measuring truthfulness. The question becomes, not are you measuring truthfulness, but are the emotional reactions you are measuring a good indicator of truthfulness. While there is great variability in the studies of whether the emotional responses you measure on a polygraph are an indicator of truthfulness, the results are better than chance.

---

Lonewulf, it matters greatly what you use a test for when deciding if that test is useful. I screen people for HIV. The HIV antibody test is not the most accurate test in the world. I'll get to what we do about that next, but I need to discuss something else first.

Now, take the treatment for HIV. Is it very risky? Yes. Is not treating HIV very risky? Yes. So can I afford false positives and is there a better alternative? That also has to be considered if I am deciding to use an inaccurate test.

If I can't afford false positives the test may not be useful to determine who to treat, but it may still be useful to determine who to do more testing on. So even though the HIV antibody test is fairly inaccurate, it is still very useful to screen people and do further, more expensive tests on those who test positive.

In the case of the inaccurate polygraph, sending someone who is innocent to jail or firing them from their job and labeling them a spy is not acceptable. And there are alternative means of accomplishing your goals. So the inaccurate polygraph is not useful in court nor in weeding out spies. But that doesn't mean the polygraph, because it is inaccurate is totally useless.

Now suppose I wanted to use the inaccurate HIV test, not to determine who to treat and who not to treat, or who to test further, but to determine the rate of HIV in a group or the trend upward or downward of new infections? I can take the rate of false positives and do some calculations first.

In a low risk population, just due to statistical factors alone, more of the positives will be false. In a high risk population just due to statistical factors alone, more of the positives will represent true positives. Take this one step further to see the reason. In a population where everyone is infected with HIV, all of the positives will be true positives. And in a population where no one has HIV, all of the positives will be false positives.

With that background information, I can now test a population and analyze the results even with an inaccurate test.

Say I have a low risk population and a test that is only accurate 50% of the time, I test the group and get 2% positives. I can conclude from that, 1% of the population is infected. I can draw a conclusion about the group which I cannot draw about the individual. An inaccurate test is still very useful, depending on what you are using it for.

--

Now, with that said, I do agree that the polygraph is not as reliable as one would think. It did appear however, that a lot of the inaccuracy depended on the skill of the person tested in thwarting the results. That would make it even less accurate in a court or for weeding out spies since those would be cases where one would expect more people to be purposefully deceptive.

We would need a lot more information about the circumstances of the urban mythical 100 atheist tests to interpret the polygraph results. And since there is no evidence the urban myth is true, it is unlikely we can take this discussion on the polygraph results any further in that thread.
 
Last edited:
It is therefore fair to say that the NAS is largely positive about the ability of polygraphs to distinguish lying from truth-telling under laboratory conditions, with naive subjects, in specific-incident investigations. Indeed, the NAS explicitly makes such a statement at least twice, on page 4 and page 149.

Goodness. I just found a third statement. On p. 178, the report reads (emphasis mine).

The available evidence indicates that in the context of specific-incident investigation and with inexperienced examinees untrained in countermeasures, polygraph tests as currently used have value in distinguishing truthful from deceptive individuals. However, they are far from perfect in that context, and important unanswered questions remain about polygraph accuracy in other important contexts. No alternative techniques are available that perform better, though some show promise for the long term. The limited evidence on screening polygraphs suggests that their accuracy in field use is likely to be somewhat lower than that of specific-incident polygraphs.


There is no suggestion that this "value" is in any way overstated, just the standard note that screening is harder than incident investigation.

In fact, they are very positive about polygraph use in a criminal investigation setting. (p 184)

Suppose that in a criminal investigation the polygraph is used on suspects who, on other grounds, are estimated to have a 50 percent chance of being guilty. For a test with A = 0.80 and a sensitivity of 50 percent, the false positive index is 0.23 and the positive predictive value is 81 percent. That means that someone identified by this polygraph protocol as deceptive has an 81 percent chance of being so, instead of the 0.4 percent (1 in 250) chance of being so if the same test is used for screening a population with a base rate of 1 in 1,000.

Note that the A value proposed, 0.80, is in-line with the estimates from the prior sections; they didn't just pull it out of thin air. But this "contrast sharply" with the numbers from the screening problem : "a test that may look attractive for identifying deceptive individuals in a population with a base rate above 10 percent looks very much less attractive for screening a population with a very low base rate of deception. It will create a very large pool of suspect individuals, within which the probability of any specific individual being deceptive is less than 1 percent—and even so, it may not catch all the target individuals in the net. To put this another way, if the polygraph identifies 100 people as indicating deception, but only 1 of them is actually deceptive, the odds that any of these identified examinees is attempting to deceive are quite low, and it would take strong and compelling evidence for a decision maker to conclude on the basis of the test that this particular examinee is that 1 in 100." (p. 184).



Similarly, the report notes on p. 201 that "Although polygraphs clearly have utility in some settings, courts have been unwilling to conclude that utility denotes validity. The value of the test for law enforcement and employee screening is an amalgam of utility and validity, and the two are not easily separated." (emphasis mine.) As the report discusses in some detail, legal admissibility and scientific validity are separate (although related) concepts, and you can't simply say that something is invalid because it is inadmissible.

Finally, on p. 214, they present the ultimate statement of their finding on accuracy(emphasis in original).

Notwithstanding the limitations of the quality of the empirical research and the limited ability to generalize to real-world settings, we conclude that in populations of examinees such as those represented in the polygraph research literature, untrained in countermeasures, specific-incident polygraph tests for event-specific investigations can discriminate lying from truth telling at rates well above chance, though well below perfection.


Given that, at this point, they have formally recognized that polygraphs work in lab conditions something like eight times in the report, I see no reason for you -- or for anyone -- to dispute the basic fact that they work (in lab conditions).

Which makes them decidedly not woo.
 
Hypothetically, say you asked 100 theists if they really believed in god and they all answered yes and the results showed they were all lying. If the polygraph gives valid results 80% of the time (without trying to fool it the report said the results were valid close to 100% of the time but say it was only 80%), then 80 theists would be lying. You could not tell which 80 were lying and which 20 were not lying but you could conclude 80 were lying based on the average validity of the polygraph results.

I really don't understand why the two of you don't understand the scientific concept of sensitivity and specificity.

It's an old cynic's trick called raising the bar.

No one will be able to provide a good enough explanation or amount of evidence.
 
Drkitten has done a better job addressing Claus that I could hope to.

I asked you Claus, to give your estimate of the accuracy of a polygraph. You can even describe the circumstances. Then we can get to the issue of how to interpret the mythical 100 atheists tests.

So with people trying to deceive the polygraph, the reliability appears abysmal.

With a skilled test administrator and a population not trying to fool the test, the results are better than chance and perhaps even as good as 80%. I am not going to attempt to offer an opinion as to how high that number is except to say the range is wide according to the extensive analysis by the GAO source I cited.
 

Back
Top Bottom