The problem in this case is that you know that there are liars at all. You don't know that with the group in Yaffle's case. You could have a group of totally innocent people.
...
You might want to reconsider that one. If its accuracy cannot be reliably estimated, how do you know it is above chance?
These are technical questions, and you will have to pay attention to the answers if you are to understand them. They will actually give you some good ammunition to use against proponents of the polygraph.
In your scenario you would have a group of people that had a false-positive rate much too far above that of the general population rate to be by chance. Unless you had some
a priori reason to expect this, you would assume that the general population false-positive rate should apply (having ruled out equipment malfunction etc.). Of course, you could not be certain, and other evidence could show this assumption to be wrong.
Bear in mind that, again, this problem is completely general to screening tests. In order to 'calibrate' the test, we collect data from what we assume to be a representative sample of the population that will undergo testing (e.g. all males over 60, for a tumour marker). But the actual test population in specific situations will certainly differ, in ways that might be important. For example, the tumour marker test could give very poor results for oriental males, because of different normal marker levels in this population compared with the calibration population. Unless there is the same proportion of oriental males in the test population as in the original calibration, the performance figures are not applicable (and could be completely wrong). And the test would be useless in predominantly oriental populations.
You might say that we should have used different population data for orientals. But how can we know in advance what groupings are significant? Also, if your subdivisions are too fine you can't get enough data in each group for the calibration. This kind of problem occurs all the time in medical screening, and is an important reason why accuracy figures are often much better in the initial studies than in the real-world applications.
I would say that, with polygraphy,
there is every reason to expect that there are large differences in population data between subgroups. In general we can't define these groupings a priori. We might expect biological differences according to ethnicity, but what should the grouping be? Social background, intelligence, criminality, various psychological factors – all of these may be highly relevant, but again we have little idea how to group them in terms of test parameters. Unless we have some confidence that the calibration group is representative of the test group -
which we certainly don't – then we have no idea of the test's real-world accuracy, which could be very low. This is an extremely strong argument against its use.
Another objection is that the questioning strategies are themselves a kind of 'population'. The calibration data applies to a particular population of question types, which will not be representative of the question types in actual use.
It is quite possible to know that a test has some discriminatory power but be unable to estimate its accuracy. If we can show that polygraphy works better than chance with at least some subgroups (of both people and questions) - which we can - then even if it doesn't work at all in others, the overall population accuracy is still better than chance. But in order to calculate this overall accuracy figure, we would have to know the population frequency of the subgroups (which we don't, as we haven't identified the groups), and the accuracy for each group. Any quoted figure for overall accuracy (or for a specific subgroup, unless the calibration data was from that subgroup) is baloney.
I repeat that all this is general for screening tests, but whether it invalidates the test depends on the possibility of identifying significant groups, the magnitude of population data differences between these groups, and the ease of collecting group-specific data. All these factors are likely to be very unfavourable to polygraphy.
These kinds of detail are difficult to argue and explain, aren't they (especially to people who aren't as logical or numerate as we'd like)? Much easier to fall back on labelling the whole field a 'pseudoscience'. But wrong.
But these people in particular would definitely be trained to beat the polygraph.
Screening is done to find those who are untrustworthy. If polygraphs are widely - or even generally - used, then it would be in everyone's interest to learn how to beat it.
Of course, there will always be some groups who won't or can't learn it, but that only means you are punishing the lesser intelligent and rewarding the more intelligent.
As is the case today, where the polygraph is used: Aldrich Ames knew how to beat it, with disastrous results.
That's exactly why I said it might be useful in screening
lines of investigation but not
people (though please note that I am not advocating its use in
any circumstances). For instance, if you are going to investigate everyone on the terrorist's contact list it would make sense to start with ones that the polygraph tells you are more likely to be accomplices, even if it's only 20% more likely. Or if you only have resources to check a few locations for the body, you start with the one that gave the most positive polygraph response.