Which replication are you refering to? In the Bem/Palmer/Broughton paper the Cornell students rated eight studies at 7 (maximum standardness) and of these, half were well above chance( lowest of the four being 37.6). And of the nine studies with the second highest rating, 6.67, all but two were well above chance, (lowest of the 7 being at 36.0.) So this would indicate that the strictest replications get the most highly significant results. Disagree?Ersby said:
If we are to draw a line in the sand and say that the autoganzfeld work of PRL is the new year zero, then the data looks good, I admit. But not great. The strictest replication of the PRL work came out with chance results.
Yes! Now researchers need to do studies with variant protocols, in an attempt to figure out the conditions under which ganzfeld works and under which it does not. This is supposed to be science, man, not just some kind of "rah, rah, psi" lovefest.Amherst said:
I'll go back to my original question, what more do you need? Even more studies?
Loki said:Why does the Ganzfeld effect exist only in a statistical sense?
Loki said:I still don't understand why there is so little focus on individuals in the Ganzfeld research. Do some people *always* do better than chance, or don't they? Do some people *always* do better than other receivers, or don't they? The silence on this issue seems to suggest that such 'exceptional/consistent' individuals do not exist. So what sort of effect are we studying if it isn't 'person' based? Why does the Ganzfeld effect exist only in a statistical sense?
I still don't understand why there is so little focus on individuals in the Ganzfeld research. Do some people *always* do better than chance, or don't they? Do some people *always* do better than other receivers, or don't they? The silence on this issue seems to suggest that such 'exceptional/consistent' individuals do not exist. So what sort of effect are we studying if it isn't 'person' based? Why does the Ganzfeld effect exist only in a statistical sense?
I think the parapsychology community has an unwritten agreement that they won't focus on individuals. Did a lot of that in the '60s and '70s and just got burned (think Geller).
If you had read the originial Psychological Bulletin paper you'd have known that there have been some highly successful, and very intriguing ganzfeld studies of people who research indicates are more likely to be "psychic" than others:Aussie Thinker said:Loki,
Loki you make a great point .. again..
No one ever addresses this because…
1. NO individuals show consistent “high” hits
2. Because they don’t it implies there is just a problem with the process
We all know if these tests TRULY showed some psi effect then it should manifest itself consistently in particular individuals.
Paul,
Which adds to the “don’t focus on THAT.. it clearly shows there is NO effect” myopia of the parapsychology community !
amherst said:
Which replication are you refering to? In the Bem/Palmer/Broughton paper the Cornell students rated eight studies at 7 (maximum standardness) and of these, half were well above chance( lowest of the four being 37.6). And of the nine studies with the second highest rating, 6.67, all but two were well above chance, (lowest of the 7 being at 36.0.) So this would indicate that the strictest replications get the most highly significant results. Disagree?
amherst
I don't understand why you have a problem with the meta-analysis having a standardness criteria. Like the authors write in the paper, in order to understand psi, experimenters must be willing to risk replication failures by changing the procedure. And so if you want to do a meta-analysis to see if the original experiments (done with the expressed purpose of demonstrating psi) have been replicated, you need to be sure that the experiments you are grouping together adhere to the standards of that original work.Ersby said:The problem I have with the Bem m-a is that it took an existing set of data, added more data and then placed a new criteria (new inasmuch as it hadn't been used before in previous meta-analyses) and then tried again. This seems to me like having a second roll of the dice.
Unless you're seeing something I'm not, the question as to whether standard replications achieved better results than the non-standard experiments really isn't disputable:As for the scores of the sctrictest replications being highest, look at the figures again. The group of experiments with the best scores are the ones from 4-5. Together they get a hit rate of 41.8% so no, I can't necessarily agree that the strictest adherence to "standardness" gives the best results.
I'll say it again, I think a replication is where someone does the same experiment and gets the same results. A meta-analysis is not a replication. It is used to look for patterns in a large database. It is from a meta-analysis that one proceeds to make replications, not the other way round.
amherst said:
Unless you're seeing something I'm not, the question as to whether standard replications achieved better results than the non-standard experiments really isn't disputable:
"This same outcome can be observed by defining as standard the 29 studies whose ratings fell above the midpoint of the scale (4) and defining as non-standard the 9 studies that fell below the midpoint (2 studies fell at the midpoint): The standard studies obtain an overall hit rate of 31.2%, ES = .096, Stouffer Z = 3.49, p = .0002, one-tailed. In contrast, the non-standard studies obtain an overall hit rate of only 24.0%, ES = -.10, Stouffer Z = -1.30, ns. The difference between the standard and non-standard studies is itself significant, U = 190.5, p = .020, one-tailed. Most importantly, the mean effect size of the standard studies falls within the 95% confidence intervals of both the 39 pre-autoganzfeld studies and the 10 autoganzfeld studies summarized by Bem and Honorton (1994). In other words, ganzfeld studies that adhere to the standard ganzfeld protocol continue to replicate with effect sizes comparable to those of previous studies."
Of course a meta-analysis isn't a replication. In this case it is a grouping of replications. And this grouping reveals that most experiments which adhere to the PRL procedure will get results which are significant.
amherst
The "standardness" ratings of the three raters achieved a Cronbach’s a of .78. The mean of the three sets of ratings on the 7-point scale was 5.33, where higher ratings correspond to greater adherence to the standard ganzfeld protocol. As hypothesized, the degree to which a replication adheres to the standard ganzfeld protocol is positively and significantly correlated with ES, rs(38) = .31, p = .024, one-tailed.
There were two studies which were rated at 4.00. Since 4.00 is midpoint, these two studies (one at 15.6, the other at 45.1) were not considered standard or non-standard and therefore had no effect on either's combined hit rate. Studies rated over 4.00 are standard, so what's the problem?Ersby said:So you have no comment about the fact that studies that scored 4-5 get the best results?
Below 4 is the point where standard becomes non-standard, and above 4 is the point non-standard becomes standard. This is because an equal amount of places (1,2,3) are non-standard as standard ( 5,6,7.) There is nothing arbitrary about this. 5.33 is the average rating the three raters gave to the 40 experiments. I have no clue as to why you think it should be used as the midpoint for rating standardness.Take another look at the paper you quoted, in particluar the paragraph just before your quote.
This makes no sense. Doing the sums myself, if you use 5.33 as the point where standard becomes non-standard, then the hit rate is larger in the non-standard half (31.1% compared to 30.4%). In fact, as you demonstrate, the paper goes on to talk in more detail about the difference of using 4 as the point where standardness becomes non-standard. They even talk about the difference in hit rates, which they don’t do for their calculations with 5.33 as the mean..
Which brings me back to that minor point I raised earlier. Why choose 4 as the average? “Standardness” has no numerical value of itself. And why chose 5.33 for that matter? Neither seems better than the other, except they give quite different results. It all goes to demonstrate that standardness is just a construct that has no intrinsic meaning other than an attempt to reorder the data to get the hit rate back up towards thirtysomething percent.
First off, I don't know why you think the experiments Bierman mentions in his paper negate the entire original ganzfeld database. Bierman says nothing of the sort. He actually writes that "As argued in the results-section the direct scoring rates however do not invalidate previous meta-analysis."As for adding more studies to the m-a, I've no problem with that. It's the "standardness" criteria which bothers me. It doesn't exist in the first Honorton m-a so the two can't be compared, but then again now we've demonstrated that additional studies reduce the effect size of this dataset to zero, I guess that's by the by.
The laboratory was forced to close (due to lack of funding) even before all the original sessions schedueled to be conducted with the students had been completed.Loki said:amherst,
I'm aware of the "Juilliard Sample" - and it's precisely what I'm referring to. Notice the figures given in your quote :
"10 male and 10 female undergraduates...these students achieved a hit rate of 50% (p = .014)"
"8 were music students, .... The musicians were particularly successful: 6 of the 8 (75%) successfully identified their targets"
Perhaps the most significant term in the quote you provided was (and it's what I'm referring to) :.
"Each served as the receiver in a single session ..."
Hit rates FAR beyond the normal, and yet these people were not tested again? Either as a group, or individually? Why?
amherst said:Below 4 is the point where standard becomes non-standard, and above 4 is the point non-standard becomes standard. This is because an equal amount of places (1,2,3) are non-standard as standard ( 5,6,7.) There is nothing arbitrary about this. 5.33 is the average rating the three raters gave to the 40 experiments. I have no clue as to why you think it should be used as the midpoint for rating standardness.
amherst
amherst said:
All the government ESP work was done with gifted subjects. A man by the name of Patrick Price was probably the most talented. Here's a link to one of the targets Price was asked to view, and the drawing he made when given only the geographical cordinates:
http://www.lfr.org/csl/practical/ops_3.html
amherst
amherst said:
There were two studies which were rated at 4.00. Since 4.00 is midpoint, these two studies (one at 15.6, the other at 45.1) were not considered standard or non-standard and therefore had no effect on either's combined hit rate. Studies rated over 4.00 are standard, so what's the problem?
amherst
amherst said:
First off, I don't know why you think the experiments Bierman mentions in his paper negate the entire original ganzfeld database. Bierman says nothing of the sort. He actually writes that "As argued in the results-section the direct scoring rates however do not invalidate previous meta-analysis."
amherst