But it does "work" within the parameters of the study.
Actually, no it doesn't. Look at those p values. Random chance is by far a better explanation.
And taking it up a level... from a design perspective: there were no controls to isolate placebo arms from non-placebo arms, so
we don't know if there was any placebo effect at all.
Specifically, there was no non-treatment control. The results could be the random fluctuations associated with nontreatment. We don't know. The experiment is not designed to detect a placebo effect.
Also from a design perspective: what is the standard deviation of variability from run to run in 4 hour sessions? Why 4 hours?
This experimental protocol for this condition is so vulnerable to Simpson's Paradox that it is not advised for this type of inquiry.
Like you, I am always suspicious of studies with very small sample sizes. I presented this as interesting rather than Earth shattering.
It's not just the small sample sizes - it's exacerbated by the high variability from minute to minute of the effects being measured. The patients are a random noise generator.
Every experiment is interesting, but some are limited to the discussion of how pretty much anything can get published these days, so "follow the data" has become more of a recipe for "get misinformed by publication bias" over the last 20 years. The design does not lead to the conclusions - the conclusions are not following from the experiment - they are an unsupported claim by the authors. We see that all the time.
The interesting question is 'how did this pass peer review in
Neurology?'
I anticipate we'll see analysis on neurologica shortly.
Regarding the 4-hour runs... I suspect the effect disappears when the power of randomization declines over longer timeframes as patients' presentment reverts to a mean.