• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Streaming statistics

Why would I want to read anything that you wrote, you loathesome troll?

loathesome troll? please be more polite.
Replying to this modbox in thread will be off topic  Posted By: tim
 
Last edited by a moderator:
What other statistics can we make streaming, you ask?

Well? How about the statisic of having to go to the bathroom, causing p-streaming to the power of number 1? :)
 
If it's all well-known stuff, why are we bothering to rehash it here?
 
I actually blogged about something very similar to this a while back. I can't quite figure out if we're talking about the same thing...

Yes, that's exactly the same, and I also do the same with the standard deviation.

I'm wondering if such things can be done for many other statistics.
 
Why would I want to read anything that you wrote, you loathesome troll?

loathesome troll? please be more polite.
Replying to this modbox in thread will be off topic  Posted By: tim

That was polite.


Edited by Darat: 
Removed mod code - do not use the mod code again, that is for mod team use only.
 
Last edited by a moderator:
My question at the bottom of the page- does anyone know?

Seems fruitful, especially for Internet, genome, and satellite data.

Specifically:
Anything that can be written as a function f(x)g(n) and f and g are polynomials.
(where n is your number of samples)

In general:
Anything that can be considered as a markov chain across your run, e.g. maximum value of an RNG over n iterations

Not the median (or indeed any non-trivial percentile). Not the mode.

Use WinBUGS. Some stats can be stored in summary mode (which uses what you term 'streaming'), some only in sample mode (where you have to store the whole lot). Its well known, and kinda useful, but hardly ground breaking stuff...
 
Somebody who has more time than I do should put some effort into discussing the arithemetic accuracy that this kind of old thing needs to work decently in the real world.
 
Specifically:
Anything that can be written as a function f(x)g(n) and f and g are polynomials.
(where n is your number of samples)

In general:
Anything that can be considered as a markov chain across your run, e.g. maximum value of an RNG over n iterations

Not the median (or indeed any non-trivial percentile). Not the mode.

Use WinBUGS. Some stats can be stored in summary mode (which uses what you term 'streaming'), some only in sample mode (where you have to store the whole lot). Its well known, and kinda useful, but hardly ground breaking stuff...

Thanks Cluster, I'll check it out.
 
Somebody who has more time than I do should put some effort into discussing the arithemetic accuracy that this kind of old thing needs to work decently in the real world.


Well, when extreme accuracy is needed, you can always take my favorite approach: implement a Fraction class (with all the necessary arithmetic operators) that represents each value as a numerator and a denominator (reducing them as necessary), until you ask for a decimal result. Your results will then be EXACTLY accurate, to any arbitrary number of decimal places.

This trick doesn't work for every situation -- if there are roots involved, then you need to convert them to decimal (or create a Root class that does the same sort of thing :). But it should work for the averaging scenario...
 

Back
Top Bottom