• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

What is a random variable...

mijopaalmc

Philosopher
Joined
Mar 10, 2007
Messages
7,172
and how is it "random"? Or does the phrase "random variable" mean something different that a combination of "random" and "variable" (i.e., a variable that is random, can take on random values)?

Note: I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability bability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

Can anyone help me come up with non-jargon-ridden answers to the questions above?
 
and how is it "random"? Or does the phrase "random variable" mean something different that a combination of "random" and "variable" (i.e., a variable that is random, can take on random values)?

Note: I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability bability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

Can anyone help me come up with non-jargon-ridden answers to the questions above?

The Wikipedia article is fairly clear in its description.

A discrete random variable is a variable that takes on specific values according to an associated probability distribution.
For example the random variable X for tossing a coin has values X = 1 (if heads) or X = 0 (if tails) and the associated probability distribution is p(x) = 0.5 (if x = 1), 0.5 (if x = 0) and 0 otherwise.

A continuous random variable is a variable that takes on any of the values that are associated with a probability density function.
 
Simply speaking, a random variable is a variable that can take on a random value - no great surprise. The values are, of course, controlled by the statistical parameters of the collection they are drawn from - a Gaussian distribution with mean zero and standard deviation of 1 will have values clustered within a short distance of 0 usually, with the occasional value farther away from zero.
 
I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

More context would be useful: To whom are you trying to explain the concept, and how much do they know about probability?

But, basically, a random variable is the sort of thing about which one can meaningfully ask, for any number x, what the probability is that it is less than x. That is, if X is a random variable, the statement "X < 5", for example, is not something definitely true or definitely false, but rather is something that has some probability of being true. (That probability might be 1 or 0, so I don't mean to exclude the possibility that the statement is definitely true or definitely false; I just mean to include other possibilities.)
 
This easiest way is to narrow down a little bit.

Just think about a 6 sided die. If you roll it 6 millions times then you should expect approxiamately 1 millions "1's", one million "2's" and so on. You would also expect that there would be no discernible pattern in the rolls because it's realisticly unpredictable.

So, you could define a set of random numbers as any sequence of numbers which are drawn from a set in such a way that all outcomes (1 to 6 in the case of a die) are equally likely and the sequence lacks any discernible pattern. You could, of course, throw in a caveat that not all outcomes have to be equally likely ( as if the die had a weighted side). This is a just a quick and dirty explanation that I think gets to the meat of the matter. I'm not an expert in random numbers though.
 
Last edited:
Depending on who you're aiming at, a bottom-up approach might be more accessible. Start with a concrete scenario, "I lost my cat this morning. Sometime she goes to the park, and sometimes to the fish shop. What should I do? " Discuss what's the variable in the scenario, what are the values and how probability fits in. This can lead on to discussing outcomes with zero probability (are they really outcomes?), outcomes with probability=1 (is such a variable really random?) etc, etc. The concept can be compared with constants and deterministic variables, "She always goes to the park", " She goes to park during the day, and heads to the fish shop at 5pm when they throw out the old fish."

The example I've given may not be very appropriate for your purposes - but having something concrete to "abstract from" can help those of us who are less mathematically inclined.
 
Well, mathematically a "random variable" is just a (measurable) function on a probability space, normally whichever probability space you happen to be talking about at the time, and including trivial cases such as constant functions. As to why:

the way probability is formalized in maths is to put all the randomness into the probability space. For example, if you are discussing questions about sequences of dice rolls, your probability space would consist of all possible (infinite, say) sequences such as 1435235333462... There's also a "measure" on the space that says what the probabilities of certain subsets are. (Ignoring technicalities, you can think of any subset as having a probability.) A random variable is then something like "the 2nd roll" (here 4), or "the sum of the 2nd and 4th rolls", which is a function taking a sequence (element of the probabilty space) and returning a number.

To analyze the behaviour of the random variables you just need to understand the probability space itself. For example, the probability that the sum of the 2nd and 4th rolls is 7 is the probability of a certain subset of the probability space, namely the one where the sum of the 2nd and 4th terms in the sequence is 7.

As to why it's formalized like this: well, it seems to work!
 
Here is the comment that spurred this thread:

Technically the term you defined was 'random variable' not 'random'.

You inferred that 'random' would have the same sense when it was found in other terms. Generally, people rejected your inference for good reason. While it is okay to separate adjective from a phrase with a given sense and apply them with that sense to other phrases in common language, this is not acceptable with technical terms. Each technical term has a specific definition that may not follow from the senses of its constituent parts of speech.

Thus, since you used a technical definition of 'random variable', your inference is invalid. The observation that this inference makes all systems random, is an example of the odd sorts of conclusions you reach when you make false inferences.

My contention in the thread in which the above comment occurs is that evolution is mathematically random because not every individual of a given phenotype produces reproductively viable offspring by virtue of their possessing that specific phenotype. As I understand it, the above observable fact jibe really well with the concept of a probability measure designating how often an event will occur with respect to all existing events in the sigma-algebra of a probability space.

Now it is also possible that I am misunderstanding evolutionary biology or the application of probability theory to evolutionary biology, but I am primarily interested in checking my factual knowledge of probability theory in and of itself.
 
I am primarily interested in checking my factual knowledge of probability theory in and of itself.

In several references on probability and statistics I once checked, the term "random" is not defined and is almost never used by itself (e.g. outside phrases like "random variable", which are defined). There is a good reason for that.

As Meridian says above, one defines a "random variable" by putting all the randomness into the probability space. Which more or less begs the question - it gives you a concrete framework to work with, but it does not explain what the connection is to the physical world, or precisely what "random" means, or where the probability space came from.

Those are (in my opinion) very deep questions, to which no one knows the answer. As I have said before, I think the best definition of "random" is something that is fundamentally unpredictable.

Let me try to formulate that more precisely: A random event is an event whose outcome cannot be predicted with a confidence that tends to 1 when the errors in your knowledge of the initial data tend to 0.

Comments?
 
In several references on probability and statistics I once checked, the term "random" is not defined and is almost never used by itself (e.g. outside phrases like "random variable", which are defined). There is a good reason for that.

As Meridian says above, one defines a "random variable" by putting all the randomness into the probability space. Which more or less begs the question - it gives you a concrete framework to work with, but it does not explain what the connection is to the physical world, or precisely what "random" means, or where the probability space came from.

Those are (in my opinion) very deep questions, to which no one knows the answer. As I have said before, I think the best definition of "random" is something that is fundamentally unpredictable.

Let me try to formulate that more precisely: A random event is an event whose outcome cannot be predicted with a confidence that tends to 1 when the errors in your knowledge of the initial data tend to 0.

Comments?

Does your "cannot" mean logically impossible, mathematically impossible, physically impossible, currently impractical or something else?
 
Does your "cannot" mean logically impossible, mathematically impossible, physically impossible, currently impractical or something else?

That's a good question. As a physicist I like to define things operationally, so I would go with "physically impossible under any circumstance". You're allowed to collect as much data as you like, have as powerful a computer as you like, make as precise measurements as is poassible, so long as none of those things violate the laws of physics.

The canonical example (of random by my definition) is predicting the results of a measurement of the z-axis spin of an electron prepared in a state of spin up along the x-axis. A slightly less canonical one is determining (from outside) whether an unstable particle which passed into a black hole horizon decays before it hits the singularity.
 
Incidentally, one problem with my proposed definition is that it probably either makes everything or nothing in the physical world random (depending on what you think about quantum mechanics).

However it can be modified to address that: one could say that an event is almost non-random (or maybe epsilon-predictable is better) if your confidence in your prediction tends to a number larger than 1-epsilon when the precision goes to infinity (where epsilon is to be specified depending on the application).
 
A random variable produces an infinite sequence not producible by a finitely describable mechanism.
 
Said sequence (infinite or otherwise) would not be a product of a random variable. It would be a sequence of random variables.
 
Said sequence (infinite or otherwise) would not be a product of a random variable. It would be a sequence of random variables.

So are you saying that cyborg hasn't really defined a random variable? Or that there is just something wrong with how I interpreted what he said?
 
Last edited:
I'm saying he hasn't defined "random variable" correctly.
 
Last edited:

Back
Top Bottom