Random means are unpredictable. Hence, a random variable means a variable whose future value is unpredictable despite knowing its past performance. In this article, we will look at the definition of a random variable and its types.
Definition of a Random Variable
A random variable is a variable whose possible values are the numerical outcomes of a random experiment. Therefore, it is a function which associates a unique numerical value with every outcome of an experiment. Further, its value varies with every trial of the experiment.
Since random variables are outcomes of a random experiment, it is important to understand a random experiment as well. A random experiment is a process which leads to an uncertain outcome.
Usually, it is assumed that the experiment is repeated indefinitely under homogeneous conditions. While the result of a random experiment is not unique, it is one of the possible outcomes.
For example, when you toss an unbiased coin, the outcome can be a head or a tail. Even if you keep tossing the coin indefinitely, the outcomes are either of the two. Also, you would never know the outcome in advance.
In a random experiment, the outcomes are not always numerical. However, we need numbers as outcomes for calculations. Therefore, we define a random variable as a function which associates a unique numerical value with every outcome of a random experiment.
For example, in the case of the tossing of an unbiased coin, if there are 3 trials, then the number of times a ‘head’ appears can be a random variable. This has values 0, 1, 2, or 3 since, in 3 trials, you can get a minimum of 0 heads and a maximum of 3 heads.
Learn more about the Theory of Probability here in detail.
Types of Random variables
We classify random variables based on their probability distribution. A random variable either has an associated probability distribution (Discrete Random Variable), or a probability density function (Continuous Random Variable). Therefore, we have two types of random variables – Discrete and Continuous.
Discrete Random Variables
Discrete random variables take on only a countable number of distinct values. Usually, these variables are counts (not necessarily though). If a random variable can take only a finite number of distinct values, then it is discrete.
Number of members in a family, number of defective light bulbs in a box of 10 bulbs, etc. are some examples of discrete random variables.
The probability distribution of these variables is a list of probabilities associated with each of its possible values. It is also called the probability function or the probability mass function.
If a random variable (X) takes ‘k’ different values, with the probability that X = xi is defined as P(X = xi) =pi, then it must satisfy the following:
- 0 < pi < 1 (for each ‘i’)
- p1 + p2 + p3 + … + pk = 1
Example of Discrete Random Variables
You toss a coin 10 times. The random variable X is the number of times you get a ‘tail’. X can only take values 0, 1, 2, … , 10. Therefore, X is a discrete random variable. Let’s look at the probability of getting 8 tails.
p8 (probability of getting 8 tails) falls in the range 0 to 1. Also, the sum of probabilities for all possible values of tails p0 + p1 + … p10 = 1.
Continuous Random Variables
Continuous random variables take up an infinite number of possible values which are usually in a given range. Typically, these are measurements like weight, height, the time needed to finish a task, etc.
To give you an example, the life of an individual in a community is a continuous random variable. Let’s say that the average lifespan of an individual in a community is 110 years.
Therefore, a person can die immediately on birth (where life = 0 years) or after he attains an age of 110 years. Within this range, he can die at any age. Therefore, the variable ‘Age’ can take any value between 0 and 110.
Hence, continuous random variables do not have specific values since the number of values is infinite. Also, the probability at a specific value is almost zero. Instead, it is defined over an interval of values and represented by the area under a curve.
Let’s say that a random variable X takes all values over an interval of real numbers. Therefore, the probability that X is in the set of outcomes ‘A’ is the area above A and under a curve. Also, the curve representing it must satisfy the following conditions:
- The curve has no negative values (i.e. p(x) > 0 for all x)
- The total area under the curve = 1.
This is a density curve.
Example of Density Curve
You burn a light bulb until it burns out. Let’s say that the life of the bulb ranges between zero hours to 100 hours (minimum = 0 and maximum = 100). ‘Y’ is a random variable which is the lifetime of the bulb in hours. Since Y can take any positive real value between 0 and 100, it is a continuous random variable.
As we have seen above, calculating the probability of Y at a specific point is immaterial. Instead, we calculate the probability of Y between two points within the range (0-10, 50-70, less than 20, more than 90, etc.). Further, at any point in the range, p(x) >0 and the total area in the probability curve = 1.
Q1. What are random variables?
Answer: Random variables are variables whose possible values are the numerical outcomes of a random experiment. Therefore, they are functions which associate a unique numerical value with every outcome of an experiment. Further, their value varies with every trial of the experiment.