Measures of Dispersion:
By now, you must have come across or learnt different measures of central tendency. Measures of central tendency facilitate the representation of the entire mass of the data with a single value. Can the central tendency describe the data wholly and accurately? No, and that is precisely why we need measures of dispersion. For instance, the hourly income of professionals in two offices are:
Office A : 30 50 50 65 70 90 100
Office B : 60 60 70 65 65 65 70
Here, evidently, the mean of both the sections is the same, that is, 65
- In office A, the observations are much more away from the mean.
- In office B, almost all the observations are close to the mean. Certainly, both the offices differ even though their mean is the same.
Therefore it is required to differentiate between the groups. We need some other measures with regards to the measure of scattered-ness (or spread). For this purpose, we study this topic known as measures of dispersion.
In simple words, ‘dispersion’ is a lack of uniformity in the sizes or quantities of the items of a group or series. According to Reiglemen, “Dispersion is the extent to which the magnitudes or quantities of the items differ, the degree of diversity.” The word may also be used to address the spread of the data.
Types of Dispersion
The measures of dispersion can be ‘absolute’ or ‘relative’. In the case of absolute measures of dispersion, they are stated in the same units in which the original data is expressed. For instance, if a group of data expresses the number of shoes a group of people own; the absolute dispersion will provide the values in numbers.
Relative dispersion, on the other hand, is the ratio of a measure of absolute dispersion to an appropriate average. The main benefit of this measure is that two or more series can be compared with each other even if they are expressed in different units.
Methods of Dispersion
Methods of studying dispersion are divided into two types :
(i) Mathematical Methods: We can study the ‘degree’ and ‘extent’ of variation with the use of these methods. The measures of dispersion included in this category are :
(b) Quartile Deviation
(c) Average Deviation
(d) Standard deviation and coefficient of variation.
(ii) Graphic Methods: If only the extent of variation is studied, whether it is higher or lower, a Lorenz-curve is put to use.
Two sections of 10 students each in class XII in a school were given a common test in Economics (40 maximum marks). The scores of the students are given below:
Section A: 6 9 11 13 15 21 23 28 29 35
Section B: 15 16 16 17 18 19 20 21 23 25
The average score in section A is 19. The average score in section B is 19.
In the above cited example, we observe that:
- the scores of all the students in section A are ranging from 6 to 35;
- the scores of the students in section B are ranging from 15 to 25.
The difference between the largest and the smallest scores in section A is 29 (35-6)
The difference between the largest and smallest scores in section B is 10 (25-15)
Thus, the difference between the largest and the smallest value of a data, is termed as the range of the distribution. Range does not consider all the values of a series, i.e. it takes only the extreme items and middle items are not considered significant. Therefore, Range is not sufficient to explain about the character of the distribution. The concept of range is useful in the field of quality control and to study the variations in the prices of the shares etc.
- Quartile Deviation
The quartile deviation is a slightly better measure of absolute dispersion than the range, although it ignores the observations on the ends (tails). It helps in knowing the range within which certain proportion of the items fall lay. It only considers the values of the ‘Upper quartile’ (Q3) and the ‘Lower quartile’ (Q1).
Inter Quartile Range = Q3 – Q1 .
The Inter-Quartile Range is based upon the 50% of the values in a distribution which lay in the middle; and hence is unaffected by extreme values. Half of the Inter-Quartile Range is called Quartile Deviation (Q.D.).
Thus: Q .D . = (Q3 – Q1)/2
Q.D. is therefore also called Semi Inter Quartile Range.
In individual and discrete series, Q1 is the size of [(n +1)/ 4]th value, but in a continuous distribution, it is the size of n/4th value. Similarly, for Q3 and median also, n is used in place of n+1.
A relative measure of dispersion based on the quartile deviation is called the coefficient of quartile deviation. It is just a number without any units of measurement. It can be used for comparing the dispersion of two or more sets of data.
- Average Deviation
Average deviation can be defined as the arithmetic mean of the absolute deviations (ignoring the negative signs) of various items from Mean, Mode or Median.
Calculation of mean deviation:
MD = Mean deviation
| D | = Deviations from mean or median ignoring + Signs
N = Number of item (Individual Series)
N = Total number of Frequencies (Discrete and continuous series)
F = Number of frequencies.
- Standard Deviation
Standard deviation is one of the best and popularly used measures of dispersion. Standard deviation is the square root of the arithmetic mean of the squares of deviation of its items from their arithmetic mean. The concept of standard deviation, which was introduced by Karl Pearson is useful in assessing the representativeness of the mean. It has a practical significance because it does not come with the problems associated with a range, quartile deviation or average deviation.
Calculations for the same, are as under:
1. Actual Mean Method
2. Assumed Mean Method
1. Actual Mean Method
2. Assumed Mean Method
3. Step Deviation Method
Coefficient of variation:
This Is the most apt measure when two or more groups of similar data are to be compared with respect to stability (or uniformly or consistency or homogeneity). It is the ratio of the standard deviation to the mean.
Where C.V. = Coefficient of variation
σ= Standard deviation
X = Arithmetic mean
Lorenz Curve: A Lorenz Curve can be defined as a graph on which the cumulative percentage of total national income (or some other variable) is plotted against the cumulative percentage of the corresponding population (ranked in increasing size of share). The extent to which the curve sags below a straight diagonal line indicates the degree of inequality of distribution.
This was all about the measures of dispersion. For more such articles, keep following us here!