The Quartile Deviation is a simple way to estimate the spread of a distribution about a measure of its central tendency (usually the mean). So, it gives you an idea about the range within which the central 50% of your sample data lies. Consequently, based on the quartile deviation, the Coefficient of Quartile Deviation can be defined, which makes it easy to compare the spread of two or more different distributions. Since both of these topics are based on the concept of quartiles, we’ll first understand how to calculate the quartiles of a dataset before working with the direct formulae.
Suggested Videos
Quartiles
A median divides a given dataset (which is already sorted) into two equal halves similarly, the quartiles are used to divide a given dataset into four equal halves. Therefore, logically there should be three quartiles for a given distribution, but if you think about it, the second quartile is equal to the median itself! We’ll deal with the other two quartiles in this section.
- The first quartile or the lower quartile or the 25th percentile, also denoted by Q1, corresponds to the value that lies halfway between the median and the lowest value in the distribution (when it is already sorted in the ascending order). Hence, it marks the region which encloses 25% of the initial data.
- Similarly, the third quartile or the upper quartile or 75th percentile, also denoted by Q3, corresponds to the value that lies halfway between the median and the highest value in the distribution (when it is already sorted in the ascending order). It, therefore, marks the region which encloses the 75% of the initial data or 25% of the end data.
Browse more Topics under Measures Of Central Tendency And Dispersion
- Arithmetic Mean
- Median and Mode
- Partition Values or Fractiles
- Harmonic Mean and Geometric Mean
- Measure of Dispersion
- Range and Mean Deviation
- Standard deviation and Coefficient of Variation
Learn more about Standard Deviation, Coefficient of Variation here in detail.
For a better understanding, look at the representation below for a Gaussian Distribution –
Source – Wikipedia
The Quartile Deviation
Formally, the Quartile Deviation is equal to the half of the Inter-Quartile Range and thus we can write it as – $$ Q_d = \frac{Q_3 – Q_1}{2} $$ Therefore, we also call it the Semi Inter-Quartile Range.
- The Quartile Deviation doesn’t take into account the extreme points of the distribution. Thus, the dispersion or the spread of only the central 50% data is considered.
- If the scale of the data is changed, the Qd also changes in the same ratio.
- It is the best measure of dispersion for open-ended systems (which have open-ended extreme ranges).
- Also, it is less affected by sampling fluctuations in the dataset as compared to the range (another measure of dispersion).
- Since it is solely dependent on the central values in the distribution, if in any experiment, these values are abnormal or inaccurate, the result would be affected drastically.
Learn more about Range and Mean Deviation here in detail.
Quartile Deviation Formula
Quartile Deviation = $$ \frac{Q_3 – Q_1}{2} $$
Q1 = lower quartile
Q3 = upper quartile
Q2 is also known as the median.
Quartile Deviation for Ungrouped Data
For an ungrouped data, the formula to calculate quartiles are:
Q1 = $$ \left [ \frac{(n + 1)}{4} \right ]^{th} item $$
Q2 = $$ \left [ \frac{(n + 1)}{2} \right ]^{th} item $$
Q3 = $$ \left [ \frac{3 (n + 1)}{4} \right ]^{th} item $$
Here, n is the total number of observations.
It is important to note here that students need to arrange the given data values in ascending order before estimating the quartiles.
Quartile Deviation for Grouped Data
For a grouped data, the quartiles can be calculated using the following formula:
$$ Q_{r}=l_{1}+\frac{r\left(\frac{N}{4}\right)-c}{f}\left(l_{2}-l_{1}\right) $$
Here,
Qr = rth quartile
l1 = the lower limit of the quartile class
l2 = the upper limit of the quartile class
f = the frequency of the quartile class
c = the cumulative frequency of the class preceding the quartile class
N = Number of observations in the given data set
The Coefficient of Quartile Deviation
Based on the quartiles, a relative measure of dispersion, known as the Coefficient of Quartile Deviation, can be defined for any distribution. It is formally defined as – $$\text{Coefficient of Quartile Deviation = }\frac{Q_3 – Q_1}{Q_3 + Q_1} \times 100$$
Since it involves a ratio of two quantities of the same dimensions, it is unitless. Thus, it can act as a suitable parameter for comparing two or more different datasets which may or may not involve quantities with the same dimensions.
So, now let’s go through the solved examples below to get a better idea of how to apply these concepts to various distributions.
Importance of Quartile Deviation
Statistics is a tool that helps us understand the data, its frequency, and the distribution of the trends. Quartile deviation is the difference between the first quartile and the third quartile in the frequency distribution table. This is also known as the interquartile range. It is important as in this range numerous regressions and deviations can be calculated which help to assess the characteristics of the data. When we divide the interquartile range by two, it is known as quartile deviation or semi-interquartile range.
Solved Examples on Quartile Deviation
Question 1: The number of vehicles sold by a major Toyota Showroom in a day was recorded for 10 working days. The data is given as –
Day | Frequency |
1 | 20 |
2 | 15 |
3 | 18 |
4 | 5 |
5 | 10 |
6 | 17 |
7 | 21 |
8 | 19 |
9 | 25 |
10 | 28 |
Find the Quartile Deviation and its coefficient for the given discrete distribution case.
Solution: We first need to sort the frequency data given to us before proceeding with the quartiles calculation –
Sorted Data – 5, 10, 15, 17, 18, 19, 20, 21, 25, 28
n(number of data points) = 10
Now, to find the quartiles, we use the logic that the first quartile lies halfway between the lowest value and the median; and the third quartile lies halfway between the median and the largest value.
First Quartile Q1 = \(\frac{n + 1}{4}\)th term.
= \(\frac{10 + 1}{4}\)th term = 2.75th term
= 2nd term + 0.75 × (3rd term – 2nd term)
= 10 + 0.75 × (15 – 10)
= 10 + 3.75
= 13.75
Third Quartile Q3 = \(\frac{3(n + 1)}{4}\)th term.
= \(\frac{3(10 + 1)}{4}\)th term = 8.25th term
= 8th term + 0.25 × (9th term – 8th term)
= 21 + 0.25 × (25 – 21)
= 21 + 1
= 22
Using the values for Q1 and Q3, now we can calculate the Quartile Deviation and its coefficient as follows –
Quartile Deviation = Semi-Inter Quartile Range
= \(\frac{Q_3 – Q_1}{2}\)
= \(\frac{22 – 13.75}{2}\)
=\(\frac{8.25}{2}\)
= 4.125
Coefficient of Quartile Deviation
= \(\frac{Q_3 – Q_1}{Q_3 + Q_1} \times 100\)
= \(\frac{22 – 13.75}{22 + 13.75} \times 100\)
= \(\frac{8.25}{35.75} \times 100\)
≈ 23.08
Question 2:
For the following open-ended data, calculate the Quartile Deviation and its coefficient.
Marks | No. of Students |
0-10 | 10 |
10-20 | 20 |
20-30 | 30 |
30-40 | 50 |
40-50 | 40 |
50-60 | 30 |
Solution: For the case of a grouped-data distribution, we can find the quartiles through the following steps –
⇒ Construct a cumulative frequency table for the given data alongside the given distribution
⇒ From the total number of data values, estimate the groups/classes of the Lower and Upper Quartiles
⇒ Use the following formulae to then calculate the quartiles:
The Lower Quartile Q1 = \(LB + w\frac{\frac{1}{4}n – f_c}{f}\)
The Upper Quartile Q3 = \(LB + w\frac{\frac{3}{4}n – f_c}{f}\)
where, LB – the lower bound of the class in which the respective quartile lies
w – the class width
f_c – the cumulative frequency up to that class
f – the frequency corresponding to that particular class
For the given data, we can form the required table with the cumulative frequency as –
Marks | Frequency | Cumulative Frequency |
0-10 | 10 | 10 |
10-20 | 20 | 30 |
20-30 | 30 | 60 |
30-40 | 50 | 110 |
40-50 | 40 | 150 |
50-60 | 30 | 180 |
Since the total number of students is 180, the first quartile must lie at the position of 180/4 = 45th student. Similarly, the third quartile must lie at the position of 180×3/4 = 135th student. By the distribution of our data into groups, we can note that the first quartile will lie in the 20-30 marks range.
Calculation –
Q1 = \(LB + w\frac{\frac{1}{4}n – f_c}{f}\)
Here, LB = 20; w = 10
f_c = 30; f = 30; n = 180
Thus, Q1 = \(20 + 10\frac{\frac{1}{4}\times 180 – 30}{30}\)
=\(20 + \frac{15}{30}\times 10\)
= 25
Similarly, the third quartile will lie in the 40-50 marks range. Calculation –
Q3 = \(LB + w\frac{\frac{3}{4}n – f_c}{f}\)
Here, LB = 40; w = 10
f_c = 110; f = 40; n = 180
Thus, Q3 = \(40 + 10\frac{\frac{3}{4}\times 180 – 110}{40}\)
=\(40 + \frac{25}{40}\times 10\)
= 46.25
Now, using the values for Q1 and Q3, now we can calculate the Quartile Deviation and its coefficient as follows –
Quartile Deviation = Semi-Inter Quartile Range
= \(\frac{Q_3 – Q_1}{2}\)
= \(\frac{46.25 – 25}{2}\)
=\(\frac{21.25}{2}\)
= 10.625
Coefficient of Quartile Deviation
= \(\frac{Q_3 – Q_1}{Q_3 + Q_1} \times 100\)
= \(\frac{46.25 – 25}{46.25 + 25} \times 100\)
= \(\frac{21.25}{71.25} \times 100\)
≈ 29.82
This concludes our discussion on this topic.