Measures of Dispersions- Mean and Standard Deviation

Table of Contents

Measures of Dispersions:

The various measures of central tendency take into consideration one important characteristic of statistical distribution- namely, the single representative i.e., average. But to study the reliability of the central tendency, it is equally important to know how the values of the variable cluster around the point denoting the measure of central tendency. The measures of central tendency are inadequate to give a complete idea about the items; rather we would be more definite about the nature of distribution if some idea about the scatter of items about some average value is provided.

A measure of scatteredness of the item about some average is referred to as a measure of dispersion. A measure of dispersion for statistical data may also be defined as a measurement of the spread of the items about an average value. One may think it is in terms of a lack of uniformity or homogeneity in the size of the series.

The requisites of a good measure of dispersion are the same as those for a good measure of central tendency, namely:

It should be easy to compute.
It should be well-defined.
It should be simple to understand.
It should be based on all the items.
It should not be unexpectedly affected by the presence of extreme items.
It should have sampling stability.
It should be capable of further algebraic treatment.

Some of the commonly used measures of dispersion are:

Range.
Mean Deviation.
Variance.
Standard Deviation.

The simplest measure of dispersion is the range of the values which may be defined as the difference between the highest and the lowest values of the variable appearing in the distribution. As it depends entirely on extreme values, it is most unstable and unreliable. The deletion or insertion of any number of intermediate values would not at all change the dispersion in this case. Owing to these backwardnesses it has a limited use.

Mean deviation and standard deviation are widely used nowadays and we shall discuss only these two types of dispersions here.

Mean Deviation (M.D.):

The mean deviation of a series of values of a variable is the arithmetic mean of all the absolute deviations from the Mean or Median. (absolute deviation means the positive value of the deviation).

Mean deviation of a set of n discrete scores x₁, x₂, x₃,…..x_n about arithmetic mean is given by-

When the values of the variables are given in the form of classes, then their respective class marks (mid-values of the classes) are to be taken as the values of the variables.

Coefficient of Mean Deviation:

It is defined to be the ratio of the mean deviation to the arithmetic mean i.e.,

Coefficient of M.D. = M.D./x̄

It is generally used for the comparison of the variability of two or more series.

Merits of Mean Deviation:

It is well-defined.
It is easy to understand.
It is easy to compute.
It is based on all the items.
It can be computed about any average.
It is not much affected by the presence of extreme scores.

Demerits of Mean Deviation:

It does not consider the signs of the individual deviation [d_i = (x_i – x̄)] of the items.
It is not capable of further algebraic treatment so as to permit more analysis of the distribution.

Variance and Standard Deviation:

The mean square deviation from any given average A (usually calculated from mean or median) is defined as the arithmetic mean of the squares of the deviation from the average A. It is generally denoted by S². Thus, the mean square deviation from average A is given by-

If the mean square deviation is calculated from the mean, then it is called the variance and is denoted by σ². If x̄ be the mean of the set of scores x₁, x₂, x₃, ……., x_n, then-

In practical problems, the means of a set of scores may be a decimal fraction and so each of the values of x_i – x̄ would be also a decimal fraction. Finding out squares of those values would be a laborious and boring job. In order to avoid that, the above formula for finding out the variance is modified as follows.

formula for finding out the variance is modified

If the scores are repeated, i.e. if the score x₁ occurs f₁ times, x₂ occurs f₂ times, and so on, then the variance of the scores is given by-

To find the variance from the frequency distribution Table, the continuous series is first converted into a discrete series by considering the values of the variable as the mid-values of the corresponding classes.

If the scores are large enough so that squaring them is not desirable, then the scores may be reduced by changing the origin. Change of origin does not alter the variance- as will be clear from the discussion made below. We write u_i = x_i – a, where a is the reduction in each score.

∴ x_i = u_i + a

find the variance from the frequency distribution

Thus, σ² remains unaffected by replacing x_i with u_i.

If the classes of the frequency distribution table are of equal width, then one can change the scale also (in addition to the shift of origin).

If we take d_i = (x_i – a)/h, where h is the width of the class intervals, then x_i = a + hd_i

The positive square root of the variance is called the standard deviation. It is denoted by σ.

We use this formula to compute σ directly by finding out the actual mean of the score. This method of finding out σ is known as the direct method.

In the short-cut method, we do not find out the actual mean. In this case, we use the formula

short cut method to find standard deviation

In this case, each score has been reduced to the same extent.

In addition to the change of origin if the scale is also reduced, then the method is known as the step deviation method. The formula is-

Coefficient of Standard Deviation (S.D.):

In order to compare two or more series for variability, the corresponding relative measures are used. The relative measure is known as the coefficient of standard deviation and is defined as-

Coefficient of S.D. = S.D./x̄

Another relative measure of variability is known as the coefficient of variation defined as

C.V. = (S.D./x̄) x 100

Merits of Standard Deviation:

It is well-defined.
It is simple to understand.
It is based on all the items.
It has sampling stability.
In calculating S.D. the signs of deviations of the items are also taken into consideration.
It is capable of further algebraic treatment.

Demerits of Standard Deviation:

It is not easy to calculate.
As the squares of the deviations are involved in the calculation, it is highly affected by the presence of extremely high or extremely low scores.

Note: Mean deviation is very nearly equal to 4/5th of the standard deviation.

Standard Deviation of Composite Group:

If σ₁ and σ₂ be the standard deviations of two groups containing n₁ and n₂ items respectively, x̄₁ and x̄₂ their respective arithmetic means, then x̄, the arithmetic mean of the combined group is given by-

and the standard deviation of the combined group is given by-

x_i	\| x_i – x̄ \|
39	2
31	10
62	21
29	12
33	8
49	8
44	3
	Σ \| x_i – x̄ \| = 64

Example 2: Calculate the mean deviation about the mean for the following data.

Scores	5	10	15	20	25
Frequency	7	4	6	3	5

Solution-

*Scores*	*Frequency*
x_i	f_i	f_ix_i	\| x_i – x̄ \|	f_i \| x_i – x̄ \|
5	7	35	9	63
10	4	40	4	16
15	6	90	1	6
20	3	60	6	18
25	5	125	11	55
	Σf_i = 25	Σf_ix_i = 350		Σf_i \| x_i – x̄ \| = 158

∴ x̄ = Σf_ix_i / Σf_i
⇒ x̄ = 350/25 = 14

Mean deviation about mean = Σf_i | x_i – x̄ | / Σf_i = 158/25 = 6.32

Example 3: Calculate the mean deviation about the Median for the following distribution.

Class	15	9	7	11	17	5	13
Frequency	12	6	4	8	8	2	10

Solution-

*Class*	*Frequency*
x_i	f_i	Cumulative Frequency	\| x_i – x̄ \|	f_i \| x_i – x̄ \|
5	2	2	8	16
7	4	6	6	24
9	6	12	4	24
11	8	20	2	16
13	10	30	0	0
15	12	42	2	24
17	8	50	4	32
	Σf_i = 50			Σf_i \| x_i – x̄ \| =136

Here, N = 50
∴ N/2 = 50/2 = 25
which is greater than 20
∴ Median x̄ = 13

Mean deviation about median = Σf_i | x_i – x̄ | / Σf_i =136/50 = 2.72

Example 4: Find the mean deviation about the mean for the following distribution.

Class	0-10	10-20	20-30	30-40	40-50	50-60	60-70	70-80
Frequency	5	8	12	15	20	14	12	6

Solution-

*Class*	*Frequency*
	f_i	x_i	f_ix_i	\| x_i – x̄ \|	f_i \| x_i – x̄ \|
0-10	5	5	25	37.06	185.3
10-20	8	15	120	27.06	216.48
20-30	12	25	300	17.06	204.72
30-40	15	35	525	7.06	105.9
40-50	20	45	900	2.94	58.8
50-60	14	55	770	12.94	181.16
60-70	12	65	780	22.94	275.28
70-80	6	75	450	32.94	197.64
	Σf_i = 92		Σf_ix_i = 3870		Σf_i \| x_i – x̄ \| =1425.28

Here, x̄ = Σf_ix_i / Σf_i = 3870/92 = 42.06

Mean Deviation about mean = Σf_i | x_i – x̄ | / Σf_i = 1425.28/92 = 15.49

Example 5: Calculate the mean and variance of the following distribution.

Score	2	4	6	8	10	12	14	16
Frequency	4	4	5	15	8	5	4	5

Solution-

*Score*	*Frequency*
x_i	f_i	f_ix_i	x_i – x̄	(x_i – x̄)²	f_i (x_i – x̄)²
2	4	8	-7	49	196
4	4	16	-5	25	100
6	5	30	-3	9	45
8	15	120	-1	1	15
10	8	80	1	1	8
12	5	60	3	9	45
14	4	56	5	25	100
16	5	80	7	49	245
	Σf_i = 50	Σf_ix_i = 450			Σf_i (x_i – x̄)² = 754

∴ Mean x̄ = Σf_ix_i / Σf_i = 450/50 = 9

Variance (x) = Σf_i (x_i – x̄)² / n = 754/50 = 15.08

Example 6: Find the variance and standard deviation for the following data.

Variable	10	6	18	14	22	2
Frequency	10	7	7	15	6	5

Solution-

*Variable*	*Frequency*
x_i	f_i	x_i²	f_ix_i²	f_ix_i
10	10	100	1000	100
6	7	36	252	42
18	7	324	2268	126
14	15	196	2940	210
22	6	484	2904	132
2	5	4	20	10
	Σf_i = 50		Σf_ix_i² = 9384	Σf_ix_i = 620

∴ Variance (x) = (Σf_ix_i² / n) – (Σf_ix_i / n)²
⇒ Variance (x) = (9384/50) – (620/50)² = (4692/25) – (3844/25) = 33.92

Standard Deviation = √var(x) = √33.92 = 5.8 approx.

Example 7: Calculate the mean and standard deviation from the following distribution and also find the coefficient of variation.

Score	5	15	25	35	45	55
Frequency	12	18	27	20	17	6

Solution-

Score	Frequency
x_i	f_i	f_ix_i	x_i – x̄	(x_i – x̄)²	f_i (x_i – x̄)²
5	12	60	-23	529	6348
15	18	270	-13	169	3042
25	27	675	-3	9	243
35	20	700	7	49	980
45	17	765	17	289	4913
55	6	330	27	729	4374
	Σf_i = 100	Σf_ix_i = 2800			Σf_i (x_i – x̄)² = 19900

∴ Mean x̄ = Σf_ix_i / Σf_i = 2800/100 = 28

Variance (x) = Σf_i (x_i – x̄)² / n = 19900/100 = 199

Standard Deviation = √var(x) = √199 = 14.1

Coefficient of variation = ( Standard Deviation / x̄ ) 100
⇒ Coefficient of variation = (14.1/28) 100 = 50.39

Example 8: The mean and variance of a group of 100 items are 15 and 9. A second group coating 150 items is mixed with it and the mean and variance of the composite group are 15.6 and 13.44. Find the mean and standard deviation of the second group.

Solution- Let x̄₁, and x̄₂ be the means of two groups and n₁ and n₂ be their respective items.

Let x̄ be the mean of the composite group.

Here, x̄₁ = 15, n₁ = 100
x̄ = 15.6, n₂ = 150
∴ x̄ = (n₁x̄₁ + n₂x̄₂)/(n₁ + n₂)
⇒ 15.6 = [100 (15) + 150 (x̄₂)]/(100 + 150)
⇒ 3900 = 1500 + 150 (x̄₂)
⇒ 2400/150 = x̄₂
⇒ x̄₂ = 16

∴ The mean of the second group is 16

Let σ₁² and σ₂² be the variance of the two groups.

Let σ² be the variance of the composite group.

Here, σ₁² = 9, σ² = 13.44

∴ σ² = [ { (n₁σ₁² + n₂σ₂²) / (n₁ + n₂) } + { n₁n₂ (x̄₂ – x̄₁)² / (n₁ + n₂)² } ]
⇒ 13.44 = [ { (100 x 9 + 150σ₂²) / (100 + 150) } + { 100 x 150 (16 – 15)² / (100 + 150)² } ]
⇒ 13.44 = [ { (900 + 150σ₂²) / 250 } + {15000 / (250)² } ]
⇒ 13.44 – (15000/62500) = (900 + 150σ₂²) / 250
⇒ 13.44 – (6/25) = (900 + 150σ₂²) / 250
⇒ (330/25) = (900 + 150σ₂²) / 250
⇒ 3300 – 900 = 150σ₂²
⇒ σ₂² = 2400/150
⇒ σ₂² = 16
⇒ σ₂ = 4

∴ The standard deviation of the second group is 4.

Colloidal State of Matter and Classification of Colloids
Applications of Colloids
Difference Between Lyophilic and Lyophobic Sols
Preparation of Lyophobic Sols
Brownian Movement and Tyndall Effect
Electrical Properties of Colloidal Sols
Dialysis and Electro-Dialysis
Emulsions and Micelles
Applications of Emulsions
Types of Chemical Reactions– Tamil Board

GK SCIENTIST

Measures of Dispersions