Skip to content
Related Articles

Related Articles

Spearman’s Rank Correlation Coefficient in Different Cases

View Discussion
Improve Article
Save Article
  • Last Updated : 08 Jun, 2022

Spearman’s Rank Correlation Coefficient or Spearman’s Rank Difference Method or Formula is a method of calculating the correlation coefficient of qualitative variables and was developed in 1904 by Charles Edward Spearman. In other words, the formula determines the correlation coefficient of variables like beauty, ability, honesty, etc., whose quantitative measurement is not possible. Therefore, these attributes are ranked or put in the order of their preference. 

r_k = 1 - \frac{6\sum{D^2}}{N^3 - N}

In the given formula,

rk = Coefficient of rank correlation

D = Rank differences

N = Number of variables

Three different cases of Spearman’s Rank Correlation Coefficient:

Case 1: When Ranks are given

In this case, the ranks of the frequency distribution or variables are already given, and the coefficient of rank correlation is calculated based on those ranks. The formula for calculating Spearman’s Rank Correlation is

r_k = 1 - \frac{6\sum{D^2}}{N^3 - N}

Example:

In an art competition, two judges accorded following ranks to the 10 participants:

Judge X 1 2 3 4 5 6 7 8 9 10
Judge Y 6 2 9 7 1 4 8 3 10 5

Calculate coefficient of rank correlation. 

Solution:

Judge X (R1) Judge Y (R2) D = R1 – R2 D2
1 6 -5 25
2 2 0 0
3 9 -6 36
4 7 -3 9
5 1 4 16
6 4 2 4
7 8 -1 1
8 3 5 25
9 10 -1 1
10 5 5 25
N = 10     ∑D2  = 142

r_k = 1 - \frac{6\sum{D^2}}{N^3 - N}

= 1 - \frac{6\times{142}}{10^3 - 10}

= 1 - \frac{852}{990}

= 1 – 0.860

= 0.14

Coefficient of Correlation (rk) = 0.14

As the rank correlation is positive and closer to 0, it means that the association between the ranks of the two judges is weaker. 

Case 2: When Ranks are not given

When the ranks of the variables or distribution are not given, then the individual has to rank the values themselves. While ranking the values, one has to adopt a uniform procedure for both series of distribution. For instance, if 1st rank is given to the lowest value of one series, then the same pattern should be followed for the second series as well. Once the rank has been determined, the coefficient of rank correlation is determined as the first case. The formula for calculating Spearman’s rank correlation coefficient is

r_k = 1 - \frac{6\sum{D^2}}{N^3 - N}

Example:

Calculate the Spearman’s Rank Correlation for the following data.

Mathematics 14 15 17 12 16 11 18 9 10
Accountancy 4 12 8 10 2 5 9 3 7

Solution:

In the given case, there are 9 values, and the ranking for both X and Y or Mathematics and Accountancy is done by giving the highest rank to the highest value and the lowest rank to the lowest value. Therefore, 1st rank is given to 9 in the X series and 2 in the Y series. Similarly, the 9th rank is given to 18 in the X series and 12 in the Y series. 

Mathematics (X) Rank R1 Accountancy (Y) Rank R2 D = R1 – R2 D2
14 5 4 3 2 4
15 6 12 9 -3 9
17 8 8 6 2 4
12 4 10 8 -4 16
16 7 2 1 6 36
11 3 5 4 -1 1
18 9 9 7 2 4
9 1 3 2 -1 1
10 2 7 5 -3 9
N = 9         ∑D2 = 84

r_k = 1 - \frac{6\sum{D^2}}{N^3 - N}

= 1 - \frac{6\times84}{9^3 - 9}

= 1 - \frac{504}{720}

= 1 – 0.7

= 0.3

Coefficient of Correlation (rk) = 0.3

It means that there is a positive rank correlation of a moderate degree of 0.3.

Case 3: When Ranks are equal

When two or more values of a series have an equal rank, then in such cases, each value is given the average of the two ranks. To avoid any mistake, the formula for calculating Spearman’s Rank Correlation Coefficient is

r_k = 1 - \frac{6[\sum D^2 + \frac{1}{12}(m_1^3 - m_1) + \frac{1}{12}(m_2^3 - m_2) + ...]}{N^3 - N}

Here, m1, m2, ……. are the number of times a value has repeated in the given X, Y, …….. series, respectively. 

Example:

Calculate the coefficient of rank correlation of the scores obtained by 7 students in an essay writing competition by two judges, X and Y.

X 15 12 20 16 18 20 26
Y 10 15 11 11 25 18 30

Solution:

In the given case, there are 7 values or students, and ranks have been given as highest rank to the highest score and lowest rank to the lowest score. For instance, for scores given by Judge X, 1st rank is given to the score of 26 and for the scores given by Judge Y, 1st rank is given to the score of 30. 

X             Rank R1           Y               Rank R2          D               D2   
15 6 10 7 -1 1
12 7 15 4 3 9
20 2.5 11 5.5 -3 9
16 5 11 5.5 -0.5 0.25
18 4 25 2 2 4
20 2.5 18 3 -0.5 0.25
26 1 30 1 0 0
          ∑D2 = 23.5

Judge X has given 20 scores to two students who are in the place of 2nd and 3rd rank. Therefore, the average of both ranks, i.e., (2+3)/2 = 2.5 rank has been given to both students. 

Similarly, Judge Y has given 11 scores to two students who are in the place of 5th and 6th rank. Therefore, the average of both ranks, i.e., (5+6)/2 = 5.5 has been given to both students. 

Also, in series X, the number 20 is repeated twice, and in Y series, the number 11 is repeated twice. Therefore, m for series X or m1 is 2 and m for series Y or m2 is 2. 

r_k = 1 - \frac{6[\sum D^2 + \frac{1}{12}(m_1^3 - m_1) + \frac{1}{12}(m_2^3 - m_2) + ...]}{N^3 - N}

= 1 - \frac{6[23.5 + \frac{1}{12}(2^3 - 2) + \frac{1}{12}(2^3 - 2)]}{7^3 - 7}

= 1 - \frac{6[23.5 + \frac{1}{2} + \frac{1}{2}]}{336}

= 1 - 6[\frac{24.5}{336}]

= 1 - \frac{147}{336}

= 1 – 0.4375

= 0.5625

Coefficient of Correlation = 0.56

The positive correlation coefficient of 0.56 means that around 25% of the variation is related.

 


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!