Aquileo | Spearman's Rank Correlation

Spearman’s Rank Correlation is a statistical measure used to find the strength and direction of association between two ranked variables. It checks how well the relationship between two variables can be described using a monotonic function.

Works on ranks instead of actual values
Used when data is not normally distributed
Suitable for ordinal data
Measures monotonic relationships, not just linear ones

Mathematical Intuition

Spearman’s Rank Correlation, denoted by ρ (rho), measures the strength and direction of association between two variables based on their ranks instead of actual values. Values range from -1 to +1

+1: perfect positive monotonic relationship i.e if one variable increases as the other increases
-1: perfect negative monotonic relationship i.e if one increases while the other decreases
0: no monotonic relationship

Monotonic Relationship means a relationship where variables move in one direction only.

Spearman's Correlation formula

\rho = 1 - \frac{6\sum d ^{2}}{n(n^2-1)}

where:

d: difference between ranks of corresponding values
n: total number of observation

When to Use Spearman’s Rank Correlation

Works with non-linear data
Data follows a monotonic trend
Variables contain ranks or scores
Does not assume normal distribution
Handles ordinal data
Dataset contains outliers

Calculating Spearman’s Rank Correlation

Step 1: Original Data

Number	X1	Y1
1	7	5
2	6	4
3	4	5
4	5	6
5	8	10
6	7	7
7	10	9
8	3	2
9	9	8
10	2	1

Step 2: Convert Data into Ranks

Ranks are assigned by sorting values in ascending order. If values are tied, the average rank is assigned.

Ranking X1:

Sorted X1 values: 2, 3, 4, 5, 6, 7, 7, 8, 9, 10
The two values 7 and 7 are tied → both get rank 6.5

Ranking Y1:

Ranks are assigned using the same method.
Sorted Y1 values: 1, 2, 4, 5, 5, 6, 7, 8, 9, 10
The two values 5 and 5 are tied i.e both receive the average rank 4.5

Step 3: Rank Table with Differences

No.	Rank X1	Rank Y1	d	d²
1	6.5	4.5	2	4
2	5	3	2	4
3	3	4.5	-1.5	2.25
4	4	6	-2	4
5	8	10	-2	4
6	6.5	7	-0.5	0.25
7	10	9	1	1
8	2	2	0	0
9	9	8	1	1
10	1	1	0	0

\sum d^2 = 20.5

Step 4: Apply the Formula

\rho = 1 - \frac{6 \sum d_i^2}{n(n^2 - 1)}

\rho = 1 - \frac{6 \times 20.5}{10(10^2 - 1)}

\rho = 1 - 0.12424

\rho \approx 0.88

This indicates a strong positive monotonic relationship

Calculating Spearman’s Rank Correlation in Python

Step 1: Sample Data

Python

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = {
    'Hours_Studied': [2, 4, 6, 8, 10],
    'Marks': [50, 55, 65, 70, 90]
}

df = pd.DataFrame(data)
df

Output:

Step 2: Apply Spearman’s Correlation

Python

spearman_corr = df['Hours_Studied'].corr(df['Marks'], method='spearman')
print("Spearman Correlation:", spearman_corr)

Output:

Spearman Correlation: 0.9999999999999999

Data shows a perfect increasing rank order. Hence, correlation value is 0.9

Step 3: Visualizing Spearman Correlation

Heatmap displays correlation between only the selected columns
Helps visually confirm strength of relationship

Python

corr_data = df[['Hours_Studied', 'Marks']].corr(method='spearman')

sns.heatmap(corr_data, annot=True, cmap='coolwarm')
plt.title("Spearman Rank Correlation Heatmap")
plt.show()

Output:

Step 4: Scatter Plot for Monotonic Relationship

Python

plt.scatter(df['Hours_Studied'], df['Marks'])
plt.xlabel("Hours Studied")
plt.ylabel("Marks")
plt.title("Monotonic Relationship")
plt.show()

Output:

Scatter plot shows consistent upward trend hence confirming monotonic relationship.

Difference Between Pearson and Spearman Correlation

Lets see difference between Pearson Correlation and Spearman Correlation:

Aspect	Pearson Correlation	Spearman Correlation
Data type	Continuous	Ordinal or continuous
Uses actual values	Yes	No
Uses ranks	No	Yes
Assumes normal distribution	Yes	No
Relationship type	Linear	Monotonic

Advantages

Works with non-linear monotonic data
Less affected by outliers
Simple to compute and interpret
No strict distribution assumptions

Limitations

Ignores actual data values
Cannot detect non-monotonic patterns
Less informative for precise numerical relationships
Less accurate for very small datasets

Applications

Ranking students based on marks
Comparing search result rankings
Measuring customer satisfaction levels
Behavioral and social science studies
Survey data analysis
Feature ranking in ML
Medical and social science studies

Spearman's Rank Correlation

Mathematical Intuition

Spearman's Correlation formula

When to Use Spearman’s Rank Correlation

Calculating Spearman’s Rank Correlation

Step 1: Original Data

Step 2: Convert Data into Ranks

Ranking X1:

Ranking Y1:

Step 3: Rank Table with Differences

Step 4: Apply the Formula

Calculating Spearman’s Rank Correlation in Python

Step 1: Sample Data

Step 2: Apply Spearman’s Correlation

Step 3: Visualizing Spearman Correlation

Step 4: Scatter Plot for Monotonic Relationship

Difference Between Pearson and Spearman Correlation

Advantages

Limitations

Applications

Explore