Pearson Correlation Coefficient

Calculator for the linear relationship with formulas and examples

Correlation Coefficient Calculator

What is calculated?

The Pearson correlation coefficient measures the strength of the linear relationship between two variables. Values between -1 and +1 indicate negative to positive correlation.

Input data

Variable X
Data points separated by spaces

Variable Y
Same number of values as Variable X

Decimal places

Result

Pearson r:

Measure of linear association between variables

Correlation Info

Properties

Pearson correlation:

Range: [-1, +1]
+1 = perfect positive correlation
0 = no linear correlation
-1 = perfect negative correlation

Linear: Measures only linear relationships, not curved or other nonlinear associations.

Interpretation

|r| ≥ 0.7: Strong correlation

0.3 ≤ |r| < 0.7: Moderate correlation

0.1 ≤ |r| < 0.3: Weak correlation

|r| < 0.1: No correlation

Related measures

→ Cosine similarity
Spearman rank: For nonlinear relationships
Kendall tau: Robust to outliers

Formulas for the Pearson correlation coefficient

Basic formula

\[r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}\] Standard Pearson correlation

Covariance form

\[r = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y}\] With covariance and standard deviations

Computational formula

\[r = \frac{n\sum xy - \sum x \sum y}{\sqrt{(n\sum x^2 - (\sum x)^2)(n\sum y^2 - (\sum y)^2)}}\] Numerically stable computation

Z-score form

\[r = \frac{1}{n-1}\sum_{i=1}^n z_{x_i} z_{y_i}\] Using standardized values

Coefficient of determination

\[R^2 = r^2\] Proportion of explained variance

Fisher's Z-transform

\[z = \frac{1}{2}\ln\left(\frac{1+r}{1-r}\right)\] For significance testing

Detailed calculation example

Example: Correlation([1,2,3,4,5], [2,4,6,8,10])

Given:

X = [1, 2, 3, 4, 5]
Y = [2, 4, 6, 8, 10]
n = 5

Step 1 - Means:

\[\bar{x} = \frac{1+2+3+4+5}{5} = 3\] \[\bar{y} = \frac{2+4+6+8+10}{5} = 6\]

Step 2 - Deviations:

\[\sum(x_i - \bar{x})(y_i - \bar{y}) = 20\] \[\sum(x_i - \bar{x})^2 = 10\] \[\sum(y_i - \bar{y})^2 = 40\]

Step 3 - Correlation:

\[r = \frac{20}{\sqrt{10 \cdot 40}} = \frac{20}{20} = 1.0\]

Interpretation: Perfect positive correlation (r = 1.0), because Y = 2X for all data points.

Realistic example

Example: Temperature vs. ice cream sales

Data:

Temperature (°C): [20, 22, 25, 28, 30]
Ice cream sales (€): [150, 180, 220, 280, 320]

Calculation:

\[\bar{x} = 25°C, \bar{y} = 230€\] \[r \approx 0.98\]

Interpretation:

Very strong positive correlation (r = 0.98)
R² = 0.96 → 96% of variance in ice cream sales is explained by temperature

Correlation ≠ Causation

Important note: Correlation is not Causation

Example - spurious correlation:

Variable A: Number of storks
Variable B: Birth rate
Correlation: r = 0.62 (moderate positive)

Explanation:

Third variable: Rural vs. urban areas
Storks and higher birth rates both occur more often in rural areas.

Conclusion: High correlation does not automatically imply causation. Always consider potential confounders or alternative explanations!

Practical applications

Statistics & research

Hypothesis validation
Exploratory data analysis
Variable selection
Check multicollinearity

Finance

Portfolio diversification
Asset correlations
Risk management
Hedging strategies

Machine Learning

Feature selection
Dimensionality reduction
Preprocessing step
Model evaluation

Mathematical properties

Basic properties

Range: -1 ≤ r ≤ +1
Symmetry: r(X,Y) = r(Y,X)
Linear transformation: Invariant under affine transformations
Unitless: Independent of measurement units

Statistical properties

Linearity: Only linear relationships
Sensitive to outliers: Affected by extreme values
Normality: Test statistics under bivariate normality
Effect size: Measure of practical significance

Assumptions

Data type: At least interval-scaled data

Distribution: For tests: bivariate normality

Interpretation guide

Correlation strength by Cohen (1988)

Positive correlations:

r ≥ 0.7: Strong positive correlation

0.3 ≤ r < 0.7: Moderate positive correlation

0.1 ≤ r < 0.3: Weak positive correlation

Negative correlations:

r ≤ -0.7: Strong negative correlation

-0.7 < r ≤ -0.3: Moderate negative correlation

-0.3 < r ≤ -0.1: Weak negative correlation

|r| < 0.1: Practically no linear correlation

Note: This categorization is context-dependent. Different fields (e.g. psychology) may use other standards.

Is this page helpful?

Thank you for your feedback!

Sorry about that
How can we improve it?

Distance functions

Bray Curtis Distance • Canberra Distance • Chebyshev Distance • Cosine Similarity • Euclidean • Levenshtein Distance • Manhattan Distance • Matching Distance • Minkowski Distance • Pearson Correlation Coefficient •