Pearson Correlation Coefficient
Calculator for the linear relationship with formulas and examples
Correlation Coefficient Calculator
What is calculated?
The Pearson correlation coefficient measures the strength of the linear relationship between two variables. Values between -1 and +1 indicate negative to positive correlation.
Correlation Info
Properties
Pearson correlation:
- Range: [-1, +1]
- +1 = perfect positive correlation
- 0 = no linear correlation
- -1 = perfect negative correlation
Linear: Measures only linear relationships, not curved or other nonlinear associations.
Interpretation
Related measures
→ Cosine similarity
Spearman rank: For nonlinear relationships
Kendall tau: Robust to outliers
|
Formulas for the Pearson correlation coefficient
Basic formula
Covariance form
Computational formula
Z-score form
Coefficient of determination
Fisher's Z-transform
Detailed calculation example
Example: Correlation([1,2,3,4,5], [2,4,6,8,10])
Given:
- X = [1, 2, 3, 4, 5]
- Y = [2, 4, 6, 8, 10]
- n = 5
Step 1 - Means:
Step 2 - Deviations:
Step 3 - Correlation:
Interpretation: Perfect positive correlation (r = 1.0), because Y = 2X for all data points.
Realistic example
Example: Temperature vs. ice cream sales
Data:
Temperature (°C): [20, 22, 25, 28, 30]
Ice cream sales (€): [150, 180, 220, 280, 320]
Calculation:
Interpretation:
R² = 0.96 → 96% of variance in ice cream sales is explained by temperature
Correlation ≠ Causation
Important note: Correlation is not Causation
Example - spurious correlation:
Variable A: Number of storks
Variable B: Birth rate
Correlation: r = 0.62 (moderate positive)
Explanation:
Third variable: Rural vs. urban areas
Storks and higher birth rates both occur more often in rural areas.
Conclusion: High correlation does not automatically imply causation. Always consider potential confounders or alternative explanations!
Practical applications
Statistics & research
- Hypothesis validation
- Exploratory data analysis
- Variable selection
- Check multicollinearity
Finance
- Portfolio diversification
- Asset correlations
- Risk management
- Hedging strategies
Machine Learning
- Feature selection
- Dimensionality reduction
- Preprocessing step
- Model evaluation
Mathematical properties
Basic properties
- Range: -1 ≤ r ≤ +1
- Symmetry: r(X,Y) = r(Y,X)
- Linear transformation: Invariant under affine transformations
- Unitless: Independent of measurement units
Statistical properties
- Linearity: Only linear relationships
- Sensitive to outliers: Affected by extreme values
- Normality: Test statistics under bivariate normality
- Effect size: Measure of practical significance
Assumptions
Data type: At least interval-scaled data
Distribution: For tests: bivariate normality
Interpretation guide
Correlation strength by Cohen (1988)
Positive correlations:
Negative correlations:
Note: This categorization is context-dependent. Different fields (e.g. psychology) may use other standards.