Calculate Skewness
Online calculator to compute the asymmetry (skewness) of a data distribution
Skewness Calculator
Skewness
Skewness is a measure of asymmetry of a distribution. It indicates whether the distribution is skewed to the left or right.
Skewness Visualization
Skewness describes the asymmetry of a distribution.
It shows on which side the "tail" is longer.
■ Left Skew (g < 0) ■ Symmetric (g = 0) ■ Right Skew (g > 0)
What is Skewness?
Skewness is an important measure in descriptive statistics:
- Definition: Measure of asymmetry of a probability distribution
- Designation: Also called skewness or skew measure
- Moment: Based on the third standardized moment
- Property: Shows direction and strength of asymmetry
- Application: Distribution analysis, data quality, model selection
- Interpretation: Negative, zero, or positive
Types of Skewness
Three types of skewness are distinguished based on sign:
Left Skew
g < 0: The left tail of the distribution is longer than the right.
The mode lies to the right of the mean.
Example: Age at death in developed countries
Symmetric
g ≈ 0: The distribution is symmetric. Mean, median, and mode coincide.
Example: Normal distribution, body height
Right Skew
g > 0: The right tail of the distribution is longer than the left.
The mode lies to the left of the mean.
Example: Income, wealth
Applications of Skewness
Skewness is used in many fields:
Data Analysis and Statistics
- Determine distribution shape
- Normality testing (normal distribution has g = 0)
- Model selection (e.g., log-normal for right skew)
- Outlier detection (extreme skewness indicates outliers)
Business and Finance
- Income distribution (typically right skewed)
- Stock return distribution
- Risk analysis (asymmetric risk)
- Wealth distribution
Formulas for Calculating Skewness
Sample Skewness (g)
Biased estimator - for descriptive statistics
Population Skewness (G) - Corrected
Unbiased estimator - for inferential statistics
Alternative Formula with Moments
m₃ = third central moment
Relationship to Measures of Location
Pearson Skewness (Approximation)
Symbol Explanations
| \(g\) | Sample skewness |
| \(G\) | Population skewness |
| \(x_i\) | Individual data value |
| \(\overline{x}\) | Arithmetic mean |
| \(s\) | Standard deviation |
| \(n\) | Number of values |
Example Calculations for Skewness
Example 1: Nearly Symmetric Distribution
Calculate: Skewness of data
1. Mean & Standard Deviation
Standard deviation:
s ≈ 2.28
2. Cube Z-values
| z₁³ = ((2-5.2)/2.28)³ ≈ -2.00 |
| z₂³ = ((5-5.2)/2.28)³ ≈ -0.01 |
| z₃³ = ((8-5.2)/2.28)³ ≈ 1.77 |
| z₄³ = ((7-5.2)/2.28)³ ≈ 0.99 |
| z₅³ = ((4-5.2)/2.28)³ ≈ -0.34 |
3. Calculate Skewness
Sum: 0.41
\[g = \frac{0.41}{5} \approx \color{blue}{0.08}\]
Interpretation:
Nearly symmetric
Example 2: Right-Skewed Distribution (Income)
Typical for income distributions
Key Statistics
| Mean: | 3.6 |
| Median: | 2 |
| Mode: | 2 |
| Standard Deviation: | 3.29 |
Mode < Median < Mean → Right skew
Skewness Calculation
After calculation:
g ≈ 1.83
Strongly right-skewed!
The value 10 (outlier to the right) pulls the mean far to the right and creates strong right skewness.
Example 3: Left-Skewed Distribution (Exam Results)
Typical for easy exams
Key Statistics
| Mean: | 87.14 |
| Median: | 92 |
| Mode: | - |
| Standard Deviation: | 17.77 |
Mean < Median → Left skew
Skewness Calculation
After calculation:
g ≈ -1.45
Strongly left-skewed!
Most students scored well (85-100), but one outlier with 50 points pulls the distribution to the left. This is typical for easy exams where most students perform well.
Mathematical Foundations of Skewness
Skewness is a fundamental measure to describe the shape of a distribution and is based on the third standardized moment.
Properties of Skewness
Skewness has characteristic mathematical properties:
- Dimensionless: Through standardization, skewness is a pure number without units
- Third Moment: Based on cubic deviations from the mean
- Symmetry: Symmetric distributions have skewness = 0
- Sensitivity: Very sensitive to outliers (cubic weighting)
- Range: Theoretically from -∞ to +∞, practically usually between -3 and +3
Interpretation of Different Values
g < -1 or g > 1
Highly skewed distribution
Clear asymmetry
Normal distribution unlikely
-1 ≤ g ≤ -0.5 or 0.5 ≤ g ≤ 1
Moderately skewed distribution
Clearly recognizable asymmetry
Deviation from normal distribution
-0.5 < g < 0.5
Nearly symmetric
Low asymmetry
Normal distribution possible
Relationship to Measures of Location
Pearson's Rule (Rule of Thumb)
For skewed distributions, there is a relationship between mean, median, and mode:
- Right Skew: Mode < Median < Mean
- Symmetric: Mode = Median = Mean
- Left Skew: Mean < Median < Mode
Approximation by Karl Pearson:
Skewness ≈ 3 · (Mean - Median) / Standard Deviation
Practical Considerations
When to Analyze Skewness?
- Normality Testing: Prerequisite for many tests
- Model Selection: Choosing appropriate distributions
- Data Transformation: Log transformation for right skew
- Outlier Detection: Extreme skewness as indicator
- Data Quality: Checking data structure
Caution with
- Small Samples: Skewness unstable for n < 30
- Outliers: Cubic weighting amplifies influence
- Multimodal Distributions: Interpretation difficult
- Grouped Data: Information loss
- Categorical Data: Skewness not meaningful
Sample vs. Population Skewness
Sample Skewness (g): Biased estimator, systematically too low for small samples. Used in descriptive statistics to describe present data.
Population Skewness (G): Corrected (unbiased) estimator with factor n/((n-1)(n-2)). Used in inferential statistics to estimate population parameters. For large samples (n > 100), both values are nearly identical.
Summary
Skewness is an indispensable tool for describing the asymmetry of distributions. It helps identify non-normal distributions, select appropriate models, and check data quality. Skewness near zero indicates a symmetric distribution, while strongly positive or negative values indicate clear asymmetry. For right-skewed distributions (common in income or wealth data), log transformation can make the distribution more symmetric. Interpretation should always be in the context of the data and in conjunction with other statistics (mean, median, standard deviation) and visual representations (histogram, box plot).
|
|