Statistics & Hydrology

Chi-Square Goodness of Fit Calculator

Verify whether your observed hydrological data fits a theoretical distribution (Gumbel, Normal, Log-Pearson III). Full step-by-step derivation, interactive charts, p-value approximation, and export.

Flood Frequency Analysis Step-by-Step Results P-Value Approximation

Input

Enter Frequency Data

Significance Level (α)

Parameters from data (m)

Quick-Load Example

Distribution being tested

Frequency Data Table

#	Bin / Class Interval	Observed (O_i)	Expected (E_i)	Note	Del
Totals		—	—

Background

Theory & Formula

The Chi-Square (χ²) Goodness-of-Fit Test is a non-parametric statistical test that determines whether observed frequencies differ significantly from expected theoretical frequencies. In engineering hydrology, it validates whether flood peak, rainfall, or streamflow data follows a specific probability distribution before using it for design return-period estimation.

The Core Formula

$$ \chi^2_{calc} = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} $$

Where $O_i$ = observed frequency in bin $i$, $E_i$ = expected (theoretical) frequency in bin $i$, and $k$ = number of class intervals (bins).

Degrees of Freedom

\[ df = k - 1 - m \]

Where $m$ = number of distribution parameters estimated from the observed data. For example: Normal distribution requires estimating mean and standard deviation, so $m = 2$. Gumbel EV-I also has two parameters, so $m = 2$.

Decision Rule

χ²_calc ≤ χ²_critical: Fail to reject H₀ — data fits the assumed distribution. Acceptable for design use.
χ²_calc > χ²_critical: Reject H₀ — data does not adequately fit the assumed distribution. Try an alternative.

Key Assumptions & Requirements

Each class interval must have an expected frequency ≥ 5. Merge adjacent bins if violated.
Observations must be independent of each other.
Minimum recommended sample size: n ≥ 50.
The test is sensitive to the choice of bin boundaries — results can vary with different binning strategies.

Hydrological Note: The Chi-Square test is a formal companion to visual methods like probability plotting (Weibull plot, Gringorten plot). Always use both together for distribution selection. For small samples (n < 50), the Kolmogorov-Smirnov or Anderson-Darling test may be preferred.

Common Distributions in Hydrology

Distribution	Parameters (m)	Typical Use	Region/Standard
Gumbel (EV-I)	2 (location μ, scale σ)	Annual flood peaks, max daily rainfall	Global, BS 5400
Normal	2 (mean, SD)	Temperature, moderate rainfall	General
Log-Normal	2 (log-mean, log-SD)	Skewed flood data	General
Log-Pearson III	3 (mean, SD, skew)	Flood frequency analysis	US Bulletin 17C, BIS
Pearson III	3	Flood, rainfall	China, Russia
Exponential	1 (rate λ)	Inter-arrival times, drought	General

Critical Value Table (χ², upper-tail)

df	α = 0.10	α = 0.05	α = 0.01
1	2.706	3.841	6.635
2	4.605	5.991	9.210
3	6.251	7.815	11.345
4	7.779	9.488	13.277
5	9.236	11.070	15.086
6	10.645	12.592	16.812
7	12.017	14.067	18.475
8	13.362	15.507	20.090
9	14.684	16.919	21.666
10	15.987	18.307	23.209
15	22.307	24.996	30.578
20	28.412	31.410	37.566
25	34.382	37.652	44.314
30	40.256	43.773	50.892

Illustration

Worked Example

Problem: 50 years of annual flood peak data grouped into 4 class intervals. Test whether data follows the Gumbel (EV-I) distribution at α = 0.05. Parameters (location & scale) are estimated from the data, so m = 2.

Interval (m³/s)	O_i	E_i	(O−E)²/E
0 – 200	12	15	(12−15)²/15 = 9/15 = 0.600
200 – 400	20	18	(20−18)²/18 = 4/18 = 0.222
400 – 600	11	11	(11−11)²/11 = 0/11 = 0.000
> 600	7	6	(7−6)²/6 = 1/6 = 0.167
Total	50	50	χ² = 0.989

Degrees of Freedom: df = k − 1 − m = 4 − 1 − 2 = 1

Critical Value at α = 0.05, df = 1: χ²_critical = 3.841

Decision: 0.989 < 3.841

✓ Fail to Reject H₀ — The flood data fits the Gumbel distribution at the 5% significance level.

FAQ

Frequently Asked Questions

1. What is the Chi-Square goodness-of-fit test in hydrology?

It is a formal statistical test that determines whether observed hydrological data — such as flood peaks, annual rainfall, or streamflow — follows a specified theoretical probability distribution. It quantifies the discrepancy between what was observed and what a theoretical model predicts, giving a statistically defensible basis for distribution selection in frequency analysis.

2. Why is distribution fitting important in flood frequency analysis?

The chosen probability distribution directly determines design flood estimates for critical structures like dams, bridges, and spillways. An incorrect distribution can lead to either dangerous underestimation of design flows or costly over-engineering. The Chi-Square test provides a formal, reproducible criterion for accepting or rejecting a candidate distribution.

3. What are degrees of freedom and how are they calculated?

Degrees of freedom (df) = k − 1 − m, where k is the number of class intervals and m is the number of distribution parameters estimated from the data. Subtracting parameters accounts for the fact that fitting parameters uses some of the data's information, reducing independent variation. For example, a Normal distribution fit has m = 2 (mean and SD), so with 6 bins: df = 6 − 1 − 2 = 3.

4. What happens when an expected frequency bin is less than 5?

Bins with E < 5 violate a core assumption of the chi-square approximation, causing the test statistic to be unreliable — typically inflating χ², leading to false rejections. Adjacent bins must be merged until all expected frequencies are ≥ 5. This calculator warns you when this condition is violated and includes an auto-merge button.

5. How does the Gumbel distribution relate to extreme flood events?

The Gumbel (Extreme Value Type I) distribution arises naturally as the limiting distribution of the maximum of a large number of independent identically distributed variables. Annual flood peaks are the maximum of daily flows — so by the Extreme Value Theorem, they tend toward a Gumbel distribution. It has a heavier upper tail than the Normal, reflecting the higher probability of extreme events.

6. What is the Log-Pearson Type III and when is it used?

LP-III is the log-transformed version of the Pearson Type III (Gamma) distribution. It can accommodate positive skewness common in flood data. The US Army Corps of Engineers mandates its use (Bulletin 17C) for flood frequency analysis. It has three parameters: mean of log(Q), standard deviation of log(Q), and skewness coefficient of log(Q), so m = 3.

7. When should I use Kolmogorov-Smirnov instead of Chi-Square?

The K-S test works on the continuous CDF without requiring data grouping. It is preferred for: small samples (n < 50), continuous distributions where natural bin boundaries are unclear, or when you want to test at every point in the distribution rather than just within bins. Chi-Square is better when data is naturally grouped or when you have n ≥ 50 with clear class intervals.

8. What does the p-value mean in this test?

The p-value is the probability of obtaining a χ² statistic as large as the calculated value if the null hypothesis (good fit) is true. A small p-value (< α) indicates the observed data would be unlikely under the assumed distribution — evidence against it. p > α means the data is consistent with the assumed distribution. This calculator provides an approximation using a gamma function based method.

9. How sensitive is the test to the number and width of bins?

Significantly sensitive. Too few bins reduce power; too many create bins with low expected frequencies. A common rule of thumb is to use k = 1 + 3.3 × log₁₀(n) bins (Sturges' rule). Equal-probability bins (each bin has the same expected frequency) are generally preferred over equal-width bins as they distribute information evenly. Always try multiple binning strategies to check robustness.

10. What is the difference between one-tailed and two-tailed chi-square tests?

For goodness-of-fit, only the upper tail matters — a very large χ² indicates a poor fit. Very small χ² (better than expected) can sometimes indicate data manipulation, but standard practice uses only the upper-tail critical value. Two-tailed versions are used in chi-square tests of independence (contingency tables), which is a different application.

11. Can chi-square test be used for non-hydrological data?

Absolutely. The test is generic and applicable to any frequency data: structural load distributions, wind speed analysis, earthquake magnitude frequency, traffic volume, or any engineering variable where you wish to validate a theoretical model against observations. The principles remain the same regardless of the physical domain.

12. What sample size is needed for reliable results?

At minimum n = 50 observations are recommended, with each bin having E ≥ 5. Below n = 30, results are unreliable. For very small datasets (n < 30), consider the Anderson-Darling test, which is more powerful for small samples and works on continuous distributions without binning. Larger samples (n > 100) give the chi-square test high statistical power.

13. How do I determine the expected frequencies for a distribution?

Expected frequency for bin i: Eᵢ = n × P(aᵢ ≤ X ≤ bᵢ), where P is calculated from the distribution's CDF. For Gumbel: P = exp(−exp(−y)) where y = (x − μ)/σ (reduced variate). For Normal: P = Φ((b − μ)/σ) − Φ((a − μ)/σ) where Φ is the standard normal CDF. Parameters μ and σ are estimated from the data using method of moments or maximum likelihood.

14. What is the relationship between chi-square test and probability paper plots?

Probability paper plots are graphical methods — you plot data on specially scaled axes so that a fitted distribution appears as a straight line. The chi-square test is the formal numerical counterpart, providing a quantitative decision criterion. Best practice is to use both: plot to visually assess fit and identify outliers, then use chi-square to formally confirm or reject the distribution choice.

15. How do I handle tied observations when grouping into bins?

Ties (identical values) can create artificial bin boundaries. When multiple observations fall exactly on a bin boundary, assign them consistently to one bin (e.g., always to the upper bin using the convention aᵢ ≤ x < bᵢ₊₁). Document your convention. For continuous distributions, the probability of exactly tied values is theoretically zero, so ties often indicate measurement resolution issues.

Explore More Engineering Tools

Try our other free engineering calculators and read our hydrology blog posts.

All Tools Read Blog