Hypothesis Testing Calculator - Z-Test, T-Test & P-Value

Perform Z-tests and T-tests for means and proportions. Enter your sample data to compute the test statistic, p-value, and critical value in seconds.

Select the test type and alternative hypothesis, enter your data, and click Calculate to determine whether to reject the null hypothesis.

Hypothesis Testing Calculator - Z-Test, T-Test & P-Value
Perform Z-tests and T-tests for means and proportions. Enter your sample data to compute the test statistic, p-value, and critical value in seconds.

About the Hypothesis Testing Calculator

Hypothesis testing is the backbone of inferential statistics. It provides a principled, probabilistic framework for deciding whether the data you have collected is consistent with a theoretical claim — the null hypothesis — or whether the evidence is strong enough to reject that claim in favour of an alternative. Every experiment in medicine, psychology, economics, engineering quality control, and A/B website testing ultimately comes down to some form of hypothesis test. The null hypothesis (H₀) is the default assumption: nothing happened, the treatment has no effect, the process is on target, or the proportions are unchanged. The alternative hypothesis (H₁) is what you are trying to detect: the mean has shifted, the proportion has changed, or one treatment is better than another. The significance level α — usually 0.05 or 0.01 — is the probability of incorrectly rejecting H₀ when it is actually true (a Type I error). If the p-value returned by the test is less than α, you reject H₀. The Z-test for means is appropriate when the population standard deviation σ is known and either the sample is large (n ≥ 30) or the population is normally distributed. The test statistic is Z = (x̄ − μ₀) / (σ / √n). Because σ is known, the statistic follows the standard normal distribution exactly, and the p-value is read from the normal table. The T-test for means applies when σ is unknown, which is the realistic situation in most real-world research. The sample standard deviation s is used instead, and the test statistic T = (x̄ − μ₀) / (s / √n) follows a t-distribution with df = n − 1 degrees of freedom. With small samples the t-distribution has heavier tails than the normal, making it harder to reach significance — a sensible penalty for the extra uncertainty about σ. The Z-test for proportions tests whether an observed sample proportion p̂ is consistent with a hypothesised population proportion p₀. The standard error is √(p₀(1 − p₀) / n) and the test statistic is Z = (p̂ − p₀) / SE. This test is widely used in A/B testing, clinical trial primary endpoints, and quality-control fraction-defective charts. For a two-tailed test you reject H₀ when |statistic| > critical value, capturing deviations in either direction. For a one-tailed test (left or right) you specify the direction in advance; this gives more power to detect a shift in that direction but cannot flag an unexpected shift the other way. The critical value displayed is for the right-tail boundary; for a left-tailed test the relevant boundary is its negative. The p-value is the probability of observing a test statistic at least as extreme as the one computed, assuming H₀ is true. A p-value of 0.03 does not mean there is a 3% chance the null is true; it means that if H₀ were true, there would be only a 3% chance of seeing data this extreme or more extreme by random sampling alone. Statistical significance is not the same as practical significance: a tiny effect can be highly significant with a large n, while a large effect may fail to reach significance with a small n. Always pair the p-value with an effect size and a confidence interval.

Hypothesis Testing Examples

Real-world scenarios illustrating each test type and tail direction.

ScenarioResultInterpretation
Quality control: x̄=10.01mm, μ₀=10mm, σ=0.03, n=50, α=0.05, two-tailed Z-testZ=2.357, p=0.0184 → Reject H₀The mean bolt diameter has shifted significantly from the 10 mm target; the process needs adjustment.
Drug trial: x̄=12 mmHg, μ₀=10, s=3, n=30, α=0.05, right-tailed T-testT=3.651, df=29, p=0.0005 → Reject H₀Strong evidence that the drug reduces blood pressure by more than 10 mmHg on average.
A/B test: p̂=0.095, p₀=0.08, n=1000, α=0.05, right-tailed Z-test (proportion)Z=1.750, p=0.0401 → Reject H₀The new button design significantly increases the click-through rate above the baseline 8%.
Fuel efficiency: x̄=29 mpg, μ₀=30, σ=2, n=40, α=0.01, left-tailed Z-testZ=−3.162, p=0.0008 → Reject H₀Evidence at the 1% level that the car model's fuel efficiency is below the advertised 30 mpg.

How to Use the Hypothesis Testing Calculator

  1. Choose the Test Type: Z-Test (Mean) if σ is known, T-Test (Mean) if σ is unknown and you have a sample standard deviation, or Z-Test (Proportion) for categorical outcomes.
  2. Select the Alternative Hypothesis direction — Two-Tailed to detect any change, Left-Tailed to detect a decrease, or Right-Tailed to detect an increase.
  3. Enter the Null Hypothesis Value (μ₀ for mean tests or p₀ for proportion tests), your chosen Significance Level α (typically 0.05), and the Sample Size n.
  4. Fill in the remaining field: Sample Mean x̄ and Population Std Dev σ for Z-Test (Mean); Sample Mean x̄ and Sample Std Dev s for T-Test; or Sample Proportion p̂ for Z-Test (Proportion).
  5. Click Calculate. The tool displays the test statistic, degrees of freedom (T-test only), p-value, critical value, and the reject/fail-to-reject decision.

Hypothesis Testing FAQ

What is the difference between a Z-test and a T-test?
A Z-test is used when the population standard deviation σ is known, which allows the use of the standard normal distribution to compute exact p-values. A T-test is used when σ is unknown and must be estimated from the sample standard deviation s; the resulting test statistic follows a t-distribution with n−1 degrees of freedom, which has heavier tails than the normal to account for the added uncertainty. As the sample size grows, the t-distribution converges to the normal, so the distinction matters most for small samples (roughly n < 30).
What does the p-value actually mean?
The p-value is the probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true. It is not the probability that H₀ is true, nor the probability that your result happened by chance. A p-value below α (commonly 0.05) means the observed data would be surprising if H₀ were true, so you reject H₀. A p-value above α means the data are consistent with H₀, so you fail to reject it — but this does not prove H₀ is correct.
When should I use a one-tailed versus a two-tailed test?
Use a two-tailed test when a difference in either direction is scientifically meaningful and you have no strong prior reason to expect a specific direction. Use a one-tailed test when theory or prior evidence clearly specifies the direction of the effect before data collection begins. Switching to a one-tailed test after seeing the data to achieve significance is p-hacking and invalid. A one-tailed test at α=0.05 is equivalent to a two-tailed test at α=0.10.
What is the significance level α and how do I choose it?
The significance level α is the maximum acceptable probability of a Type I error — incorrectly rejecting a true null hypothesis. The conventional choice is 0.05 (5%), but 0.01 is used when false positives are particularly costly (medical diagnostics, safety-critical systems). Some fields now recommend reporting exact p-values rather than relying on a fixed threshold, and combine them with confidence intervals and effect sizes for a fuller picture.
What are Type I and Type II errors?
A Type I error (false positive) occurs when you reject H₀ even though it is true; its probability is α. A Type II error (false negative) occurs when you fail to reject H₀ even though it is false; its probability is β, and statistical power is 1−β. Reducing α tightens the criterion for rejection, which lowers Type I errors but increases Type II errors. Increasing sample size is the cleanest way to reduce both simultaneously.
Can I use this calculator for proportions from a survey?
Yes — the Z-Test for Proportion mode is designed exactly for this. Enter the hypothesised population proportion p₀ (your baseline or theoretical value), your sample size n, and the observed sample proportion p̂ (successes divided by n). The calculator applies the standard formula Z = (p̂ − p₀) / √(p₀(1−p₀)/n). The normal approximation is reliable when both n·p₀ and n·(1−p₀) exceed 5 or 10.