How much does going from n = 4 to n = 5 change my limits?

The half-width scales as 1 over root-n, so n = 4 to 5 tightens it by root of 4 over 5, about 0.89, an 11 percent reduction. The A2 constant drops from 0.729 to 0.577, about 21 percent, because A2 also carries the d2 change. Always recompute limits from the observed n rather than scaling old limits by hand.

How Subgroup Size Changes X-Bar Control Limit Sensitivity

Doubling a subgroup from n = 3 to n = 6 does not "add data" — it tightens the X-bar control limits by roughly 30% and changes which mean shifts the chart can catch. Get the subgroup size wrong and the chart is quietly miscalibrated: too small and it misses the 0.5σ drift you deployed it to catch; too large, or built from subgroups that mix assignable causes, and it fires on noise while inflating short-term capability. This how-to belongs to the X-Bar R chart implementation workflow inside the wider SPC fundamentals and control chart taxonomy: it gives a reproducible Python method to measure exactly how n moves your limits, confirm the estimator routing is correct for that n, and detect the subgroup-size drift that corrupts limits after an ETL change.

Subgroup size is a design parameter, not an ingestion convenience. The steps below treat it as a locked value validated at runtime — because a fixed-n limit routine reading the wrong constant row produces limits that are subtly, invisibly wrong rather than crashing.

Prerequisites

Confirm these are in place before running the sensitivity analysis:

Python 3.10+ with pandas >= 2.0, numpy >= 1.24, and scipy >= 1.10 installed (pip install "pandas>=2.0" "numpy>=1.24" "scipy>=1.10")
A tidy long-format pd.DataFrame with one row per measurement and a column identifying the rational subgroup (lot, cycle, or tooling run) — not a wide array pre-averaged into means
Rational subgroups already established: each subgroup must capture common-cause variation under identical short-term conditions, or the sensitivity numbers below are meaningless
Timestamps aligned and subgroups formed on real process boundaries via the time-series alignment pipeline, not arbitrary clock slices
Missing measurements resolved by the missing-value policy for quality data so a dropout does not silently turn an n = 5 subgroup into n = 4
The target chart chosen against the X-Bar R vs X-Bar S decision criteria; n = 1 streams route instead to an I-MR chart

Why n Sets the Sensitivity

Before the code, fix the mechanism, because it drives every step. The X-bar chart plots subgroup means, and the standard error of a mean is $\sigma/\sqrt{n}$. The 3σ limits are therefore:

$$\text{UCL}/\text{LCL} = \bar{\bar{X}} \pm 3\,\frac{\sigma}{\sqrt{n}} = \bar{\bar{X}} \pm A_2 \bar{R}$$

The half-width shrinks as $1/\sqrt{n}$: going from n = 2 to n = 8 halves it, but going from n = 8 to n = 16 only shrinks it another 29% — diminishing returns that are exactly why the useful range for range-based charts stops near n = 9. A tighter band raises the chart's power to detect a small sustained mean shift (0.5σ–1.0σ), but only under one assumption: within-subgroup variation is common cause only. When a subgroup spans a tool change or a lot boundary, an assignable cause is averaged into the mean and folded into $\bar{R}$, widening limits until the chart never signals — or, worse, the between-subgroup drift masquerades as within-subgroup noise and both the limits and the capability estimate are corrupted.

Sensitivity also depends on estimator efficiency, which is why n gates chart selection, not just limit width:

Subgroup size n	Estimator of σ	Relative efficiency	Route to
1	Moving range	— (no within-subgroup spread)	I-MR chart
2–5	Range R / d₂	~0.99 → 0.95	X-Bar R
6–9	Range R / d₂	~0.93 → 0.85	X-Bar R (watch efficiency)
≥ 10	Std dev S / c₄	S is materially more efficient	X-Bar S chart

Step-by-Step Implementation

Step 1 — Measure actual subgroup size and lock it

Never assume n from the recipe. Count it from the data, and refuse to proceed if it varies — a mixed-n frame silently biases $\bar{R}$ and breaks constant lookup. This is the single most common cause of drifted limits after an ETL change.

import pandas as pd


def observed_subgroup_size(df: pd.DataFrame, subgroup_col: str, metric_col: str) -> int:
    """Return the fixed subgroup size, or raise if n is not constant."""
    sizes = df.groupby(subgroup_col)[metric_col].count()
    if sizes.nunique() != 1:
        raise ValueError(
            f"Subgroup size is not fixed: {sorted(sizes.unique().tolist())}. "
            "Sensitivity and A2/d2 constants are only valid for constant n."
        )
    return int(sizes.iloc[0])

Verify in isolation: pass a frame where one subgroup is short a reading and assert observed_subgroup_size raises ValueError. A routine that averages over ragged subgroups produces limits that look plausible and are wrong.

Step 2 — Compute the limit half-width as a function of n

Quantify the compression directly. Given an estimate of process σ, the X-bar half-width is $3\sigma/\sqrt{n}$; expressing it against a range of candidate n makes the sensitivity curve explicit for chart-design reviews.

import numpy as np

D2 = {2: 1.128, 3: 1.693, 4: 2.059, 5: 2.326,
      6: 2.534, 7: 2.704, 8: 2.847, 9: 2.970}


def sigma_from_range(df, subgroup_col, metric_col, n):
    """Within-subgroup sigma estimate: R-bar / d2(n)."""
    g = df.groupby(subgroup_col)[metric_col]
    r_bar = (g.max() - g.min()).mean()
    return r_bar / D2[n]


def half_width_vs_n(sigma_hat: float, candidate_n: range) -> pd.Series:
    """3-sigma X-bar limit half-width for each candidate subgroup size."""
    return pd.Series(
        {n: 3.0 * sigma_hat / np.sqrt(n) for n in candidate_n},
        name="half_width",
    )

Verify: half_width_vs_n(1.0, range(2, 10)) should decay monotonically, and the ratio of the n = 2 to the n = 8 value should equal np.sqrt(8 / 2) == 2.0. That factor-of-two is the sensitivity you gain — or lose — by changing n.

Step 3 — Confirm the estimator routing matches n

A tighter band is only trustworthy if σ is estimated efficiently. Range-based σ degrades past n = 9, so route to the standard-deviation estimator with the c₄ bias correction once the subgroup grows. Bake the routing into the limit function so it cannot silently apply an out-of-range constant.

from scipy.special import gamma


def c4(n: int) -> float:
    """Unbiasing constant for the sample standard deviation."""
    return np.sqrt(2.0 / (n - 1.0)) * gamma(n / 2.0) / gamma((n - 1.0) / 2.0)


def sigma_hat_routed(df, subgroup_col, metric_col, n):
    """Estimate sigma via R/d2 for small n, S/c4 for n >= 10."""
    g = df.groupby(subgroup_col)[metric_col]
    if n == 1:
        raise ValueError("n = 1: route to an I-MR chart, not X-bar.")
    if n >= 10:
        return g.std(ddof=1).mean() / c4(n)     # X-bar S path
    return (g.max() - g.min()).mean() / D2[n]   # X-bar R path

Verify: for a clean normal fixture, sigma_hat_routed at n = 8 and n = 12 should agree to within a few percent of the true σ. A large gap between the two paths signals that between-subgroup drift is contaminating the range estimate.

Step 4 — Emit limits plus the sensitivity metadata

Return the limits and the n-dependent context together so the charting and audit layers can reason about sensitivity, not just plot bands. Include the theoretical shift the chart can detect at ~50% power on the next point, $\delta = 3/\sqrt{n}$ in σ units, which makes the sensitivity trade-off explicit in the record.

def xbar_limits_with_sensitivity(df, subgroup_col, metric_col) -> dict:
    """X-bar limits plus the subgroup-size sensitivity context."""
    n = observed_subgroup_size(df, subgroup_col, metric_col)
    sigma_hat = sigma_hat_routed(df, subgroup_col, metric_col, n)
    grand_mean = df.groupby(subgroup_col)[metric_col].mean().mean()

    se = sigma_hat / np.sqrt(n)
    return {
        "n": n,
        "estimator": "S/c4" if n >= 10 else "R/d2",
        "centerline": round(grand_mean, 6),
        "ucl": round(grand_mean + 3.0 * se, 6),
        "lcl": round(grand_mean - 3.0 * se, 6),
        "half_width": round(3.0 * se, 6),
        # Shift (in sigma) detectable at ~50% power on the next single point.
        "detectable_shift_sigma": round(3.0 / np.sqrt(n), 3),
    }

Verification

Confirm the $1/\sqrt{n}$ law and the routing with a minimal synthetic fixture — no live data required. Build a stable normal process, form it into subgroups at two different sizes, and assert the wider subgroup yields the tighter limit by the expected factor:

import numpy as np
import pandas as pd

rng = np.random.default_rng(0)
values = rng.normal(50.0, 2.0, size=1200)   # true sigma = 2.0


def framed(values, n):
    """Reshape a flat stream into k subgroups of size n."""
    k = len(values) // n
    v = values[: k * n]
    return pd.DataFrame({
        "subgroup": np.repeat(np.arange(k), n),
        "x": v,
    })


small = xbar_limits_with_sensitivity(framed(values, 4), "subgroup", "x")
large = xbar_limits_with_sensitivity(framed(values, 9), "subgroup", "x")

# Wider subgroup => tighter band, by ~sqrt(9/4) = 1.5x.
ratio = small["half_width"] / large["half_width"]
assert abs(ratio - np.sqrt(9 / 4)) < 0.15, f"1/sqrt(n) law violated: {ratio:.2f}"
assert large["detectable_shift_sigma"] < small["detectable_shift_sigma"]
print(f"n=4 half-width {small['half_width']}, n=9 half-width {large['half_width']}")

Expected: the n = 9 half-width is roughly two-thirds of the n = 4 half-width, and the reported detectable_shift_sigma falls from about 1.5σ to about 1.0σ. If the ratio is far from 1.5, the subgroups are not homogeneous and $\bar{R}$ is absorbing between-subgroup variation — the exact failure that inflates capability downstream.

Root-Cause Table

Symptom	Cause	Fix
Limits far wider than expected for the chosen n	Subgroups straddle tool changes/lots, so assignable cause inflates R̄	Re-form subgroups on real process boundaries before charting (Prerequisites, Step 1)
Limits shifted a fraction of a percent after an ETL change	A dropout turned an n = 5 subgroup into n = 4 and the wrong A₂/d₂ row was read	Count n from the data and raise on mixed sizes every run (Step 1)
Cpk looks inflated with no real improvement	Oversized subgroups processed with range stats compress σ_within	Route n ≥ 10 to S/c₄ and reconcile Cpk against Ppk (Step 3)
Chart never signals a known 0.5σ drift	Subgroup too small, so 3/√n detection threshold sits above the drift	Increase n toward the 6–9 range, or add run-rule detection for small shifts
`KeyError` or garbage σ at n ≥ 10	Range/d₂ estimator applied outside its valid range	Enforce the routing guard so n ≥ 10 uses S/c₄, n = 1 routes to I-MR (Step 3)

Lock subgroup size at the ingestion layer, validate the estimator route at runtime, and record n alongside every limit. When the gap between short-term (Cpk) and long-term (Ppk) capability widens past ~1.3, treat it as between-subgroup instability to investigate, not a subgrouping tweak to bury — the constant tables and their precision requirements are governed by the AIAG SPC Reference Manual (ch. II) and ASTM E2587, and stair-step limits from variable-n attribute data are handled per ISO 7870-2.

FAQ

Does a larger subgroup always give a "better" chart?

No. A larger n tightens the limits and raises power to catch small shifts, but only while within-subgroup variation stays pure common cause. Past n ≈ 9 the range estimator loses efficiency, and any subgroup wide enough to span a tool change or lot boundary folds assignable cause into R̄ — widening limits and desensitizing the chart. Beyond the diminishing 1/√n returns, the practical ceiling is set by how long you can hold conditions constant within one subgroup.

How much does going from n = 4 to n = 5 actually change my limits?

The half-width scales as 1/√n, so n = 4 → 5 tightens it by a factor of √(4/5) ≈ 0.89 — an 11% reduction. Small, but it compounds: the corresponding A₂ constant drops from 0.729 to 0.577 (about 21%) because A₂ also carries the d₂ change. Always recompute limits from the observed n rather than scaling old limits by hand.

Why does oversized subgrouping inflate Cpk?

Short-term capability uses σ_within estimated from R̄/d₂ (or S/c₄). When an oversized subgroup averages an assignable cause into its mean, that variation moves from within-subgroup to between-subgroup, so σ_within is underestimated and Cpk rises with no real process change. Ppk, computed from the overall standard deviation, is unaffected — which is why a Cpk/Ppk ratio well above 1.3 is a reliable flag that the subgrouping, not the process, improved.

What do I do when subgroup size varies in the feed?

For variable-count attribute charts (p, u) the limits legitimately stair-step with nᵢ, and you recompute them per subgroup or apply an average-n band per ISO 7870-2. For variable-count variables data (X-bar), variable n is almost always a data defect, not a design: fix it upstream in the missing-value policy and validation gate rather than averaging across ragged subgroups, because a mixed-n frame silently biases both the limits and σ_within.

Up one level: X-Bar R Chart Implementation. For chart selection criteria see SPC Fundamentals & Control Chart Taxonomy.