Step-by-Step I-MR Chart Setup for Batch Processes

When a batch reactor, lyophilizer, or blend vessel yields exactly one release measurement per run, there is no within-batch sample to average — the process signal lives entirely in how one batch differs from the next. This is the canonical n=1 case that the Individual Moving Range (I-MR) chart exists to monitor, and forcing batch telemetry into a subgroup-based layout instead only inflates within-subgroup variance and masks the batch-to-batch drift you most need to see. This how-to walks the exact setup: flatten records to one chronological sequence, compute lag-1 moving range limits, screen for the autocorrelation that batch processes routinely carry, and emit capability numbers an auditor will accept. For where this chart sits among the alternatives, start from the SPC Fundamentals & Control Chart Taxonomy.

The pivot to individuals tracking is a rational-subgrouping decision, not a preference. Where a X-Bar R chart or, for larger samples, a X-Bar S chart estimates variation from within-subgroup spread, a batch that produces a single critical quality attribute (CQA) has no such spread to draw on — the moving range between consecutive batches becomes the only unbiased estimator of process sigma.

Prerequisites

Confirm these are in place before building the chart:

Python 3.10+ with pandas >= 2.0 and numpy >= 1.24 (pip install "pandas>=2.0" "numpy>=1.24"); scipy >= 1.10 if you run the normality test in the capability step
One row per batch, each carrying a single CQA value and a batch completion timestamp — not multiple in-process readings per batch
Timestamps already de-duplicated and monotonic; sort and clean upstream with the time-series alignment pipeline if your MES export is unordered
Sentinel values and non-numeric payloads mapped to NaN by the batch data validation gate before they reach the moving range
Reworked or scrapped batches identifiable by a status column, so they can be excluded from the baseline with a documented assignable cause
At least 20–30 stable batches available for Phase I baseline estimation; fewer than that yields unstable limits

Why batch data breaks subgroup charts

Batch telemetry is frequently forced into subgroup-based architectures that quietly violate rational-subgrouping assumptions, and the failure is worth seeing before the code because it drives every step below.

Within-batch readings are not a rational subgroup. Multiple sensor reads inside one batch are dominated by fixed recipe parameters — temperature setpoint, hold time, charge mass — so their spread reflects instrument noise, not process capability. Averaging them into an X-bar subgroup suppresses the real signal, which is the shift between batches, and produces control limits far too tight around a meaningless within-batch variance.

Chronology, not batch ID, defines the sequence. The moving range is a lag-1 statistic: it only means anything if consecutive rows are truly consecutive in time. Sorting by batch ID instead of production timestamp scrambles the differences and fabricates or hides shifts. This is why the first step below sorts strictly by completion timestamp and validates the deltas.

Step-by-Step Implementation

Step 1 — Flatten to one chronological sequence and validate ordering

Reduce the export to one row per batch at a fixed process stage — the final in-process control (IPC) checkpoint or the release-test timestamp — then sort strictly by production time. A single mis-ordered or duplicated record silently corrupts every downstream moving range, so validate the time deltas explicitly rather than trusting the source order.

import pandas as pd
import numpy as np


def prepare_batch_sequence(df: pd.DataFrame, value_col: str, time_col: str) -> pd.DataFrame:
    """Flatten to one row per batch, sorted chronologically, with validated deltas."""
    clean = df.dropna(subset=[value_col, time_col]).copy()
    clean[time_col] = pd.to_datetime(clean[time_col])
    clean = clean.sort_values(time_col).reset_index(drop=True)

    if len(clean) < 2:
        raise ValueError("I-MR charts require at least 2 consecutive batches.")

    # A non-positive delta means a duplicate or a sort failure, not real data.
    deltas = clean[time_col].diff().dropna()
    if (deltas <= pd.Timedelta(0)).any():
        raise ValueError("Duplicate or non-monotonic timestamps: deduplicate upstream.")
    return clean

Verify in isolation: hand this a deliberately shuffled frame and assert the returned index is chronological, and feed it a duplicated timestamp to confirm it raises rather than charting corrupt deltas.

Step 2 — Compute the lag-1 moving range and baseline statistics

The moving range for batch-to-batch monitoring is the absolute difference between consecutive CQA values, $\text{MR}_i = |x_i - x_{i-1}|$, with span 2. Exclude any batch flagged as reworked or scrapped from the baseline mean so a known assignable cause does not inflate $\overline{\text{MR}}$ and widen every limit.

def baseline_stats(clean: pd.DataFrame, value_col: str, exclude_mask=None) -> dict:
    """Grand mean and average moving range from Phase I (in-control) batches only."""
    values = clean[value_col]
    baseline = values[~exclude_mask] if exclude_mask is not None else values

    mr = baseline.diff().abs()          # lag-1 absolute difference, span = 2
    x_bar = baseline.mean()
    mr_bar = mr.dropna().mean()         # only consecutive, non-missing pairs
    return {"x_bar": x_bar, "mr_bar": mr_bar, "n_baseline": int(baseline.notna().sum())}

Verify: on a constant series $\overline{\text{MR}}$ must be 0.0; on [10, 12, 11, 14] it must equal the mean of [2, 1, 3], i.e. 2.0.

Step 3 — Derive the control limits from MR̄

Limits come from $\overline{\text{MR}}$, not the raw standard deviation, which keeps them robust to mild non-normality. For a moving range of span 2 the constant $d_2 = 1.128$, so the individuals limits use $3/d_2 = 2.66$ and the range chart uses $D_4 = 3.267$ with $D_3 = 0$:

Individuals (X) chart: $\text{UCL}_I = \overline{X} + 2.66\,\overline{\text{MR}}$; $\text{LCL}_I = \overline{X} - 2.66\,\overline{\text{MR}}$; $\text{CL} = \overline{X}$
Moving Range (MR) chart: $\text{UCL}_{MR} = 3.267\,\overline{\text{MR}}$; $\text{LCL}_{MR} = 0$

D2, D3, D4 = 1.128, 0.0, 3.267   # span-2 SPC constants (AIAG/ASTM)


def control_limits(x_bar: float, mr_bar: float) -> dict:
    """Phase I I-MR limits; 2.66 = 3 / d2 = 3 / 1.128."""
    sigma_hat = mr_bar / D2
    return {
        "x_ucl": x_bar + 3.0 * sigma_hat,   # equivalently x_bar + 2.66 * mr_bar
        "x_center": x_bar,
        "x_lcl": x_bar - 3.0 * sigma_hat,
        "mr_ucl": D4 * mr_bar,
        "mr_center": mr_bar,
        "mr_lcl": D3 * mr_bar,
        "sigma_hat": sigma_hat,
    }

A negative $\text{LCL}_I$ is mathematically valid — do not force it to zero unless negative CQA values are physically impossible (for example a concentration or a count).

Step 4 — Screen for lag-1 autocorrelation before trusting the limits

Batch processes routinely carry lag-1 autocorrelation from shared raw-material lots, thermal carryover, or equipment warm-up. Standard I-MR limits assume independent observations; when positive autocorrelation is present the moving range underestimates true variation, tightens the limits, and floods the chart with false alarms. Compute the ACF at lag 1 and gate on it before publishing limits.

def lag1_autocorrelation(clean: pd.DataFrame, value_col: str) -> float:
    """Pearson autocorrelation at lag 1 of the Individuals series."""
    return clean[value_col].autocorr(lag=1)


def guard_independence(rho1: float, threshold: float = 0.25) -> None:
    if abs(rho1) > threshold:
        raise ValueError(
            f"lag-1 autocorrelation {rho1:.2f} exceeds ±{threshold}: "
            "standard I-MR limits will over-alarm. Apply an autocorrelation-"
            "corrected d2, or switch to an EWMA/ARIMA scheme."
        )

If the lag-1 coefficient exceeds ±0.25, either correct the moving range denominator with an autocorrelation-adjusted $d_2$ or move to an EWMA scheme with $\lambda \approx 0.2$–$0.3$. The NIST/SEMATECH e-Handbook §6.3.2.1 gives validated methods for detecting and compensating serial dependence in Phase I studies.

Step 5 — Flag out-of-control batches and assemble the annotated frame

With independence confirmed, flag each batch against the locked limits. Keep flagged rows — never delete them — so the sequence stays intact for the Western Electric run-rule evaluation described in the I-MR chart implementation guide and so any capability recalculation can trace which points were excluded and why.

def build_imr_chart(df: pd.DataFrame, value_col: str, time_col: str,
                    exclude_mask=None) -> dict:
    """End-to-end Phase I I-MR build for one measurement per batch."""
    clean = prepare_batch_sequence(df, value_col, time_col)
    guard_independence(lag1_autocorrelation(clean, value_col))

    stats = baseline_stats(clean, value_col, exclude_mask)
    limits = control_limits(stats["x_bar"], stats["mr_bar"])

    clean["MR"] = clean[value_col].diff().abs()
    clean["I_OOC"] = (clean[value_col] > limits["x_ucl"]) | (clean[value_col] < limits["x_lcl"])
    clean["MR_OOC"] = clean["MR"] > limits["mr_ucl"]
    return {"df": clean, "limits": limits, "stats": stats}

For production, wrap this in a Phase I / Phase II toggle: Phase I derives limits from the historical baseline, then serializes and version-controls them; Phase II evaluates streaming batches against the frozen limits. Recomputing limits on every new batch introduces artificial drift and inflates the false-alarm rate — the same trap the rolling-window limit recalibration workflow exists to manage deliberately.

Step 6 — Report capability with the sigma estimator you actually used

Capability for individuals requires two distinct sigma estimates, and audits (AIAG SPC Reference Manual; ISO 22514) expect both plus a normality assessment. Use the short-term within estimate for Cpk and the overall sample deviation for Ppk:

Short-term (Cpk): $\hat\sigma_{\text{within}} = \overline{\text{MR}} / d_2$, with $d_2 = 1.128$
Long-term (Ppk): $\hat\sigma_{\text{overall}} = s$, the sample standard deviation (ddof=1)

from scipy import stats as sps


def capability(clean: pd.DataFrame, value_col: str, limits: dict,
               lsl: float, usl: float) -> dict:
    """Cpk (within/MR-based) and Ppk (overall/s-based) with a normality check."""
    x = clean[value_col].dropna()
    sigma_within = limits["sigma_hat"]           # MR_bar / d2
    sigma_overall = x.std(ddof=1)
    mean = x.mean()

    cpk = min(usl - mean, mean - lsl) / (3 * sigma_within)
    ppk = min(usl - mean, mean - lsl) / (3 * sigma_overall)
    ad_p = sps.anderson(x).statistic          # compare to critical values
    return {"cpk": cpk, "ppk": ppk, "ad_stat": ad_p,
            "n_baseline": int(x.notna().sum())}

If Anderson-Darling or Shapiro-Wilk rejects normality ($p < 0.05$), transform the CQA with Box-Cox or Johnson before quoting Cpk/Ppk. Always record the sigma estimator, the baseline batch count, and every excluded out-of-control point so the report is reproducible under audit.

Verification

Confirm the full build on a minimal synthetic fixture — no live data required. Construct a stable batch series, inject one out-of-spec batch, and assert the individuals chart flags exactly that batch while the limits match the closed-form 2.66·MR̄ rule:

import numpy as np
import pandas as pd

rng = np.random.default_rng(7)
idx = pd.date_range("2026-06-01", periods=30, freq="D")
values = rng.normal(50.0, 1.0, size=30)
values[20] = 60.0                     # one grossly out-of-control batch

df = pd.DataFrame({"cqa": values, "completed_at": idx})
result = build_imr_chart(df, value_col="cqa", time_col="completed_at")

lim = result["limits"]
# UCL must equal x_bar + 2.66 * mr_bar to floating-point tolerance.
expected_ucl = result["stats"]["x_bar"] + 2.66 * result["stats"]["mr_bar"]
assert abs(lim["x_ucl"] - expected_ucl) < 1e-2, "UCL formula mismatch"

flagged = result["df"].index[result["df"]["I_OOC"]].tolist()
assert 20 in flagged, "injected out-of-control batch was not flagged"
print(f"flagged batches: {flagged}, UCL={lim['x_ucl']:.2f}")

Expected output resembles flagged batches: [20], UCL=53.xx. The formula assertion is load-bearing: if it fails, sigma_hat was derived from the wrong $d_2$ or the moving range was computed within batches instead of between them.

Root-Cause Table

Symptom	Cause	Fix
UCL and LCL collapse toward the centerline	All MR values near zero — identical batch outputs or sensor rounding	Increase measurement resolution, verify calibration, and set a minimum detectable difference threshold (Steps 2–3)
Excessive MR-chart out-of-control flags	One catastrophic batch failure or a data-entry error inflates $\overline{\text{MR}}$	Investigate that batch, exclude it via the baseline mask with a documented cause, recompute (Steps 2, 5)
Limits drift over time	Real mean shift or a raw-material lot change	Lock Phase I limits; update the baseline only after a formal engineering change order, not per batch (Step 5)
Chart floods with alarms despite a stable process	Lag-1 autocorrelation shrinks the moving range and tightens limits	Run the ACF guard; apply an autocorrelation-corrected $d_2$ or switch to EWMA λ ≈ 0.2–0.3 (Step 4)
Every consecutive delta reads as a shift	Sorted by batch ID instead of production timestamp	Sort strictly by completion time and validate the deltas before charting (Step 1)

A pre-flight validation should assert monotonic timestamps, minimum sample size (≥ 20–30 batches), and physical constraint boundaries before any limit is published to a production dashboard. For teams pushing results back through the plant floor, carry the flags and exclusions through the same audit path used when connecting Python to MES and SCADA systems.

FAQ

Why is the moving range computed between batches, not within a batch?

Because a batch that yields one release measurement has no within-batch spread that reflects process capability — multiple reads inside one run are dominated by fixed recipe parameters and instrument noise. The real process signal is how one batch differs from the next, so the lag-1 moving range across consecutive batches is the correct, unbiased estimator of sigma. Computing the range within a batch produces limits that are far too tight around a meaningless variance.

How many batches do I need before locking Phase I limits?

Use at least 20–30 stable batches representing normal operating conditions with no known assignable causes. Fewer than that makes $\overline{\text{MR}}$ and therefore the limits unstable — a single unusual early batch dominates the estimate. Once locked, serialize the limits and version-control them alongside the code; Phase II then evaluates new batches against those frozen values rather than recomputing on every arrival.

What do I do when lag-1 autocorrelation exceeds 0.25?

Standard I-MR limits assume independent observations, and positive autocorrelation makes the moving range underestimate true variation, tightening limits until the chart over-alarms. Either correct the denominator with an autocorrelation-adjusted $d_2$, or move to an EWMA scheme with λ around 0.2–0.3 that models the serial dependence directly. Shared raw-material lots and thermal carryover are the usual sources in batch plants, so screen for it before trusting any alarm.

Should I force a negative lower control limit to zero?

Only when negative values are physically impossible — a concentration, an impurity count, or a yield that cannot go below zero. Otherwise a negative $\text{LCL}_I$ is mathematically valid and clamping it to zero hides legitimate low-side excursions. Decide per CQA based on the physics of the measurement, and document the choice so an auditor can see it was deliberate rather than a coding accident.

Up one level: Individual Moving Range (I-MR) Charts. For chart selection criteria across all chart families see SPC Fundamentals & Control Chart Taxonomy.