Parameter Estimation (EM)¶

The knowledgespaces.estimation module estimates BLIM parameters from observed response data using the Expectation-Maximization algorithm.

What it estimates¶

Given a knowledge structure and a matrix of student responses, the EM algorithm estimates:

$\beta_q$ (slip per item): $P(\text{incorrect} \mid q \text{ mastered})$
$\eta_q$ (guess per item): $P(\text{correct} \mid q \text{ not mastered})$
$\pi_K$ (state prior): $P(\text{student is in state } K)$

High-level API¶

import knowledgespaces as ks

structure = ks.space_from_prerequisites(
    ["add", "sub", "mul"],
    [("add", "sub"), ("sub", "mul")],
)

result = ks.fit_blim(
    structure,
    items=["add", "sub", "mul"],
    responses=[[1,1,1], [1,1,0], [1,0,0], [0,0,0]],
    counts=[45, 30, 20, 5],  # optional: pattern frequencies
)

print(result["converged"])           # True
print(result["n_iterations"])        # number of EM iterations
print(result["beta"])                # slip per item (dict)
print(result["eta"])                 # guess per item (dict)
print(result["log_likelihood"])      # final log-likelihood
print(result["degenerate_items"])    # items with beta + eta >= 1 (see below)
print(result["gof"]["BIC"])          # Schwarz BIC (primary model-selection criterion)
print(result["gof"]["BIC_npatterns"])# pks-compatible variant (see below)
print(result["gof"]["AIC"])          # Akaike information criterion
print(result["gof"]["G2"])           # likelihood ratio statistic

Low-level API¶

For full control:

from knowledgespaces.estimation import estimate_blim, ResponseMatrix
import numpy as np

data = ResponseMatrix(
    items=["add", "sub", "mul"],
    patterns=np.array([[1,1,1], [1,1,0], [1,0,0], [0,0,0]]),
    counts=np.array([45, 30, 20, 5]),
)

result = estimate_blim(
    structure, data,
    max_iter=500,
    tol=1e-6,
    beta_init=np.array([0.05, 0.1, 0.15]),  # per-item initialization
    eta_init=0.1,                          # or global scalar
)

print(result.beta_for("add"))
print(result.eta_for("mul"))
print(result.pi)  # state prior distribution

How it works¶

The EM algorithm alternates:

E-step: For each response pattern, compute the posterior probability of each knowledge state.
M-step: Re-estimate $\beta_q$, $\eta_q$, and $\pi_K$ from the weighted sufficient statistics.

The log-likelihood is guaranteed to increase at each iteration. Convergence is declared when the change in log-likelihood falls below tol.

The M-step clips each $\beta_q$ and $\eta_q$ independently into $(0, 1)$, as in pks::blim(). The joint condition $\beta_q + \eta_q < 1$ is an informativeness requirement on the knowledge structure, not part of the parameter space (Falmagne & Doignon, 2011, §11) — enforcing it inside the loop would violate EM monotonicity (Dempster, Laird & Rubin, 1977).

Random restarts¶

EM can converge to local optima. To mitigate this, use estimate_blim_restarts, which runs EM several times from random initial values and returns the best fit:

from knowledgespaces.estimation import estimate_blim_restarts

result = estimate_blim_restarts(
    structure, data,
    n_restarts=10,
    seed=42,              # reproducibility
    init_strategy="uniform",  # or "pks" to mirror pks::blim(..., randinit=TRUE)
)

Degenerate items¶

Items whose final $\beta_q + \eta_q \geq 1 - 10^{-3}$ are non-informative under the current knowledge structure: a mastering respondent is no more likely to answer correctly than a non-mastering one. The estimator surfaces them via result["degenerate_items"] and emits a ConvergenceWarning. The recommended remedy is structural — drop the item or revise the structure (Spoto, Stefanutti & Vidotto, 2013, Behavior Research Methods, 45, 1197–1211).

Goodness of fit and model selection¶

result["gof"] collects standard diagnostics:

G2, df, p_value — likelihood-ratio $G^2$ test against the saturated multinomial, with degrees of freedom $\min(2^Q - 1, N)
- \mathrm{npar}$ (`pks` convention; caps at $N$ when the sample is smaller than the full pattern space).
`AIC = -2,\mathrm{LL} + 2,\mathrm{npar}$.
BIC = -2\,\mathrm{LL} + \log(N)\,\mathrm{npar} — the asymptotically consistent Schwarz (1978) form, based on the total number of respondents $N$. This is the recommended primary criterion for model selection.
BIC_npatterns = -2\,\mathrm{LL} + \log(n_\text{patterns})\, \mathrm{npar} — the variant returned by R pks::blim() via its nobs.blim override. Exposed for cross-package replication only; it is not asymptotically consistent because $n_\text{patterns}$ is bounded above by $2^Q$.