Parameter Estimation (EM)

The knowledgespaces.estimation module estimates BLIM parameters from observed response data using the Expectation-Maximization algorithm.

What it estimates

Given a knowledge structure and a matrix of student responses, the EM algorithm estimates:

  • \(\beta_q\) (slip per item): \(P(\text{incorrect} \mid q \text{ mastered})\)

  • \(\eta_q\) (guess per item): \(P(\text{correct} \mid q \text{ not mastered})\)

  • \(\pi_K\) (state prior): \(P(\text{student is in state } K)\)

High-level API

import knowledgespaces as ks

structure = ks.space_from_prerequisites(
    ["add", "sub", "mul"],
    [("add", "sub"), ("sub", "mul")],
)

result = ks.fit_blim(
    structure,
    items=["add", "sub", "mul"],
    responses=[[1,1,1], [1,1,0], [1,0,0], [0,0,0]],
    counts=[45, 30, 20, 5],  # optional: pattern frequencies
)

print(result["converged"])           # True
print(result["n_iterations"])        # number of EM iterations
print(result["beta"])                # slip per item (dict)
print(result["eta"])                 # guess per item (dict)
print(result["log_likelihood"])      # final log-likelihood
print(result["degenerate_items"])    # items with beta + eta >= 1 (see below)
print(result["gof"]["BIC"])          # Schwarz BIC (primary model-selection criterion)
print(result["gof"]["BIC_npatterns"])# pks-compatible variant (see below)
print(result["gof"]["AIC"])          # Akaike information criterion
print(result["gof"]["G2"])           # likelihood ratio statistic

Low-level API

For full control:

from knowledgespaces.estimation import estimate_blim, ResponseMatrix
import numpy as np

data = ResponseMatrix(
    items=["add", "sub", "mul"],
    patterns=np.array([[1,1,1], [1,1,0], [1,0,0], [0,0,0]]),
    counts=np.array([45, 30, 20, 5]),
)

result = estimate_blim(
    structure, data,
    max_iter=500,
    tol=1e-6,
    beta_init=np.array([0.05, 0.1, 0.15]),  # per-item initialization
    eta_init=0.1,                          # or global scalar
)

print(result.beta_for("add"))
print(result.eta_for("mul"))
print(result.pi)  # state prior distribution

How it works

The EM algorithm alternates:

  1. E-step: For each response pattern, compute the posterior probability of each knowledge state.

  2. M-step: Re-estimate \(\beta_q\), \(\eta_q\), and \(\pi_K\) from the weighted sufficient statistics.

The log-likelihood is guaranteed to increase at each iteration. Convergence is declared when the change in log-likelihood falls below tol.

The M-step clips each \(\beta_q\) and \(\eta_q\) independently into \((0, 1)\), as in pks::blim(). The joint condition \(\beta_q + \eta_q < 1\) is an informativeness requirement on the knowledge structure, not part of the parameter space (Falmagne & Doignon, 2011, §11) — enforcing it inside the loop would violate EM monotonicity (Dempster, Laird & Rubin, 1977).

Random restarts

EM can converge to local optima. To mitigate this, use estimate_blim_restarts, which runs EM several times from random initial values and returns the best fit:

from knowledgespaces.estimation import estimate_blim_restarts

result = estimate_blim_restarts(
    structure, data,
    n_restarts=10,
    seed=42,              # reproducibility
    init_strategy="uniform",  # or "pks" to mirror pks::blim(..., randinit=TRUE)
)

Degenerate items

Items whose final \(\beta_q + \eta_q \geq 1 - 10^{-3}\) are non-informative under the current knowledge structure: a mastering respondent is no more likely to answer correctly than a non-mastering one. The estimator surfaces them via result["degenerate_items"] and emits a ConvergenceWarning. The recommended remedy is structural — drop the item or revise the structure (Spoto, Stefanutti & Vidotto, 2013, Behavior Research Methods, 45, 1197–1211).

Goodness of fit and model selection

result["gof"] collects standard diagnostics:

  • G2, df, p_value — likelihood-ratio \(G^2\) test against the saturated multinomial, with degrees of freedom $\min(2^Q - 1, N)

    • \mathrm{npar}\( (`pks` convention; caps at \)N$ when the sample is smaller than the full pattern space).

  • `AIC = -2,\mathrm{LL} + 2,\mathrm{npar}$.

  • BIC = -2\,\mathrm{LL} + \log(N)\,\mathrm{npar} — the asymptotically consistent Schwarz (1978) form, based on the total number of respondents \(N\). This is the recommended primary criterion for model selection.

  • BIC_npatterns = -2\,\mathrm{LL} + \log(n_\text{patterns})\, \mathrm{npar} — the variant returned by R pks::blim() via its nobs.blim override. Exposed for cross-package replication only; it is not asymptotically consistent because \(n_\text{patterns}\) is bounded above by \(2^Q\).