Parameter Estimation (EM)¶
The knowledgespaces.estimation module estimates BLIM parameters from observed
response data using the Expectation-Maximization algorithm.
What it estimates¶
Given a knowledge structure and a matrix of student responses, the EM algorithm estimates:
\(\beta_q\) (slip per item): \(P(\text{incorrect} \mid q \text{ mastered})\)
\(\eta_q\) (guess per item): \(P(\text{correct} \mid q \text{ not mastered})\)
\(\pi_K\) (state prior): \(P(\text{student is in state } K)\)
High-level API¶
import knowledgespaces as ks
structure = ks.space_from_prerequisites(
["add", "sub", "mul"],
[("add", "sub"), ("sub", "mul")],
)
result = ks.fit_blim(
structure,
items=["add", "sub", "mul"],
responses=[[1,1,1], [1,1,0], [1,0,0], [0,0,0]],
counts=[45, 30, 20, 5], # optional: pattern frequencies
)
print(result["converged"]) # True
print(result["n_iterations"]) # number of EM iterations
print(result["beta"]) # slip per item (dict)
print(result["eta"]) # guess per item (dict)
print(result["log_likelihood"]) # final log-likelihood
Low-level API¶
For full control:
from knowledgespaces.estimation import estimate_blim, ResponseMatrix
import numpy as np
data = ResponseMatrix(
items=["add", "sub", "mul"],
patterns=np.array([[1,1,1], [1,1,0], [1,0,0], [0,0,0]]),
counts=np.array([45, 30, 20, 5]),
)
result = estimate_blim(
structure, data,
max_iter=500,
tol=1e-6,
beta_init=np.array([0.05, 0.1, 0.15]), # per-item initialization
eta_init=0.1, # or global scalar
)
print(result.beta_for("add"))
print(result.eta_for("mul"))
print(result.pi) # state prior distribution
How it works¶
The EM algorithm alternates:
E-step: For each response pattern, compute the posterior probability of each knowledge state.
M-step: Re-estimate \(\beta_q\), \(\eta_q\), and \(\pi_K\) from the weighted sufficient statistics.
The log-likelihood is guaranteed to increase at each iteration.
Convergence is declared when the change in log-likelihood falls
below tol.