Knowledge Tracing experiments on the MIAAM/NeurIPS dataset using pyKT. The dataset is available on Hugging Face at GAIMHE/Neurips.
Run notebooks/pykt_processing.ipynb until "4) Construct train/test dataset for other baselines". The notebook first filters out exercises that have no screenshot in data/screenshots/compressed/, so all retained exercises have visual content available. It then produces two files in miaam_pykt_data:
data.txt— student sequences in pyKT's formatgkt_graph_transition.npz— activity-level adjacency matrix for graph-aware models (e.g. GKT), indexed byactivity_id_int
Get pykt-toolkit using git clone https://github.com/pykt-team/pykt-toolkit.git. Install it with:
pip install -e pykt-toolkitCopy the data and preprocessing script into pykt-toolkit:
mkdir pykt-toolkit/data/miaamv1
cp miaam_pykt_data/* pykt-toolkit/data/miaamv1
cp miaam_pykt_data/miaamv1_preprocess.py pykt-toolkit/pykt/preprocess/Register the dataset path in pykt-toolkit/examples/data_preprocess.py:
dname2paths = {
"miaamv1": "../data/miaamv1/",
...
}Register the preprocessing script in pykt-toolkit/pykt/preprocess/data_preprocess.py:
elif dataset_name == "miaamv1":
from .miaamv1_preprocess import read_data_from_csvcd pykt-toolkit/examples
python data_preprocess.py --dataset_name=miaamv1Run the section "4) Construct train/test dataset for other baselines" from notebooks/pykt_processing.ipynb.
This reads the student IDs from pyKT's split files and filters the enriched maths_data_filtered.parquet dataframe accordingly, producing interactions_train.parquet and interactions_test.parquet at the project root. Both files include the integer-encoded columns (user_id_int, exercise_id_int, activity_id_int) added in section 2, making them directly comparable with pyKT's results.
Zero-shot knowledge tracing with an OpenRouter-served LLM (open or closed-weight). Two tasks are supported:
- prob — predict the probability the next attempt is correct (compared against pyKT's AUC / accuracy / Brier).
- answer — on multiple-choice items, predict the option index the student will pick (top-1 accuracy against random and "always-correct" baselines).
See evaluation_llm/README.md for the full pipeline (build windows → run
eval → score), modality / expert-knowledge knobs, and SLURM submission.
Quick start once interactions_test.parquet and the dataset are in place:
export OPENROUTER_API_KEY=sk-or-v1-...
jupyter notebook notebooks/run_eval_openrouter.ipynbThe evaluation scripts in this repository are released under the MIT License.
Note that running these scripts requires access to the MIAAM dataset, which is released under CC-BY-NC-SA 4.0 and hosted on Hugging Face. Usage of the dataset is subject to its own licensing terms.