MIAAM V1.0 evaluations

Knowledge Tracing experiments on the MIAAM/NeurIPS dataset using pyKT. The dataset is available on Hugging Face at GAIMHE/Neurips.

PyKT

1. Process the dataset

Run notebooks/pykt_processing.ipynb until "4) Construct train/test dataset for other baselines". The notebook first filters out exercises that have no screenshot in data/screenshots/compressed/, so all retained exercises have visual content available. It then produces two files in miaam_pykt_data:

data.txt — student sequences in pyKT's format
gkt_graph_transition.npz — activity-level adjacency matrix for graph-aware models (e.g. GKT), indexed by activity_id_int

2. Install pyKT

Get pykt-toolkit using git clone https://github.com/pykt-team/pykt-toolkit.git. Install it with:

pip install -e pykt-toolkit

3. Add MIAAM to pyKT

Copy the data and preprocessing script into pykt-toolkit:

mkdir pykt-toolkit/data/miaamv1
cp miaam_pykt_data/* pykt-toolkit/data/miaamv1
cp miaam_pykt_data/miaamv1_preprocess.py pykt-toolkit/pykt/preprocess/

Register the dataset path in pykt-toolkit/examples/data_preprocess.py:

dname2paths = {
    "miaamv1": "../data/miaamv1/",
    ...
}

Register the preprocessing script in pykt-toolkit/pykt/preprocess/data_preprocess.py:

elif dataset_name == "miaamv1":
    from .miaamv1_preprocess import read_data_from_csv

4. Run pyKT's data preprocessing

cd pykt-toolkit/examples
python data_preprocess.py --dataset_name=miaamv1

5. Break down the original dataset into train/test following PyKT's split

Run the section "4) Construct train/test dataset for other baselines" from notebooks/pykt_processing.ipynb. This reads the student IDs from pyKT's split files and filters the enriched maths_data_filtered.parquet dataframe accordingly, producing interactions_train.parquet and interactions_test.parquet at the project root. Both files include the integer-encoded columns (user_id_int, exercise_id_int, activity_id_int) added in section 2, making them directly comparable with pyKT's results.

LLM / VLM evaluation

Zero-shot knowledge tracing with an OpenRouter-served LLM (open or closed-weight). Two tasks are supported:

prob — predict the probability the next attempt is correct (compared against pyKT's AUC / accuracy / Brier).
answer — on multiple-choice items, predict the option index the student will pick (top-1 accuracy against random and "always-correct" baselines).

See evaluation_llm/README.md for the full pipeline (build windows → run eval → score), modality / expert-knowledge knobs, and SLURM submission. Quick start once interactions_test.parquet and the dataset are in place:

export OPENROUTER_API_KEY=sk-or-v1-...
jupyter notebook notebooks/run_eval_openrouter.ipynb

Licensing

The evaluation scripts in this repository are released under the MIT License.

Note that running these scripts requires access to the MIAAM dataset, which is released under CC-BY-NC-SA 4.0 and hosted on Hugging Face. Usage of the dataset is subject to its own licensing terms.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
evaluation_llm		evaluation_llm
miaam_pykt_data		miaam_pykt_data
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MIAAM V1.0 evaluations

PyKT

1. Process the dataset

2. Install pyKT

3. Add MIAAM to pyKT

4. Run pyKT's data preprocessing

5. Break down the original dataset into train/test following PyKT's split

LLM / VLM evaluation

Licensing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MIAAM V1.0 evaluations

PyKT

1. Process the dataset

2. Install pyKT

3. Add MIAAM to pyKT

4. Run pyKT's data preprocessing

5. Break down the original dataset into train/test following PyKT's split

LLM / VLM evaluation

Licensing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages