Difficulty Model - Kenquête Admin

Training Data

Loading...

Model

Loading...

In-sample R² will be near 1 when training data is thin (textbook overfitting). Trustworthy fit quality requires N ≫ feature count.

Retrain reads every completion, fits ridge regression in-process, persists coefficients to $DATA_DIR/difficulty_model.json, and rescores all stored puzzles. Active model survives restarts.

Score distribution by requested tier

Each row is a histogram of stored difficulty_score for puzzles requested as that tier. Ideal: tiers separate cleanly with little overlap. Heavy overlap means structural tier boundaries don't match perceived difficulty.

Predicted score vs actual log time

Each dot is one completion: x = stored difficulty_score, y = ln(active_time in seconds). A good model produces a positive trend; tight bands per size mean the model captures size-dependent timing.

Completions per puzzle

How many times each puzzle has been solved. Most production puzzles sit at zero; a long tail at the right means a few popular puzzles are dominating the training signal.

Coverage by size × tier

Number of scored puzzles in each (size, requested-tier) bucket and the median stored score. Sparse cells highlight combos where the generator struggles.