|
io-chess
UCI chess engine
|
The Training module is a PyTorch pipeline that trains the Factorized Mixture-of-Experts neural network from the binary datasets produced by the Preprocessing module. It uses memory-mapped dataset loaders so that hundreds of millions of positions can be trained on with minimal RAM overhead.
Training is organised into three carefully designed phases that prevent expert collapse and ensure each expert develops genuine specialisation for its assigned position type.
Detailed documentation:
| File | Purpose |
|---|---|
| train.py | Main training loop with multi-phase curriculum |
| model.py | Network architecture — Factorized MoE with 1×1 mixer and expert bodies |
| dataset.py | Memory-mapped binary dataset with chunked random sampling |
| loss.py | Custom loss functions: WDL cross-entropy, eval MSE |
| export.py | Converts PyTorch state dict to the engine's flat binary weight format |
| train.sh | End-to-end shell script that runs training phases sequentially |
| utils.py | Logging, colour-coded console output, and training utilities |