The ChessAIThon project (2025-1-ES01-KA220-VET-000354329) is co-funded by the European Union. The views and opinions expressed in this publication are those of the author(s) only and do not necessarily reflect those of the European Union or the Spanish Service for the Internationalisation of Education (SEPIE). Neither the European Union nor the National Agency SEPIE can be held responsible for them.
Table of Contents
Introduce students to the fascinating challenge of how Large Language Models (LLMs) interact with chess data. Explain that LLMs are designed to process human language (tokens/words), not inherently structured game data.
The Translation Layer
The key teaching point is that before an LLM can analyze a game or suggest a move, the structured chess format (like FEN or our custom 77x8x8 array) must be converted back into a sequential, text-based format that the LLM understands. This is often achieved by translating the board state into a string of tokens.
Data Diversity in AI Architectures
For example, a FEN string is a perfect, concise text input, but even our numerical 77x8x8 representation can be linearized and fed to a model. Students learn that their initial complex data engineering work—converting moves to 0-4096 indices and positions to a 77-layer tensor—is crucial for the AI training model, but the separate LLM might require a simpler text prompt to function.
This illustrates the diverse data needs of different AI architectures and shows how: