The ChessAIThon project (2025-1-ES01-KA220-VET-000354329) is co-funded by the European Union. The views and opinions expressed in this publication are those of the author(s) only and do not necessarily reflect those of the European Union or the Spanish Service for the Internationalisation of Education (SEPIE). Neither the European Union nor the National Agency SEPIE can be held responsible for them.
Table of Contents
Teaching the data pipeline for AI training, from raw data to a Parquet file, offers VET students a powerful look into real-world data science.
The ETL Process in Chess AI
Start by explaining that AI models require data to be in a highly specific, numerical format for efficient learning. Chess data, often starting in human-readable CSV files, must undergo an ETL (Extract, Transform, Load) process. You'll use a Google Colab Notebook or Kaggle to act as the ETL tool, employing Python to demonstrate the "Transform" step.