Project title Data processing and analysis of large chess games data
Duration  10 weeks
Description

The project is a multifaceted analysis of a large dataset of chess games (over 880GB), with move-by-move evaluations as observations. These are stored on a remote server that can be accessed with SQL queries.

  • The first objective is to filter and then export/convert the relational database to a text file or files, such as .csv’s, that can then be processed with statistical software. This will require knowledge of SQL queries.
  • If this is successful, the second objective will be to suggest testable hypotheses on the data, and to run these with statistical software (such as STATA, python or R).

Regarding the second objective, it is not required that the student already have these skills; we already have several hypotheses and we are happy to teach the student how these can be tested econometrically. These include testing the impact of time pressure on performance, and understanding the most successful strategies in terms of risk adoption in lopsided contests. However, the student will also be free to propose his/her own research ideas on the very rich dataset.

Chess databases are ideal environments in order to test questions about biases and decision-making. They provide huge amounts of clean, well identified data from a competitive environment, and in which the usually hidden variable of ability is controlled for. The results of the data analysis may motivate the design of an experiment to be conducted on chess players in order to further identify effects.

Expected outcomes & deliverables
  1. Create text-file datasets from a large relational database using SQL queries.
  2. Design testable hypotheses on the text-file data.
Student qualities

Applicants should already be familiar with SQL queries. For the second and third objectives, familiarity with hypothesis testing and/or statistical or applied economics techniques for causal analysis is desirable, but not required.

Primary supervisor

Dr David Smerdon

Further information

Contact Dr David Smerdon