Lead supervisor: Dr Peter Macgregor
Application deadline: 1 March 2025
Project description:
Modern data science and machine learning applications involve datasets with millions of data points and hundreds of dimensions. For example, deep learning pipelines produce massive vector datasets representing text, image, audio and other data types. The analysis of such datasets with classical algorithms often requires significant time and/or computational resources which may not be available in many applications.
This motivates the development of a new generation of fast algorithms for data analysis, running in linear or sub-linear time and often producing an approximate result rather than an exact one. Moreover, the dataset may change over time, requiring dynamic algorithms which handle updates efficiently.
This project will tackle aspects of the design, analysis, and implementation of algorithms for processing large dynamic datasets, with the aim to develop new algorithms with state-of-the-art practical performance and/or theoretical guarantees. This could involve performing new analysis of existing algorithms, designing new algorithms with provable guarantees, or implementing heuristic algorithms with state-of-the-art empirical performance.
Possible Directions
Potential areas of research, depending on the interests of the candidate include:
- Developing improved nearest-neighbour search algorithms (e.g., based on kd-trees, HNSW, locality-sensitive hashing).
- Exploring any connection between hierarchical clustering algorithms and nearest-neighbour search algorithms.
- Creating new dynamic or hierarchical clustering algorithms (e.g. based on spectral clustering or DBSCAN).
- Creating dynamic algorithms for numerical linear algebra. For example, maintaining the PCA of a dynamically changing dataset.
- Any other project in the area of algorithmic data science and machine learning.
Applicants should have a strong interest in the mathematical analysis of algorithms, knowledge of topics in discrete mathematics and linear algebra, and some familiarity with existing algorithms for data analysis and machine learning. Strong programming skills would also be desirable.
The scholarship:
We have one fully-funded scholarship available, starting in September 2025. The scholarship covers all tuition fees irrespective of country of origin and includes a stipend valued at £19,705 per annum. More details of the scholarship can be found here: https://blogs.cs.st-andrews.ac.uk/csblog/2024/10/24/phd-studentships-available-for-2025-entry/, but please note the different application deadline.
Eligibility criteria:
We are looking for highly motivated research students keen to be part of a diverse and supportive research community. Applicants must hold a good Bachelor’s or Master’s degree in Computer Science, or a related area appropriate for the topic of this PhD.
International applications are welcome. We especially encourage female applicants and underrepresented minorities to apply. The School of Computer Science was awarded the Athena SWAN Silver award for its sustained progression in advancing equality and representation, and we welcome applications from those suitably qualified from all genders, all races, ethnicities and nationalities, LGBT+, all or no religion, all social class backgrounds, and all family structures to apply for our postgraduate research programmes.
To apply:
Interested applicants can contact Peter Macgregor with an outline proposal.
Full instructions for the formal application process
The deadline for applications is 1 March 2025.