PhD Viva Success: Yasir Alguwaifli

Please join me in congratulating Yasir Alguwaifli, who has just passed his PhD viva subject to minor corrections.

Yasir, who is supervised by Christopher Brown, has provided his thesis abstract below.

Thanks to Özgür Akgün for serving as internal examiner and Prof Christoph Kessler from Linköping University for serving as the external examiner.

Controlling energy consumption has always been a necessity in many computing contexts as the resources that provide said energy is limited, be it a battery supplying power to an Single Board Computer (SBC)/System-on-a-Chip (SoC), an embedded system, a drone, a phone, or another low/limited energy device, or a large cluster of machines that process extensive computations requiring multiple resources, such as a Non-Uniform Memory Access (NUMA) system. The need to accurately predict the energy consumption of such devices is crucial in many fields. Furthermore, different types of languages, e.g. Haskell and C/C++, exhibit different behavioural properties, such as strict vs. lazy evaluation, garbage collection vs. manual memory management, and different parallel runtime behaviours. In addition most software developers do not write software with energy consumption as a goal, this is mostly due to the lack of generalised tooling to help them optimise and predict energy consumption of their software. Therefore, the need to predict energy consumption in a generalised way for different types of languages that do not rely on specific program properties is needed. We construct several statistical models based on parallel benchmarks using regression modelling such as Non-negative Least Squares (NNLS), Random Forests, and Lasso and Elastic-Net Regularized Generalized Linear Models (GLMNET) from two different programming paradigms, namely Haskell and C/C++. Furthermore, the assessment of the statistical models is made over a complete set of benchmarks that behave similarly in both Haskell and C/C++. In addition to assessing the statistical models, we develop meta-heuristic algorithms to predict the energy consumed in parallel benchmarks from Haskell’s Nofib and C/C++’s Princeton Application Repository for Shared-Memory Computers (PARSEC) suites for a range of implementations in PThreads, OpenMP and Intel’s Threading Building Blocks (TBB). The results show that benchmarks with high scalability and performance in parallel execution can have their energy consumption predicted and even optimised by selecting the best configuration for the desired results. We also observe that even in degraded performance benchmarks, high core count execution can still be predicted to the nearest configuration to produce the lowest energy sample. Additionally, the meta-heuristic technique can be employed using a language- and architecture-agnostic approach to energy consumption prediction rather than requiring hand-tuned models for specific architectures and/or benchmarks. Although meta-heuristic sampling provided acceptable levels of accuracy, the combination of the statistical model with the meta-heuristic algorithms proved to be challenging to optimise. Except for low to medium accuracy levels for the Genetic algorithm, combining meta-heuristics demonstrated limited to poor accuracy.

Seminar Talk from a SICSA visitor (Daniel Garijo) Friday 10 June, 11.00am

Accelerating Research Software Understandability Through Knowledge Capture

Daniel Garijo

Summary: Research Software is key to understand, reproduce and reuse existing work in many disciplines, ranging from Geosciences to Astronomy or Artificial Intelligence. However, research software is usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. These problems affect not only researchers, but also students who aim to compare published findings and policy makers seeking clarity on a scientific result. In this talk I will present the main research challenges and our recent efforts towards facilitating software understanding by automatically capturing Knowledge Graphs from software documentation and code.

Short bio: Dr. Daniel Garijo Verdejo is a Distinguished Researcher at the Ontology Engineering Group of Universidad Politécnica de Madrid (UPM). Previously, he held a Research Computer Scientist position at the Information Sciences Institute of the University of Southern California, in Los Angeles. Daniel’s research activities focus on e-Science and Knowledge Capture, specifically on how to increase the understandability of research software and scientific workflows by creating Knowledge Graph from their documentation and provenance (i.e., steps, outputs, inputs, intermediate results).

For this talk we will use a hybrid approach: In person (Jack Cole, 1.33) and online, via Teams.

If you wish to attend it would be helpful if you could register on eventbrite to let us know if you intend to attend in person or online

All Welcome!

Systems Research Group seminars

The Systems Research Group is re-starting their seminars series from 6th May 2022. Seminars will take place every two weeks at 1pm, on Fridays. From May to July the seminars will be online (SRG Teams), while from September onward we aim to move them to a hybrid format. More information on the schedule can be found on the seminars page of the Systems Research Group site.

Learning to Describe: A New Approach to Computer Vision Based Ancient Coin Analysis

The work on deep learning based understanding of ancient coins by Jessica Cooper, who is a Research Assistant and a part-time PhD student supervised by Oggie Arandjelovic and David Harrison has been chosen as a featured, “title story” article by the Journal Sci where it was published in a Special Issue Machine Learning and Vision for Cultural Heritage.

Open source contributors sought for an interview

MANAGING OPEN SOURCE PROJECTS ON GITHUB — SUCCESS FACTORS AND PERFORMANCE INDICATORS

As a part of my, Julia Seeger’s, MSc Dissertation in the School of Computer Science at the University of St Andrews I am looking for volunteers for an interview. This interview is a part of a research project focussed on success factors and performance indicators of managing open source projects hosted on GitHub.

I am looking for core contributors to open source projects hosted on GitHub. Ideally, the project should have configured and make use of Travis CI, and should have a history of pull requests before and after the configuration of Travis CI.

I would firstly be interested in your opinion about success factors and performance indicators that I have identified by analysing the public GitHub repository of your project with the help of the GitHub API. I will ask if, as a core contributor of the project, you would agree or disagree with my findings. Secondly, I am interested in your personal experience in managing a repository of an open source project on GitHub, and the factors and managing techniques you identified to be important for a successful project.

The interview will take place in a form of a video or an audio call via Skype for Business or Microsoft Teams. The interview will take place during July 2020, consists of 6 questions and will last around 25 minutes. If you agree to participate, questions will be given to you at least three days in advance.

If you are willing to participate, please get in touch using the contact details below. You will then be given a Participant Information Sheet that further details my research, and will have the opportunity to ask questions, before being asked whether you consent to participate.

Contact Details

Researcher: Julia Seeger
js433@st-andrews.ac.uk

Supervisor: Dr. Alexander Konovalov
alexander.konovalov@st-andrews.ac.uk

Leverhulme Early Career Fellowship for Nguyen Dang

Congratulations to Dr Nguyen Dang, who has been awarded a Leverhulme Trust Early Career Fellowship. The 3 year Fellowships are intended to assist those at an early stage of their academic careers to undertake a significant piece of publishable work. Nguyen will be researching Constraint-based automated generation of synthetic benchmark instances.

Abstract summary: “Combinatorial problems such as routing or timetabling are ubiquitous in society, industry, and academia. In the quest to develop algorithms to solve these problems effectively, we need benchmark instances. An instance is an example of the problems at hand for testing how well an algorithm performs. Having rich benchmarks of instances is essential for algorithm developers to gain understanding about the strengths and weaknesses of their approaches, and ensure successful applications in practice. This fellowship will provide a fully automated system for generating valid and useful synthetic benchmark instances based on a constraint modelling pipeline that supports several algorithmic techniques.”

Winnability of Klondike Solitaire research features in Major Nelson’s video podcast

Research carried out by Charlie Blake and Ian Gent to compute the approximate odds of winning any version of solitaire features in Major Nelson’s Video Podcast [Interview with Ian and Charlie starts 23:56] for XBox news today.

Today is National Solitaire Day and the 30th anniversary of the game. The celebrations include an invitation to participate in a record breaking attempt at the most games of Microsoft Solitaire completed in one day. You can download the collection free or play it through your browser.

The Klondike Solitaire research also featured in the New Scientist last year.
Link to the full paper on arxiv: https://arxiv.org/abs/1906.12314

Online article published in Technology Nov 17th 2019: https://www.newscientist.com/article/2223643-we-finally-know-the-odds-of-winning-a-game-of-solitaire/

Modern practices of sharing computational research

As a part of the Love Data Week, Alexander Konovalov will give a talk on Tuesday 11 February, 3pm, Physics Lecture Theatre C.

Abstract: Have you been frustrated by trying to use someone else’s code which is non-trivial to install? Have you tried to make supplementary code for your paper to be easily accessible for the reader? If so, you certainly know that this may require non-trivial efforts. I will demonstrate some tools that may help to create reproducible computational experiments, and will explain which skills will be needed to use these tools. The talk will demonstrate examples in Python and R runnable in Jupyter notebooks. You are welcome to bring your laptop to try these examples online. No prior knowledge of programming is required.

Links:

  • Templates for reproducible experiments in GAP, Python and R
  • Code4REF guidance on recording research software in Pure

References:

Event details

  • When: 11th February 2020 15:00 - 16:00
  • Where: Phys Theatre C
  • Format: Talk