Marina Romanchikova (NPL): How good are our data? Measuring the data quality at National Physical Laboratory (School Seminar)

Abstract:

From mapping the spread of disease to monitoring climate change, data holds the key to solving some of the world’s biggest challenges. Dependable decisions rely on understanding the provenance and reliability of data. Historically, only a small fraction of the generated data was shared and re-used, while the majority of data were used once and then erased or archived. At NPL Data Science we are defining best practice in measurement data reuse and traceability by developing metadata standards and data storage structures to locate and interpret datasets and make them available for sharing, publication and data mining.

The talk will shed light on the most burning issues in the scientific data management, and illustrate it with examples from industrial and academic practices. It will present several NPL Data Science projects that focus on delivering confidence in data obtained from life science imaging, medicine, geosciences and fundamental physics.

Speaker Bio:

Dr Marina Romanchikova joined the NPL Data Science team in 2017 to work on data quality and metadata standards. She obtained an MSc in Medical Informatics at University of Heidelberg, Germany, where she specialised in medical image processing and in management of hospital information systems. In 2010 she received a PhD on Monte Carlo dosimetry for targeted radionuclide therapy at the Institute of Cancer Research in Sutton, UK. Marina worked six years as a radiotherapy research physicist at Cambridge University Hospitals where she developed methods for curation and analysis of medical images.

Current interests

– Quantitative quality assessment of medical images and medical image segmentation
– Harmonisation of medical and healthcare data from heterogeneous sources
– Applications of machine learning in healthcare
– Automated data quality assurance

Event details

  • When: 12th March 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

Lauren Roberts & Peter Michalák (Newcastle): Automating the Placement of Time Series Models for IoT Healthcare Applications (School Seminar)

Abstract:

There has been a dramatic growth in the number and range of Internet of Things (IoT) sensors that generate healthcare data. These sensors stream high-dimensional time series data that must be analysed in order to provide the insights into medical conditions that can improve patient healthcare. This raises both statistical and computational challenges, including where to deploy the streaming data analytics, given that a typical healthcare IoT system will combine a highly diverse set of components with very varied computational characteristics, e.g. sensors, mobile phones and clouds. Different partitionings of the analytics across these components can dramatically affect key factors such as the battery life of the sensors, and the overall performance. In this work we describe a method for automatically partitioning stream processing across a set of components in order to optimise for a range of factors including sensor battery life and communications bandwidth. We illustrate this using our implementation of a statistical model predicting the glucose levels of type II diabetes patients in order to reduce the risk of hyperglycaemia.

Speaker Bios:

Lauren and Peter are final year PhD students at the CDT in Cloud Computing for Big Data at Newcastle University. Peter has a background in Computer Engineering from University of Žilina, Slovakia and a double-degree in Computer Software Engineering from JAMK University of Applied Sciences, Jyväskylä, Finland. His research interests are within distributed event processing, edge computing and Internet of Things with a special focus on energy and bandwidth constrains. Lauren has an MMath degree from Newcastle University and her research interests lie in statistical modelling of time series data.

Event details

  • When: 26th February 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

Quintin Cutts (Glasgow): Re-imagining software engineering education through the apprenticeship lens (School Seminar)

Abstract:

Apprenticeship degrees have sprung up so fast that there has been little time for us all to reflect on how this apparently new form of education, to universities at least, could significantly affect our educational offerings. The University of Glasgow has been undertaking some preparatory work for Skills Development Scotland prior to running its apprenticeship degree in software engineering, and this has afforded us some time to see what others nationally and internationally have been doing, and to consider relevant aspects of the literature, as well as consult with industry. One view that we are developing of these degrees is as a true evolution of typical, largely campus-based, software engineering degrees, towards a full-blown professional degree such as in medicine, where university and hospitals are in real partnership over the training of doctors. In this talk, I will outline our thinking and raise a number of issues for discussion. In suggesting a closer relationship with industry in a talk in St Andrews, I do not of course miss the irony that industry accreditation was never (I believe) something that St Andrews was particularly bothered about – thinking that my BSc (Hons) 1988 is not accredited!

Event details

  • When: 19th February 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

Lewis McMillan (St Andrews): Parallel Computer Simulations of Light-Tissue Interactions for Applications in Medicine, Cosmetics Industry and Biophotonics Research (School Seminar)

Abstract:

Tissue ablation is a widely used treatment in both the cosmetic and medical sectors, for treating various diseases or to improve cosmetic outlooks. We present our tissue ablation model which can predict the depth of ablation, and the surrounding thermal damage caused by the laser during ablation.

“Non-diffracting” beams have a multitude of uses in physics, from optical manipulation to improved microscopy light sources. For the first time we show that these beams can be modelled using Monte Carlo radiation transport method. Allowing better insight into how these beams propagate in a turbid medium.

Both of these projects use the Monte Carlo radiation transport method (MCRT) to simulate light transport. The MCRT method is a powerful numerical method that can solve light transport though heavily scattering and absorbing mediums, such as biological tissues. The method is extremely flexible and can model arbitrary geometries and light sources. MCRT can also model the various micro-physics of the simulated medium, such as polarisation, fluorescence, and Raman scattering. This talk will give an overview of our group’s work, with particular focus on simulating tissue ablation, and modelling “non-diffracting” beams.

Speaker Bio:

Lewis McMillan is a final year physics PhD student at St Andrews University. His research interests are in using Monte Carlo radiation transport method for various applications within medicine and biophotonics.

Event details

  • When: 23rd April 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

Ian Gent (St Andrews): The Winnability of Klondike and Many Other Single-Player Card Games (School Seminar)

This is joint work with Charlie Blake.

Abstract:

The most famous single-player card game is ‘Klondike’, but our ignorance of its winnability percentage has been called “one of the embarrassments of applied mathematics”. Klondike is just one of many single-player card games, generically called ‘solitaire’ or ‘patience’ games, for which players have long wanted to know how likely a particular game is to be winnable for a random deal. A number of different games have been studied empirically in the academic literature and by non-academic enthusiasts.

Here we show that a single general purpose Artificial Intelligence program, called “Solvitaire”, can be used to determine the winnability percentage of approximately 30 different single-player card games with a 95% confidence interval of ± 0.1% or better. For example, we report the winnability of Klondike to within 0.10% (in the ‘thoughtful’ variant where the player knows the location of all cards). This is a 30-fold reduction in confidence interval, and almost all our results are either entirely new or represent significant improvements on previous knowledge.

Speaker Bio:

Ian Gent is professor of Computer Science at the University of St Andrews. His mother taught him to play patience and herself showed endless patience when he “helped” her by taking complete control of the game. A program to play a patience game was one of the programs he wrote on his 1982 Sinclair Spectrum now on the wall outside his office.

Event details

  • When: 5th February 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

Emanuele Trucco (Dundee): Retinal image analysis and beyond in Scotland: the VAMPIRE project (School Seminar)

Abstract:

This talk is an overview of the VAMPIRE (Vessel Assessment and Measurement Platform for Images of the REtina) project, an international and interdisciplinary research initiative created and led by the Universities of Dundee and Edinburgh in Scotland, UK, since the early 2000s. VAMPIRE research focuses on the eye as a source of biomarkers for systemic diseases (e.g. cardiovascular, diabetes, dementia) and cognitive decline, as well as on eye-specific diseases. VAMPIRE is highly interdisciplinary, bringing together medical image analysis, machine learning and data analysis, medical research, and data governance and management at scale. The talk introduces concisely the aims, structure and current results of VAMPIRE, the current vision for effective translation to society, and the several non-technical factors complementing technical research needed to achieve effective translation.

Speaker Bio:

Emanuele (Manuel) Trucco, MSc, PhD, FRSA, FIAPR, is the NRP Chair of Computational Vision in Computing, School of Science and Engineering, at the University of Dundee, and an Honorary Clinical Researcher of NHS Tayside. He has been active since 1984 in computer vision, and since 2002 in medical image analysis, publishing more than 270 refereed papers and 2 textbooks, and serving on the organizing or program committee of major international and UK conferences. Manuel is co-director of VAMPIRE (Vessel Assessment and Measurement Platform for Images of the Retina), an international research initiative led by the Universities of Dundee and Edinburgh (co-director Dr Tom MacGillivray), and part of the UK Biobank Eye andVision Consortium. VAMPIRE develops software tools for efficient data and image analysis with a focus on multi-modal retinal images. VAMPIRE has been used in UK and international biomarker studies on cardiovascular risk, stroke, dementia, diabetes and complications, cognitive performance, neurodegenerative diseases, and genetics.

Event details

  • When: 29th January 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: School Seminar Series
  • Format: Seminar

SRG Seminar: “Large-Scale Hierarchical k-means for Heterogeneous Many-Core Supercomputers” by Teng Yu

We present a novel design and implementation of k-means clustering algorithm targeting supercomputers with heterogeneous many-core processors. This work introduces a multi-level parallel partition approach that not only partitions by dataflow and centroid, but also by dimension. Our multi-level ($nkd$) approach unlocks the potential of the hierarchical parallelism in the SW26010 heterogeneous many-core processor and the system architecture of the supercomputer.
Our design is able to process large-scale clustering problems with up to 196,608 dimensions and over 160,000 targeting centroids, while maintaining high performance and high scalability, significantly improving the capability of k-means over previous approaches. The evaluation shows our implementation achieves performance of less than 18 seconds per iteration for a large-scale clustering case with 196,608 data dimensions and 2,000 centroids by applying 4,096 nodes (1,064,496 cores) in parallel, making k-means a more feasible solution for complex scenarios.
This work is to be presented in the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC18).

Event details

  • When: 1st November 2018 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar, Talk

SRG Seminar: “Using Metric Space Indexing for Complete and Efficient Record Linkage” by Özgür Akgün

Record linkage is the process of identifying records that refer to the same real-world entities, in situations where entity identifiers are unavailable. Records are linked on the basis of similarity between common attributes, with every pair being classified as a link or non-link depending on their degree of similarity. Record linkage is usually performed in a three-step process: first groups of similar candidate records are identified using indexing, pairs within the same group are then compared in more detail, and finally classified. Even state-of-the-art indexing techniques, such as Locality Sensitive Hashing, have potential drawbacks. They may fail to group together some true matching records with high similarity. Conversely, they may group records with low similarity, leading to high computational overhead. We propose using metric space indexing to perform complete record linkage, which results in a parameter-free record linkage process combining indexing, comparison and classification into a single step delivering complete and efficient record linkage. Our experimental evaluation on real-world datasets from several domains shows that linkage using metric space indexing can yield better quality than current indexing techniques, with similar execution cost, without the need for domain knowledge or trial and error to configure the process.

Event details

  • When: 18th October 2018 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

DLS: Scalable Intelligent Systems by 2025 (Carl Hewitt)

Venue: The Old Course Hotel (Hall of Champions)

Timetable:

9:30 Lecture 1
10:30 Break with Coffee
11:15 Lecture 2
12:15 Break for Lunch (not provided)
14:15 Lecture 3
15:15 Discussion

Lecture 1: Introduction to Scalable Intelligent Systems

Lecture 2: Foundations for Scalable Intelligent Systems

Lecture 3: Implications of Scalable Intelligent Systems

Speaker Bio:

Professor Carl Hewitt is the creator (together with his students and other colleagues) of the Actor Model of computation, which influenced the development of the Scheme programming language and the π calculus, and inspired several other systems and programming languages. The Actor Model is in widespread industrial use including eBay, Microsoft, and Twitter. For his doctoral thesis, he designed Planner, the first programming language based on pattern-invoked procedural plans.

Professor Hewitt’s recent research centers on the area of Inconsistency Robustness, i.e., system performance in the face of continual, pervasive inconsistencies (a shift from the previously dominant paradigms of inconsistency denial and inconsistency elimination, i.e., to sweep inconsistencies under the rug). ActorScript and the Actor Model on which it is based can play an important role in the implementation of more inconsistency-robust information systems. Hewitt is an advocate in the emerging campaign against mandatory installation of backdoors in the Internet of Things.

Hewitt is Board Chair of iRobust™, an international scientific society for the promotion of the field of Inconsistency Robustness. He is also Board Chair of Standard IoT™, an international standards organization for the Internet of Things, which is using the Actor Model to unify and generalize emerging standards for IoT. He has been a Visiting Professor at Stanford University and Keio University and is Emeritus in the EECS department at MIT.

Abstract:

A project to build the technology stack outlined in these lectures can bring Scalable Intelligent Systems to fruition by 2025. Scalable Intelligent Systems have the following characteristics:

  • Interactively acquire information from video, Web pages, hologlasses, online data bases, sensors, articles, human speech and gestures, etc.
  • Real-time integration of massive pervasively inconsistent information
  • Scalability in all important dimensions meaning that there are no hard barriers to continual improvement in the above areas
  • Close human collaboration with hologlasses for secure mobile interaction. Computers alone cannot implement the above capabilities
  • No closed-form algorithmic solution is possible to implement the above capabilities

Technology stack for Scalable Intelligent Systems is outlined below:

  • Experiences Hologlasses: Collaboration, Gestures, Animations, Video
  • Matrix Discourse, Rhetoric, and Narration
  • Citadels No single point of failure
  • Massive Inconsistency Robust Ontology Propositions, Goals, Plans, Descriptions, Statistics, Narratives
  • Actor Services Hardware and Software
  • Actor Many Cores Non-sequential, Every-word-tagged, Faraday cage Crypto, Stacked Carbon Nanotube

For example, pain management could greatly benefit from Scalable Intelligent Systems. Complexities of dealing with pain have led to the current opioid crisis. According to Eric Rodgers, PhD., director of the VA’s Office of Evidence Based Practice:

“The use of opioids has changed tremendously since the 1990s, when we first started formulating a plan for guidelines. The concept then was that opioid therapy was an underused strategy for helping our patients and we were trying to get our providers to use this type of therapy more. But as time went on, we became more aware of the harms of opioid therapy and the development of pill mills. The problems got worse.

It’s now become routine for providers to check the state databases to see if there’s multi-sourcing — getting prescriptions from other providers. Providers are also now supposed to use urine drug screenings and, if there are unusual results, to do a confirmation. [For every death from an opioid overdose] there are 10 people who have a problem with opioid use disorder or addiction. And for every addicted person, we have another 10 who are misusing their medication.”

Pain management requires much more than just prescribing opioids, which are often critical for short-term and less often longer-term use. [Coker 2015; Friedberg 2012; Holt 2017; Marchant 2017; McKinney 2015; Spiegel 2018; Tedesco, et. al. 2017; White 2017] Organizational aspects play an important role in pain management. [Fagerhaugh and Strauss 1977]

Event details

  • When: 13th November 2018 09:30 - 15:30
  • Series: Distinguished Lectures Series
  • Format: Distinguished lecture

SRG Seminar: “Efficient Cross-architecture Hardware Virtualisation” by Tom Spink

Virtualisation is a powerful tool used for the isolation, partitioning, and sharing of physical computing resources. Employed heavily in data centres, becoming increasingly popular in industrial settings, and used by home-users for running alternative operating systems, hardware virtualisation has seen a lot of attention from hardware and software developers over the last ten?fifteen years.

From the hardware side, this takes the form of so-called hardware assisted virtualisation, and appears in technologies such as Intel-VT, AMD-V and ARM Virtualization Extensions. However, most forms of hardware virtualisation are typically same-architecture virtualisation, where virtual versions of the host physical machine are created, providing very fast isolated instances of the physical machine, in which entire operating systems can be booted. But, there is a distinct lack of hardware support for cross-architecture virtualisation, where the guest machine architecture is different to the host.

I will talk about my research in this area, and describe the cross-architecture virtualisation hypervisor Captive that can boot unmodified guest operating systems, compiled for one architecture in the virtual machine of another.

I will talk about the challenges of full system simulation (such as memory, instruction, and device emulation), our approaches to this, and how we can efficiently map guest behaviour to host behaviour.

Finally, I will discuss our plans for open-sourcing the hypervisor, the work we are currently doing and what future work we have planned.

Event details

  • When: 11th October 2018 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar, Talk