SRG Seminar: “Evaluating Data Linkage: Creation and use of synthetic data for comprehensive linkage evaluation” by Tom Dalton and “Container orchestration” by Uchechukwu Awada

The abstract of Tom’s talk:

“Data linkage approaches are often evaluated with small or few data sets. If a linkage approach is to be used widely, quantifying its performance with varying data sets would be beneficial. In addition, given a data set needs to be linked, the true links are by definition unknown. The success of a linkage approach is thus difficult to comprehensively evaluate.

This talk focuses on the use of many synthetic data sets for the evaluation of linkage quality achieved by automatic linkage algorithms in the domain of population reconstruction. It presents an evaluation approach which considers linkage quality when characteristics of the population are varied. We envisage a sequence of experiments where a set of populations are generated to consider how linkage quality varies across different populations: with the same characteristics, with differing characteristics, and with differing types and levels of corruption. The performance of an approach at scale is also considered.

The approach to generate synthetic populations with varying characteristics on demand will also be addressed. The use of synthetic populations has the advantage that all the true links are known, thus allowing evaluation as if with real-world ‘gold-standard’ linked data sets.

Given the large number of data sets evaluated against we also give consideration as to how to present these findings. The ability to assess variations in linkage quality across many data sets will assist in the development of new linkage approaches and identifying areas where existing linkage approaches may be more widely applied.”

The abstract of Awada’s talk:

“Over the years, there has been rapid development in the area of software development. A recent innovation in software or application deployment and execution is the use of Containers. Containers provide a lightweight, isolated and well-defined execution environment. Application container like Docker, wrap up a piece of software in a complete file-system that contain everything it needs to run: code, runtime, system tools, system libraries, etc. To support and simplify large-scale deployment, cloud computing providers (i.e., AWS, Google, Microsoft, etc) have recently introduced Container Service Platforms (CSPs), which support automated and flexible orchestration of containerised applications on container-instances (virtual machines).

Existing CSP frameworks do not offer any form of intelligent resource scheduling: applications are usually scheduled individually, rather than taking a holistic view of all registered applications and available resources in the cloud. This can result in increased execution times for applications, and resource wastage through under utilised container-instances; but also a reduction in the number of applications that can be deployed, given the available resources. In addition, current CSP frameworks do not currently support: the deployment and scaling of containers across multiple regions at the same time; merging containers into a multi-container unit in order to achieve higher cluster utilisation and reduced execution times.

Our research aims to extend the existing system by adding a cloud-based Container Management Service (CMS) framework that offers increased deployment density, scalability and resource efficiency. CMS provides additional functionalities for orchestrating containerised applications by joint optimisation of sets of containerised applications and resource pool in multiple (geographical distributed) cloud regions. We evaluate CMS on a cloud-based CSPs i.e., Amazon EC2 Container Management Service (ECS) and conducted extensive experiments using sets of CPU and Memory intensive containerised applications against the custom deployment strategy of Amazon ECS. The results show that CMS achieves up to 25% higher cluster utilisation and up to 70% reduction in execution times.”

Event details

  • When: 20th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

Distinguished Lecture Series 2017: Dr David Manlove

On March 31st, Dr David Manlove from the University of Glasgow, delivered the semester two distinguished lectures in Lower and Upper College Hall. The overall title was algorithms for healthcare-related matching problems.

David started with an overview of complexity theory and solving hard problems. He gave examples of this in practice, for example how researchers constructed a best-possible tour around the best 20,000 pubs in the UK. The second lecture focussed on how to assign junior doctors to hospitals in the best way, a very practical problem but with interesting complexity issues. The final lecture focussed on the life-changing topic of how to set up exchanges of kidneys between healthy donors and patients needing transplants. David talked about how his expertise in algorithms has been translated into regularly finding the best possible matches which then result in real transplants taking place.

David is pictured above at various stages of the distinguished lecture series and outside College Hall with Head of School, Prof Steve Linton, Prof Ian Gent and Dr Ishbel Duncan,

Videos from the DLS can be accessed on Vimeo –
Lecture 1: https://vimeo.com/211633740
Lecture 2: https://vimeo.com/211634119
Lecture 3: https://vimeo.com/211634923

Images courtesy of Ryo Yanagida.

Seeing the Wood for the Trees – Essential Structure in Model-based Search by Prof. John McCall

Problem structure, or linkage, refers to the interaction between variables in a black-box fitness function. Discovering structure is a feature of a range of search algorithms that use structural models at each iteration to determine the trajectory of the search. Examples include Information Geometry Optimisation (IGO), Covariance Matrix Adaptation Evolution Strategy (CMA-ES), Bayesian Evolutionary Learning (BEL) and Estimation of Distribution Algorithms (EDA).

In particular, EDAs use probabilistic graphical models to represent structure learned from evaluated solutions. Various EDA approaches using trees, directed acyclic graphs and undirected graphs have been developed and evaluated on a range of benchmarks with a variety of representations.
Continue reading

Event details

  • When: 4th April 2017 14:00 - 15:00
  • Where: Cole 1.33
  • Series: School Seminar Series
  • Format: Seminar

SRG Seminar: nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) by Percy Perez Aruni

The aim of this talk is to introduce the nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) approach. nMANET is an alternative perspective on utilising the characteristics of NDN to solve the limitations of MANETs, such as mobility and energy consumption. NDN, which is an instance of Information Centric Networking (ICN), provides an alternative architecture for the future Internet. In contrast with traditional TCP/IP networks, NDN enables content addressing instead of host based communication. NDN secures the content instead of securing the communication channel between hosts, therefore the content can be obtained from the intermediate caches or final producers. Although NDN has proven to be an effective design in wired networks, it does not perfectly address challenges arising in MANETs. This shortcoming is due to the high mobilty of mobile devices and their inherent resource constraints, such as remaining energy in batteries.

The implementation of nMANET, the Java based NDN Forwarder Daemon (JNFD), aims to fill this gap and provide a Mobile Name-based Ad-hoc Network prototype compatible with NDN implementations. JNFD was designed for Android mobile devices and offers a set of energy efficient forwarding strategies to distribute content in a dynamic topology where consumers, producers and forwarders have high mobility and may join or leave the network at unpredictable times. nMANET evalues JNFD through benchmarking to estimate efficiency, which is defined as high rates of reliability, throughput and responsiveness with a low energy consumption.

Event details

  • When: 6th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

DLS: Algorithms for healthcare-related matching problems

Algorithms for healthcare-related matching problems

Distinguished Lecture Series, Semester 2, 2016-7

David Manlove

School of Computing Science, University of Glasgow

Lower College Hall (with overflow simulcast in Upper College Hall)

Abstract:

Algorithms arise in numerous everyday appPicture of David Manlovelications – in this series of lectures I will describe how algorithms can be used to solve matching problems having applications in healthcare settings.  I will begin by outlining how algorithms can be designed to cope with computationally hard problems.  I will then describe algorithms developed at the University of Glasgow that have been used by the NHS to solve two particular matching problems.  These problems correspond to the annual assignment of junior doctors to Scottish hospitals, and finding “kidney exchanges” between kidney patients and their incompatible donors in the UK.
Continue reading

Event details

  • When: 31st March 2017 09:15 - 15:30
  • Where: Lower College Hall
  • Series: Distinguished Lectures Series
  • Format: Distinguished lecture

SRG Seminar: Managing Shared Mutable Data in a Distributed Environment (Simone Conte)

Title: Managing Shared Mutable Data in a Distributed Environment

Abstract: Managing data is central to our digital lives. The average user owns multiple devices and uses a large variety of applications, services and tools. In an ideal world storage is infinite, data is easy to share and version, and available irrespective of where it is stored, and users can protect and exert control over the data arbitrarily.

In the real world, however, achieving such properties is very hard. File systems provide abstractions that do not satisfy all the needs of our daily lives anymore. Many applications now abstract data management to users but do so within their own silos. Cloud services provide each their own storage abstraction adding more fragmentation to the overall system.

The work presented in this talk is about engineering a system that usefully approximates to the ideal world. We present the Sea Of Stuff, a model where users can operate over distributed storage as if using their local storage, they can organise and version data in a distributed manner, and automatically exert policies about how to store content.

Event details

  • When: 23rd March 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: Cloud scheduling algorithms by Long Thai

“Thanks to cloud computing, accessing to a virtualised computing cluster has become not only easy but also feasible to organisations, especially small and medium-sized ones. First of all, it does not require an upfront investment in building data centres and a constant expense for managing them. Instead, users can pay only for the amount of resources that they actually use. Secondly, cloud providers offer a resource provisioning mechanism which allows users to add or remove resources from their cluster easily and quickly in order to accommodate the workload which changes dynamically in real-time. The flexibility of users’ computing clusters are further increased as they are able to select one or a combination of different virtual machine types, each of which has different hardware specification.

Nevertheless, the users of cloud computing have to face the challenges that they have never encountered before. The monetary cost changes dynamically based on the amount of resources used by the clients. Which means it is no longer cost-effective to adopt a greedy approaches which acquires as much resource as possible. Instead, it requires a careful consideration before making any decision regarding acquiring resources. Moreover, the users of cloud computing have the face that paradox of choice resulted from the high number of options regarding hardware specification offered by cloud providers. As a result, finding a suitable machine type for an application can be difficult. It is even more challenging when a user owns many applications which of which performs different. Finally, addressing all the above challenges while ensuring that a user receives a desired performance further increase the difficulty of effectively using cloud computing resources.

In this research, we investigate and propose the approach that aims to solve the challenge of optimising the usage of cloud computing resource by constructing the heterogeneous cloud cluster which dynamically changes based on the workload. Our proposed approach consists two processes. The first one, named execution scheduling, aims to determine the amount of virtual machines and the allocate of workload on each machine in order to achieve the desired performance with the minimum cost. The second process, named execution management, monitors the execution during runtime, detects and handles unexpected events. The proposed research has been thoroughly evaluated by both simulated and real world experiments. The results have showed that our approach is able to not only achieve the desired performance while minimising the monetary cost but also reduce, or even completely prevent, negative results caused by unexpected events at runtime.”

Event details

  • When: 9th March 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

Implementing Event-Driven Microservices Architecture using Functional programming

*PLEASE NOTE THIS TALK WILL TAKE PLACE IN BMS BUILDING – SEMINAR ROOM 113*

BIO: Nikhil Barthwal is a polyglot programmer currently working as a Senior Software Engineer at Jet.com, an e-commerce startup recently acquired by Walmart. He works in the Tools & Productivity team with the aim of making developers more productive, as well as improving the quality of the code. Outside of work, he is involved with local meetups in New York city where he gives talks on various topics related to technology. He holds a Master’s in Computer Science with special focus on Distributed Systems and a Bachelor’s in Electrical Engineering.

ABSTRACT: Web services are typically stateless entities, that need to operate at scale at large. Functional paradigm can be used to model these web services work and offer several benefits like scalability, productivity, and correctness.

This talk describes how Jet.com implemented their Event-Driven Microservices using F#. It covers topics like their Microservices, Event-Sourcing, Kafka, Build & Deployment pipeline. The objective of the talk is show how to create a scalable & highly distributed web service in F#, and demonstrate how various characteristics of functional paradigm capture the behavior of such services architecture very naturally.

Event details

  • When: 8th March 2017 15:00 - 16:00
  • Where: TBA
  • Series: CS Colloquia Series
  • Format: Colloquium, Seminar

Seminar: The technology driving the evolution of internet advertising, targeted advertising or intrusive surveillance?

“The technology driving the evolution of internet advertising, targeted advertising or intrusive surveillance?”

 Tim Palmer, Senior Partner, Digiterre (http://www.digiterre.com)

 

Event details

  • When: 27th February 2017 14:00 - 15:00
  • Where: Cole 1.33a
  • Series: CS Colloquia Series
  • Format: Seminar

Distinguished Lecture Series 2016: Prof. Julie McCann

Earlier this month Professor Julie McCann from Imperial College London, delivered the next set of distinguished lectures for 2016, in Lower and Upper College Hall. The three topical, well attended and interesting lectures centred around Distributed Systems and Sensing and discussed how sensor networks are being used today, how other sciences will impact the research area, how such systems are programmed and finished by introducing ongoing challenges in terms of scalability, resilience and security.

Professor McCann is pictured below at various stages of the distinguished lecture series, and with Director of Research, Professor Simon Dobson and Dean of Science, Professor Alan Dearle.

dls1

dls2

Videos from the DLS can be accessed on Vimeo –
Lecture 1: https://vimeo.com/192134381
Lecture 2: https://vimeo.com/192135351
Lecture 3: https://vimeo.com/192137007

Images courtesy of Saleem Bhatti