“Ambient intelligence with sensor networks” by Lucas Amos and “Location, Location, Location: Exploring Amazon EC2 Spot Instance Pricing Across Geographical Regions” by Nnamdi Ekwe-Ekwe

Lucas’s abstract

“Indoor environment quality has a significant effect on worker productivity through a complex interplay of factors such as temperature, humidity and levels of Volatile Organic Compounds (VOCs).

In this talk I will discuss my Masters project which used off the shelf sensors and Raspberry Pis to collect environmental readings at one minute intervals throughout the Computer Science buildings. The prevalence of erroneous readings due to sensor failure and the strategy used for the identification and correction of such faults will be presented. Identifiable correlations between environmental variables and attempts to model these relationships will be discussed

Past studies identifying the ideal environmental conditions for human comfort and productivity allow for the objective assessment of indoor environmental conditions. An adaptation of Frešer’s environment rating system will be presented, showing how VOC levels can be incorporated into assessments of environment quality and how this can be communicated to building users.”

Nnamdi’s abstract

“Cloud computing is becoming an almost ubiquitous part of the computing landscape. For many companies today, moving their entire infrastructure and workloads to the cloud reduces complexity, time to deployment, and saves money. Spot Instances, a subset of Amazon’s cloud computing infrastructure (EC2), expands on this. They allow a user to bid on spare compute capacity in Amazon’s data centres at heavily discounted prices. If demand was ever to increase such that the user’s maximum bid is exceeded, their compute instance is terminated.

In this work, we conduct one of the first detailed analyses of how location affects the overall cost of deployment of a spot instance. We simultaneously examine the reliability of pricing data of a spot instance, and whether a user can be confident that their instance has a low risk of termination.

We analyse spot pricing data across all available Amazon Web Services regions for 60 days on a variety of instance types. We find that location does play a critical role in spot instance pricing and also that pricing differs depending on the granularity of the location – from a more coarse-grained AWS region to a more fine-grained Availability Zone within a region. We relate the pricing differences we find to the price’s stability, confirming whether we can be confident in the bid prices we make.

We conclude by showing that it is very possible to run workloads on Spot Instances achieving
both a very low risk of termination as well as paying very low amounts per hour.”

Event details

  • When: 9th November 2017 13:00 - 14:00
  • Where: Cole 1.33a
  • Series: Systems Seminars Series
  • Format: Seminar

“A Decentralised Multimodal Integration of Social Signals: A Bio-Inspired Approach” by Esma Benssassi and “Plug and Play Bench: Simplifying Big Data Benchmarking Using Containers” by Sheriffo Ceesay

Esma’s abstract

The ability to integrate information from different sensory modalities in a social context is crucial for achieving an understanding of social cues and gaining useful social interaction and experience. Recent research has focused on multi-modal integration of social signals from visual, auditory, haptic or physiological data. Different data fusion techniques have been designed and developed; however, the majority have not achieved significant accuracy improvement in recognising social cues compared to uni-modal social signal recognition. One of the possible limitations is that these existing approaches have no sufficient capacity to model various types of interactions between different modalities and have not been able to leverage the advantages of multi-modal signals by considering each of them as complementary to the others. We introduce ideas for creating a decentralised model for social signals integration inspired by computational models of multi-sensory integration in neuroscience and the perception of social signals in the human brain.

Sheriffo’s abstract

The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which users can rely on to gauge and comprehend the performance of their big data applications deployed locally on clusters, or in the cloud. Researchers have started exploring this area by providing benchmarking suites suitable for big data applications. However, many of these tools are fragmented, complex to deploy and manage, and do not provide transparency with respect to the monetary cost of benchmarking an application.

In this talk, I will present Plug And Play Bench PAPB (https://github.com/sneceesay77/papb): an infrastructure aware abstraction built to integrate and simplify the process of big data benchmarking. PAPB automates the tedious process of installing, configuring and executing common big data benchmark workloads by containerising the tools and settings based on the underlying cluster deployment framework. Our proof of concept implementation utilises HiBench as the benchmark suite, HDP as the cluster deployment framework and Azure as the cloud platform. The talk will further illustrate the inclusion of cost metrics based on the underlying Microsoft Azure cloud platform.

Event details

  • When: 26th October 2017 13:00 - 14:00
  • Where: Cole 1.33a
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: “Adaptive Multisite Computation Offloading in Mobile Clouds” by Dawand Sulaiman and “Topological Ranking-Based Resource Scheduling for Multi-Accelerator Systems” by Teng Yu

Dawand’s abstract

The concept of using cloud hosted infrastructure as a means to overcome the resource-constraints of mobile devices is known as Mobile Cloud Computing (MCC), and allows applications to run partially on the device, and partially on a remote cloud instance, thereby overcoming any device-specific resource constraints. However, as smart phones and tablets gain more CPU power and longer battery life, the meaning of MCC gradually changes. Instead of being fully dependent on the cloud, a number of nearby devices can be used to coordinate and distribute content and resources in a decentralised manner; this is known as Mobile Ad hoc Cloud Computing. Mobile devices with less computational power and lower battery life can be leveraged by the nearby mobile devices to run resource-intensive applications. Therefore, more efficient and reliable methodologies need to be explored for resource hungry and real time applications such as face recognition, data-intensive, and augmented reality mobile applications.
We present a unified framework which allows each mobile device within the shared environment to intelligently offload its computation to other external platforms. For the individual mobile devices, it is important to make the offloading decision based on network conditions, load of other machines, and mobile device’s own constraints (e.g., mobility and battery). Moreover, to achieve a global optimal task completion time for tasks from all the mobile devices, it is necessary to devise a task scheduling solution that schedules offloaded tasks in real time. The offloading decision engine needs to adapt to the dynamic changes in both the host device and connected nearby and remote devices.

Teng’s abstract

Accelerators are becoming increasingly prevalent in distributed computation. FPGAs have been shown to be fast and power efficient for particular tasks, yet scheduling on multi-accelerator systems is challenging when workloads vary significantly in granularity in terms of task size and/or number of computational unit required.
We present a novel approach for dynamically scheduling tasks on networked multi-accelerator systems which maintains high performance, even in the presence of irregular jobs. Our topological ranking-based scheduling allows realistic irregular workloads to be processed while maintaining a significantly higher level of performance than existing schedulers.

Event details

  • When: 12th October 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: “Simulating a pulmonary tuberculosis infection using a network-based metapopulation model” by Michael Pitcher and “A Fake City of People: Modeling the Co-evolution of City and Citizens” by Xue Guo

Event details
When: 28th September 2017 13:00 – 14:00
Where: Cole 1.33b
Series: Systems Seminars Series
Format: Seminar

Michael Pitcher’s abstract

Tuberculosis (TB) is one of the world’s most deadly infectious diseases, claiming over 1.4 million lives every year. TB infections typically affect the lungs and treatment regimens are long and arduous, requiring at least 6 months of daily chemotherapy. Previous investigations have shown TB to have unique localisations within the lung at varying stages of infection. The initial implant and the primary lesion which arises from it can occur anywhere in the lungs, with a greater probability of occurrence in the lower to middle regions of the lung. However, reactivation of a previously latent form of disease always involves cavitation of the tissue at the apical regions. This difference in spatial location of TB infections suggests two important factors: i) bacteria are able to disseminate across the lung in some manner, and ii) the environment at the top of the lung has some properties that make it preferential for TB replication.

In this project, we aim to build a whole-organ model of the lung and surrounding lymphatics which incorporates both bacterial dissemination possibilities and lung tissue spatial heterogeneity in order to understand their impact on TB. We develop ComMeN (Compartmentalised Metapopulation Network), a Python framework designed to allow the easy creation of complex network-based metapopulations with spatial heterogeneity upon which interaction dynamics can be applied, with discrete event modelling using the Gillespie Algorithm. We then extend this framework to create a TB-specific model, PTBComMeN, which models a TB infection occurring over lung tissue which is divided into patches, each of which contains spatial attributes appropriate to its position in the lung, such as ventilation, perfusion and oxygen tension. Events dictate the interactions between cells and bacteria and their interaction with the environment, with dissemination occurring between edges joining patches on the lung network. This model allows experimentation into studying the effects spatial heterogeneities and bacterial dissemination may have on the progression of disease and the model is designed to provide insight into the factors that result in long treatment times for TB.

Xue Guo’s abstract

By the year 2050, the global urban population will reach 2.5 billion. While the fast pace of urbanisation brings improved quality of life initially, the surging population will inevitably lead to unique urban issues. Emerging research fields, with the aim of creating smarter cities, plan to counteract these problems. To facilitate this research, we need solid models to generate ’fake cities’, which cannot be easily produced by existing random graph algorithms due to spatial constraints. Therefore, we propose a new model for the co-evolution of city and population, which can show how street network forms, how population spreads and how settlements emerge and diminish. The new model will be a random city generator, which could be used to backtrack the history and predict the future of a city, or act as test cases for the validation and evaluation of urban optimisation algorithms.

Event details

  • When: 28th September 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: School Seminar Series, Systems Seminars Series
  • Format: Seminar

SRG Seminar: Evaluation Techniques for Detection Model Performance in Anomaly Network Intrusion Detection System by Amjad Al Tobi

Everyday advancements in technology brings with it novel challenges and threats. Such advancement imposes greater risks than ever on systems and services, including individual privacy information. Relying on intrusion specialists to come up with new signatures to detect different types of new attacks, does not seem to scale with excessive traffic growth. Therefore, anomaly-based detection provides a promising solution for this problem area.

Anomaly-based IDS applies machine learning, data mining and/or artificial intelligence along with many other methods to solve this problem. Currently, these solutions seem not to be tractable for real production environments due to the high false alarms rate. This might be a result of such systems not being able to determine the point at which an update is required. It is not clear how detection models will behave over time, when traffic behaviour has changed since the last time the model was re-generated.
Continue reading

Event details

  • When: 1st June 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: New Network Functionality using ILNPv6 and DNS by Khawar Shehzad

This research deals with the introduction of a new network functionality based on Identifier-Locator Network Protocol version 6 (ILNPv6), and Domain Name System (DNS). The chosen area of concern is security and specifically mitigation of Distributed Denial of Service (DDoS). The functionality proposed and tested deals with the issues of vulnerability testing, probing, and scanning which directly lead to a successful DDoS attack. The solutions presented can be used as a reactive measure to these security issues. The DDoS is chosen because in recent years DDoS have become the most common and hard to defend attacks. These attacks are on the availability of system/site. There are multiple solutions in the literature but no one solution is based on ILNPv6, and are complex in nature. Similarly, the solutions in literature either require modification in the providers’ networks or they are complex if they are only site-based solutions. Most of these solutions are based on IPv6 protocol and they do not use the concept of naming, as proposed by ILNPv6.

The prime objectives of this research are:

  • to defend against DDoS attacks with the use of naming and DNS
  • to increase the attacker’s effort, reduce vulnerability testing, and random probing by attackers
  • to practically demonstrate the effectiveness of the ILNPv6-based solution for security

Event details

  • When: 18th May 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: Investigation of Virtual Network Isolation Security in Cloud Computing Using Penetration Testing by Haifa Al Nasseri

Software Defined Networking (SDN) or Virtual Networks (VNs) are required for cloud tenants to leverage demands. However, multi-tenancy can be compromised without proper isolation. Much research has been conducted into VN Isolation; many researchers are not tackling security aspects or checking if their isolation evaluation is complete. Therefore, data leakage is a major security concern in the cloud in general.

This research uses an OpenStack VN and OpenStack Tenant Network to test multi-tenancy features. We are evaluating the relationship between isolation methods used in cloud VN and the amount of data being leaked through using penetration tests. These tests will be used to identify the vulnerabilities causing cloud VN data leakage and to investigate how the vulnerabilities, and the leaked data, can compromise the tenant Virtual Networks.

Event details

  • When: 4th May 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Talk

SRG Seminar: “Evaluating Data Linkage: Creation and use of synthetic data for comprehensive linkage evaluation” by Tom Dalton and “Container orchestration” by Uchechukwu Awada

The abstract of Tom’s talk:

“Data linkage approaches are often evaluated with small or few data sets. If a linkage approach is to be used widely, quantifying its performance with varying data sets would be beneficial. In addition, given a data set needs to be linked, the true links are by definition unknown. The success of a linkage approach is thus difficult to comprehensively evaluate.

This talk focuses on the use of many synthetic data sets for the evaluation of linkage quality achieved by automatic linkage algorithms in the domain of population reconstruction. It presents an evaluation approach which considers linkage quality when characteristics of the population are varied. We envisage a sequence of experiments where a set of populations are generated to consider how linkage quality varies across different populations: with the same characteristics, with differing characteristics, and with differing types and levels of corruption. The performance of an approach at scale is also considered.

The approach to generate synthetic populations with varying characteristics on demand will also be addressed. The use of synthetic populations has the advantage that all the true links are known, thus allowing evaluation as if with real-world ‘gold-standard’ linked data sets.

Given the large number of data sets evaluated against we also give consideration as to how to present these findings. The ability to assess variations in linkage quality across many data sets will assist in the development of new linkage approaches and identifying areas where existing linkage approaches may be more widely applied.”

The abstract of Awada’s talk:

“Over the years, there has been rapid development in the area of software development. A recent innovation in software or application deployment and execution is the use of Containers. Containers provide a lightweight, isolated and well-defined execution environment. Application container like Docker, wrap up a piece of software in a complete file-system that contain everything it needs to run: code, runtime, system tools, system libraries, etc. To support and simplify large-scale deployment, cloud computing providers (i.e., AWS, Google, Microsoft, etc) have recently introduced Container Service Platforms (CSPs), which support automated and flexible orchestration of containerised applications on container-instances (virtual machines).

Existing CSP frameworks do not offer any form of intelligent resource scheduling: applications are usually scheduled individually, rather than taking a holistic view of all registered applications and available resources in the cloud. This can result in increased execution times for applications, and resource wastage through under utilised container-instances; but also a reduction in the number of applications that can be deployed, given the available resources. In addition, current CSP frameworks do not currently support: the deployment and scaling of containers across multiple regions at the same time; merging containers into a multi-container unit in order to achieve higher cluster utilisation and reduced execution times.

Our research aims to extend the existing system by adding a cloud-based Container Management Service (CMS) framework that offers increased deployment density, scalability and resource efficiency. CMS provides additional functionalities for orchestrating containerised applications by joint optimisation of sets of containerised applications and resource pool in multiple (geographical distributed) cloud regions. We evaluate CMS on a cloud-based CSPs i.e., Amazon EC2 Container Management Service (ECS) and conducted extensive experiments using sets of CPU and Memory intensive containerised applications against the custom deployment strategy of Amazon ECS. The results show that CMS achieves up to 25% higher cluster utilisation and up to 70% reduction in execution times.”

Event details

  • When: 20th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) by Percy Perez Aruni

The aim of this talk is to introduce the nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) approach. nMANET is an alternative perspective on utilising the characteristics of NDN to solve the limitations of MANETs, such as mobility and energy consumption. NDN, which is an instance of Information Centric Networking (ICN), provides an alternative architecture for the future Internet. In contrast with traditional TCP/IP networks, NDN enables content addressing instead of host based communication. NDN secures the content instead of securing the communication channel between hosts, therefore the content can be obtained from the intermediate caches or final producers. Although NDN has proven to be an effective design in wired networks, it does not perfectly address challenges arising in MANETs. This shortcoming is due to the high mobilty of mobile devices and their inherent resource constraints, such as remaining energy in batteries.

The implementation of nMANET, the Java based NDN Forwarder Daemon (JNFD), aims to fill this gap and provide a Mobile Name-based Ad-hoc Network prototype compatible with NDN implementations. JNFD was designed for Android mobile devices and offers a set of energy efficient forwarding strategies to distribute content in a dynamic topology where consumers, producers and forwarders have high mobility and may join or leave the network at unpredictable times. nMANET evalues JNFD through benchmarking to estimate efficiency, which is defined as high rates of reliability, throughput and responsiveness with a low energy consumption.

Event details

  • When: 6th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

SRG Seminar: Managing Shared Mutable Data in a Distributed Environment (Simone Conte)

Title: Managing Shared Mutable Data in a Distributed Environment

Abstract: Managing data is central to our digital lives. The average user owns multiple devices and uses a large variety of applications, services and tools. In an ideal world storage is infinite, data is easy to share and version, and available irrespective of where it is stored, and users can protect and exert control over the data arbitrarily.

In the real world, however, achieving such properties is very hard. File systems provide abstractions that do not satisfy all the needs of our daily lives anymore. Many applications now abstract data management to users but do so within their own silos. Cloud services provide each their own storage abstraction adding more fragmentation to the overall system.

The work presented in this talk is about engineering a system that usefully approximates to the ideal world. We present the Sea Of Stuff, a model where users can operate over distributed storage as if using their local storage, they can organise and version data in a distributed manner, and automatically exert policies about how to store content.

Event details

  • When: 23rd March 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar