Containers for HPC environments

Rethinking High performance computing Platforms: Challenges, Opportunities and Recommendations, co-authored by Adam Barker and a team (Ole Weidner, Malcolm Atkinson, Rosa Filgueira Vicente) in the School of Informatics, University of Edinburgh was recently featured in the Communications of the ACM and HPC Wire.

The paper focuses on container technology and argues that a number of “second generation” high-performance computing applications with heterogeneous, dynamic and data-intensive properties have an extended set of requirements, which are not met by the current production HPC platform models and policies. These applications (and users) require a new approach to supporting infrastructure, which draws on container-like technology and services. The paper then goes on to describe cHPC: an early prototype of an implementation based on Linux Containers (LXC).

Ali Khajeh-Hosseini, Co-founder of AbarCloud and former co-founder of ShopForCloud (acquired by RightScale as PlanForCloud) said of this research, “Containers have helped speed-up the development and deployment of applications in heterogeneous environments found in larger enterprises. It’s interesting to investigate their applications in similar types of environments in newer HPC applications.

SRG Seminar: Evaluation Techniques for Detection Model Performance in Anomaly Network Intrusion Detection System by Amjad Al Tobi

Event details

  • When: 1st June 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

Everyday advancements in technology brings with it novel challenges and threats. Such advancement imposes greater risks than ever on systems and services, including individual privacy information. Relying on intrusion specialists to come up with new signatures to detect different types of new attacks, does not seem to scale with excessive traffic growth. Therefore, anomaly-based detection provides a promising solution for this problem area.

Anomaly-based IDS applies machine learning, data mining and/or artificial intelligence along with many other methods to solve this problem. Currently, these solutions seem not to be tractable for real production environments due to the high false alarms rate. This might be a result of such systems not being able to determine the point at which an update is required. It is not clear how detection models will behave over time, when traffic behaviour has changed since the last time the model was re-generated.
Continue reading

SRG Seminar: New Network Functionality using ILNPv6 and DNS by Khawar Shehzad

Event details

  • When: 18th May 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

This research deals with the introduction of a new network functionality based on Identifier-Locator Network Protocol version 6 (ILNPv6), and Domain Name System (DNS). The chosen area of concern is security and specifically mitigation of Distributed Denial of Service (DDoS). The functionality proposed and tested deals with the issues of vulnerability testing, probing, and scanning which directly lead to a successful DDoS attack. The solutions presented can be used as a reactive measure to these security issues. The DDoS is chosen because in recent years DDoS have become the most common and hard to defend attacks. These attacks are on the availability of system/site. There are multiple solutions in the literature but no one solution is based on ILNPv6, and are complex in nature. Similarly, the solutions in literature either require modification in the providers’ networks or they are complex if they are only site-based solutions. Most of these solutions are based on IPv6 protocol and they do not use the concept of naming, as proposed by ILNPv6.

The prime objectives of this research are:

  • to defend against DDoS attacks with the use of naming and DNS
  • to increase the attacker’s effort, reduce vulnerability testing, and random probing by attackers
  • to practically demonstrate the effectiveness of the ILNPv6-based solution for security

Prizes for Haifa Al Nasseri

At the Cyber Academy’s International Conference on Big Data in Cyber Security on May 10 2017 at Edinburgh Napier’s Craiglockhart Campus, PhD student Haifa Al Nasseri won two 3rd prizes. One was for her research poster on Cloud Virtual Network Isolation Security and the other was for her team’s efforts in the Splunk Hackathon.

SRG Seminar: Investigation of Virtual Network Isolation Security in Cloud Computing Using Penetration Testing by Haifa Al Nasseri

Event details

  • When: 4th May 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Talk

Software Defined Networking (SDN) or Virtual Networks (VNs) are required for cloud tenants to leverage demands. However, multi-tenancy can be compromised without proper isolation. Much research has been conducted into VN Isolation; many researchers are not tackling security aspects or checking if their isolation evaluation is complete. Therefore, data leakage is a major security concern in the cloud in general.

This research uses an OpenStack VN and OpenStack Tenant Network to test multi-tenancy features. We are evaluating the relationship between isolation methods used in cloud VN and the amount of data being leaked through using penetration tests. These tests will be used to identify the vulnerabilities causing cloud VN data leakage and to investigate how the vulnerabilities, and the leaked data, can compromise the tenant Virtual Networks.

SRG Seminar: “Evaluating Data Linkage: Creation and use of synthetic data for comprehensive linkage evaluation” by Tom Dalton and “Container orchestration” by Uchechukwu Awada

Event details

  • When: 20th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

The abstract of Tom’s talk:

“Data linkage approaches are often evaluated with small or few data sets. If a linkage approach is to be used widely, quantifying its performance with varying data sets would be beneficial. In addition, given a data set needs to be linked, the true links are by definition unknown. The success of a linkage approach is thus difficult to comprehensively evaluate.

This talk focuses on the use of many synthetic data sets for the evaluation of linkage quality achieved by automatic linkage algorithms in the domain of population reconstruction. It presents an evaluation approach which considers linkage quality when characteristics of the population are varied. We envisage a sequence of experiments where a set of populations are generated to consider how linkage quality varies across different populations: with the same characteristics, with differing characteristics, and with differing types and levels of corruption. The performance of an approach at scale is also considered.

The approach to generate synthetic populations with varying characteristics on demand will also be addressed. The use of synthetic populations has the advantage that all the true links are known, thus allowing evaluation as if with real-world ‘gold-standard’ linked data sets.

Given the large number of data sets evaluated against we also give consideration as to how to present these findings. The ability to assess variations in linkage quality across many data sets will assist in the development of new linkage approaches and identifying areas where existing linkage approaches may be more widely applied.”

The abstract of Awada’s talk:

“Over the years, there has been rapid development in the area of software development. A recent innovation in software or application deployment and execution is the use of Containers. Containers provide a lightweight, isolated and well-defined execution environment. Application container like Docker, wrap up a piece of software in a complete file-system that contain everything it needs to run: code, runtime, system tools, system libraries, etc. To support and simplify large-scale deployment, cloud computing providers (i.e., AWS, Google, Microsoft, etc) have recently introduced Container Service Platforms (CSPs), which support automated and flexible orchestration of containerised applications on container-instances (virtual machines).

Existing CSP frameworks do not offer any form of intelligent resource scheduling: applications are usually scheduled individually, rather than taking a holistic view of all registered applications and available resources in the cloud. This can result in increased execution times for applications, and resource wastage through under utilised container-instances; but also a reduction in the number of applications that can be deployed, given the available resources. In addition, current CSP frameworks do not currently support: the deployment and scaling of containers across multiple regions at the same time; merging containers into a multi-container unit in order to achieve higher cluster utilisation and reduced execution times.

Our research aims to extend the existing system by adding a cloud-based Container Management Service (CMS) framework that offers increased deployment density, scalability and resource efficiency. CMS provides additional functionalities for orchestrating containerised applications by joint optimisation of sets of containerised applications and resource pool in multiple (geographical distributed) cloud regions. We evaluate CMS on a cloud-based CSPs i.e., Amazon EC2 Container Management Service (ECS) and conducted extensive experiments using sets of CPU and Memory intensive containerised applications against the custom deployment strategy of Amazon ECS. The results show that CMS achieves up to 25% higher cluster utilisation and up to 70% reduction in execution times.”

SRG Seminar: nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) by Percy Perez Aruni

Event details

  • When: 6th April 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

The aim of this talk is to introduce the nMANET, the Name-based Data Network (NDN) for Mobile Ad-hoc Networks (MANETs) approach. nMANET is an alternative perspective on utilising the characteristics of NDN to solve the limitations of MANETs, such as mobility and energy consumption. NDN, which is an instance of Information Centric Networking (ICN), provides an alternative architecture for the future Internet. In contrast with traditional TCP/IP networks, NDN enables content addressing instead of host based communication. NDN secures the content instead of securing the communication channel between hosts, therefore the content can be obtained from the intermediate caches or final producers. Although NDN has proven to be an effective design in wired networks, it does not perfectly address challenges arising in MANETs. This shortcoming is due to the high mobilty of mobile devices and their inherent resource constraints, such as remaining energy in batteries.

The implementation of nMANET, the Java based NDN Forwarder Daemon (JNFD), aims to fill this gap and provide a Mobile Name-based Ad-hoc Network prototype compatible with NDN implementations. JNFD was designed for Android mobile devices and offers a set of energy efficient forwarding strategies to distribute content in a dynamic topology where consumers, producers and forwarders have high mobility and may join or leave the network at unpredictable times. nMANET evalues JNFD through benchmarking to estimate efficiency, which is defined as high rates of reliability, throughput and responsiveness with a low energy consumption.

SRG Seminar: Managing Shared Mutable Data in a Distributed Environment (Simone Conte)

Event details

  • When: 23rd March 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

Title: Managing Shared Mutable Data in a Distributed Environment

Abstract: Managing data is central to our digital lives. The average user owns multiple devices and uses a large variety of applications, services and tools. In an ideal world storage is infinite, data is easy to share and version, and available irrespective of where it is stored, and users can protect and exert control over the data arbitrarily.

In the real world, however, achieving such properties is very hard. File systems provide abstractions that do not satisfy all the needs of our daily lives anymore. Many applications now abstract data management to users but do so within their own silos. Cloud services provide each their own storage abstraction adding more fragmentation to the overall system.

The work presented in this talk is about engineering a system that usefully approximates to the ideal world. We present the Sea Of Stuff, a model where users can operate over distributed storage as if using their local storage, they can organise and version data in a distributed manner, and automatically exert policies about how to store content.

SRG Seminar: Cloud scheduling algorithms by Long Thai

Event details

  • When: 9th March 2017 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: Systems Seminars Series
  • Format: Seminar

“Thanks to cloud computing, accessing to a virtualised computing cluster has become not only easy but also feasible to organisations, especially small and medium-sized ones. First of all, it does not require an upfront investment in building data centres and a constant expense for managing them. Instead, users can pay only for the amount of resources that they actually use. Secondly, cloud providers offer a resource provisioning mechanism which allows users to add or remove resources from their cluster easily and quickly in order to accommodate the workload which changes dynamically in real-time. The flexibility of users’ computing clusters are further increased as they are able to select one or a combination of different virtual machine types, each of which has different hardware specification.

Nevertheless, the users of cloud computing have to face the challenges that they have never encountered before. The monetary cost changes dynamically based on the amount of resources used by the clients. Which means it is no longer cost-effective to adopt a greedy approaches which acquires as much resource as possible. Instead, it requires a careful consideration before making any decision regarding acquiring resources. Moreover, the users of cloud computing have the face that paradox of choice resulted from the high number of options regarding hardware specification offered by cloud providers. As a result, finding a suitable machine type for an application can be difficult. It is even more challenging when a user owns many applications which of which performs different. Finally, addressing all the above challenges while ensuring that a user receives a desired performance further increase the difficulty of effectively using cloud computing resources.

In this research, we investigate and propose the approach that aims to solve the challenge of optimising the usage of cloud computing resource by constructing the heterogeneous cloud cluster which dynamically changes based on the workload. Our proposed approach consists two processes. The first one, named execution scheduling, aims to determine the amount of virtual machines and the allocate of workload on each machine in order to achieve the desired performance with the minimum cost. The second process, named execution management, monitors the execution during runtime, detects and handles unexpected events. The proposed research has been thoroughly evaluated by both simulated and real world experiments. The results have showed that our approach is able to not only achieve the desired performance while minimising the monetary cost but also reduce, or even completely prevent, negative results caused by unexpected events at runtime.”