Archive

Posts Tagged ‘dbharvester’

SICSA Conference 2010

June 12th, 2010 Angus Macdonald No comments

I’m just back from presenting a paper and poster at the SICSA Conference 2010.

You can find the work that I presented at the conference below, and more information on H2O in general at the project webpage.

Paper

H2O: An Autonomic, Resource-Aware Distributed Database System

Abstract:

This paper presents the design of an autonomic, resource-aware distributed database which enables data to be backed up and shared without complex manual administration. The database, H2O, is designed to make use of unused resources on workstation machines.

Creating and maintaining highly-available, replicated database systems can be difficult for untrained users, and costly for IT departments. H2O reduces the need for manual administration by autonomically replicating data and load-balancing across machines in an enterprise.

Provisioning hardware to run a database system can be unnecessarily costly as most organizations already possess large quantities of idle resources in workstation machines. H2O is designed to utilize this unused capacity by using resource availability information to place data and plan queries over workstation machines that are already being used for other tasks.

This paper discusses the requirements for such a system and presents the design and implementation of H2O.

Poster (pdf)

Presentation

(1-up, 6-up)

Designing a Resource-Aware Distributed Database (Presentation)

February 17th, 2010 Angus Macdonald No comments

I recently gave a presentation to Masters students in the department on my research. It’s probably the longest talk I’ve done on my own work (~50 minutes), so it covers a number of areas: stating why it is I’m doing what I’m doing, the design issues in this area, and an overview of my implementation. You can find handouts of the slides below:

6-Up Slides

1-Up Slides

Trends in Database Systems Research: Energy Efficiency

July 9th, 2009 Angus Macdonald No comments

Wherever you look nowadays companies are searching for ways to market themselves as the environmental alternative, both because it makes customers feel good and promises to save them money. It follows that this is particularly true of large computing companies, given the cost of running data centres. As an example, US data centres alone run at an estimated cost of $2.7 billion, 1.2% of the total national energy consumption (ref).

The challenge is in finding ways to reduce the waste.

Broadly speaking there are two complementary approaches to reducing energy consumption: by making hardware more efficient, and software less resource intensive. Database research is beginning to appear on the latter approach, but I think a lot of the work on the hardware side is just as interesting.

Hardware Optimization

In The Case for Energy-Proportional Computing Barroso and Hölzle look at the energy efficiency of typical data centres. They show that the energy efficiency of a server is not directly proportional to its utilization – so, for example, a server running at near 0% utilization is using 50% of the power it uses at peak utilization. Ideally servers should use no power when not in use and power only in proportion to their utilization when they are. The authors call for future hardware design to aim for better energy proportionality, so that machines that are doing little, cost little.

Software Optimization

Two recent CIDR papers look at reducing energy consumption in database systems.

Energy Efficiency: The New Holy Grail of Data Management Systems Research looks at areas where software optimizations can be made, and more generally provides a number of approaches to reducing energy waste. It’s a worthwhile read if you want to get a good feel for the area.

Towards Eco-friendly Database Management Systems proposes that energy consumption be considered a first-class performance goal when planning and processing queries. The authors give details of two optimizations which can help to reduce the energy consumption of a database system.

The first uses the ability of modern processors to execute at a lower power voltage and frequency – their database can explicitly order the processor to operate at a lower voltage when such a change is desirable.

Their second technique is to queue queries where possible so that query aggregation can be used more often, reducing the number of repeat queries to the database. In some evaluations these approaches yield a 49% reduction in energy consumption against only a 3% increase in response time.

While these kind of solutions are not the whole answer (in many cases any reduction in performance would be unacceptable), they at the very least provide an interesting perspective.

Existing Resources

One of the most interesting statistics from this work is the 50% energy consumption of servers doing nothing at all. Essentially, machines doing nothing are still doing something. If we assume that we can’t reduce the energy consumption of these machines to zero, then the question becomes how can these machines be used?

When it comes to user workstations various volunteer computing projects (see Seti@HOME, Folding@Home) strive to make use of unused capacity. But within an enterprise there is a paucity of software able to take advantage of unused resources – resources, which as this article points out, are still costing companies money.

This is one of the motivations behind the H2O project which I am currently involved in.

http://www.cs.st-andrews.ac.uk/~angus/2009/07/trends-in-database-systems-research-column-stores/

SICSA Conference Talk

June 4th, 2009 Angus Macdonald No comments

I recently gave a talk at the inaugural SICSA postgraduate conference here in St Andrews. The talk itself is similar to one I gave previously, but with more added on the pressures concept we’ve been thinking about. These slides contain quite a bit of animation, so I’ve put up a link to the powerpoint version as well as the PDF.

PDF Version

PPTX Version

Photo of Myself During the Talk

During the Talk

Categories: Photos, Presentations, Work Tags: ,

DBHarvester: A Resource Harvesting Distributed Database (Presentation)

January 22nd, 2009 Angus Macdonald 1 comment

I recently gave a talk on the project on a PhD away-day to Lochearnhead, following on from my DBHarvester poster presentation. These trips are always interesting because it’s often the only time you see what your peers in other research groups are doing, and feedback from people working in other areas is always useful.

After the talk we visited the Famous Grouse distillery and went for a brief walk; a good break from typical St Andrews life.

You can download the slides for the talk here: DBHarvester Talk (PDF)

Photos of the Event

DBHarvester: A Resource Harvesting Distributed Database (Poster)

December 9th, 2008 Angus Macdonald No comments

I created this poster for display in the school’s annual poster session. It is a good summary of the current focus of my research as of December 2008. To summarise, it details a plan to build a database running over lab machines, and gives a few reasons why this is interesting.

Update: I won the best overall poster award at the poster session (the prize, a 16Gb memory stick)!

DBHarvester Poster

DBHarvester Poster

You can download the poster here.

You can view a Silverlight Deep Zoom version of the poster here.

Full poster abstract:
The School of Computer Science runs hundreds of computers. The University runs thousands. For large periods of time in each day these computers go unused, wasting processing cycles whilst expensive servers perform tasks such as data warehousing and storage. This project aims to harvest – to identify and use – these in-house resources for use in a distributed database system. The challenges of creating such a system and a proposed solution are discussed in this poster.

Categories: Posters, Work Tags: , , ,