Archive

Author Archive

Setting up PGCluster (Guide)

June 5th, 2011 Angus Macdonald 2 comments

This post is a guide on how to install, configure, and run PGCluster 1.9rc7 (a synchronous replication tool for PostgreSQL), on Centos.

It assumes that you are going to be running PGCluster on at least three distinct machines, and that you have a general understanding of its architecture. It’s key contribution is some scripts which automate the task of modifying config files for each of the three types of cluster node.

Read more…

Categories: Work Tags: , ,

Charting Thesis Progress

April 7th, 2011 Angus Macdonald 2 comments

Someone told me it’d be a good idea to literally chart the progress I’ve made writing my thesis, so I created a graph showing my word count over time. Exciting, I know!

Hopefully the line levelling off is a positive sign…

Update (02/05/11): Updated with more detail and more words. The line isn’t levelling off any more, which is maybe worrying!

Categories: Discussion Tags:

CrossRef++ (A Microsoft Word Add-in)

March 11th, 2011 Angus Macdonald 3 comments

This is a Word 2010 that replaces Word’s in-built ‘add cross-reference’ tool. Why? — because it has annoyed me greatly whilst writing my thesis!

Installer (EXE Version).

Source code as ZIP, on Github.

Please note that this only works in Word 2010, not on any earlier versions.

Why this is ‘needed’

The standard word tool (pictured below) quickly becomes tedious to use in large documents for a number of reasons:

  • You have to switch between references for things (like figures and numbered items) constantly, and you have to use a drop-down box to do this.
  • It doesn’t remember your last referenced item, so if you’re constantly referencing a figure that is two pages down the list, you constantly have to scroll down to that reference.
  • It doesn’t remember the size of your reference box, so even if you have a massive monitor, you can only ever use a tiny fraction of it to search for references.

The Standard Word Cross-reference Tool

What CrossRef++ Does Differently

  • References are displayed in a task pane, which typically stretches the length of a screen, but can be moved around as well.
  • It remembers (roughly) where your last used reference was for each type (figure, numbered item, etc.), so you don’t have to scroll as much as before.
  • It provides a few big buttons to at the top to change reference type, which makes it quicker to change.

The CrossRef++ Tool

What it Doesn’t Yet Do

  • It doesn’t support all types of references (for endnotes and other things you need to use the old tool).
  • It doesn’t handle re-sizing the task pane well.
  • It doesn’t allow you to search through references, though I’d like to do that eventually.
Categories: Side Projects Tags: , ,

Tutorial Notes (on Data Structures and Algorithms)

December 15th, 2010 Angus Macdonald 1 comment

This year marked the first time I’d tutored on a second year course, Foundations of Computation. Of the topics covered I produced notes and code to help explain lists, search algorithms, and trees. I’ve included them below in the hope they may be useful.

Algorithms for finding cycles in Linked Lists (GitHub repository). The included code runs various algorithms to find cycles, and graphs the efficiency of each algorithm.

Search Algorithms Comparison (GitHub repository). The included code implements three search algorithms (Selection, Insertion, and Merge Sort) and includes various levels of debug to show the process taken by each algorithm and to count the number of comparisons and swaps involved.

Cheat sheet for Balancing AVL Trees (Google Docs). A very brief guide explaining what operations must be performed to balance and AVL tree.

Categories: Teaching Tags: ,

Current Trends in Distributed Database Systems (Talk)

December 13th, 2010 Angus Macdonald 2 comments

I recently gave a talk to our Masters Databases class entitled Current Trends in Distributed Database Systems.

The talk (available here) covers some of the more innovative designs in database systems over the last few years, from Vertica and VoltDB, to larger-scale datastores such as Amazon’s Dynamo.

Major aside: I tried and failed to come up with a more entertaining title for the talk. The suggestions I received on twitter were better, but less relevant (one of the suggestions is on my title slide).

So, if you think you can do better and come up with something that is both relevant and witty/entertaining, there’ll be some form of prize in it for you!

Demo of H2O at SICSA DEMOfest (2nd November)

October 29th, 2010 Angus Macdonald No comments

Next Tuesday (2nd November 2010) we’ll be demoing H2O at the SICSA DEMOfest in Edinburgh.

We’ll be debuting our new H2O visualization tool plus the new, occasionally colourful, posters below:

Categories: Posters, Presentations, Work Tags: , ,

How do you teach Software Quality?

September 27th, 2010 Angus Macdonald No comments

I was recently asked to give a talk on ‘Software Quality’ to our incoming Junior Honours class which made me realise one thing – you can’t teach software quality!

For a start, it’s so subjective. The narrator in Zen and the Art of Motorcycle Maintenance says much the same things about creative writing. Quality is difficult to define. If you ask students to rank essays (or programs) from best to worst then they will probably be able to reach a consensus – they have an intuitive understanding that one essay has more quality than another – but it’s much more difficult to identify the parts of the essay that give it quality. Over time, with practise and reading, a student develops their ability to recognize that quality. It is the practise then that helps them develop this understanding and an ability to create it.

To me this seems similar to the food tasters Malcolm Gladwell discusses in Blink. The average person can taste Coke and Pepsi and give their preference for one or the other, but may find it difficult to explain why they have this preference. On the other hand, an experienced food taster is able to identify the multitude of flavours in each drink and categorize them by their taste and strength. The food taster, with their well-developed analytical ability, is able to put words to the feelings and intuition that the untrained person has.

With programming classes, students are lectured on the techniques and design patterns that help to produce quality software, but they can only begin to understand where these techniques should be used (and possibly more importantly, the extent to which they should be used) with experience. It’s almost as easy to write bad code by over-applying design patterns and ‘good practise techniques’ as it is to when not using them at all. There is hardly ever one right answer.

I don’t think it necessarily helps that the focus is on writing code rather than analysing the work of other people. As soon as I started working on group projects I think I developed a far greater appreciation for the quality (lack of) of my own code.

Refactoring as an Afterthought

A more obvious problem relating to software quality is the treatment of refactoring as an added extra. When I write a paper I tend to write out the first draft as a long brain-dump saying everything I think needs to be said. That draft is then constantly refactored until I have a cohesive piece of work. In programming (particularly with University assignments) the first draft is often finished when it meets the general requirements of the task, so there is little immediate need to redraft or refactor the work into a better state.

I’m not sure it’s possible to realise the danger of this until you’ve worked on a larger project where poor quality code can lose you days in debugging and refactoring. This is probably the way most people learn of the need for ‘quality’ software anyway.

What Can You Teach?

Beyond teaching students the design patterns and architectures that can help to improve the structure of their code I think the only thing you can do is emphasize the importance of taking an analytical view of code. The best programmers I’ve worked with are constantly unhappy at the state of their own work; continually looking for ways to improve its clarity and structure. Ultimately you can’t really teach that, only motivate the need for it.

There are a few books I’ve read that help in this respect. I particularly liked:

There are probably many more, but I haven’t read them yet!

Other books are just good at giving you extra ways to think about and structure problems:

  • Design Patterns (Gang of Four or Head First – pick your poison).
  • Extreme Programming Explained. You might not agree with all of it, but principles like incremental design and constant refactoring work for me.
  • Programming Pearls. I started reading this last week, so I can’t give a complete review, but the first few chapters are great examples of clever programming.

I think my talk will focus on motivating the need for quality with examples I’ve learned the hard way not to do. Ultimately, there’s only so much you can say in 45 minutes!

Categories: Discussion, Teaching Tags: ,

Eclipse Cheat Sheet

September 14th, 2010 Angus Macdonald 2 comments

I created a cheat sheet for the eclipse IDE which you can find here: eclipse cheat sheet [Google Docs]

It’s intended as an accompaniment to a refresher talk I’m giving to the incoming JH class on using eclipse. They’ve been using it on and off for two years now, so the aim is to give cover some of the features they might not know about rather than the basics.

Categories: Teaching Tags:

SICSA Conference 2010

June 12th, 2010 Angus Macdonald No comments

I’m just back from presenting a paper and poster at the SICSA Conference 2010.

You can find the work that I presented at the conference below, and more information on H2O in general at the project webpage.

Paper

H2O: An Autonomic, Resource-Aware Distributed Database System

Abstract:

This paper presents the design of an autonomic, resource-aware distributed database which enables data to be backed up and shared without complex manual administration. The database, H2O, is designed to make use of unused resources on workstation machines.

Creating and maintaining highly-available, replicated database systems can be difficult for untrained users, and costly for IT departments. H2O reduces the need for manual administration by autonomically replicating data and load-balancing across machines in an enterprise.

Provisioning hardware to run a database system can be unnecessarily costly as most organizations already possess large quantities of idle resources in workstation machines. H2O is designed to utilize this unused capacity by using resource availability information to place data and plan queries over workstation machines that are already being used for other tasks.

This paper discusses the requirements for such a system and presents the design and implementation of H2O.

Poster (pdf)

Presentation

(1-up, 6-up)

An Approach to Ad-hoc Cloud Computing

February 23rd, 2010 Angus Macdonald No comments

You can find our recent technical report on ad-hoc cloud computing here. The abstract is reprinted below.

Abstract:

We consider how underused computing resources within an enterprise may be harnessed to improve utilization and create an elastic computing infrastructure. Most current cloud provision involves a data center model, in which clusters of machines are dedicated to running cloud infrastructure software. We propose an additional model, the ad hoc cloud, in which infrastructure software is distributed over resources harvested from machines already in existence within an enterprise. In contrast to the data center cloud model, resource levels are not established a priori, nor are resources dedicated exclusively to the cloud while in use. A participating machine is not dedicated to the cloud, but has some other primary purpose such as running interactive processes for a particular user. We outline the major implementation challenges and one approach to tackling them.