<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Angus Macdonald</title>
	<atom:link href="http://blogs.cs.st-andrews.ac.uk/angus/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.cs.st-andrews.ac.uk/angus</link>
	<description>Computer Science at the University of St Andrews</description>
	<lastBuildDate>Wed, 29 Jun 2011 15:43:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>Setting up PGCluster (Guide)</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2011/06/setting-up-pgcluster-guide/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2011/06/setting-up-pgcluster-guide/#comments</comments>
		<pubDate>Sun, 05 Jun 2011 15:37:44 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Work]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[install-guide]]></category>
		<category><![CDATA[postgresql]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=1071</guid>
		<description><![CDATA[This post is a guide on how to install, configure, and run PGCluster 1.9rc7 (a synchronous replication tool for PostgreSQL), on Centos. It assumes that you are going to be running PGCluster on at least three distinct machines, and that you have a general understanding of its architecture. It&#8217;s key contribution is some scripts which automate [...]]]></description>
			<content:encoded><![CDATA[<p>This post is a guide on how to install, configure, and run <em>PGCluster <em>1.9rc7</em></em> (a synchronous replication tool for <em>PostgreSQL), </em>on<em> <em>Centos</em></em>.</p>
<p>It assumes that you are going to be running <em>PGCluster </em>on at least three distinct machines, and that you have a general understanding of its architecture. It&#8217;s key contribution is <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts">some scripts</a> which automate the task of modifying config files for each of the three types of cluster node.</p>
<p><span id="more-1071"></span></p>
<p><strong>Step 1. Download, Compile, and Install</strong></p>
<p>Run the following commands to download and extract the <em>PGCluster </em>source:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>pgfoundry.org<span style="color: #000000; font-weight: bold;">/</span>frs<span style="color: #000000; font-weight: bold;">/</span>download.php<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">2408</span><span style="color: #000000; font-weight: bold;">/</span>pgcluster-1.9.0rc7.tar.gz
&nbsp;
<span style="color: #c20cb9; font-weight: bold;">tar</span> <span style="color: #660033;">-zxvf</span> pgcluster-1.9.0rc7.tar.gz</pre></div></div>

<p>Once you&#8217;ve done this, go into the extracted directory and run the following commands (as root/with sudo):</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;">.<span style="color: #000000; font-weight: bold;">/</span>configure
&nbsp;
<span style="color: #c20cb9; font-weight: bold;">gmake</span>
<span style="color: #c20cb9; font-weight: bold;">gmake</span> <span style="color: #c20cb9; font-weight: bold;">install</span>
&nbsp;
<span style="color: #c20cb9; font-weight: bold;">mkdir</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data
adduser postgres
<span style="color: #c20cb9; font-weight: bold;">chown</span> <span style="color: #660033;">-R</span> postgres <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql;
&nbsp;
<span style="color: #c20cb9; font-weight: bold;">su</span> - postgres
<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>initdb <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data</pre></div></div>

<p><strong>Step 2: Create Config Files</strong></p>
<p>I created a couple of scripts to generate the config files required on each cluster node. To use these, follow the instructions below (or do it manually as per <a href="http://odyssi.blogspot.com/2007/08/postgresql-replication-with-pgcluster.html">this guide</a>):</p>
<ol>
<li>Download the scripts from <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts/zipball/master">here</a>.</li>
<li>Unzip.</li>
<li>Edit the <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts/blob/master/config.cfg">config.cfg</a> file with the parameters for your system. You must add your own values for the following variables:
<ol>
<li><em>IP_ADDRESSES</em>: IP addresses of all machines in the cluster. This is used to update PostgreSQL&#8217;s settings specifying what machines incoming connections will be allowed from.</li>
<li><em>LOAD_BALANCER_NODE</em>: The hostname (or IP address) of the node which will run the load balancer.</li>
<li><em>REPLICATION_SERVER_NODE</em>: The hostname (or IP address) of the node which will run the replication server.</li>
<li><em>DATA_NODES</em>: The hostnames (or IP addresses) of the nodes which will run data servers.</li>
</ol>
</li>
<li>Copy this config file (and the rest of the scripts) onto each node in the cluster.</li>
<li>On the node that is to be the load balancer, run <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts/blob/master/load-balancer-setup.sh">load-balancer-setup.sh</a>.</li>
<li>On the node that is to be replication server, run <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts/blob/master/replication-node-setup.sh">replication-node-setup.sh</a>.</li>
<li>On the node(s) that are to be the data servers, run <a href="https://github.com/angusmacdonald/PGCluster-Configuration-Scripts/blob/master/data-node-setup.sh">data-node-setup.sh</a>.</li>
</ol>
<p><strong>Step 3: Start PGCluster</strong></p>
<p>Before running any of the following commands, make sure that you are logged in as the database user (type &#8216;<em>su postgres</em>&#8216; if you aren&#8217;t).</p>
<p>To start each data server:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pg_ctl <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data start</pre></div></div>

<p>To start the replication server:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pgreplicate <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data start</pre></div></div>

<p>To start the load balancer:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pglb <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data start</pre></div></div>

<p><div class="note"><div class="noteclassic">I&#8217;ve read other instructions that say you should start the replication server before the data servers. This didn&#8217;t work for me, but if things start going wrong I&#8217;d try starting the replication server first.</div></div><br />
To get more informative output from the replication server or load balancer add the arguments <em>-n -v</em>, for verbose debugging. If at any time you want to stop these processes running you can use the following commands:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pgreplicate <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data stop
<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pg_ctl <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data stop
<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>pglb <span style="color: #660033;">-D</span> <span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>data stop</pre></div></div>

<div><strong>Step 4: Start Querying!</strong></div>
<p>Provided there were no errors during the last step, you should now be able to start using the cluster. You can send queries to the <em>data nodes </em>directly without breaking replication, but the system is structured so that you send queries to the <em>load balancer</em>.</p>
<p>The simplest way of testing that the system is working is to create a new database using the createdb command:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>createdb my_test_database</pre></div></div>

<p>If this works you can login to the interactive console and start creating tables and issuing queries:</p>

<div class="wp_syntax"><div class="code"><pre class="bash" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>pgsql<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>psql my_test_database</pre></div></div>

<p><strong>What to do when nothing else works&#8230;</strong></p>
<p>I found <a href="http://odyssi.blogspot.com/2007/08/postgresql-replication-with-pgcluster.html">this blog post</a> on setting up PGCluster very useful, though the advice on postgresql.conf doesn&#8217;t seem to work anymore (my scripts do something different). The instructions I&#8217;ve given here follow the same approach, but I have included scripts in step 2 to automatically create the necessary config files.</p>
<p>The official install instructions contain a number of mistakes (specifically, in describing where config files should be placed and what config files are needed).</p>
<p>I came across two main sources of error when working through this:</p>
<ol>
<li>The replication server stating that it couldn&#8217;t connect to either of the data servers. This was due to an incorrectly configured pg_hba.conf file, which specifies what incoming connections are allowed.</li>
<li><em>Postgresql FATAL: user &#8220;root&#8221; does not exist&#8221;</em>. This happens if you try to run one of the services as the wrong user. You need to be logged in as the user which owns the database folder.</li>
<li><em>ERROR:load_balance_main():PGR_Create_Recv_Socket failed</em>. This happened when, after seeing the error in no. 2, I logged in as the correct user and tried starting one of the services. The service I&#8217;d tried to start as the wrong user was still running, and already listening on the PostgreSQL port.</li>
<li><em>PGRget_Pgrp_Conf_Data error</em>. This happens when the process can&#8217;t find the configuration file it needs when starting up. This will happen when you specify the wrong folder for the configuration file, and when the database user doesn&#8217;t have permissions to open the configuration file.</li>
<li><em>WARNING:  This query is not permitted without running replication server</em>. This happens when the data node has not been listed in the replication server&#8217;s configuration file.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2011/06/setting-up-pgcluster-guide/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	<saved_date_time>2011-06-29T15:43:28+00:00</saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>Charting Thesis Progress</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2011/04/counting-words/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2011/04/counting-words/#comments</comments>
		<pubDate>Thu, 07 Apr 2011 20:44:03 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Discussion]]></category>
		<category><![CDATA[thesis]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=1033</guid>
		<description><![CDATA[Someone told me it&#8217;d be a good idea to literally chart the progress I&#8217;ve made writing my thesis, so I created a graph showing my word count over time. Exciting, I know! Hopefully the line levelling off is a positive sign&#8230; Update (02/05/11): Updated with more detail and more words. The line isn&#8217;t levelling off [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">Someone told me it&#8217;d be a good idea to literally chart the progress I&#8217;ve made writing my thesis, so I created a graph showing my word count over time. Exciting, I know!</p>
<p style="text-align: left;">Hopefully the line levelling off is a positive sign&#8230;</p>
<p style="text-align: left;"><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/04/word-count-2.png"><img class="aligncenter size-full wp-image-1068" title="word-count" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/04/word-count-2.png" alt="" width="641" height="290" /></a></p>
<p style="text-align: left;">Update (02/05/11): Updated with more detail and more words. The line isn&#8217;t levelling off any more, which is maybe worrying!</p>
<p style="text-align: center;">
<p style="text-align: left;">
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2011/04/counting-words/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>CrossRef++ (A Microsoft Word Add-in)</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2011/03/crossref-plus-a-microsoft-word-add-in/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2011/03/crossref-plus-a-microsoft-word-add-in/#comments</comments>
		<pubDate>Fri, 11 Mar 2011 07:30:55 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Side Projects]]></category>
		<category><![CDATA[add-in]]></category>
		<category><![CDATA[c#]]></category>
		<category><![CDATA[vsto]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=1004</guid>
		<description><![CDATA[This is a Word 2010 that replaces Word&#8217;s in-built &#8216;add cross-reference&#8217; tool. Why? &#8212; because it has annoyed me greatly whilst writing my thesis! Installer (EXE Version). Source code as ZIP, on Github. Please note that this only works in Word 2010, not on any earlier versions. Why this is &#8216;needed&#8217; The standard word tool [...]]]></description>
			<content:encoded><![CDATA[<p>This is a Word 2010 that replaces Word&#8217;s in-built &#8216;add cross-reference&#8217; tool. Why? &#8212; because it has annoyed me greatly whilst writing my thesis!</p>
<p><a href="https://github.com/downloads/angusmacdonald/CrossRefPlusPlus/CrossRefPlusPlus.zip">Installer (EXE Version)</a>.</p>
<p>Source code as <a href="https://github.com/angusmacdonald/CrossRefPlusPlus/zipball/master">ZIP</a>, on <a href="https://github.com/angusmacdonald/CrossRefPlusPlus">Github</a>.</p>
<p>Please note that this only works in Word 2010, not on any earlier versions.</p>
<p><strong>Why this is &#8216;needed&#8217;</strong></p>
<p><strong> </strong>The standard word tool (pictured below) quickly becomes tedious to use in large documents for a number of reasons:</p>
<ul>
<li>You have to switch between references for things (like figures and numbered items) constantly, and you have to use a drop-down box to do this.</li>
<li>It doesn&#8217;t remember your last referenced item, so if you&#8217;re constantly referencing a figure that is two pages down the list, you constantly have to scroll down to that reference.</li>
<li>It doesn&#8217;t remember the size of your reference box, so even if you have a massive monitor, you can only ever use a tiny fraction of it to search for references.</li>
</ul>
<div id="attachment_1005" class="wp-caption aligncenter" style="width: 478px"><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/03/old-cross-reference.png"><img class="size-full wp-image-1005 " title="old-cross-reference" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/03/old-cross-reference.png" alt="" width="468" height="285" /></a><p class="wp-caption-text">The Standard Word Cross-reference Tool</p></div>
<p><strong>What Cross</strong><strong>Ref++ Does Differently</strong></p>
<ul>
<li>References are displayed in a task pane, which typically stretches the length of a screen, but can be moved around as well.</li>
<li>It remembers (roughly) where your last used reference was for each type (figure, numbered item, etc.), so you don&#8217;t have to scroll as much as before.</li>
<li>It provides a few big buttons to at the top to change reference type, which makes it quicker to change.</li>
</ul>
<div id="attachment_1006" class="wp-caption aligncenter" style="width: 375px"><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/03/new-cross-reference.png"><img class="size-full wp-image-1006" title="new-cross-reference" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2011/03/new-cross-reference.png" alt="" width="365" height="206" /></a><p class="wp-caption-text">The CrossRef++ Tool</p></div>
<p><strong>What it Doesn&#8217;t Yet Do</strong></p>
<ul>
<li>It doesn&#8217;t support all types of references (for endnotes and other things you need to use the old tool).</li>
<li>It doesn&#8217;t handle re-sizing the task pane well.</li>
<li>It doesn&#8217;t allow you to search through references, though I&#8217;d like to do that eventually.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2011/03/crossref-plus-a-microsoft-word-add-in/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>Tutorial Notes (on Data Structures and Algorithms)</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/tutorial-notes-on-data-structures-and-algorithms/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/tutorial-notes-on-data-structures-and-algorithms/#comments</comments>
		<pubDate>Wed, 15 Dec 2010 14:24:04 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Teaching]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=967</guid>
		<description><![CDATA[This year marked the first time I&#8217;d tutored on a second year course, Foundations of Computation. Of the topics covered I produced notes and code to help explain lists, search algorithms, and trees. I&#8217;ve included them below in the hope they may be useful. Algorithms for finding cycles in Linked Lists (GitHub repository). The included [...]]]></description>
			<content:encoded><![CDATA[<p>This year marked the first time I&#8217;d tutored on a second year course, Foundations of Computation. Of the topics covered I produced notes and code to help explain lists, search algorithms, and trees. I&#8217;ve included them below in the hope they may be useful.</p>
<p><strong>Algorithms for finding cycles in Linked Lists</strong> (<a href="https://github.com/angusmacdonald/Linked-List-Cycle-Detection-Analysis">GitHub repository</a>). The included code runs various algorithms to find cycles, and graphs the efficiency of each algorithm.</p>
<p><strong>Search Algorithms Comparison</strong> (<a href="https://github.com/angusmacdonald/Sorting-Algorithm-Analysis">GitHub repository</a>). The included code implements three search algorithms (Selection, Insertion, and Merge Sort) and includes various levels of debug to show the process taken by each algorithm and to count the number of comparisons and swaps involved.</p>
<p><strong>Cheat sheet for Balancing AVL Trees</strong> (<a href="https://docs.google.com/viewer?a=v&amp;pid=explorer&amp;chrome=true&amp;srcid=0ByFkMXEHUcbLZmMxYmQxMzAtYmExNS00YTYwLWEzNWMtYTUzMDlmM2VjZTNk&amp;hl=en_GB">Google Docs</a>). A very brief guide explaining what operations must be performed to balance and AVL tree.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/tutorial-notes-on-data-structures-and-algorithms/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>Current Trends in Distributed Database Systems (Talk)</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/current-trends-in-distributed-database-systems-talk/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/current-trends-in-distributed-database-systems-talk/#comments</comments>
		<pubDate>Mon, 13 Dec 2010 12:07:19 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Teaching]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[c-store]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[database system]]></category>
		<category><![CDATA[h-store]]></category>
		<category><![CDATA[h2o]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=956</guid>
		<description><![CDATA[I recently gave a talk to our Masters Databases class entitled Current Trends in Distributed Database Systems. The talk (available here) covers some of the more innovative designs in database systems over the last few years, from Vertica and VoltDB, to larger-scale datastores such as Amazon&#8217;s Dynamo. Major aside: I tried and failed to come [...]]]></description>
			<content:encoded><![CDATA[<p>I recently gave a talk to our Masters Databases class entitled <em><a href="https://docs.google.com/viewer?a=v&amp;pid=explorer&amp;chrome=true&amp;srcid=0ByFkMXEHUcbLNDhiMTZjZjUtY2RjMi00NjllLTg0M2YtODg4ODgyNTk3MWQ0&amp;hl=en_GB">Current Trends in Distributed Database Systems</a></em>.</p>
<p>The talk (<a href="http://docs.google.com/viewer?a=v&amp;q=cache:o0g6ONeGemcJ:www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf+architecture+of+facebook&amp;hl=en&amp;gl=sg&amp;pid=bl&amp;srcid=ADGEESiw63ZXSvf3kF4c-tTuKgPrHYucnpVY2FNqzV3vwY12fTxebQwEQMhf9PLmeEVVWioJiFDnY2aiaAnQaqL4E65nkHUleUIuZj2Ythq34IDRx5gVhajUSKjOUmLUVxkV_3GFC7M2&amp;sig=AHIEtbThewthIAMefblxc9U9NWkD6yYY6g&amp;pli=1">available here</a>) covers some of the more innovative designs in database systems over the last few years, from Vertica and VoltDB, to larger-scale datastores such as Amazon&#8217;s Dynamo.</p>
<p><a href="https://docs.google.com/viewer?a=v&amp;pid=explorer&amp;chrome=true&amp;srcid=0ByFkMXEHUcbLNDhiMTZjZjUtY2RjMi00NjllLTg0M2YtODg4ODgyNTk3MWQ0&amp;hl=en_GB"><img class="aligncenter size-full wp-image-964" title="title-slide-small" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/12/title-slide-small.png" alt="" width="522" height="391" /></a></p>
<p><em>Major aside:</em> I tried and failed to come up with a more entertaining title for the talk. The suggestions I received on twitter were better, but less relevant (one of the suggestions is on my title slide).</p>
<p>So, if you think you can do better and come up with something that is both relevant and witty/entertaining, there&#8217;ll be some form of prize in it for you!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/12/current-trends-in-distributed-database-systems-talk/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>Demo of H2O at SICSA DEMOfest  (2nd November)</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/10/demo-of-h2o-at-sicsa-demofest-2nd-november/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/10/demo-of-h2o-at-sicsa-demofest-2nd-november/#comments</comments>
		<pubDate>Fri, 29 Oct 2010 15:49:36 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Posters]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[poster]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=928</guid>
		<description><![CDATA[Next Tuesday (2nd November 2010) we&#8217;ll be demoing H2O at the SICSA DEMOfest in Edinburgh. We&#8217;ll be debuting our new H2O visualization tool plus the new, occasionally colourful, posters below:]]></description>
			<content:encoded><![CDATA[<p>Next Tuesday (2nd November 2010) we&#8217;ll be demoing H2O at the <a href="http://www.sicsa.ac.uk/events/sicsa-demofest">SICSA DEMOfest</a> in Edinburgh.</p>
<p>We&#8217;ll be debuting our new H2O visualization tool plus the new, occasionally colourful, posters below:</p>
<p><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/10/DemoFest-MotivationPoster.pdf"><img class="aligncenter size-full wp-image-935" title="motivation" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/10/motivation.png" alt="" width="568" height="404" /></a></p>
<p><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/10/DemoFest-DesignPoster.pdf"><img class="aligncenter size-full wp-image-934" title="design" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/10/design.png" alt="" width="569" height="403" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/10/demo-of-h2o-at-sicsa-demofest-2nd-november/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>How do you teach Software Quality?</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/how-do-you-teach-software-quality/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/how-do-you-teach-software-quality/#comments</comments>
		<pubDate>Sun, 26 Sep 2010 23:00:51 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Discussion]]></category>
		<category><![CDATA[Teaching]]></category>
		<category><![CDATA[lectures]]></category>
		<category><![CDATA[thoughts]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=873</guid>
		<description><![CDATA[I was recently asked to give a talk on ‘Software Quality’ to our incoming Junior Honours class which made me realise one thing – you can’t teach software quality! For a start, it’s so subjective. The narrator in Zen and the Art of Motorcycle Maintenance says much the same things about creative writing. Quality is [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently asked to give a talk on ‘Software Quality’ to our incoming Junior Honours class which made me realise one thing – you can’t teach software quality!</p>
<p>For a start, it’s so subjective. The narrator in <em><a href="http://en.wikipedia.org/wiki/Zen_and_the_Art_of_Motorcycle_Maintenance">Zen and the Art of Motorcycle Maintenance</a></em><a href="http://en.wikipedia.org/wiki/Zen_and_the_Art_of_Motorcycle_Maintenance"> </a>says much the same things about creative writing. Quality is difficult to define. If you ask students to rank essays (or programs) from best to worst then they will probably be able to reach a consensus – they have an intuitive understanding that one essay has more quality than another – but it’s much more difficult to identify the parts of the essay that give it quality. Over time, with practise and reading, a student develops their ability to recognize that quality. It is the practise then that helps them develop this understanding and an ability to create it.</p>
<p>To me this seems similar to the food tasters Malcolm Gladwell discusses in <em><a href="http://www.gladwell.com/blink/index.html">Blink</a></em>. The average person can taste Coke and Pepsi and give their preference for one or the other, but may find it difficult to explain why they have this preference. On the other hand, an experienced food taster is able to identify the multitude of flavours in each drink and categorize them by their taste and strength. The food taster, with their well-developed analytical ability, is able to put words to the feelings and intuition that the untrained person has.</p>
<p>With programming classes, students are lectured on the techniques and design patterns that help to produce quality software, but they can only begin to understand where these techniques should be used (and possibly more importantly, the extent to which they should be used) with experience. It&#8217;s almost as easy to write bad code by over-applying design patterns and &#8216;good practise techniques&#8217; as it is to when not using them at all. There is hardly ever one right answer.</p>
<p>I don’t think it necessarily helps that the focus is on writing code rather than analysing the work of other people. As soon as I started working on group projects I think I developed a far greater appreciation for the quality (lack of) of my own code.</p>
<p style="text-align: left;"><strong>Refactoring as an Afterthought</strong></p>
<p>A more obvious problem relating to software quality is the treatment of refactoring as an added extra. When I write a paper I tend to write out the first draft as a long brain-dump saying everything I think needs to be said. That draft is then constantly refactored until I have a cohesive piece of work. In programming (particularly with University assignments) the first draft is often finished when it meets the general requirements of the task, so there is little immediate need to redraft or refactor the work into a better state.</p>
<p>I’m not sure it’s possible to realise the danger of this until you’ve worked on a larger project where poor quality code can lose you days in debugging and refactoring. This is probably the way most people learn of the need for ‘quality’ software anyway.</p>
<p><strong>What Can You Teach?</strong></p>
<p>Beyond teaching students the design patterns and architectures that can help to improve the structure of their code I think the only thing you can do is emphasize the importance of taking an analytical view of code. The best programmers I’ve worked with are constantly unhappy at the state of their own work; continually looking for ways to improve its clarity and structure. Ultimately you can’t really teach that, only motivate the need for it.</p>
<p>There are a few books I&#8217;ve read that help in this respect. I particularly liked:</p>
<ul>
<li><a href="http://www.pragprog.com/the-pragmatic-programmer">The Pragmatic Programmer</a>. It&#8217;s good at motivating the need for various techniques. Their <a href="http://www.pragprog.com/the-pragmatic-programmer/extracts/tips">list of tips</a> is good if you haven&#8217;t already read this book.</li>
<li><a href="http://programmer.97things.oreilly.com/wiki/index.php/97_Things_Every_Programmer_Should_Know">97 Things Every Programmer Should Know</a> covers a lot of programming lessons. It&#8217;s funny how many of these are basic common sense stories, but it still needs to be said.</li>
</ul>
<p>There are probably many more, but I haven&#8217;t read them yet!</p>
<p>Other books are just good at giving you extra ways to think about and structure problems:</p>
<ul>
<li>Design Patterns (<a href="http://www.amazon.co.uk/Design-patterns-elements-reusable-object-oriented/dp/0201633612">Gang of Four</a> or<a href="http://www.amazon.co.uk/Head-First-Design-Patterns-Freeman/dp/0596007124/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1284479860&amp;sr=1-1"> Head First</a> &#8211; pick your poison).</li>
<li><a href="http://www.amazon.co.uk/Extreme-Programming-Explained-Embrace-Change/dp/0201616416">Extreme Programming Explained</a>. You might not agree with all of it, but principles like incremental design and constant refactoring work for me.</li>
<li>Programming Pearls. I started reading this last week, so I can&#8217;t give a complete review, but the first few chapters are great examples of clever programming.</li>
</ul>
<p>I think my talk will focus on motivating the need for quality with examples I&#8217;ve learned the hard way not to do. Ultimately, there&#8217;s only so much you can say in 45 minutes!</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/how-do-you-teach-software-quality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>Eclipse Cheat Sheet</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/eclipse-cheat-sheet/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/eclipse-cheat-sheet/#comments</comments>
		<pubDate>Tue, 14 Sep 2010 09:00:49 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Teaching]]></category>
		<category><![CDATA[eclipse]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=890</guid>
		<description><![CDATA[I created a cheat sheet for the eclipse IDE which you can find here: eclipse cheat sheet [Google Docs] It&#8217;s intended as an accompaniment to a refresher talk I&#8217;m giving to the incoming JH class on using eclipse. They&#8217;ve been using it on and off for two years now, so the aim is to give cover [...]]]></description>
			<content:encoded><![CDATA[<p><a href="https://docs.google.com/viewer?a=v&#038;pid=explorer&#038;chrome=true&#038;srcid=0ByFkMXEHUcbLODgwMDlmYzUtMjBiOS00ZTEzLTlkYzgtNTMzOGQ0NWM0NmQ1&#038;hl=en_GB"><img class="aligncenter size-full wp-image-898" title="cheatsheet-cover" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/09/cheatsheet-cover.png" alt="" width="611" height="430" /></a></p>
<p>I created a cheat sheet for the eclipse IDE which you can find here: <a href="https://docs.google.com/viewer?a=v&#038;pid=explorer&#038;chrome=true&#038;srcid=0ByFkMXEHUcbLODgwMDlmYzUtMjBiOS00ZTEzLTlkYzgtNTMzOGQ0NWM0NmQ1&#038;hl=en_GB">eclipse cheat sheet [Google Docs]</a></p>
<p>It&#8217;s intended as an accompaniment to a refresher talk I&#8217;m giving to the incoming JH class on using eclipse. They&#8217;ve been using it on and off for two years now, so the aim is to give cover some of the features they might not know about rather than the basics.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/09/eclipse-cheat-sheet/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>SICSA Conference 2010</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/06/sicsa-conference-2010/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/06/sicsa-conference-2010/#comments</comments>
		<pubDate>Sat, 12 Jun 2010 15:04:14 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Papers]]></category>
		<category><![CDATA[Posters]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[dbharvester]]></category>
		<category><![CDATA[sicsa]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=804</guid>
		<description><![CDATA[I&#8217;m just back from presenting a paper and poster at the SICSA Conference 2010. You can find the work that I presented at the conference below, and more information on H2O in general at the project webpage. Paper H2O: An Autonomic, Resource-Aware Distributed Database System Abstract: This paper presents the design of an autonomic, resource-aware [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m just back from presenting a paper and poster at the <a href="http://www.sicsaconf.org/">SICSA Conference 2010</a>.</p>
<p>You can find the work that I presented at the conference below, and more information on H2O in general at the <a href="http://blogs.cs.st-andrews.ac.uk/h2o/">project webpage</a>.</p>
<p><span style="text-decoration: underline;"><strong>Paper</strong></span></p>
<p><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-paper.pdf">H2O: An Autonomic, Resource-Aware Distributed Database System</a></p>
<p>Abstract:</p>
<blockquote><p>This paper presents the design of an autonomic,  resource-aware distributed database which enables data to be backed up  and shared without complex manual administration. The database, H2O, is  designed to make use of unused resources on workstation machines.</p>
<p>Creating and maintaining highly-available, replicated database  systems can be difficult for untrained users, and costly for IT  departments. H2O reduces the need for manual administration by  autonomically replicating data and load-balancing across machines in an  enterprise.</p>
<p>Provisioning hardware to run a database system can be unnecessarily  costly as most organizations already possess large quantities of idle  resources in workstation machines. H2O is designed to utilize this  unused capacity by using resource availability information to place data  and plan queries over workstation machines that are already being used  for other tasks.</p>
<p>This paper discusses the requirements for such a system and presents  the design and implementation of H2O.</p></blockquote>
<p><span style="text-decoration: underline;"><strong>Poster</strong></span><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-poster.pdf"> (pdf)</a></p>
<p style="text-align: center;"><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-poster.pdf"><img class="aligncenter size-full wp-image-817" title="poster-final" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/poster-final.png" alt="" width="448" height="322" /></a></p>
<p><span style="text-decoration: underline;"><strong>Presentation</strong></span></p>
<p style="text-align: center;"><iframe src="http://player.vimeo.com/video/15127199?title=0&amp;byline=0" width="500" height="375" frameborder="0"></iframe></p>
<p><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-presentation.pdf"> (1-up</a>, <a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-presentation-6up.pdf">6-up</a>)</p>
<p style="text-align: center;"><a href="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/sicsa2010-presentation.pdf"><img class="aligncenter size-full wp-image-818" title="SICSA Presentation" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/06/SICSA-Presentation.png" alt="" width="448" height="336" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/06/sicsa-conference-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
		<item>
		<title>An Approach to Ad-hoc Cloud Computing</title>
		<link>http://blogs.cs.st-andrews.ac.uk/angus/2010/02/an-approach-to-ad-hoc-cloud-computing/</link>
		<comments>http://blogs.cs.st-andrews.ac.uk/angus/2010/02/an-approach-to-ad-hoc-cloud-computing/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 11:41:25 +0000</pubDate>
		<dc:creator>Angus Macdonald</dc:creator>
				<category><![CDATA[Papers]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[technical report]]></category>

		<guid isPermaLink="false">http://blogs.cs.st-andrews.ac.uk/angus/?p=702</guid>
		<description><![CDATA[You can find our recent technical report on ad-hoc cloud computing here. The abstract is reprinted below. Abstract: We consider how underused computing resources within an enterprise may be harnessed to improve utilization and create an elastic computing infrastructure. Most current cloud provision involves a data center model, in which clusters of machines are dedicated [...]]]></description>
			<content:encoded><![CDATA[<p>You can find our recent technical report on ad-hoc cloud computing <a href="http://www.cs.st-andrews.ac.uk/research/output/detail?output=KDM+10a.php">here</a>. The abstract is reprinted below.</p>
<p>Abstract:</p>
<blockquote><p><a href="http://www.cs.st-andrews.ac.uk/files/publications/download/KDM+10a.pdf"><img class="size-full wp-image-705 alignright" title="firstpage" src="http://blogs.cs.st-andrews.ac.uk/angus/files/2010/02/firstpage.png" alt="" width="129" height="168" /></a>We consider how underused computing resources within an enterprise may be harnessed to improve utilization and create an elastic computing infrastructure. Most current cloud provision involves a data center model, in which clusters of machines are dedicated to running cloud infrastructure software. We propose an additional model, the ad hoc cloud, in which infrastructure software is distributed over resources harvested from machines already in existence within an enterprise. In contrast to the data center cloud model, resource levels are not established a priori, nor are resources dedicated exclusively to the cloud while in use. A participating machine is not dedicated to the cloud, but has some other primary purpose such as running interactive processes for a particular user. We outline the major implementation challenges and one approach to tackling them.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://blogs.cs.st-andrews.ac.uk/angus/2010/02/an-approach-to-ad-hoc-cloud-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	<saved_date_time></saved_date_time><news_archive_date></news_archive_date>	</item>
	</channel>
</rss>

