By Olexandr Konovalov, Research Fellow in the Centre for Interdisciplinary Research in Computational Algebra, University of St Andrews (reproduced from the original post here)
This blog post is based on an improvised demo that I gave at the Newcastle University on May 21st, 2015 during a short visit supported by the CoDiMa project.
Let’s consider the following exercise: for a finite group G, calculate the average order of its element (that is, the sum of orders of its elements divided by the order of the group). We begin with a very straightforward approach, iterating over all elements of the group in question:
gap> S:=SymmetricGroup(10);
Sym( [ 1 .. 10 ] )
gap> sum:=0;
0
gap> for g in S do
> sum := sum + Order(g);
> od;
gap> sum/Size(S);
39020911/3628800
Now assume that we would like to save this fragment of the GAP code and later repeat this calculation for some other groups.. We may even reformat it to fit it into one line and use double semicolon to suppress the output of sum
:
sum:=0;; for g in S do sum := sum + Order(g); od; sum/Size(S);
Now we may copy and paste it into the GAP session when we will need it next time. And here we see the first inconvenience: this assumes that the group in question is stored in a variable named S
:
gap> S:=AlternatingGroup(10); Alt( [ 1 .. 10 ] ) gap> sum:=0;; for g in S do sum := sum + Order(g); od; sum/Size(S); 2587393/259200
Indeed, this is NOT the way to do it, whenever you have a one-line command or a large GAP script. This approach is very much error-prone: e.g. incidentally not all the code was copied and pasted, so the input was incomplete and triggered a break loop, or (even worse!), sum
was not reset to zero prior to the new calculation and the output was incorrect! The group in question may have a different variable name, so the code will have to be changed. But there is even more: when GAP code is pasted into the interpreter, it is evaluated line by line. If you have a long file with many commands, and line N has a syntax error, this error will be reported only when GAP will complete evaluation of all preceding lines, and that may be quite time-consuming.
Instead of that, it is a good practice to structure the GAP code and organise it in the form of functions. Functions are parsed first and may be called later. Any syntax errors will be detected in the parsing stage, and not at the time of the call. Also, functions may have local variables, and this prevents them being accidentally overwritten just because of reusing the same name of the variable to store something else.
Following this good practice, we put the code into a function that looks like this:
AvgOrdOfGroup := function (S) local sum, g; sum:=0; for g in S do sum := sum + Order(g); od; return sum/Size(S); end;
Now we can apply it to any group, passing the group as a function argument. The example below also demonstrates time
– this is the variable which stores the CPU time in milliseconds spent by the last command.
gap> S:=SymmetricGroup(10); Sym( [ 1 .. 10 ] ) gap> AvgOrdOfGroup(S); 39020911/3628800 gap> time; 16517 gap> A:=AlternatingGroup(10); Alt( [ 1 .. 10 ] ) gap> AvgOrdOfGroup(A); 2587393/259200 gap> time; 8108
Of course, for any given group the average order of its elements may be calculated only once, as the next time it will return the same value. However, as we see from the runtimes below, each new call of AvgOrdOfGroup
will repeat the same computation again:
gap> A:=AlternatingGroup(10); Alt( [ 1 .. 10 ] ) gap> AvgOrdOfGroup(A); 2587393/259200 gap> time; 8226 gap> AvgOrdOfGroup(A); 2587393/259200 gap> time; 8118
If you need to reuse this value, one option could be to store it in some variable, but then you should be careful about matching such variables with corresponding groups, and the code could become quite convoluted and unreadable. On the other hand, GAP has a notion of attributes which are used to accumulate information that objects learn about themselves during their lifetime. Since we already have a function which does the calculation, the simplest example of turning it into an attribute could look as follows:
DeclareAttribute("AverageOrder", IsGroup); InstallMethod( AverageOrder, [IsGroup], AvgOrdOfGroup);
In this example, first we declared an attribute AverageOrder
for objects in the category IsGroup, and then installed our function AvgOrdOfGroup as a method for this attribute.
Now we may check that subsequent calls of AverageOrder
with the same argument are performed at zero costs:
gap> S:=SymmetricGroup(10); Sym( [ 1 .. 10 ] ) gap> AverageOrder(S); 39020911/3628800 gap> time; 16609 gap> AverageOrder(S); 39020911/3628800 gap> time; 0
Our algorithm to calculate the average order is very straightforward. It’s not efficient, but at least it is correct. Before we will try to improve it, we will create a test file. We will copy and paste the GAP session with GAP prompts, inputs and outputs into a text file, called avgord.tst
and created in the current directory (to avoid typing full path):
gap> S:=SymmetricGroup(9); Sym( [ 1 .. 9 ] ) gap> AverageOrder(S); 3291487/362880
Test files may be tested using the function Test (as documented here):
gap> Test("avgord.tst"); true
Now we have a test which we will run each time we modify the code to check that the calculation is still correct. For the purposes of this example, we are satisfied with testing it only on a single group. In a real-life situation, it would be useful to add more test cases, e.g. to work with groups in different representations and to ensure good code coverage (i.e. that each line of the code is executed at least once), to check not only the correctness of the code, but also its performance, etc.
Now we will present a better implementation. Of course, the order of an element is an invariant of a conjugacy class of elements of a group, so we need only to know the orders of conjugacy classes of elements and their representatives. We would like to have this algorithm as a special case for permutation groups, so we install it as a method for AverageOrder
for objects in IsPermGroup
. Note that there is no need to create a function and then install it as a method – it’s better to combine this in one command:
InstallMethod( AverageOrder, [IsPermGroup], function (S) local cc, sum, c; cc:=ConjugacyClasses(S); sum:=0; for c in cc do sum := sum + Order(Representative(c))*Size(c); od; return sum/Size(S); end);
Now we may run test and check that it works:
gap> Test("avgord.tst"); true
The test will use the newly installed method, since the GAP method selection mechanism will prefer a method for a special case of IsPermGroup
over the generic method for IsGroup
, and the speedup is amazing:
gap> S:=SymmetricGroup(10); Sym( [ 1 .. 10 ] ) gap> AverageOrder(S); 39020911/3628800 gap> time; 21 gap> A:=AlternatingGroup(10); Alt( [ 1 .. 10 ] ) gap> AverageOrder(A); 2587393/259200 gap> time; 23
Next, we would like GAP to display some information about the progress of calculation. A simple idea would be to add some Print
statements to the code, and then comment them out when they are not needed and then uncomment them back when you want them. This is tedious and error-prone, especially if the GAP script is long enough or you would like this script to be used by others. Instead of that, one could use the Info
functionality, documented here. For example, we may declare a new info class, called MyInfo
:
DeclareInfoClass("MyInfo");
and then add the line
Info(MyInfo,1,"Calculated ", Length(cc), " classes");
after the line
cc:=ConjugacyClasses(S);
in the method above. After rereading that method installation, one could turn info messages on and off by changing the info level, for example:
gap> SetInfoLevel(MyInfo,1); gap> S:=SymmetricGroup(10); Sym( [ 1 .. 10 ] ) gap> AverageOrder(S); #I Calculated 42 classes 39020911/3628800
One could have any number of info levels for any info class in order to specify different levels of verbosity: if the current info level is N, GAP will show all messages with info levels less or equal to N, so the bigger is the level, the more verbose is the output.
The test file needs some modification to remember the initial info level before the test and restore it after the test. We will store the info level into a variable with some technical name to avoid name clashes:
gap> _OLDMYINFOLEVEL:=InfoLevel(MyInfo);; gap> SetInfoLevel(MyInfo,0); gap> S:=SymmetricGroup(9); Sym( [ 1 .. 9 ] ) gap> AverageOrder(S); 3291487/362880 gap> SetInfoLevel(MyInfo,_OLDMYINFOLEVEL);
Now we will deliberately introduce an error. The order of the conjugacy class is the index of the centraliser of its representative, so we may try to avoid the final division by the order of the group. Let’s try the following:
InstallMethod( AverageOrder, [IsPermGroup], function (S) local cc, sum, c, r; cc:=ConjugacyClasses(S); Info(MyInfo,1,"Calculated ", Length(cc), " classes"); sum:=0; for c in cc do r := Representative(c); sum := sum + Order(r)*Size(Centraliser(S,r)); od; return sum/Size(S); end);
This method has the same requirements as the previous one, so GAP will select it as the last installed. Running the test, we will detect a discrepancy, or a “diff”:
gap> Test("avgord.tst"); ########> Diff in avgord.tst, line 6: # Input is: AverageOrder(S); # Expected output: 3291487/362880 # But found: 3291487/131681894400 ######## false
Something went wrong! Indeed, we should divide by the order of the centraliser and not multiply! Restart GAP and read the previous definitions and then the corrected version of the method:
InstallMethod( AverageOrder, [IsPermGroup], function (S) local cc, sum, c, r; cc:=ConjugacyClasses(S); sum:=0; Info(MyInfo,1,"Calculated ", Length(cc), " classes"); for c in cc do r := Representative(c); sum := sum + Order(r)/Size(Centraliser(S,r)); od; return sum; end);
Now it works as expected:
gap> Test("avgord.tst"); true gap> S:=SymmetricGroup(10); Sym( [ 1 .. 10 ] ) gap> AverageOrder(S); 39020911/3628800 gap> time; 23 gap> A:=AlternatingGroup(10); Alt( [ 1 .. 10 ] ) gap> AverageOrder(A); 2587393/259200 gap> time; 16
As you may see, at least for the groups from our examples, this does not give us any significant changes in the performance. You may experiment with larger groups and groups in different representations (polycyclic groups, matrix groups) to compare the performance. Will you be able to find a group for which an average order of an element is an integer? Do you need to look among p-groups for such an example?