The Pizza Compiler
The Pizza language is an extension to Java with three new features: |
Pizza Example: Quicksort
|
|
Quicksort is a well-known sorting algorithm, first described by Hoare.
The algorithm is very efficient in the average case, and is quite
frequently used. This problem is a good example because the details
of the algorithm are almost universally taught in computer science
education, because the implementation is large enought to be
interesting without drowning us in details, and (of course!) because
the new features of Pizza improve the possible codings of the
algorithm in a number of ways.
The main advantage of coding the algorithm in Pizza rather than
Java is that the same code can sort arrays of many different types.
Here, the modularity arises from the use of first-class functions and
polymorphism. Our top-level sorting function will take two
parameters: not only the array to be sorted (as usual), but also a
function which compares two elements, returning true if the former is
less than (or equal to) the latter. Without the first-class
functions, one would have either a separate sorting routine for every
different sorting criterion, or else a single routine able to sort
only arrays of Object which have some comparison method. The
disadvantage of the former is clear, as large amounts of code would
need to be duplicated. In the latter, we have two main disadvantages.
The first disadvantage is simple: we cannot cast arrays of
primitive elements into arrays of Object, so we will still
require separate sorters for each primitive type.
The second disadvantage of sorting based on a comparison method of
Object's is that we will casts in the general comparison
method: objects which are (for example) strings must first be cast to
strings, and then compared using the string comparison function.
Casts have two serious problems. First, there will be an additional
time cost associated with the casting: since the comparison method is
called frequently, this penalty is quite serious. The second problem
is a loss of type safety: since the sorting function will accept
arrays of any objects at compile-time, and check that the
objects are of the proper type only at comparison-time, the program
may fail in the middle of execution. Since polymorphism is strongly
typed at compile-time, the Pizza compiler can detect immediately
whether a type error would occur at run-time, leading to a smaller
debugging cycle.
The first obvious departure from the ordinary Java language is the
use of a type parameter in the class
definition (red highlighting). The
type variable A indicates that each instantiation of
a member of the class may use a different type in that position, but
usage within each such object must be consistant. That is, if we
instantiate one object to sort arrays of integers, it will be a
detectable (compile-time) type error to apply that sorter to an array
of (say) characters. In this example, we will use single upper-case
letters as our type variables. Of course it is legal to use any valid
identifier name for the type variables, but the readability of
programs is improved if the identifiers for types and ordinary values
are distinctly written.
An object of the quicksorter class stores the function to be used
for comparing elements of the array under the
private name less. The type variable A is part
of this definition, since the type of this function may differ among
instantiations. However, for every instantiation to a type
M, the value of less will always have type
(M,M)-->boolean.
The main sorting routine shows how we use
the value of the stored functions (green
highlighting). Within the body of the class definition,
less refers to a function, so we can call it just like any
other function or static member. Note that, just like an
uninitialized value, calling a reference to an uninitialized function
causes an error. The remainder of the method is absolutely standard:
the only new aspect of the sorting loop is its use of the
stored function, rather than a single comparison predicate hardcoded
into the source code of the sorter.
The QuickSortTest class shows
how we may use the sorting methods. We use a few ``helper''
functions: less and greater compare two strings in
the usual ways, and showArray prints out the elements of an
array of strings. The main routine sorts strings given as
command-line arguments. We first sort the strings in increasing
order. To do so, we generate a new QuickSorter object
q (purple highlighting) with the
appropriate comparison function. Once we have built the new object,
we can use its unary sort method to sort the argument array.
The next few lines of the main routine
illustrate another way to use the sorting routines. Although we have
separated the task of choosing a sorting function from the task of
sorting the array in the implementation, there is no need to require
that these tasks be separated everytime we wish to sort. The static
method sort of two arguments combines the two steps is the
obvious manner, hiding the details. Here, we sort the array into
decreasing order.
A second type variable (purple
highlighting) appears in the definition of this static sort
method. Since there will not necessarily be an instantiated class
object when we call the static method, the type variable A
which we associate with the whole class will not be resolved. Since
different calls to the static method may of course sort arrays of
different types, we need a local type variable for the static
method. Even though each call to the method may involve arrays of
different types of elements, each call must give an array and function
whose types ``match'': the function must take pairs of values whose
type is the same as the elements of the array. The local
quantification of the type variable is shown by the
<B> tag before the name of the method.
Clearly this static sorting method is useful because of its
brevity, but the final bit of the example main method
shows why the two-step approach is useful as well. To re-sort the
array in increasing order, we can use the same object q
again. The reuse of this closure saves the resources
involved in creating an object and discarding it after the single call
of the static method.
R. Sedgewick, Algorithms in C,
Addison-Wesley, 1990.
The
Problem
The Java
Solution
The Pizza
Solution
This link summons the Pizza code for Quicksort to a new browser
window. You may even download the QuickSorter<A> source code.
Exercise
Rewrite the QuickSorter class so that it does not store the
comparison function: there should be only two methods, a public static
routine with the same argument and result types as the current static
sort method which calls a recursive, private static method
with the additional bounds arguments.
References
C.A.R. Hoare, Quicksort, Comput. J.,
5 (1962), p. 10-15. Any good textbook on data structures will
cover this algorithm; see for example Sedgewick (1990).