The Pizza Compiler
The Pizza language is an extension to Java with three new features: |
Pizza Tutorial
|
|
This section of the tutorial is an introduction to the features the
Pizza language adds to Java 1.1. We assume you know
at least the basics about Java programming.
The Pizza language is fully compatible
with the Java 1.1 language and thus you can compile all your Java
sources with the Pizza compiler without any changes.
We'll look first at the features Pizza adds to Java. Then we'll
start to combine them and see how much easier Java programming becomes
with Pizza.
But first of all you should know that all Pizza source files should
use the extension `.pizza'.
To illustrate a very simple example we'll write a class that stores
an element (source: StoreSomething.pizza):
You might say we could have done it using Java's Object class.
In that case we wouldn't be able to store a variable of Java's basic
types (as int). Furthermore we would be forced to use several
type casts to get the correct types when we use the get() method.
We can extend this class as any other Java class:
Ok, ok the StoreSomething class is not really helpfull. Let's
write a class to store two bits of data:
All the things we did to classes can also be done to interfaces. As an
example we consider this interface to sets of elements of a parameterised
type:
A common implementation of a set is a linked list storing the elements.
To implement the funcionality of a set we'll use two classes: EmptyList
to represent a list without any elements and NonEmptyList. Both
extend this abstract list class (source:
Polymorph.pizza):
As an example for the usage we consider this program which tests whether
its first command line argument is repeated in subsequent arguments.
So we set up an interface to wich we'll bind our parameter types:
Our implementation of trees consists again of an abstract class, a class
for empty trees and a class for non-empty trees (source: Bounded.pizza):
Sometimes, it is neccessary to also have polymorphic static functions.
Or, one wants a dynamic method which is polymorphic independently of any
type parameters. This can be done by prefixing the method return type with
a type-variable section:
With first class functions we can avoid the effort of wrapping classes
to make them implement a special interface. We define all the functions
we need to be passed as parameters to our methods. (Yes, we pass a function
as parameter to a function!)
The syntax for function types is:
Or, if exceptions are thrown by the function:
Any method with a matching signature may be substituted for a function
parameter. So a function boolean doIt(int, String) matches
(int, String) -> boolean or
int read(InputStream) throws IOException matches
(InputStream) throws IOException -> int.
As an example we consider the compareTo() method of the Java
class String. This method compares to Strings
lexically and case sensitve. Now we want to sort
Strings.
We want to change the behaviour of the sorting by providing the
function that compares to Strings:
The use of first class functions becomes much more important if you
are using parametric polymorphism.
So we go back to our tree example. We needed the method less which was
introduced via an interface. We now refine our class Tree to take
a function as parameter to the methods contains and insert:
Pizza's syntax to define an anonymous function is:
As a short example we consider an incrementer which will return
consecutive integer values on each call. We can easily instantiate
several counters of this kind:
We could also write incrementers with a startvalue as parameter:
The following function applies a given function
in infix order to all elements of a binary tree.
But if we want to examine a structure (like our List) from
outside we need to use type tests and type casts (like in the
traverse example in the last section).
Pizza gives us an easy to use syntax to avoid the complicated abstract
class/concrete cases construction: algebraic data types
In Pizza class can contain declarations of these forms:
To instantiate a tree with three elements we can write:
In the matching patterns we can replace parameters we don't want to
access with a single '_':
Parametric Polymorphism
Parametric Polymorphism gives us the possibility to write classes that
operate on data without specifing the data's type. (C++ programmers know
the template mechanism that does something similar but without typechecking.)
We read the source as a normal Java class StoreSomething that
takes the type parameter A. A can be any legal Java (or
Pizza) type. Whenever we write an A the Pizza compiler fills in
the type we specifiy at the point where we instantiate the class:
class StoreSomething<A> {
A something;
StoreSomething(A something) {
this.something = something;
}
void set(A something) {
this.something = something;
}
A get() {
return something;
}
}
Note that when allocating an Object of a parametric type, we don't need
to pass the instance type. The Pizza compiler infers this type from the
constructor arguments.
StoreSomething<String> a = new StoreSomething("I'm a string!");
StoreSomething<int> b = new StoreSomething(17+4);
b.set(9);
int i = b.get();
String s = a.get();
gives us a specialised class that only stores objects of type String.
class StoreString extends StoreSomething<String> {
StoreString(String something) {
super(something);
}
}
This gives us a parameterised class extending the other parameterised class.
class StoreSomething2<A> extends StoreSomething<A> {
StoreSomething2(A something) {
super(something);
}
void print() {
System.out.println(""+something);
}
}
This class becomes very handy, if you need a method to return two values:
public class Pair<A, B> {
public A fst;
public B snd;
public Pair(A fst, B snd) {
this.fst = fst;
this.snd = snd;
}
}
An because we know how handy it is, we integrated it to the standard Pizza
classes as class pizza.Pair!
Pair<String, int> getValues() {
return new Pair(aString, anInt);
}
The method include(A) shows us the sets defined by this interface
are persistent: Every time we include an element we get a new set
(it doesn't change the old one).
interface Set<A> {
boolean contains(A x);
Set<A> include(A x);
}
abstract class LinkedList<A> implements Set<A> {
public abstract boolean contains(A x);
public LinkedList<A> insert(A x) {
if (contains(x)) return this;
else return new NonEmptyList(x, this);
}
public Set<A> include(A x) {
return insert(x);
}
}
class EmptyList<A> extends LinkedList<A> {
public boolean contains(A x) {
return false;
}
}
class NonEmptyList<A> extends LinkedList<A> {
private A elem;
private LinkedList<A> rest;
public NonEmptyList(A elem, LinkedList<A> rest) {
this.elem = elem;
this.rest = rest;
}
public boolean contains(A x) {
return ((Object)elem).equals((Object)x) ||
rest.contains(x);
}
}
If this example doesn't look to handy to you: wait until you learn about
algebraic types! And if the code for method contains
looks a little bit strange to you, wait until you learn about Pizza's
typecasts!
public class TestForDuplicates {
public static void main(String[] args) {
Set<String> set = new EmptyList();
for (int i = 1; i < args.length; i++)
set = set.include(args[i]);
System.out.println(set.contains(args[0]));
}
}
Bounded Polymorphism
Our next task is to implement sets using a binary tree. To implement a
binary tree we have to know how to compare the elements. That's why we
have to set up a constraint on our parameter type - or to use the correct
expression we have to bind the type.
interface Comparable {
boolean equals(Object x);
boolean less(Object x);
}
And we limit the elements of our set to it:
interface Set<A implements Comparable> {
boolean contains(A x);
Set<A> include(A x);
}
This declaration means we can only use types as parameters to the interface
Set that implement the interface Comparable.
abstract class Tree<A implements Comparable>
implements Set<A> {
public abstract boolean contains(A x);
public abstract Tree<A> insert (A x);
public Set<A> include(A x) {
return insert(x);
}
}
class EmptyTree<A implements Comparable> extends Tree<A> {
public boolean contains(A x) {
return false;
}
public Tree<A> insert(A x) {
return new Branch(x, this, this);
}
}
class Branch<A implements Comparable> extends Tree<A> {
A elem;
Tree<A> left;
Tree<A> right;
public Branch(A elem, Tree<A> left, Tree<A> right) {
this.elem = elem;
this.left = left;
this.right = right;
}
public boolean contains(A x) {
if (x.less(elem))
return left.contains(x);
else if (elem.less(x))
return right.contains(x);
else // we assume total ordering, therefore x.equals(elem)
return true;
}
public Tree<A> insert(A x) {
if (x.less(elem))
return new Branch(elem, left.insert(x), right);
else if (elem.less(x))
return new Branch(elem, left, right.insert(x));
else // we assume total ordering, therefore x.equals(elem)
return this;
}
}
When you try to use this tree implementation you'll find out none of Java
classes implements our interface Comparable! To use these classes
with the tree we have to wrap it with a class implementing our interface
(however, Pizza gives us a way to do it without wrapping
classes):
class ComparableString implements Comparable {
private String s;
ComparableString(String s) {
this.s = s;
}
public boolean less(Object other) {
if (other instanceof ComparableString)
return s.compareTo(((ComparableString)other).s) < 0;
else
throw new Error("can't compare to non-strings");
}
}
To be able to include argument strings in sets, we need to wrap them in
a ComparableString. To show a sample usage of the tree impementation for
sets we use again our test-for-duplicates program:
public class TestForDuplicates {
public static void main(String[] args) {
Set<ComparableString> set = new EmptyTree();
for (int i = 1; i < args.length; i++)
set.include(new ComparableString(args[i]));
System.out.println(
set.contains(new ComparableString(args[0])));
}
}
F-bounded Polymorphism
In our implementation of less in the CompareableString
class you find again some nasty type checks that make the code less readable
and may result in exceptions during runtime. To avoid this the Comparable
interface can be refined with a type parameter. We have to refine the Set
interface to make sure these type parameters will be the same
(source: FBounded.pizza):
interface Comparable<A> {
boolean less(A x);
}
interface Set<A implements Comparable<A>> {
boolean contains(A x);
Set<A> include(A x);
}
Now ComparableString can be defined as follows:
Note that the type variable A appears recursively in its own bound in the
class definition of Set.
class ComparableString implements Comparable<ComparableString> {
private String s;
ComparableString(String s) { this.s = s; }
public boolean less(ComparableString other) {
return s.compareTo(other.s) < 0;
}
}
Polymorphic Functions
Type-variables which are type-parameters may only be used in dynamic methods,
blocks, and initializers. The reason is that for static code, there is no current
instance object, and hence no current instance-type for the type parameter.
class TreeUtils {
static <A implements Comparable<A>>
Tree<A> insertArray(Tree<A> tree, A[] elems) {
for (int i = 0; i < elems.length; i++)
tree = tree.insert(elems[i]);
return tree;
}
}
Implementing parametric types
Restrictions on parametric types
First class functions
First class functions let us use functions as special kinds of data types. We can
store a funcion in a variable and access it without any knowledge about
the class it's defined in. (If you know C++ you know about function pointers.)
(argtype, ..., argtype) -> resulttype.
(argtype, ..., argtype) throws exception, ..., exception -> resulttype.
To uses the compareTo method of the class String in
one case and a case insensitive version in another we
can now write:
class PrintSortedStrings {
static void print(String[] args, (String, String) -> int compare) {
....
if (compare(args[i], args[j]) > 0)
....
}
}
int myCompare(String s1, String s2) {
return s1.compareTo(s2);
}
int myCompareNoCase(String s1, String s2) {
return s1.toLowerCase().compareTo(s2.toLowerCase());
}
String[] myStrings = { "These", "Strings", "will", "be", "sorted", "!"};
PrintSortedStrings.print(myStrings, myCompare);
PrintSortedStrings.print(myStrings, myCompareNoCase);
These methods can now use the variable less to use it's function,
as if it was defined locally:
abstract class Tree<A> {
public abstract boolean contains((A,A)->boolean less, A x);
public abstract Tree<A> insert((A,A)->boolean less, A x);
}
public boolean contains((A, A) -> boolean less, A x) {
if (less(x, elem))
return left.contains(less, x);
else if (less(elem, x))
return right.contains(less, x);
else // we assume total ordering, therefore x.equals(elem)
return true;
}
We can now implement a tree class to store elements of all types without
any restrictions. Even Java's basic type can be used again, which was not
possible in the F-bounded version via the interface. (source: FirstClass.pizza)
To get a tree that stores Strings we can now extend the Branch
class and overload the methods contains and insert:
class Branch<A> extends Tree<A> {
A elem;
Tree<A> left;
Tree<A> right;
public Branch(A elem, Tree<A> left, Tree<A> right) {
this.elem = elem;
this.left = left;
this.right = right;
}
public boolean contains((A, A) -> boolean less, A x) {
if (less(x, elem))
return left.contains(less, x);
else if (less(elem, x))
return right.contains(less, x);
else // we assume total ordering, therefore x.equals(elem)
return true;
}
public Tree<A> insert((A, A) -> boolean less, A x) {
if (less(x, elem))
return new Branch(elem, left.insert(less, x), right);
else if (less(elem, x))
return new Branch(elem, left, right.insert(less, x));
else
return this;
}
}
As second example we implement mutable sets. (Our first implementation
had persistent sets.) (source: MutableSet.pizza)
class StringBranch extends Branch<String> {
StringBranch(String elem, Tree<String> left, Tree<String> right) {
super(elem, left, right);
}
private static boolean stringLess(String x, String y) {
return x.compareTo(y) > 0;
}
public boolean contains(String x) {
return contains(stringLess, x);
}
public Tree<String> insert(String x) {
return insert(stringLess, x);
}
}
We don't have to extend a class that uses first class functions as parameters
as we did with the StringTree class. In this example we define
the needed function locally to the use of the Set class (in the
next section you'll learn about a third way!):
interface Set<A> {
boolean contains(A x);
void include(A x);
}
class TreeSet<A> implements Set<A> {
private (A, A) -> boolean less;
Tree<A> elems = new EmptyTree();
public TreeSet((A, A) -> boolean less) {
this.less = less;
}
public boolean contains(A x) {
return elems.contains(less, x);
}
public void include(A x) {
elems = elems.insert(less, x);
}
}
public class TestForDuplicates {
static boolean stringLess(String x, String y) {
return x.compareTo(y) < 0;
}
public static void main(String[] args) {
Set<String> s = new TreeSet(stringLess);
for (int i = 1; i < args.length; i++)
s.include(args[i]);
System.out.println(s.contains(args[0]));
}
}
Anonymous Functions
Instead of defining a method belonging to a class Pizza allows you to
define functions locally. These functions don't have a name that's why
they are known as anonymous functions.
With anonumous functions we can recode our test-for-duplicates
example (same source: MutableSet.pizza):
fun(arguments) -> resulttype { statements }
We can declare inside other functions (even in other anonymous
functions). They may access all variables and methods visible to
normal methods at that point.
public class TestForDuplicatesAnonymous {
public static void main(String[] args) {
Set<String> s =
new TreeSet(
fun(String x, String y) -> boolean {
return x.compareTo(y) < 0;
}
);
for (int i = 1; i < args.length; i++)
s.include(args[i]);
System.out.println(s.contains(args[0]));
}
}
() -> int makeIncr() {
int count = 0;
return fun() -> int {
return count++;
}
}
As you see the return value is a function (with the signature
() -> int). Whenever we need a incrementer we call
this method.
() -> int incr1 = makeIncr();
() -> int incr2 = makeIncr();
System.out.println(
incr1() + " " + incr1() + " " + incr2() + " " + incr1());
This example will print 0 1 0 2.
() -> int makeIncr(int start) {
int count = start;
return fun() -> int {
return count++;
}
}
You have to understand where the parameter acutally goes, it is not a
parameter to the anonymous function!
We can use this method to sum up all elements of a tree of integers:
void <A> traverse(Tree<A> t, (A)->void proc) {
if (t instanceof Branch) {
Branch b = (Branch)t;
traverse(b.left, proc);
proc(b.elem);
traverse(b.right, proc);
}
}
The technique in traverse is one of the most commonly used
benefits of first class functions. You find numerous examples in Pizza's List class.
int sum(Tree<int> t) {
int n = 0;
traverse(t, fun(int x) -> void { n += x; });
return n;
}
Algebraic Data Types
In the list implementation of sets we used one abstract base class and
two classes to represent list fields. We often have this situation
representing a data abstraction by an abstract base class and some
subclasses which represent individual cases. Depending on the type of
the subclass we change the behaviour.
With this syntax our Tree class becomes:
case ident(arguments);
case ident;
This example defines one class Tree which provides the
variable Empty representing empty branches and a method
Branch(A, Tree<A>, Tree<A>) instantiating a
representation for a non-empty branch.
class Tree<A> implements Set<A> {
case Empty;
case Branch(A elem, Tree<A> left, Tree<A> right);
}
To avoid the need of using the classname Tree in every
instantiation Pizza allows us to import all cases of a class. Just as
an import from a Java package we write an import statement in
the file header (source: Algebraic.pizza):
Tree<int> t = Tree.Branch(2,
Tree.Branch(1, Tree.Empty, Tree.Empty),
Tree.Branch(3, Tree.Empty, Tree.Empty));
import Tree.*;
class Tree<A> {
case Empty;
case Branch(A elem, Tree<A> left, Tree<A> right);
}
public class Test {
public static void main(String[] args) {
Tree<int> t = Branch(2,
Branch(1, Empty, Empty),
Branch(3, Empty, Empty));
}
}
Pattern matching
Until now we only know how to create a tree, but we can access none of
the tree's data. Pizza internally generates inner classes to represent
algebraic cases of a class. In these classes the arguments we provide
in the case syntax become variables. We could access the element
stored in a branch this way:
This, again, seems very complicated in standard Java syntax with all
it's typechecks. Pizza provides us a very simple way to access
instances of an algebraic class similar to Java's switch
statement (source: Pattern.pizza).
Tree<String> t = ... ;
String s;
if (t instanceof Branch) {
s = (Branch)t.elem;
} else {
s = "(empty)";
}
class Tree<A> {
case Empty;
case Branch(A elem, Tree<A> left, Tree<A> right);
public boolean contains((A, A) -> boolean less, A x) {
switch (this) {
case Empty:
return false;
case Branch(A elem, Tree<A> l, Tree<A> r):
if (less(x, elem))
return l.contains(less, x);
else if (less(elem, x))
return r.contains(less, x);
else
return true;
}
}
public Tree<A> insert((A, A) -> boolean less, A x) {
switch (this) {
case Empty:
return Branch(x, Empty, Empty);
case Branch(A elem, Tree<A> l, Tree<A> r):
if (less(x, elem))
return Branch(elem, l.insert(less, x), r);
else if (less(elem, x))
return Branch(elem, l, r.insert(less, x));
else
return this;
}
}
}
As we see the switch (var) statement for algebraic types is very
powerful. It does two things for us: It selects the case corresponding
to the case of the instance of var. And it assigns all
variables in a constructed case with the corresponding fields (in our
example l matches left, because it's on the same
position in the argument list).
Tree<A> goRight() {
switch (this) {
case Empty:
return this;
case Branch(_, _, Tree<A> r):
return r.goRight();
}
}
Deep pattern matching
Pattern matching is not restricted to just one level of variables.
If you really want to understand it examine this complicated example:
static <A,B> List<C> mapPairs((A,B)->C f, List<A> xs, List<B> ys) {
List<C> result;
switch (Pair.Pair(xs, ys)) {
case Pair(List.Nil, _):
case Pair(_, List.Nil):
result = List.Nil;
break;
case Pair(List.Cons(A x, List<A> xs1),
List.Cons(B y, List<B> ys1)):
result = List.Cons(f(x, y), mapPairs(f, xs1, ys1));
break;
case _: System.out.println("case 2"); break;
}
return result;
}
Special case: Classes with a single constructor
The algebraic class syntax is so convenient that it's useful even if
there are no sub-cases. An example is the class Pair:
is a shorthand for
class Pair<A,B> {
case Pair(A fst, B snd);
}
If there's only one constructor, no separate class for the case will
be produced. Therefore, it's OK to use the same name Pair for
the class and the case. class Pair<A,B> {
A fst;
B snd;
Pair(A fst, B snd) {
this.fst = fst;
this.snd = snd;
}
}
Special case: Enums
By defining a class with only constant subcases we get enumeration types,
which are missing in Java (compared to C):
This is often preferable to using sets of integer constants since a
typecheck by the compiler is provided and you can't use an undefined
value.
class Color {
case Red;
case Green;
case Blue;
String toString() {
switch (this) {
case Red: return "Red";
case Green: return "Green";
case Blue: return "Blue";
}
}
}
Type casts in Pizza
To use parametric polymorphism even for Java's basic types we
sometimes need the ability to cast between objects and basic
types. Pizza provides typecasts from basic types to their boxing
classes and back:
However, as always typecasts can go wrong during runtime!
int basicX;
Integer integerX;
Object objectX;
basicX = (int)integerX;
basicX = (int)objectX;
integerX = (Integer)basicX;
objectX = (Object)basicX;
Tail Call Recursion
If you use too deep recursion the execution of your program might stop
with an StackOverflowException. The reason is all return
addresses, local variables and method parameters are stored on the
stack. There is a case in which the Pizza compiler can avoid the
enormous stack usage. If the last command in a recursively called
method is the call to itself we call it tail call recursion. In
this case you can tell the Pizza compiler to generate code that does
not use the stack. The method itself has to be marked with
continue and the recursive call to itself has to be done with
the goto statement (source: TailCall.pizza):
However, this approach results in code many times
slower than the normal solution via the stack. So you should use it
very carefully.
continue int veryRecursive(int howDeep) {
if (howDeep == 0) {
System.out.println("I'm there -- finally!");
return -1;
} else return goto veryRecursive(howDeep - 1);
}
Pizza's API Classes
To get into programming with Pizza you should have a closer look to
it's standard API classes
listed here. You'll find the documentation
to all the classes in your Pizza distribution in the directory
pizza/doc. These HTML pages are automatically generated by
Pizzadoc.
List<A>
Store lists of elements and can apply functions to the elements in
many ways. Offer functional-programming-like access to data.
ListBuffer<A>
Buffers to which elements can be appended in constant time, and which
can be converted to lists. Modelled after java.lang.StringBuffer.
Cell<A>
Reference cells holding a single mutable item of type A.
Pair<A,B>
Pairs of values.
Package pizza.util:
Enumeration<A>
An interface for type-safe enumeration, modelled after java.util.Enumeration.
Enumerator<A>
An abstract class which defines additional functionality for enumerations. E.g.
map, filter, fold, concat.
CompositeEnumerator<A>
EmptyEnumerator<A>
ListEnumerator<A>
SingletonEnumerator<A>
MapEnumerator<A>
SetEnumerator<A>Concrete implementations of enumerators.
Hashtable<Key,Value>
Type-safe hashtables, modelled after java.util.Hashtable.
Set<Elem>
(Mutable) sets.
Option<A>
Optional occurence: Either an A or a value for nothing.
Further Information
Pizza implements everything on top of Java. You can translate your
Pizza sources to Java sources by providing the command line parameter
-s to the compiler. All internally generated classes will be
written as Java sources to the class directory. This is the best way
to find out how Pizza creates these additional features.
Tutorial author: Enno Runne.