The simplest form for a toplevel function declaration is
let f (t->s) x -> e
in which the body of a function is formed by a single branch
x -> e of pattern matching. As we have seen in the previous sections, the body of a function may be formed by several branches with complex patterns.
The interface t->s specifies a constraint on the behavior of the function to be checked by the type system: when applied to an argument of type t, the function returns a result of type s.
In general the interface of a function may specify several such constraints, as the names3 example The general form of a toplevel function declaration is indeed:
let f (t1->s1;...;tn->sn) p1 -> e1 | ... | pm -> em
Such a function accepts arguments of type (t1|...|tn); it has all the types ti->si, and, thus, it also has their intersection (t1->s1)&...&(tn->sn)
The use of several arrow types in an interface serves to give the function a more precise type. We can roughly distinguish two different uses of multiple arrow types in an interface:
- when each arrow type specifies the behavior of a different piece
of code forming the body of the function, the compound interface
serves to specify the overloaded behavior of the
function. This is the case for the
let add ( (Int,Int)->Int ; (String,String)->String ) | (x & Int, y & Int) -> x+y | (x & String, y & String) -> x@y
where each arrow type in the interface refers to a different branch of the body.
- when the arrow types specify different behavior for the same code, then the compound interface serves to give a more precise description of the behavior of the function. An example is the function names4 from Section "First functions".
There is no clear separation between these two situations since, in general, an overloaded function has body branches that specify behaviors of different arrow types of the interface but share some common portions of the code.
A more complex example
Let us examine a more complex example. Recall the types used to represent persons that we defined in Section "Getting started" that for the purpose of the example we can simplify as follows:
type Person = FPerson | MPerson type FPerson = <person gender = "F">[ Name Children ] type MPerson = <person gender = "M">[ Name Children ] type Children = <children>[ Person* ] type Name = <name>[ PCDATA ]
We want to transform this representation of persons into a different representation that uses different tags <man> and <woman> instead of the gender attribute and, conversely, that uses an attribute instead of an element for the name. We also want to distinguish the children of a person into two different sequences, one of sons, composed of men (i.e. elements tagged by <man>), and the other of daughters, composed of women. Of course we also want to apply this transformation recursively to the children of a person. In practice, we want to define a function <split> of type Person ->(Man | Woman) where Man and Woman are the types:
type Man = <man name=String>[ Sons Daughters ] type Woman = <woman name=String>[ Sons Daughters ] type Sons = <sons>[ Man* ] type Daughters = <daughters>[ Woman* ]
Here is a possible way to implement such a transformation:
let fun split (MPerson -> Man ; FPerson -> Woman) <person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] -> (* the above pattern collects all the MPerson in mc, and all the FPerson in fc *) let tag = match g with "F" -> `woman | "M" -> `man in let s = map mc with x -> split x in let d = map fc with x -> split x in <(tag) name=n>[ <sons>s <daughters>d ] ;;
The function split is declared to be an overloaded function that, when applied to a MPerson, returns an element of type Man and that, when applied to a FPerson, returns an element of type Woman. The body is composed of a single pattern matching
<person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] ->
whose pattern binds four variables: g is bound to the gender of the argument of the function, n is bound to its name, mc is bound to the sequence of all children that are of type MPerson, and fc is bound to the sequence of all children that are of type FPerson.
On the next line we define tag to be `man or `woman according to the value of g.
let tag = match g with "F" -> `woman | "M" -> `man
Then we apply split recursively to the elements of mc and fc.
let s = map mc with x -> split x in let d = map fc with x -> split x in <(tag) name=n>[ <sons>s <daughters>d ] ;;
Here is the use of overloading: since mc is of type [MPerson*], then by the overloaded type of split we can deduce that s is of type [Man*]; similarly we deduce for d the type [Woman*]. From this the type checker deduces that the expressions <sons>s and <daughters> are of type Sons and Daughters, and therefore it returns for the split function the type (MPerson -> Man) & (FPerson -> Woman). Note that the use of overloading here is critical: although split has also type Person ->(Man | Woman) (since split is of type (MPerson->Man) & (FPerson->Woman), which is a subtype), had we declared split of that type, the function would not have type-checked: in the recursive calls we would have been able to deduce for s and for d the type [ (Man | Woman)* ], which is not enough to type-check the result. If, for example, we wanted to define the same transformation in XDuce we would need first to apply a filter (that is our transform) to the children so as to separate male from females (while in CDuce we obtain it simply by a pattern) and then resort to two auxiliary functions that have nearly the same definition and differ only on their type, one being of type MPerson -> Man, the other of type FPerson -> Woman. The same transformation can be elegantly defined in XSLT with a moderate nloc increase, but only at the expense of loosing static type safety and type-based optimizations.