topical media & game development

object-oriented programming

# Type abstraction

#### subsections:

In this section we will study type calculi that allow us to express the various forms of polymorphism, including inclusion polymorphism (due to inheritance), parametric polymorphism (due to generics) and intersection types (due to overloading), in a syntactic way, by means of appropriate type expressions. The type calculi are based on the typed lambda calculus originally introduced in  [Ca84] to study the semantics of multiple inheritance. We will first study some simple extensions to the typed lambda calculus and then discuss examples involving universal quantification (defining parametric types), existential quantification (hiding implementation details) and bounded quantification (modeling subtypes derived by inheritance). For those not familiar with the lambda calculus, a very elementary introduction is given below. For each calculus, examples will be given to relate the insights developed to properties of the C++ type system.

#### The lambda calculus

The lambda calculus provides a very concise, yet powerful formalism to reason about computational abstraction. The introduction given here has been taken from  [Barend], which is a standard reference on this subject.

#### Lambda calculus -- very informal

$%l$

• variables, abstractor $%l$, punctuation $\left(,\right)$

#### Lambda terms -- $%L$

• $x \e %L$ \zline{variables}
• $M \e %L => %l x. M \e %L$ \zline{abstraction}
• $M \e %L$ and $N \e %L => M N \e %L$ \zline{application}

slide: The lambda calculus -- terms

Syntactically, lambda terms are built from a very simple syntax, figuring variables, the abstractor $%l$ (that is used to bind variables in an expression), and punctuation symbols. Abstractors may be used to abstract a lambda term M into a function $%l x.M$ with parameter x. The expression $%l x.M$ must be read as denoting the function with body M and formal parameter x. The variable x is called the bound variable, since it is bound by the abstractor $%l$. In addition to function abstraction, we also have (function) application, which is written as the juxtaposition of two lambda terms. See slide lambda-terms. Behaviorally, lambda terms have a number of properties, as expressed in the laws given in slide lambda-laws.

#### Laws

• $\left( %l x.M\right) N = M \left[ x := N \right]$ \zline{conversion}
• $M = N => MZ = NZ$ and $ZM = ZN$
• $M = N => %l x.M = %l x.N$

slide: The lambda calculus -- laws

The most important rule is the beta conversion rule, which describes in a manner of speaking how parameter passing is handled. In other words function call, that is the application $\left(%l x.M\right)N$, results in the function body M in which N is substituted for x. Two other laws are the so-called extensionality axioms, which express how equality of lambda terms is propagated into application and function abstraction. These laws impose constraints upon the models characterizing the meaning of lambda terms.

#### Substitution

• $x\left[x:=N\right] == N$
• $y\left[x:=N\right] == y$ if $x != y$
• $\left(%l y.M\right)\left[x:=N\right] == %l y.\left(M\left[x:=N\right]\right)$
• $\left(M_1 M_2 \right) \left[x:=N\right]$ $==$ $\left(M_1 \left[x:=N\right]\right) \left(M_2\left[x:=N\right]\right)$

slide: The lambda calculus -- substitution

Substitution is defined by induction on the structure of lambda terms. A variable y is replaced by N (for a substitution $\left[x:=N\right]$) if y is x and remains y otherwise. A substitution $\left[x:=N\right]$ performed on an abstraction $%l y.M$ results in substituting N for x in M if x is not y. If x is identical to y, then y must first be replaced by a fresh variable (not occurring in M). A substitution performed on an application simply results in applying the substitution to both components of the application. See slide lambda-substitution. Some examples of beta conversion are given in slide lambda-examples. In the examples, for simplicity we employ ordinary arithmetical values and operators. This does not perturb the underlying $%l$-theory, since both values and operations may be expressed as proper $%l$-terms.

#### Examples


$\left($\%l x.x) 1 = x[x:=1] = 1
$\left($\%l x.x+1) 2 = (x+1)[x:=2] = 2 + 1
$\left($\%l x.x+y+1) 3 = (x+y+1)[x:=3] = 3+y+1
$\left($\%l y.(\%l x.x+y+1) 3) 4) =
$\left(\left($\%l x.x+y+1) 3)[y:=4] = 3 + 4 + 1


slide: Beta conversion -- examples

Note that the result of a substitution may still contain free variables (as in the third example) that may be bound in the surrounding environment (as in the fourth example). Lambda calculus may be used to state properties of functions (and other programming constructs) in a general way.

#### Properties

• $\A M \left(%l x.x\right) M = M$ \zline{identity}
• $\A F \E X. F X = X$ \zline{fixed point}

Proof: take $W = %l x.F \left( xx \right)$ and $X = WW$, then


$X = WW = \left($\%l x.F ( xx ) ) W = F ( WW ) = FX


slide: The lambda calculus -- properties

Consider, for example, the statement that the identity function works for each lambda term as expected. The quantification with respect to M indicates that in each possible model for the lambda calculus (that respects the extensionality axioms given above) the identity $\left(%l x.x\right)M = M$ holds. See slide lambda-properties. As another example, consider the statement that each function F has a fixed point, that is a value X for which $FX = X$. The proof given above, however, does not give us any information concerning the actual contents of the fixed point, but merely proves its existence. In the following (see section self-reference) we will write $\Y\left(F\right)$ for the fixed point of a function F. In  [Barend], an extensive account is given of how to construct mathematical models for the lambda calculus. A semantics of our type calculus may be given in terms of such models; however we will not pursue this any further here.

## A simple type calculus

In our first version of a type calculus we will restrict ourselves to a given set of basic types (indicated by the letter $%r$) and function types (written $%s -> %t$, where $%s$ stands for the domain and $%t$ for the range or codomain). This version of the typed lambda calculus (with subtyping) is called $%l_\left\{ <= \right\}$ in  [Pierce93] from which most of the material is taken. The $%l_\left\{<=\right\}$ calculus is a first order calculus, since it does not involve quantification over types. See slide 9-c-subtypes. The structure of type expressions is given by the definition

\%t ::= \%r | \%t_1 -> \%t_2

where we use $%t$ as a type identifier and $%r$ as a meta variable for basic types. The expressions of our language, that we indicate with the letter e, are similar to lambda terms, except for the typing of the abstraction variable in $%l x: %t.e$.

#### A simple type calculus -- subtypes

$%l_\left\{ <= \right\}$

• $%t ::= %r | %t_1 -> %t_2$
• $e ::= x | %l x : %t . e | e_1 e_2$

#### Type assignment

• \ffrac{$%G |- x : %s$ \hspace{1cm} $%G |- e : %t$}{ $%G |- %l x : %s . e \e %s -> %t$ }
• \ffrac{$%G |- e_1 : %s -> %t, e_2 : %s$}{ $%G |- e_1 e_2 \e %t$ }

#### Refinement

• \ffrac{ $%G |- e : %s$ \hspace{1cm} $%G |- %s <= %t$}{ $%G |- e : %t$ }

slide: The subtype calculus

To determine whether an expression e is correctly typed (with some type expression $%t$) we need type assignment rules, as given above. Typing is usually based on a collection of assumptions $%G$, that contains the typing of expressions occurring in the expression for which we are determining the type. In the type assignment rules and the (subtyping) refinement rules, the phrase $%G |- e : %t$ means that the expression e has type $%t$, under the assumption that the type assignments in $%G$ are valid. When $%G$ is empty, as in $|- e : %t$, the type assignment holds unconditionally. Occasionally, we write $%G |- e \e %t$ instead of $%G |- e : %t$ for readability. These two expressions have identical meaning. The premises of a type assignment rule are given above the line. The type assignment given below the line states the assignment that may be made on the basis of these premises. For example, the first type assignment rule states that, assuming $%G |- x : %s$ (x has type $%s$) and $%G |- e : %t$ (e has type $%t$) then $%G |- %l x : %s. e \e %s -> %t$, in other words the abstraction $%l x:%s.e$ may be validly typed as $%s -> %t$. Similarly, the second type assignment rule states that applying a function $e_1 : %s -> %t$ to an expression $e_2$ of type $%s$ results in an (application) expression $e_1 e_2$ of type $%t$. We may assume the basic types denoted by $%r$ to include (integer) subranges, records and variants. As a consequence, we may employ the subtyping rules given in section subtypes to determine the subtyping relation between these types. The (subtyping) refinement rule repeated here expresses the substitutability property of subtypes, which allows us to consider an expression e of type $%s$, with $%s <= %t$, as being of type $%t$. In slide 9-ex-subtypes, some examples are given illustrating the assignment of types to expressions. Type assignment may to a certain extent be done automatically, by type inference, as for example in ML, see  [ML90]. However, in general, typing is not decidable when we include the more powerful type expressions treated later. In those cases the programmer is required to provide sufficient type information to enable the type checker to determine the types.

#### Examples

• $S = %l x:Int. x + 1$
$S : Int -> Int$
• $twice = %l f:Int -> Int. %l y : Int . f \left( f \left( y \right) \right)$ $twice : \left( Int -> Int \right) -> Int -> Int$

#### Application

• $S 0 = 1 \e Int$
• $twice \left( S \right) = %l x. S S x \e Int -> Int$

slide: Subtypes -- examples

When we define the successor function S as $%l x : Int. x + 1$ then we may type S straightforwardly as being of type $Int -> Int$. Similarly, we may type the (higher order) function twice as being of type $\left(Int -> Int\right) -> Int -> Int$. Note that the first argument to twice must be a function. Applying twice to a function argument only results in a function. When applied to S it results in a function of type $Int -> Int$ that results in applying S twice to its (integer) argument. The subtyping rules (partly imported from section subtypes) work as expected. We may define, for example, a function $+ : Real \* Real -> Int$ as a subtype of $+ : Int \* Int -> Int$ (according to the contra-variant subtyping rule for functions).

#### Subtyping in C++

Subtyping is supported in C++ only to a very limited extent. Function subtypes are completely absent. However, class subtypes due to derivation by inheritance may be employed. Also, built-in conversions are provided, some of which are in accordance with the subtyping requirements, and some of which, unfortunately, violate the subtyping requirements. Built-in conversions exist, for example, between double and int, in both ways. However, whereas the conversion from int to double is safe, the other way around may cause loss of information by truncation. The type system sketched in slide 9-c-subtypes is quite easily mapped to a C++ context. For example, we may mimic the functions S and twice as given in slide 9-ex-subtypes in C++ as:


int S(int x) { return x+1; }
int twice(int f(int), int y) { return f(f(y)); }
int twice_S(int y) { return twice(S,y); }


slide: Types in C++

Nevertheless, the type system of C++ imposes some severe restrictions. For example, functions may not be returned as a value from functions. (Although we may provide a workaround, when we employ the $operator\left(\right)$ function for objects.) The absence of function subtyping becomes clear when, for example, we call the function twice with the function SD, which is defined as:


int SD(double x) { return x+1; } // twice(SD) rejected


slide: SD example

According to the subtyping rules and in accordance with the substitutability requirement, we employ SD whenever we may employ S. But not so in C++. We run into similar limitations when we try to refine an object class descriptions following the object subtype refinement rules.

class P {    P
public:
P() { _self = 0; }
virtual P* self() {
return _self?_self->self():this;
}
virtual void attach(C* p) {
_self = p;
}
private:
P* _self;
};

class C : public P {    $C <= P$
public:
C() : P(this) { }
C* self() { // ANSI/ISO
return _self?_self->self():this;
}
void attach(P* p) {  // rejected
p->attach(self());
}
void redirect(C* c) { _self = c; }
private:
C* _self;
};


slide: Subtyping in C++

Suppose we have a parent class P which offers the member functions self and attach, as in slide 9-cc-sub. The meaning of the function self is that it de-references the _self variable if it is non-zero and delivers this otherwise. (See section hush for an example of its use.) The function attach may be used to connect an instance of C to the _self variable. The class C in its turn inherits from P and redefines self and attach. Syntactically, both refinements are allowed, due to the function subtype refinements rules. The function self is redefined to deliver a more tightly specified result, and the attach function is allowed to take a wider range of arguments. In a number of compilers for C++, both redefinitions are considered illegal. However, in the ANSI/ISO standard of C++, redefining a member function to deliver a subtype (that is, derived class) pointer will be allowed. Redefining attach, as has been done for C is probably not a wise thing to do, since it changes the semantics of attach as defined for the parent class P. In effect, it allows us to write $c->attach\left(p\right)$ instead of $p->attach\left(c->self\left(\right)\right)$, for $P\!* p$ and $C\!* c$. Nevertheless, from a type theoretical perspective, there seem to be no grounds for forbidding it.

## Intersection types

We define our second version of the typed lambda calculus ($%l_\left\{ /\ \right\}$) as an extension of the first version ($%l_\left\{ <= \right\}$), an extension which provides facilities for (ad hoc) overloading polymorphism. Our extension consists of adding a type expression $\bigwedge \left[ %t_1,...,%t_n \right]$ which denotes a finite conjunction of types. Such a conjunction of types, that we will also write as $%t_1 /\ ... /\ %t_n$ is called an intersection type. The idea is that an expression e of type $\bigwedge \left[ %t_1,...,%t_n \right]$ is correctly typed if $e : %t_i$ for some i in $1..n$. This is expressed in the type assignment rule given in slide
9-c-intersection.

$%l_\left\{ /\ \right\}$

• $%t ::= %r | %t_1 -> %t_2 | \bigwedge \left[ %t_1 .. %t_n \right]$

#### Type assignment

• \ffrac{$%G |- e : %t_i$ $\left(i \e 1..n\right)$}{ $%G |- e : \bigwedge \left[ %t_1 .. %t_n \right]$ }

#### Refinement

• \ffrac{ $%G |- %s <= %t_i$ $\left(i \e 1..n \right)$}{ $%G |- %s <= \bigwedge \left[ %t_1 .. %t_n \right]$ }
• $\bigwedge \left[ %t_1 .. %t_n \right] <= %t_i$
• $%G |- \bigwedge \left[ %s -> %t_1 .. %s -> %t_n \right] <= %s -> \bigwedge \left[ %t_1 .. %t_n \right]$

slide: The intersection type calculus

The subtyping rule for intersection types states that any subtype of a type occurring in the intersection type $\bigwedge \left[ %t_1,..., %t_n \right]$ is itself a subtype of the intersection type. In addition we have two subtyping rules without premises, the first of which says that the intersection type itself may be regarded as a subtype of any of its components. In other words, from a typing perspective an intersection type is equal (hence may be replaced by) any of its component types. Also, we may refine a function, with domain $%s$, which has an intersection type $\bigwedge \left[ %t_1,..., %t_n \right]$ as its range into an intersection type consisting of functions $%s -> %t_i$ for $i = 1..n$. Intersection types allow us to express a limited form of overloading, by enumerating a finite collection of possible types. Since the collection of types comprising an intersection type is finite, we do not need a higher order calculus here, although we might have used type abstraction to characterize intersection types.

#### Examples

• $+ : \bigwedge \left[ Int \* Int -> Int, Real \* Real -> Real\right]$
• $Int -> Int <= \bigwedge \left[ Int -> Int, Real -> Real \right]$
• $Msg -> Obj1 /\ Msg -> Obj2 <= Msg -> \bigwedge \left[ Obj1, Obj2 \right]$

slide: Intersection types -- examples

A typical example of an intersection type is presented by the addition operator, overloaded for integers and reals, which we may define as

$+ : \bigwedge \left[ Int \* Int -> Int, Real \* Real -> Real \right]$

According to our refinement rule, we may specialize an intersection type into any of its components. For example, when we have an intersection type defining a mapping for integers and a mapping for reals, we may choose the one that fits our purposes best. This example illustrates that intersection types may be an important tool for realizing optimizations that depend upon (dynamic) typing. Similarly, we may refine a generic function working on objects into a collection of (specialized) functions by dividing out the range type. See slide 9-ex-intersection. The resulting intersection type itself may subsequently be specialized into one of the component functions. In  [CGL93], a similar kind of type is used to model the overloading of methods in objects, that may but need not necessarily be related by inheritance. The idea is to regard message passing to objects as calling a polymorphic function that dispatches on its first argument. When the type of the first argument is compatible with multiple functions (which may happen for methods that are refined in the inheritance hierarchy) the most specific function component is chosen, that is the method with the minimal object type. A similar idea is encountered in CLOS, which allows for the definition of multi-methods for which dynamic dispatching takes place for all arguments. (A problem that occurs in modeling methods as overloaded functions is that the subtyping relation between methods no longer holds, due to the domain contravariance requirement. See  [CGL93] for a possible solution.)

Although C++ does not provide support for subtyping, it does provide extensive support for function overloading. Given a collection of functions (overloading a particular function name) C++ employs a system of matches to select the function that is most appropriate for a particular call.

C++

• [1] no or unavoidable conversions -- $array->pointer$, $T -> const T$
• [2] integral promotion -- $char->int$, $short->int$, $float->double$
• [3] standard conversions -- $int->double$, $double->int$, $derived* -> base*$
• [4] user-defined conversions -- constructors and operators
• [5] ellipsis in function declaration -- ...

#### Multiple arguments -- intersect rule

• better match for at least one argument and at least as good a match for every other argument

Matches may involve built-in or user-defined conversions. The general rule underlying the application of conversions is that {\em conversions that are considered less error-prone and surprising are to be preferred over the others}. This rule is reflected in the ordering of the C++ overloading selection rules depicted in slide 9-cc-over. According to the rules, the absence of conversions is to be preferred. For compatibility, with C, array to pointer conversions are applied automatically, and also T to $const T$ conversions are considered as unproblematic. Next, we have the integral promotion rules, allowing for the conversion of char to int and short to int, for example. These conversions are also directly inherited from C, and are safe in the sense that no information loss occurs. Further, we have the standard conversions such as int to double and $derived*$ to $base*$, user-defined conversions (as determined by the definition of one-argument constructors and conversion operators), and the ... ellipsis notation, which allows us to avoid type-checking in an arbitrary manner. For selecting the proper function from a collection of overloaded functions with multiple arguments, the so-called intersect rule is used, which states that the function is selected with a better match for at least one argument and at least as good a match for every other argument. In the case that no winner can be found because there are multiple candidate functions with an equally good match, the compiler issues an error, as in the example below:


void f(int, double);
void f(double, int);

f(1,2.0); // f(int, double);
f(2.0,1); // f(double,int);
f(1,1); // error: ambiguous


slide: example

The reason that C++ employs a system of matches based on declarations and actual parameters of functions is that the graph of built-in conversions (as inherited from C) contains cycles. For example, implicit conversions exist from int to double and double to int (although in the latter case the C++ compiler gives a warning). Theoretically, however, the selection of the best function according to the subtype relation would be preferable. However, the notion of best is not unproblematic in itself. For example, consider the definition of the overloaded function f and the classes P and C in slide 9-cc-best.


class P;
class C;

void f(P* p) { cout << "f(P*)"; } // (1)
void f(C* c) { cout << "f(C*)"; } // (2)

class P {
public:
virtual void f() { cout << "P::f"; }// (3)
};

class C : public P {
public:
virtual void f() { cout << "C::f"; } // (4)
};


slide: Static versus dynamic selection

What must be considered the best function f, given a choice between (1), (2), (3) and (4)?


P* p = new P; // static and dynamic P*
C* c = new C; // static and dynamic C*
P* pc = new C; // static P*, dynamic C*

f(p); // f(P*)
f(c); // f(C*)
f(pc); // f(P*)

p->f(); // P::f
c->f(); // C::f
pc->f(); // C::f

In the example given above, we see that for the functions f (corresponding to (1) and (2)) the choice is determined by the static type of the argument, whereas for the member functions f (corresponding to (3) and (4)) the choice is determined by the dynamic type. We have a dilemma. When we base the choice of functions on the dynamic type of the argument, the function subtype refinement rule is violated. On the other hand, adhering to the domain contravariance property seems to lead to ignoring the potentially useful information captured by the dynamic type of the argument.

## Bounded polymorphism

Our next extension, which we call $F_\left\{ <= \right\}$, involves (bounded) universal quantification. For technical reasons we need to introduce a primitive type Top, which may be considered as the supertype of all types (including itself). Also we need type abstraction variables, that we will write as $%a$ and $%b$. Our notation for a universally quantified (bounded) type is $\A %a <= %s . %t$, which denotes the type $%t$ with the type variable $%a$ replaced by any subtype $%s\text{'}$ of $%s$. In a number of cases, we will simply write $\A %a. %t$, which must be read as $\A %a <= Top. %t$. Recall that any type is a subtype of Top. Observe that, in contrast to $%l_\left\{ <= \right\}$ and $%l_\left\{ /\ \right\}$, the calculus $F_\left\{ <= \right\}$ is second order (due to the quantification over types). In addition to the (value) expressions found in the two previous calculi, $F_\left\{ <= \right\}$ introduces a type abstraction expression of the form $%L %a <= %t.e$ and a type instantiation expression of the form $e\left[%t\right]$. The type abstraction expression $%L %a <= %t.e$ is used in a similar way as the function abstraction expression, although the abstraction involves types and not values. Similar to the corresponding type expression, we write $%L %a.e$ as an abbreviation for $%L %a <= Top.e$. The (complementary) type instantiation statement is written as $e\left[%t\right]$, which denotes the expression e in which the type identifier $%t$ is substituted for the type variable bound by the first type abstractor.

#### Bounded polymorphism -- abstraction

$F_\left\{ <= \right\}$

• $%t ::= Top | %a | %r | %t_1 -> %t_2 | \A %a <= %t_1. %t_2$
• $e ::= x | %l x:%t.e | e_1 e_2 | %L %a <= %t.e | e \left[ %t \right]$

#### Type assignment

• \ffrac{$%G, %a <= %s |- e : %t$ }{ $%G |- %L %a <= %s. e \e \A %a <= %s . %t$ }
• \ffrac{$%G, e : \A %a <= %s . %t$ \hspace{1cm} $%G |- %s\text{'} <= %s$ }{ $%G |- e \left[ %s\text{'} \right] \e %t \left[ %a := %s\text{'} \right]$ }

#### Refinement

• \ffrac{ $%G |- %s <= %s\text{'}$ \hspace{1cm} $%G |- %t\text{'} <= %t$}{ $%G |- \A %a <= %s\text{'}.%t\text{'} <= \A %a <= %s.%t$ }

slide: The bounded type calculus

The type assignment rule for type abstraction states that, when we may type an expression e as being of type $%t$ (under the assumption that $%a <= %s$), then we may type $%L %a <= %s.e$ as being of type $\A %a <= %s.%t$. The type assignment rule for type instantiation characterizes the relation between type instantiation and substitution (which is notationally very similar). When we have an expression e of type $\A %a <= %s.%t$ and we have that $%s\text{'} <= %s$, then $e\left[%s\text{'}\right]$ is of type $%t \left[ %a := %s\text{'} \right]$, which is $%t$ with $%s\text{'}$ substituted for $%a$. See slide 9-c-bounded. The refinement rule for bounded types states the subtyping relation between two bounded types. We have that $\A %a <= %s\text{'} . %t\text{'}$ is a subtype of $\A %a <= %s.%t$ whenever $%s <= %s\text{'}$ and $%t\text{'} <= %t$. Notice that the relation is contravariant with respect to the types bounding the abstraction, similar as for the domains of function subtypes in the function subtyping rule. In contrast to the polymorphism due to object type extensions and overloading, bounded polymorphism (employing type quantifiers) is an example of what we have called parametric polymorphism. In effect, this means that we must explicitly give a type parameter to instantiate an object or function of a bounded (parametric) type, similar to when we use a template in C++. The examples given in slide 9-ex-parameters illustrate how we may define and subsequently type parametric functions. In these examples, we employ the convention that in the absence of a bounding type we assume Top as an upper limit. The examples are taken from  [CW85].

#### Examples

• $id = %L %a. %l x:%a.x$ $id : \A %a. %a -> %a$
• $twice1 = %L %a.%l f: %L %b. %b -> %b. %l x:%a. f\left[%a\right]\left(f\left[%a\right]\left(x\right)\right)$
$twice1 : \A %a. \A %b. \left(%b -> %b\right) -> %a -> %b$
• $twice2 = %L %a.%l f: %a -> %a. %l x:%a. f\left(f\left(x\right)\right)$
$twice2 : \A %a. \left(%a -> %a\right) -> %a -> %a$

#### Applications

• $id \left[ Int \right] \left( 3 \right) = 3$
• $twice1 \left[ Int \right] \left( id \right)\left( 3 \right) = 3$
• $twice1 \left[ Int \right] \left( S \right) = illegal$
• $twice2 \left[ Int \right] \left( S \right)\left( 3 \right) = 5$

slide: Parametrized types -- examples

The (generic) identity function id is defined as $%L %a.%l x:%a.x$, which states that when we supply a particular type, say Int, then we obtain the function $%l x:Int. x$. Since the actual type used to instantiate id is not important, we may type id as being of type $\A %a.%a -> %a$. In a similar way, we may define and type the two (generic) variants of the function twice. Notice the difference between the two definitions of twice. The first variant requires the function argument itself to be of a generic type, and fails (is incorrectly typed) for the successor function S which is (non generic) of type $Int -> Int$. In contrast, the second variant accepts S, and we may rely on the automatic conversion of $id : \A %a. %a -> %a$ to $id \left[ Int \right] : Int -> Int$ (based on the second type assignment rule) to accept id as well. The interplay between parametric and inclusion polymorphism is illustrated in the examples presented in slide 9-ex-quantification. Recall that inclusion polymorphism is based on the subtyping relation between records (which states that refinement of a record type involves the addition of components and/or refinement of components that already belong to the super type).

#### Bounded quantification

• $g = %L %a <= \left\{ one : Int \right\}. %l x: %a. \left(x.one\right)$
$g : \A %a <= \left\{ one : int \right\}. %a -> Int$
• $g\text{'} = %L %b . %L %a <= \left\{ one : %b \right\}. %l x: %a. \left(x.one\right)$
$g\text{'} : \A %b . \A %a <= \left\{ one : %b \right\}. %a -> %b$
• $move = %L %a <= Point. %l p:%a. %l d : Int .\left(p.x := p.x + d\right); p$
$move : \A %a <= Point. %a -> Int -> %a$

#### Application

• $g\text{'} \left[ Int \right]\left[ \left\{ one:Int, two : Bool \right\}\right]\left(\left\{ one = 3, two = true \right\}\right) = 3$
• $move \left[\left\{ x : Int, y : Int \right\}\right]\left( \left\{ x = 0, y = 0 \right\} \right)\left(1\right) = \left\{ x = 1, y = 0 \right\}$

slide: Bounded quantification -- examples

The first example defines a function g that works on a record with at least one component one and delivers as a result the value of the component one of the argument record. The function $g\text{'}$ is a generalized version of g that abstracts from the particular type of the one component. Notice that both g and $g\text{'}$ may be applied to any record that conforms to the requirement stated in the bound, such as the record $\left\{ one = 3, two = true \right\}$. As another example of employing bounds to impose requirements, look at the function move that is defined for subtypes of Point (which we assume to be a record containing x and y coordinates). It expects a record (that is similar to or extends Point) and an (integer) distance, and as a result delivers the modified record.

#### Discussion

Parametric polymorphism is an important means to incorporate subtyping in a coherent fashion. Apart from  [Pierce93], from which we have taken most of the material presented here, we may mention  [PA93] as a reference for further study. In  [Pierce93] a calculus $F_\left\{ /\ \right\}$ is also introduced in which intersection polymorphism is expressed by means of an explicit type variable. The resulting type may be written as $\A %a \e \left\{ ... \right\}$, where $\left\{ ... \right\}$ denotes a finite collection of types. As already mentioned, intersection types may also be used to model inclusion polymorphism (see Castagna {\it et al.}, 1993).\index{Castagna {\it et al.} (1993)} It is an interesting research issue to explore the relation between parametric polymorphism and inclusion polymorphism further along this line. However, we will not pursue this line here. Instead, in the next section we will look at another application of parametric polymorphism, namely existential types that allow us to abstract from hidden component types. This treatment is based on  [CW85]. In the last section of this chapter, we will look in more detail at the role of self-reference in defining (recursive) object types, following  [CoHC90]. We will conclude this chapter with some observations concerning the relevance of such type theories for actual programming languages. In particular, we will show that Eiffel is not type consistent.

#### Type abstraction in C++

Type abstraction in C++ may occur in various guises. One important means of type abstraction is to employ what we have called polymorphic base class hierarchies. For example, the function move, which was somewhat loosely characterized in slide 9-ex-quantification, may be defined in C++ as follows:


Point* move(Point* p, int d); // require int Point::x
Point* move(Point* p, int d) { p.x += d; return p; }


slide: example move

In effect, the function move accepts a pointer to an instance of Point, or any class derived from Point, satisfying the requirement that it has a public integer data member x. Similar restrictions generally hold when instantiating a template class, but in contrast to base class subtyping requirements, these restrictions will only be verified at link time.


template< class T > // requires T::value()
class P {
public:
P(T& r) : t(r) {}
int operator==( P& p) {
return t.value() == p.t.value();
}
private:
T& t;
};


slide: Type abstraction in C++

Consider the template class definition given in slide 9-cc-abs. Evidently, for the comparison function to operate properly, each instantiation type substituted for the type parameter T must satisfy the requirement that it has a public member function value.


template< class T >
class A {     A<T>
public:
virtual T value() = 0;
};

class Int : public A<int> { //    Int $<=$ A<int>
public:
Int(int n = 0) : _n(n) {}
int value() { return _n; }
private:
int _n;
};


slide: Type instantiation

Such a requirement may also be expressed by defining an abstract class A defining a pure virtual member function value. See slide 9-cc-abs-2. The restrictions on instantiating P may then be stated informally as the requirement that each instantiation type T must be a subtype of $A$ for arbitrary type X. The class Int is an example of a type complying with the implicit requirements imposed by the definition of P. An example of using P is given below


Int i1, i2;
P<Int> p1(i1), p2(i2);
if ( p1 == p2 ) cout << "OK" << endl;  OK


slide: Example P

Note, however, that the derivation of A<int> is by no means necessary or in any way enforced by C++.

(C) Æliens 04/09/2009

You may not copy or print any of this material without explicit permission of the author or the publisher. In case of other copyright issues, contact the author.