## Codata as dual of data

### Data

When we define a new type in Haskell, we use the `data`

keyword:

```
data Foo
= Foo1 Int String
| Foo2 Bool
```

This is essentially equivalent as defining two data constructors (they are legit functions):

```
Foo1 :: Int -> String -> Foo
Foo2 :: Bool -> Foo
```

This type of construct is called “data” by some people.

### Usage of a data

Here’s an example of an inductively defined data structure.

```
data Tree = Leaf | Node Tree Int Tree
-- i.e.
Leaf :: Tree
Node :: Tree -> Int -> Tree -> Tree
```

What does it mean to *use* such data? It means we need to destruct the data by matching on all its constructors:

```
height :: Tree -> Int
Leaf = 0
height Node left _val right) = 1 + max (height left) (height right) height (
```

This types of function is called an “eliminator”.

### Data eliminator and initial algebras

A data is an initial algebra we previously learned about. Take the above `Tree`

as an example, it’s defined as the initial algebra on the following functor.

```
data TreeF a = Leaf | Node a Int a
type Tree = Fix TreeF
```

You can review the my previous talk on initial algebra.

Then above defintion of `height`

can be expressed as catamorphism on an algebra of `TreeF`

.

```
heightAlg :: TreeF Int -> Int
Leaf = 0
heightAlg Node left _val right) = 1 + (max left right)
heightAlg (
height :: Fix TreeF -> Int
= cata heightAlg height
```

### Codata

If we revert the arrows that constructs a data, we get two functions dual to the previous data constructors that have the shapes like this:

```
Bar1 :: Bar -> (Int, String)
Bar2 :: Bar -> Bool
```

Intuitively, `Bar1`

and `Bar2`

, instead of two ways to construct a `Bar`

from their components, they specifies two ways to use the `Bar`

. This is what we called codata.

### Construction of codata

Here’s another example of useful codata:

```
type Stream = (Int, Stream)
head :: Stream -> Int
tail :: Stream -> Stream
```

Here’s an example on how to construct this codata.

```
startFrom :: Int -> Stream
= (n, startFrom (n + 1)) startFrom n
```

### Terminal coalgebra and codata constructor

Above codata `Stream`

can be expressed as terminal coalgebra (fix-point) on the following

```
type StreamF a = (Int, a)
type Stream = Fix StreamF
```

Then the `startFrom`

constructor for `Stream`

will be an anamorphism on some coalgebra:

```
startFromCoalg :: Int -> StreamF Int
= (n, n + 1)
startFromCoalg n
startFrom :: Int -> Fix StreamF
= ana fromCoalg startFrom
```

## General purpose codata

### Church encoding of data types

One intriguing topic in functional programming is Church encoding. Church encoding shows us ways to encode any data and control constructs as purely lambda functions.

```
-> x -- 0
\f x -> f x -- 1
\f x -> f (f x) -- 2
\f x -> \f x -> f (n f x) -- succ :: Nat -> Nat
\n
-> x -- true
\x y -> y -- false
\x y -> \a b -> p b a -- not :: Bool -> Bool
\p -> p a b -- if :: Bool -> a -> a -> a
\p a b
-> z x y -- pair :: a -> b -> pair
\x y z -> p (\x y -> x) -- fst :: pair -> a \p
```

### General eliminator of Bool

Let’s take a look in an eliminator for `Bool`

.

```
data Bool = True | False
type BoolC a = (a, a) -> a
elimBool :: Bool -> BoolC a
True (a, b) = a
elimBool False (a, b) = b elimBool
```

You may recognize that the `BoolC`

for `elimBool`

is equivalent to the Church encoding for `Bool`

. We will show their equivalence in next section.

You may also recognize that that `elimBool`

is the most general eliminator for type `Bool`

. In other words, every valid eliminator can be derived from this eliminator.

In fact, the general eliminator `elimBool`

is the catamorphism of type `Bool`

.

### Isomorphism between Church encoding and the data

We will demonstrate that `Bool`

and `BoolC`

are indeed isomorphic:

```
from :: Bool -> BoolC a
= elimBool
from
to :: BoolC a -> Bool
= f (True, False) to f
```

It’s easy to prove that `from . to = id`

and `to . from = id`

so I’ll elaborate. So far we have shown that `BoolC`

is indeed a Church encoding for `Bool`

.

### General eliminator for Tree

Let’s look at a more complex type:

```
data Tree = Leaf | Node Tree Int Tree
type TreeC a = a -> ((a, Int, a) -> a) -> a
= Tree -> TreeC a
elimTree Leaf f g = f
elimTree Node left n right) f g = g left' n right'
elimTree (where left' = elimTree f g left
= elimTree f g right right'
```

You may recognize that `elimTree`

is the catamorphism for `Tree`

. Also `TreeC`

is a legit Church encoding for Tree.

### Visitor pattern on Tree

Now we learned how to find Church encoding and shown how Church encoding is isomorphic as the represented data type.

We can discover a pattern such that we can extract of all the ways we can eliminate a `Tree`

into a single entity. We will call this entity “TreeVisitor”.

```
type TreeVisitor a = (a, (a, Int, a) -> a)
visitLeaf :: TreeVisitor a -> a
visitNode :: TreeVisitor a -> (a, Int, a) -> a
```

As the name suggests, this pattern is just the *Visitor pattern* in OOP. Here I am using a pair to represent the type for `TreeVisitor`

, but the exact way to implement it doesn’t really matter.

The point of `TreeVisitor a`

being a codata type because the only thing we care about it is to be able to derive the two methods `visitLeaf`

and `visitNode`

.

### Tree as Codata

A Tree can then be defined as the all possible `TreeVisitor a -> a`

instances (i.e. `TreeC a`

), as we already proved by showing the isomorphism between `Tree`

and `TreeC`

.

`walk :: Tree -> (forall a. TreeVisitor a -> a)`

This representation of Tree is also a codata because the actual underlying data structure of Tree is hidden from the outside, and the TreeVisitors already defined all the ways to access it.

## References

This article is inspired by Codata in action, or how to connect Functional Programming and Object Oriente…

Codata in Action, by Paul Downen, Zachary Sullivan, et al. (Programming Languages and Systems)