Codata as dual of data
When we define a new type in Haskell, we use the
data Foo = Foo1 Int String | Foo2 Bool
This is essentially equivalent as defining two data constructors (they are legit functions):
Foo1 :: Int -> String -> Foo Foo2 :: Bool -> Foo
This type of construct is called “data” by some people.
Usage of a data
Here’s an example of an inductively defined data structure.
data Tree = Leaf | Node Tree Int Tree -- i.e. Leaf :: Tree Node :: Tree -> Int -> Tree -> Tree
What does it mean to use such data? It means we need to destruct the data by matching on all its constructors:
height :: Tree -> Int Leaf = 0 height Node left _val right) = 1 + max (height left) (height right)height (
This types of function is called an “eliminator”.
Data eliminator and initial algebras
A data is an initial algebra we previously learned about. Take the above
Tree as an example, it’s defined as the initial algebra on the following functor.
data TreeF a = Leaf | Node a Int a type Tree = Fix TreeF
You can review the my previous talk on initial algebra.
Then above defintion of
height can be expressed as catamorphism on an algebra of
heightAlg :: TreeF Int -> Int Leaf = 0 heightAlg Node left _val right) = 1 + (max left right) heightAlg ( height :: Fix TreeF -> Int = cata heightAlgheight
If we revert the arrows that constructs a data, we get two functions dual to the previous data constructors that have the shapes like this:
Bar1 :: Bar -> (Int, String) Bar2 :: Bar -> Bool
Bar2, instead of two ways to construct a
Bar from their components, they specifies two ways to use the
Bar. This is what we called codata.
Construction of codata
Here’s another example of useful codata:
type Stream = (Int, Stream) head :: Stream -> Int tail :: Stream -> Stream
Here’s an example on how to construct this codata.
startFrom :: Int -> Stream = (n, startFrom (n + 1))startFrom n
Terminal coalgebra and codata constructor
Stream can be expressed as terminal coalgebra (fix-point) on the following
type StreamF a = (Int, a) type Stream = Fix StreamF
startFrom constructor for
Stream will be an anamorphism on some coalgebra:
startFromCoalg :: Int -> StreamF Int = (n, n + 1) startFromCoalg n startFrom :: Int -> Fix StreamF = ana fromCoalgstartFrom
General purpose codata
Church encoding of data types
One intriguing topic in functional programming is Church encoding. Church encoding shows us ways to encode any data and control constructs as purely lambda functions.
-> x -- 0 \f x -> f x -- 1 \f x -> f (f x) -- 2 \f x -> \f x -> f (n f x) -- succ :: Nat -> Nat \n -> x -- true \x y -> y -- false \x y -> \a b -> p b a -- not :: Bool -> Bool \p -> p a b -- if :: Bool -> a -> a -> a \p a b -> z x y -- pair :: a -> b -> pair \x y z -> p (\x y -> x) -- fst :: pair -> a\p
General eliminator of Bool
Let’s take a look in an eliminator for
data Bool = True | False type BoolC a = (a, a) -> a elimBool :: Bool -> BoolC a True (a, b) = a elimBool False (a, b) = belimBool
You may recognize that the
elimBool is equivalent to the Church encoding for
Bool. We will show their equivalence in next section.
You may also recognize that that
elimBool is the most general eliminator for type
Bool. In other words, every valid eliminator can be derived from this eliminator.
In fact, the general eliminator
elimBool is the catamorphism of type
Isomorphism between Church encoding and the data
We will demonstrate that
BoolC are indeed isomorphic:
from :: Bool -> BoolC a = elimBool from to :: BoolC a -> Bool = f (True, False)to f
It’s easy to prove that
from . to = id and
to . from = id so I’ll elaborate. So far we have shown that
BoolC is indeed a Church encoding for
General eliminator for Tree
Let’s look at a more complex type:
data Tree = Leaf | Node Tree Int Tree type TreeC a = a -> ((a, Int, a) -> a) -> a = Tree -> TreeC a elimTree Leaf f g = f elimTree Node left n right) f g = g left' n right' elimTree (where left' = elimTree f g left = elimTree f g right right'
You may recognize that
elimTree is the catamorphism for
TreeC is a legit Church encoding for Tree.
Visitor pattern on Tree
Now we learned how to find Church encoding and shown how Church encoding is isomorphic as the represented data type.
We can discover a pattern such that we can extract of all the ways we can eliminate a
Tree into a single entity. We will call this entity “TreeVisitor”.
type TreeVisitor a = (a, (a, Int, a) -> a) visitLeaf :: TreeVisitor a -> a visitNode :: TreeVisitor a -> (a, Int, a) -> a
As the name suggests, this pattern is just the Visitor pattern in OOP. Here I am using a pair to represent the type for
TreeVisitor, but the exact way to implement it doesn’t really matter.
The point of
TreeVisitor a being a codata type because the only thing we care about it is to be able to derive the two methods
Tree as Codata
A Tree can then be defined as the all possible
TreeVisitor a -> a instances (i.e.
TreeC a), as we already proved by showing the isomorphism between
walk :: Tree -> (forall a. TreeVisitor a -> a)
This representation of Tree is also a codata because the actual underlying data structure of Tree is hidden from the outside, and the TreeVisitors already defined all the ways to access it.
This article is inspired by Codata in action, or how to connect Functional Programming and Object Oriente…
Codata in Action, by Paul Downen, Zachary Sullivan, et al. (Programming Languages and Systems)