Codata as dual of data
Data
When we define a new type in Haskell, we use the data
keyword:
data Foo
= Foo1 Int String
| Foo2 Bool
This is essentially equivalent as defining two data constructors (they are legit functions):
Foo1 :: Int -> String -> Foo
Foo2 :: Bool -> Foo
This type of construct is called “data” by some people.
Usage of a data
Here’s an example of an inductively defined data structure.
data Tree = Leaf | Node Tree Int Tree
-- i.e.
Leaf :: Tree
Node :: Tree -> Int -> Tree -> Tree
What does it mean to use such data? It means we need to destruct the data by matching on all its constructors:
height :: Tree -> Int
Leaf = 0
height Node left _val right) = 1 + max (height left) (height right) height (
This types of function is called an “eliminator”.
Data eliminator and initial algebras
A data is an initial algebra we previously learned about. Take the above Tree
as an example, it’s defined as the initial algebra on the following functor.
data TreeF a = Leaf | Node a Int a
type Tree = Fix TreeF
You can review the my previous talk on initial algebra.
Then above defintion of height
can be expressed as catamorphism on an algebra of TreeF
.
heightAlg :: TreeF Int -> Int
Leaf = 0
heightAlg Node left _val right) = 1 + (max left right)
heightAlg (
height :: Fix TreeF -> Int
= cata heightAlg height
Codata
If we revert the arrows that constructs a data, we get two functions dual to the previous data constructors that have the shapes like this:
Bar1 :: Bar -> (Int, String)
Bar2 :: Bar -> Bool
Intuitively, Bar1
and Bar2
, instead of two ways to construct a Bar
from their components, they specifies two ways to use the Bar
. This is what we called codata.
Construction of codata
Here’s another example of useful codata:
type Stream = (Int, Stream)
head :: Stream -> Int
tail :: Stream -> Stream
Here’s an example on how to construct this codata.
startFrom :: Int -> Stream
= (n, startFrom (n + 1)) startFrom n
Terminal coalgebra and codata constructor
Above codata Stream
can be expressed as terminal coalgebra (fix-point) on the following
type StreamF a = (Int, a)
type Stream = Fix StreamF
Then the startFrom
constructor for Stream
will be an anamorphism on some coalgebra:
startFromCoalg :: Int -> StreamF Int
= (n, n + 1)
startFromCoalg n
startFrom :: Int -> Fix StreamF
= ana fromCoalg startFrom
General purpose codata
Church encoding of data types
One intriguing topic in functional programming is Church encoding. Church encoding shows us ways to encode any data and control constructs as purely lambda functions.
-> x -- 0
\f x -> f x -- 1
\f x -> f (f x) -- 2
\f x -> \f x -> f (n f x) -- succ :: Nat -> Nat
\n
-> x -- true
\x y -> y -- false
\x y -> \a b -> p b a -- not :: Bool -> Bool
\p -> p a b -- if :: Bool -> a -> a -> a
\p a b
-> z x y -- pair :: a -> b -> pair
\x y z -> p (\x y -> x) -- fst :: pair -> a \p
General eliminator of Bool
Let’s take a look in an eliminator for Bool
.
data Bool = True | False
type BoolC a = (a, a) -> a
elimBool :: Bool -> BoolC a
True (a, b) = a
elimBool False (a, b) = b elimBool
You may recognize that the BoolC
for elimBool
is equivalent to the Church encoding for Bool
. We will show their equivalence in next section.
You may also recognize that that elimBool
is the most general eliminator for type Bool
. In other words, every valid eliminator can be derived from this eliminator.
In fact, the general eliminator elimBool
is the catamorphism of type Bool
.
Isomorphism between Church encoding and the data
We will demonstrate that Bool
and BoolC
are indeed isomorphic:
from :: Bool -> BoolC a
= elimBool
from
to :: BoolC a -> Bool
= f (True, False) to f
It’s easy to prove that from . to = id
and to . from = id
so I’ll elaborate. So far we have shown that BoolC
is indeed a Church encoding for Bool
.
General eliminator for Tree
Let’s look at a more complex type:
data Tree = Leaf | Node Tree Int Tree
type TreeC a = a -> ((a, Int, a) -> a) -> a
= Tree -> TreeC a
elimTree Leaf f g = f
elimTree Node left n right) f g = g left' n right'
elimTree (where left' = elimTree f g left
= elimTree f g right right'
You may recognize that elimTree
is the catamorphism for Tree
. Also TreeC
is a legit Church encoding for Tree.
Visitor pattern on Tree
Now we learned how to find Church encoding and shown how Church encoding is isomorphic as the represented data type.
We can discover a pattern such that we can extract of all the ways we can eliminate a Tree
into a single entity. We will call this entity “TreeVisitor”.
type TreeVisitor a = (a, (a, Int, a) -> a)
visitLeaf :: TreeVisitor a -> a
visitNode :: TreeVisitor a -> (a, Int, a) -> a
As the name suggests, this pattern is just the Visitor pattern in OOP. Here I am using a pair to represent the type for TreeVisitor
, but the exact way to implement it doesn’t really matter.
The point of TreeVisitor a
being a codata type because the only thing we care about it is to be able to derive the two methods visitLeaf
and visitNode
.
Tree as Codata
A Tree can then be defined as the all possible TreeVisitor a -> a
instances (i.e. TreeC a
), as we already proved by showing the isomorphism between Tree
and TreeC
.
walk :: Tree -> (forall a. TreeVisitor a -> a)
This representation of Tree is also a codata because the actual underlying data structure of Tree is hidden from the outside, and the TreeVisitors already defined all the ways to access it.
References
This article is inspired by Codata in action, or how to connect Functional Programming and Object Oriente…
Codata in Action, by Paul Downen, Zachary Sullivan, et al. (Programming Languages and Systems)