Shou's origin https://log.lain.li/atom.xml Shou Ya log AT lain.li 2021-01-16T00:00:00Z Fast fourier transform: An ingenious way to multiply polynomials https://log.lain.li/blog/fft/index.html 2021-01-16T00:00:00Z 2021-01-16T00:00:00Z
Posted on January 16, 2021
Tags: explanation note Categories: math

## Background

I used to hear many people talk about that fast fourier transform (FFT) is one of the most beautiful algorithms in the engineering world, but I never dove deep into that.

Then a few days ago I stumble upon this wonderful Youtube video about FFT created by @Reducible. So I took some note of my learning from the that video. I figured out this note is probably qualified for public consumption so I decided to publicize it.

If you haven’t watched the video yet, I highly recommend you to watch it:

Let’s begin the journey!

## Polynomial multiplication problem

How do we multiply two polynomials? A method we all learned from high school is to use distribution law to expand the terms, and then combine the terms with the same exponent. Here’s an example:

$\begin{equation} \begin{split} (5x^2+4)(2x^2+x+1) & = 5x^2(2x^2+x+1) + 4(2x^2+x+1) \\ & = (10x^4+5x^3+5x^2) + (8x^2+4x+4) \\ & = 10x^4+5x^3+13x^2+4x+4 \\ \end{split} \end{equation}$

## Polynomial representation

If we want to implement the multiplication in a program, we first need to find a way to represent the polynomial.

For polynomial of degree $$d$$, there are two unique ways we can use:

1. use a list $$d+1$$ of coefficients of the powers: $$[p_0, p_1, ..., p_d]$$ (coefficient representation)
2. use a set of $$d+1$$ distinct points: $${(x_0, P(x_0)), (x_1, P(x_1)), ..., (x_d,P(x_d))}$$ (value representation)

The value representation works because a polynomial of degree $$N$$ can be determined by $$N+1$$ distinct interpolation points. For example, two points determine a line, three points determine a parabola.

The coefficient representation is what we demonstrated in Polynomial multiplication problem. It’s straightforward and we already know it. However, it’s not performant enough for polynomial multiplication. The time complexity is $$O(N^2)$$ where $$N$$ is the degree of the polynomial, because we need to multiply all possible pairs of terms.

The value representation, on the other hand, is much faster for multiplying polynomial. To multiply two polynomials, if we assume the $$\mathbf{x}$$ values are consistent in the two representations, we can just pointwise multiply the $$P(\mathbf{x})$$ values. So this algorithm has a time complexity of $$O(N)$$.

## Flowchart for polynomial multiplication

Now we can draw a diagram on how to perform fast multiplication for polynomials. The step involved in converting coefficient representation into value reprentation is where fast fourier transform (FFT) kicks in. ## Coefficient representation conversion

The most naive method to convert coefficient representation is just to plug in some arbitrary values to the polynomial.

However, this will result in us going back to the original complexity because we need to calculate for each term in the polynomial and for each $$x$$ value. This will get us back to $$O(N^2)$$ time complexity with $$N$$ being the degree of the polynomial. Not good.

This is where Fast Fourier Transform algorithm shines. The FFT algorithm is able do the same conversion in just $$O(N \log N)$$ time!

We will learn about the tricks FFT based on and see if we can derive FFT ourselves.

## Polynomial evaluation

First, let’s see how we can improve the naive algorithm. Suppose we have a polynomial term of even power, $$5x^2$$ for example, then we can carefully pick the $$\mathbf{x}$$ points to be symmetrical about the $$y$$ axis. If we do so, we only need to perform half the computation since once we know $$P(x)$$, we automatically know $$P(-x) = P(x)$$. Similarly, for individual odd degree terms, we have $$P(x)=-P(-x)$$.

With this trick, we can cut the number of computations needed by half. For each individual point $$x_i$$, to compute $$P(x_i)$$, we can split the calculation into $$P_e(x_i^2)$$ and $$x_i P_o(x_i^2)$$ which represent the even and odd degree terms, respectively.

For example, $$P(x) = 5x^5 + 6x^4 + 7x^3 + 8x^2 + 9x + 10$$ evaluated at $$x_i$$ can be written as $$P_e(x_i^2) + x P_o(x_i^2)$$, where $$P_e(x^2) = 6x^4 + 8x^2 + 10$$ and $$P_o(x^2) = 5x^5 + 7x^3 + 9x$$.

Then we have:

$\begin{equation} \begin{cases} P( x_i) &= P_e(x_i^2) + x_i P_o(x_i^2) \\ P(-x_i) &= P_e(x_i^2) - x_i P_o(x_i^2) \end{cases} \end{equation}$

In which $$P_e(x_i^2)$$ and $$P_o(x_i^2)$$ can be calculated only once and used twice.

From here we can go with the recursive step. $$P_e(x)$$ and $$P_o(x)$$ each is a polynomial of degree $$d/2$$, which only have half of the terms. We can further apply the algorithm to it with half the size until we reach the base case where $$d=0$$. In such case we just have $$1$$ term - the constant term, so we can just return it as the evaluation result.

Except there is a blocker in the recursion step. The “divide” step works based on the assumption that we can always choose two set of points $$\mathbf{x_i}$$ and $$-\mathbf{x_i}$$. This will not be the case except for the first iteration. In the second iteration we will have $$x_i^2$$ and $$(-x_i)^2$$ which are normally both positive, so we can no longer further split the polynomial - well, unless we introduce complex number into the play.

## Constraints for the sample points

Let’s list out the requirements we need for the $$N$$ sampling points for degree $$N-1$$ polynomial. From now on, for simplicity purpose, we will only talk about polynomial with degree $$2^k-1$$ where $$k\in\mathbb{Z}^{*}$$.

In the first iteration, we have $$N$$ points, half of them need to be negation of the other half. Let’s note down the conditions.

$\label{cond1} x_{\frac{N}{2}+i} = -x_i\quad\text{for } i \in 0,\ldots,\frac{N}{2}$

Note that we only need to compute half of the point set.

Then in the next interation, we will be passing $$x_k^2$$ to the $$P_e$$ and $$P_o$$, where $$k$$ takes values of $$0,..,N/4$$.

$\label{cond2} x_{\frac{N}{4}+i}^2 = -x_i^2\quad\text{for }i \in 0,\ldots,\frac{N}{4}$

Starting from equations and we can inductively deduce all the restrictions on all $$x$$ values.

The base case for polynomial of degree $$N-1$$ is $$x_0^{N}$$. We can assign it to $$x_0^N = 1$$. This value specifies $$x_0^{\frac{N}{2}}$$ and $$x_1^{\frac{N}{2}}$$ to be the two roots of $$x_0^N$$ that are negation of each other. In the next iteration, each of these two values in turn specifies four more values, $$x_i^{\frac{N}{4}}$$ for $$i\in \{0,1,2,3\}$$ and so on. Until we hit the case $$\frac{N}{2^k} = 1$$, then we get all the plain $$x_i^{\frac{N}{2^k}}=x_i$$ values.

Let’s look at the second interation, where we acquired the constraint that $$x_0^{\frac{N}{2}}$$ and $$x_1^{\frac{N}{2}}$$ are roots of $$x_0^N=1$$, so one must be $$1$$ and other be $$-1$$. But in reality they can be of any order. Same choice must be made to all future iterations.

The trick is to take $$x_i$$ to be the $$i$$ th element of the “$$N$$ th root of unity”.

$\begin{equation} x_i=e^{2\pi j\frac{i}{N}} \end{equation}$

Where $$j=\sqrt{-1}$$. We can verify that these points satisfy the constraints we wanted: $$x_0^N=e^{0}=1$$; then $$x_0^{N/2} = e^{0} = 1$$ and $$x_1^{N/2}=e^{\pi j} = -1$$ are the roots of $$x_0^N$$; and so on.

## Symmetrical properties of sample points

If we plot the points for $$e^{2\pi j \frac{i}{N}}$$ for all $$i$$ on the complex plane, we can find that they reside on a circle with equal distance apart. Here’s a graph for $$N=8$$. The $$x$$ points are arranged in an counter-clockwise order, starting from $$x_0=1$$.

We can see that these points are symmetrical about the origin - that is to say, every point $$x_i$$ has a counterpart $$x_{N/2+i}=-x_i$$, which is the reflection of $$x_i$$ about the origin. This property is exactly what we wanted.

In the following iterations, we would square the half of all the $$x_i$$ values. Squaring a unit complex number $$z$$ is the same as doubling angle of the number couter-clockwise. So in the next iterations, we essentially continue to fill the circle with half of the points, resulting a new circle where points are twice the old distance apart. In the last iteration, the result will be just a single point $$x_0^N = 1$$.

## Constructing the algorithm

We now learned how to pick sample points, now let’s formalize the algorithm.

The FFT algorithm should take two arguments, a list of coefficients representing the polynomial, and a list of sampling points. The output is a list of $$P(x_i)$$ values corresponding to each point $$x_i$$. $$N$$ is represented by the length of the coefficient list (or the number of points, which is the same anyway).

The simplest case is when $$N=1$$, where we have to calculate $$\operatorname{FFT}(P=[c_0], X=)$$, which is the same as evaluating $$P(x)=c_0$$ at $$x=1$$. The result is trivial - we just return $$Y=[c_0]$$.

Next simplest case is when $$N=2$$, where we have $$\operatorname{FFT}(P=[c_0, c_1], X=[1, -1])$$. This polynomial is easy to calcualte on its own. Although the recursive algorithm applies at this case, it’s not very representative for demonstration purpose, so we will skip this iteration and assume it works normally.

The next one is $$N=3$$, where we have $$\operatorname{FFT}(P=[c_0, c_1, c_2, c_3], X=[1, j, -1, -j])$$, that is, to evaluate $$c_0 + c_1 x^1 + c_2 x^2 + c_3 x^3$$ at the given $$x_i$$ points.

We first split $$P(x)$$ into even and odd components: $$P_e(x^2)+xP_o(x^2) = (c_0+c_2 x^2) + x(c_1+c_3 x^2)$$. This gave us two smaller polynomials for the recursion step $$P_e=[c_0, c_2]$$ and $$P_o = [c_1, c_3]$$.

Now let’s see what $$x_i$$ values we need to provide for the recursion step. The whole point of the algorithm is to save half of the calculation by exploiting the even/odd polynomials. So we only need to calculate for these polynomials on $$X=[1, j]$$. Note that their arguments are not $$x_i$$ but $$x_i^2$$. So we need to pass $$X=[1, j^2=-1]$$ to them. The same parameter applies to both the even and odd polynomials. In turn, we are left with evaluating two expressions $$Y_e = \operatorname{FFT}(P=[c_0, c_2], X=[1, -1])$$ and $$Y_o=\operatorname{FFT}(P=[c_1, c_3], X=[1, -1])$$. This reduces our problem of size $$N=4$$ to two $$N=2$$ cases.

Now comes the final part - after we computed the $$Y_e$$ and $$Y_o$$, we need to compose them in a way that calculates the final $$Y$$ values. Given that $$x_{N/2+i}=-x_i$$ for $$i \in 0,..,\frac{N}{2}$$, and the nature of even/odd polynomials, we know that $$P_e(x_{N/2+i})=P_e(x_i)$$ and $$P_o(x_{N/2+i}) = -P_o(x_i)$$. Also we know $$P=P_e + x P_o$$. This gave us the way to compose the $$Y_e$$ and $$Y_o$$. $$y_i = y_e+ x_i y_i$$ and $$y_{N/2+i} = y_e - x_i y_i$$ for $$i \in 0,..,\frac{N}{2}$$.

Now we can observe another invariant. For all recursion steps, the $$X$$ values for that step are fixed. In other words, the values of $$X$$ only depend on the degree $$N$$, which can be deduced from length of the coefficient list. For $$N=1$$, we always have $$X=$$; for $$N=2$$, we always have $$X=[1,-1]$$; for $$N=4$$, we always have $$X=[1,j,-1,-j]$$. This result comes from our previous reasoning from last section about squaring roots of unity. As a result, we no longer have to explicitly specify this argument to FFT procedure.

## Implementing in code

To implement the algorithm in code, we basically just copy the steps described above. Note that we represent $$x_i=e^{2 \pi j \frac{i}{N}}$$ with $$w^i$$ where $$w=e^{2 \frac{\pi j}{N}}$$.

import math
import cmath

def fft(p):
n = len(p)
if n == 1:
return p
w = cmath.exp(2*math.pi*1j/n)
y_e = fft(p[0::2])
y_o = fft(p[1::2])
y = [None] * n
for i in range(n//2):
y[i]      = y_e[i] + y_o[i] * w**i
y[i+n//2] = y_e[i] - y_o[i] * w**i
return y

Let’s verify if the result is correct.

def p(x):
return 1 + 2*x + 3*x**2 + 4*x**3

print(fft([1,2,3,4]))
print([p(1), p(1j), p(-1), p(-1j)])
[(10+0j), (-2-2j), (-2+0j), (-1.9999999999999998+2j)]
[10, (-2-2j), -2, (-2+2j)]


Ignoring the round-off error, we can see the two results are the same.

## FFT operation as a matrix

The naive method of evaluating the value of a polynomial of degree $$N-1$$ at $$N$$ sampling points is to calculating $$N$$ values, which are $$P(x_i)$$ where $$i = 0, 1, \ldots, N-1$$. Then, to calculate $$P(x_i)$$, we need to sum up the the value of each terms

$\begin{equation} P(x_i) = \sum_{k=0}^{N-1} x_i^kc_k \end{equation}$

where $$c_k$$ is the k-th coefficient of the polynomial. This means we can construct an $$N$$ by $$N$$ matrix with each element to be $$W_{i,k} = x_i^k$$.

$\begin{equation} W = \begin{pmatrix} 1 & x_0 & x_0^2 & \cdots & x_0^m \\ 1 & x_1 & x_1^2 & \cdots & x_1^m \\ 1 & x_2 & x_2^2 & \cdots & x_2^m \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_m & x_m^2 & \cdots & x_m^m \\ \end{pmatrix} \end{equation}$

where $$m = N-1$$. If we plug in our chosen sampling points $$x_i = w^i = e^{\frac{2\pi j}{n}}$$ we get this matrix:

$\begin{equation} W = \begin{pmatrix} 1 & 1 & 1 & \cdots & 1 \\ 1 & w & w^2 & \cdots & w^m \\ 1 & w^2 & w^4 & \cdots & w^{2m} \\ 1 & w^3 & w^6 & \cdots & w^{3m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & w^m & w^{2m} & \cdots & w^{m^2} \\ \end{pmatrix} \end{equation}$

Given a vector of coefficient $$P=[c_0, c_1, \ldots, c_m]$$, the result of $$Y = WP$$ is just the sampled values we wanted.

In fact, this matrix is called the discrete fourier transform matrix (DFT matrix) for this exact reason. The FFT algorithm above is just an more efficient way to perform the matrix multiplication with this matrix.

Correspondingly, the technique to calculate the inverse fourier transform is to find the inverse matrix. And interestingly, the inverse DFT matrix looks very similar to the DFT matrix!

$\begin{equation} W^{-1} = \frac{1}{N}\begin{pmatrix} 1 & 1 & 1 & \cdots & 1 \\ 1 & w^{-1} & w^{-2} & \cdots & w^{-m} \\ 1 & w^{-2} & w^{-4} & \cdots & w^{-2m} \\ 1 & w^{-3} & w^{-6} & \cdots & w^{-3m} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & w^{-m} & w^{-2m} & \cdots & w^{-m^2} \\ \end{pmatrix} \end{equation}$

The fact that the inverse DFT matrix is similar to the DFT matrix gives us a way to compute inverse fast fourier transform. This means to convert our FFT algorithm to Inverse FFT algorithm, we only need the following two changes,

1. replace all occurences $$w$$ with $$w^{-1}$$;
2. finally, multiply with $$\frac{1}{N}$$.

## Implementation of Inverse FFT algorithm

And - here we have it.

def _ifft(p):
n = len(p)
if n == 1:
return p
w = cmath.exp(-2*math.pi*1j/n)
y_e = _ifft(p[0::2])
y_o = _ifft(p[1::2])
y = [None] * n
for i in range(n//2):
y[i]      = y_e[i] + y_o[i] * w**i
y[i+n//2] = y_e[i] - y_o[i] * w**i
return y

def ifft(p):
n = len(p)
return [x/n for x in _ifft(p)]

Now let’s try out if it’s indeed the inverse of fft function.

print(ifft(fft([1,2,3,4])))
print(fft(ifft([1j,2j+1])))
[(1+0j), (2-5.721188726109833e-18j), (3+0j), (4+5.721188726109833e-18j)]
[1j, (1+2j)]


Ihe inverse does seem to work ignoring the round-off errors.

I tried to understand the IFFT algorithm in a similar way as we exploit the symmetry of even and odd polynomials, but I can’t find a way to make sense of it. The algorithm itself still works like magic - I can’t explain it with a deeper understanding. I am sorry if you’re expecting that.

## Polynomial multiplication Now we have all the components in the flowchart, we can finally implement the algorithm to perform fast polynomial multiplication.

def pad_radix2(xs, n):
b = n.bit_length() - 1
if n & (n-1): # not a power of 2
b += 1
n = 2 ** b
return xs +  * (n - len(xs))

def poly_mult(p1, p2):
max_n = max(len(p1), len(p2)) * 2
y1, y2 = fft(pad_radix2(p1, max_n)), fft(pad_radix2(p2, max_n))
y3 = [a * b for a, b in zip(y1, y2)]
return ifft(y3)

# calculate (2x+1)(4x+3)
print(poly_mult([1,2], [3,4]))
[(3+8.881784197001252e-16j), (10-3.599721149882556e-16j), (8-8.881784197001252e-16j), 3.599721149882556e-16j]


3,10,8 - we just calculated that $$(2x+1)(4x+3)=8x^2 + 10x + 3$$. The algorithm worked!

## Summary

In this article we studied how to make use of FFT algorithm to compute polynomial multiplication in $$O(N \log N)$$ time. We mainly studied how the ingenious tricks work together to make FFT algorithm concise and elegant, and finally implemented the multiplication algorithm in code.

This article is mainly my personal note on Reducible’s fantastic video The Fast Fourier Transform (FFT): Most Ingenious Algorithm Ever? on Youtube. Huge thanks to Reducible for the presentation.

Some other sources that are helpful for my understanding:

• Source code of fft function in sympy. I learned how to quickly pad up a list to radix-2 length.
• jakevdp’s post “Understanding the FFT Algorithm”. I learned the trick numpy uses to make it much faster than my implementation.
• This math stackexchange answer. It resolved my confusion on the discrepancy of the formula on Wikipedia and the formula used in the video.
What I learned about Codata https://log.lain.li/blog/codata/index.html 2020-12-09T00:00:00Z 2020-12-09T00:00:00Z
Posted on December 9, 2020
Tags: explanation Categories: category theory haskell

## Codata as dual of data

### Data

When we define a new type in Haskell, we use the data keyword:

data Foo
= Foo1 Int String
| Foo2 Bool

This is essentially equivalent as defining two data constructors (they are legit functions):

Foo1 :: Int -> String -> Foo
Foo2 :: Bool -> Foo

This type of construct is called “data” by some people.

### Usage of a data

Here’s an example of an inductively defined data structure.

data Tree = Leaf | Node Tree Int Tree
-- i.e.
Leaf :: Tree
Node :: Tree -> Int -> Tree -> Tree

What does it mean to use such data? It means we need to destruct the data by matching on all its constructors:

height :: Tree -> Int
height Leaf = 0
height (Node left _val right) = 1 + max (height left) (height right)

This types of function is called an “eliminator”.

### Data eliminator and initial algebras

A data is an initial algebra we previously learned about. Take the above Tree as an example, it’s defined as the initial algebra on the following functor.

data TreeF a = Leaf | Node a Int a
type Tree = Fix TreeF

You can review the my previous talk on initial algebra.

Then above defintion of height can be expressed as catamorphism on an algebra of TreeF.

heightAlg :: TreeF Int -> Int
heightAlg Leaf = 0
heightAlg (Node left _val right) = 1 + (max left right)

height :: Fix TreeF -> Int
height = cata heightAlg

### Codata

If we revert the arrows that constructs a data, we get two functions dual to the previous data constructors that have the shapes like this:

Bar1 :: Bar -> (Int, String)
Bar2 :: Bar -> Bool

Intuitively, Bar1 and Bar2, instead of two ways to construct a Bar from their components, they specifies two ways to use the Bar. This is what we called codata.

### Construction of codata

Here’s another example of useful codata:

type Stream = (Int, Stream)

head :: Stream -> Int
tail :: Stream -> Stream

Here’s an example on how to construct this codata.

startFrom :: Int -> Stream
startFrom n = (n, startFrom (n + 1))

### Terminal coalgebra and codata constructor

Above codata Stream can be expressed as terminal coalgebra (fix-point) on the following

type StreamF a = (Int, a)
type Stream = Fix StreamF

Then the startFrom constructor for Stream will be an anamorphism on some coalgebra:

startFromCoalg :: Int -> StreamF Int
startFromCoalg n = (n, n + 1)

startFrom :: Int -> Fix StreamF
startFrom = ana fromCoalg

## General purpose codata

### Church encoding of data types

One intriguing topic in functional programming is Church encoding. Church encoding shows us ways to encode any data and control constructs as purely lambda functions.

\f x -> x               -- 0
\f x -> f x             -- 1
\f x -> f (f x)         -- 2
\n -> \f x -> f (n f x) -- succ :: Nat -> Nat

\x y -> x               -- true
\x y -> y               -- false
\p -> \a b -> p b a     -- not :: Bool -> Bool
\p a b -> p a b         -- if :: Bool -> a -> a -> a

\x y z -> z x y         -- pair :: a -> b -> pair
\p -> p (\x y -> x)     -- fst :: pair -> a

### General eliminator of Bool

Let’s take a look in an eliminator for Bool.

data Bool = True | False
type BoolC a = (a, a) -> a

elimBool :: Bool -> BoolC a
elimBool True  (a, b) = a
elimBool False (a, b) = b

You may recognize that the BoolC for elimBool is equivalent to the Church encoding for Bool. We will show their equivalence in next section.

You may also recognize that that elimBool is the most general eliminator for type Bool. In other words, every valid eliminator can be derived from this eliminator.

In fact, the general eliminator elimBool is the catamorphism of type Bool.

### Isomorphism between Church encoding and the data

We will demonstrate that Bool and BoolC are indeed isomorphic:

from :: Bool -> BoolC a
from = elimBool

to :: BoolC a -> Bool
to f = f (True, False)

It’s easy to prove that from . to = id and to . from = id so I’ll elaborate. So far we have shown that BoolC is indeed a Church encoding for Bool.

### General eliminator for Tree

Let’s look at a more complex type:

data Tree = Leaf | Node Tree Int Tree

type TreeC a = a -> ((a, Int, a) -> a) -> a
elimTree = Tree -> TreeC a
elimTree Leaf                f g = f
elimTree (Node left n right) f g = g left' n right'
where left'  = elimTree f g left
right' = elimTree f g right

You may recognize that elimTree is the catamorphism for Tree. Also TreeC is a legit Church encoding for Tree.

### Visitor pattern on Tree

Now we learned how to find Church encoding and shown how Church encoding is isomorphic as the represented data type.

We can discover a pattern such that we can extract of all the ways we can eliminate a Tree into a single entity. We will call this entity “TreeVisitor”.

type TreeVisitor a = (a, (a, Int, a) -> a)

visitLeaf :: TreeVisitor a -> a
visitNode :: TreeVisitor a -> (a, Int, a) -> a

As the name suggests, this pattern is just the Visitor pattern in OOP. Here I am using a pair to represent the type for TreeVisitor, but the exact way to implement it doesn’t really matter.

The point of TreeVisitor a being a codata type because the only thing we care about it is to be able to derive the two methods visitLeaf and visitNode.

### Tree as Codata

A Tree can then be defined as the all possible TreeVisitor a -> a instances (i.e. TreeC a), as we already proved by showing the isomorphism between Tree and TreeC.

walk :: Tree -> (forall a. TreeVisitor a -> a)

This representation of Tree is also a codata because the actual underlying data structure of Tree is hidden from the outside, and the TreeVisitors already defined all the ways to access it.

## References

]]>
Adjunctions and Eilenberg-Moore category https://log.lain.li/blog/adjunctions-and-eilenberg-moore-category/index.html 2018-11-16T00:00:00Z 2018-11-16T00:00:00Z
Posted on November 16, 2018
Tags: explanation Categories: category theory

This post is the sequel to the yesterday’s post Algebras compatible with list monad gives rise to a monoid. To illustrate how Eilenberg-Moore category can be constructed with algebras on monads.

A question arises as we learned that an adjunction gives rise to a monad and a comonad, does the converse hold as well? Is there a natural way to get an adjunction from a monad? The answer to this is yes. The intuition is that with monad $$T$$, we have plenty of choices for $$C$$, the point is to find out these two functors $$L:T\to C$$ and $$R:C\to T$$ that satisfies the triangular identity.

I’ll review from horizontal composition of natural transformations; then I’ll review the definition of adjunctions and how adjunctions give rise to a monad; finally, I’ll talk about how a Eilenberg-Moore category can be constructed and why it is left adjoint to a monad category.

## Horizontal composition between natural transformations

Natural transformations, of course, can be composed. Natural transformations can compose in two different ways: horizontally and vertically. Vertical composition is just plain morphism composition with little surprise. Horizontal composition is what gets more interesting. Here I’ll denote the identity natural transformation over $$F$$ by simply $$F$$ when there is no ambiguity.

Let’s see what happens if we compose $$\alpha\circ F$$ in the following diagram (called a “whiskering”, because of its shape):

$\xymatrix { E & & D \ar@/_2pc/[ll]^{G'}="a" \ar@/^2pc/[ll]_{G}="b" & C \ar[l]^{F} \ar @2{->}_{\alpha} "b";"a" }$

Now we take $$a:C$$, and see what we get on $$(\alpha \circ F)_a$$.

$\xymatrix { G'(F a) \\ & F a \ar[ul]^{G'} \ar[dl]_{G} & a\ar[l]^F \\ G(F a) \ar[uu]_{\alpha_{F a}} }$

From the diagram we can see $$(\alpha \circ F)_a = \alpha_{F a}$$. Now let’s see another version of whiskering:

$\xymatrix { E & D \ar[l]^{G} & & C \ar@/_2pc/[ll]^{F'}="a" \ar@/^2pc/[ll]_{F}="b" \ar @2{->}_{\alpha} "b";"a" }$

Similarly, take $$a: C$$, let’s see what we get by composing $$(G \circ \alpha)_a$$.

$\xymatrix { G (F' a) & F' a \ar[dd]_{\alpha_a}\ar[l]^G & \\ & & a \ar[ul]^{F'}\ar[dl]^{F} \\ G (F a)\ar[uu]_{G \alpha_a} & F a \ar[l]^G & \\ }$

Thus $$(G\circ \alpha)_a = G \alpha_a$$, not exactly symmetric to the left whiskering version.

In the rest of the articles, I’ll using the result from this sections a lot and won’t get into the details.

## Adjunction

Given two categories $$C$$ and $$D$$, two morphisms that goes between them $$L: D\to C$$, $$R: C\to D$$. We say $$L$$ is left adjoint to $$R$$ when the following condition holds for all $$a:C$$, $$b:D$$.

$C(L b, a) \simeq D(b, R a)$

By replacing $$a$$ with $$L b$$, we get

$C(L b, L b) \simeq D(b, R(L b))$

On the left we have $$id_{L b}$$, by the isomorphism of hom-set, this morphism will select a morphism on the right: $$b \to R (L b)$$. This is an natural transformation called a unit, denoted as $$\eta: 1_D \to R\circ L$$.

Similarly by replacing $$b$$ with $$R a$$, we can get the dual of unit – the counit – on the left of the hom-set isomorphism: $$\epsilon: L \circ R \to 1_C$$.

An adjunction defined by unit and counit has to satisfy the triangle identities, namely:

$\xymatrix { L \ar[r]^{L\circ \eta} \ar@2{-}[rd]^{} & LRL \ar[d]^{\epsilon \circ L} \\ & L \\ }$

and:

$\xymatrix { R \ar[r]^{\eta \circ R} \ar@2{-}[rd]^{} & RLR \ar[d]^{R\circ\epsilon} \\ & R \\ }$

Adjunction defines a very loose equivalence relation between functors $$L$$ and $$R$$. Where one can think of $$L$$ as to introduce some structures, and $$R$$ to remove the structures. A typical example for adjunction is the free-forgetful adjunction.

## Monad from adjunction

In last post I mentioned an adjunction can give rise to a monad, here’s how.

Given an adjunction $$L \dashv R$$, where $$L: D\to C$$ and $$R: C\to D$$, we define a functor $$T = R \circ L$$ be the underlying functor of the monad. Let $$\mu = R \circ \epsilon \circ L$$ be the monad multiplication and the monad unit - $$\eta$$ - be the same as the unit for the adjunction.

We first check the types. $$\mu: RLRL \to RL = T^2 \to T$$, check. $$\eta: 1 \to RL = 1 \to T$$, also check. Then we need to check if $$\mu$$ and $$\eta$$ works subjecting to the monad laws.

First the unit laws:

$\xymatrix { T \ar[d]_{T\circ\eta} \ar[r]^{\eta\circ T} & T^2 \ar[d]^{\mu} \\ T^2 \ar[r]^{\mu} & T }$

This law can be derived from triangle identities by left composing the first identity with an $$R$$ and right composing the second identity with an $$L$$. The following diagram will automatically communites thanks to the triangle identities.

$\xymatrix { RL \ar[d]_{RL \circ \eta} \ar[r]^{\eta \circ RL} \ar@2{-}[rd]^{} & RLRL \ar[d]^{R\circ\epsilon\circ L} \\ RLRL \ar[r]_{R\circ\epsilon\circ L} & RL \\ }$

Now the multiplication law (or associativity law):

$\xymatrix { T^3 \ar[r]^{\mu \circ T} \ar[d]_{T\circ\mu} & T^2\ar[d]^\mu \\ T^2 \ar[r]^\mu & T }$

Or,

$\xymatrix { RLRLRL \ar[rr]^{R\circ\epsilon\circ L \circ RL} \ar[dd]_{RL\circ R\circ\epsilon\circ L} & & RLRL\ar[dd]^{R\circ\epsilon\circ L} \\\ & \\ RLRL \ar[rr]^{R\circ\epsilon\circ L} & & RL }$

This is the naturality square of $$\epsilon$$ left composed with $$R$$! Let’s remove the $$R$$ on the left and see what we get:

$\xymatrix { LRLRL \ar[rr]^{\epsilon_{LRL}} \ar[dd]_{LR \circ f} & & LRL\ar[dd]^{f} \\\ & \\ LRL \ar[rr]^{\epsilon_L} & & RL }$

This is exactly the naturality square of $$\epsilon: LR\to1$$. Thus we proved that the monad associativity law holds naturally just because $$\epsilon$$ is a natural transform.

In fact, an adjunction also gives rise to a comonad on $$RL$$, in a very similar fashion, by defining the counit (extract) to be $$\epsilon$$ and $$\delta$$ (duplication) to be $$L\circ\eta\circ R$$.

## Monad algebra category

Now let’s get back to monad algebra which we discussed extensively in the last post. Last time we proved that an algebra compatible with a list monad is a monoid, it’s a rather surprising finding.

Let’s now revisit the definition of monad compatible algebra. Given a monad $$(T: C \to C, \eta: 1 \to T, \mu: T^2 \to T)$$ and an algebra $$(a: C, \sigma: T a \to a)$$, we define the coherence conditions as follows:

• $$\sigma \circ \eta_a = 1_a$$
• $$\sigma \circ \mu_a = \sigma \circ T \sigma$$

Algebras that satisfy these conditions are compatible with the given monad, and we call them monad algebras.

Not very surprisingly, monad algebras for a monad $$(T, \tau, \mu)$$ do form a category. The objects are $$(a:C, \sigma:T a\to a)$$, and the morphisms are the same morphisms in $$C$$. This category is called monad algebra category, or Eilenberg-Moore category, denoted as $$\textbf{mAlg}_T$$, or $$C^T$$.

Identity morphism on $$(a,\sigma)$$ in $$C^T$$ is just the identity morphism on $$a$$ in $$C$$. Morphism compositions also work the same way as they’re in $$C$$.

I just missed one thing. Before we can claim $$(T a, \mu_a)$$ is a monad algebra, we need to check if this algebra meets the coherence conditions.

First, $$\mu \circ \eta_{T a} = 1_{T a}$$. This one is just the right identity law for monad, so it automatically holds. Then we check $$\mu\circ \mu_{T a}=\mu\circ T \mu$$. It’s just the monad associativity (multiplication) law! See how these all fits so perfectly. Just because of monad laws, $$\mu_a$$ will always be a compatible algebra on $$T a$$.

## Free monad algebra

The cannonical functor from $$C^T$$ to $$C$$ is the forgetful functor that “forget” about the algebra part, i.e. $$U: (a,\sigma) \mapsto a$$. In addition, $$U: f \mapsto f$$ for morphisms.

It’s much more tricker to define the free functor $$F: C\to C^T$$. We need to find a valid algebra for every $$a$$. Fortunately, it turns out we already have a very good candidate – $$\mu_a: T (T a) \to T a$$ from the monad. $$\mu$$ is a natural transformation so it works on any $$a$$, for every $$a$$ we have an algebra from $$T (T a)$$ to $$T a$$.

Now we can define the free functor $$F: C \to C^T$$ as $$a \mapsto (T a, \mu_a)$$, which is kind of neat. Of course, $$F$$ should also maps $$f: a \to b$$ to $$F f: T a \to T b$$, which is just $$T f$$.

## Monad algebra adjunctions

Now we have got a pair of free and forgetful functors, it’s time to prove they are really adjoint. We say $$F$$ is left adjoint to $$U$$, or $$F \dashv U$$.

To play with adjunction, we need to first define our pair of natural transformations $$\eta: 1_C \to U\circ F$$ and $$\epsilon: F\circ U \to 1_{C^T}$$.

Let’s write down $$\eta$$ in its components form: $$\eta_a: a \to U (F a)$$, and we know $$U (F a) = U (T a, \sigma) = T a$$. We can just use the unit $$\eta$$ from the monad!

What about $$\epsilon$$? $$\epsilon_{(a,\sigma)}: F (U (a,\sigma)) \to (a,\sigma)$$. Where $$F (U (a, \sigma)) = F a = (T a, \mu_a)$$. We need to find a map from $$(T a, \mu_a)$$ to $$(a, \sigma)$$. We can just use the forgotten evaluator $$\sigma: T a\to a$$.

In order for the unit and counit to form an adjunction, we need to check the triangle laws.

$\xymatrix { (T a, \mu_a) \ar[r]^{F\circ \eta_a} \ar@2{-}[rd]^{} & (T(T a), \mu_{T a}) \ar[d]^{\mu_{a}} \\ & (T a, \mu_a) \\ }$

We can check the algebras’ carrier types.

$\xymatrix { T a \ar[r]^\eta & T (T a) \ar[r]^\mu & T a }$

This is essentially just $$\mu \circ \eta$$, which is equal to the identity by the right unit law on monad. The evaluator follows automatically, because they’re just regular morphisms in $$C$$. Then we check another triangle identity:

$\xymatrix { a \ar[r]^{\eta_{U (a, \sigma_a)}} \ar@2{-}[rd]^{} & T a \ar[d]^{U\circ\sigma_a} \\ & a \\ }$

Omitting the unimportant part, we get:

$\xymatrix { a \ar[r]^{\eta_a} & T a \ar[r]^{\sigma_a} & a }$

Looking familiar? Yes, $$\sigma \circ \eta = 1$$ must hold by one of the coherence conditions for $$\sigma$$ to be compatible with our monad.

Therefore we have shown the Eilenberg-Moore category $$C^T$$ is left adjoint to a monad category $$C$$. I’m not sure if it’s valid to say a category is left adjoint to another category. But anyway, we have discovered another free-forgetful functor pair that are adjoint to each other, and what makes it so fascinating is that it’s an adjunction we can get from ANY monad.

## References

]]>
Algebras in list monad https://log.lain.li/blog/algebras-in-list-monad/index.html 2018-11-15T00:00:00Z 2018-11-15T00:00:00Z
Posted on November 15, 2018
Tags: explanation Categories: category theory

I’m currently watching Dr. Bartosz Milewski’s video lecture series on category theory on YouTube. In this lecture Category Theory III 4.2, Monad algebras part 3, he stated an interesting fact that in an algebra that is compatible with the list monad is a monoid.

In his video lecture he explained it very briefly and drawn the conclusion quickly before I can follow to convince myself on the fact. After that, in comment, I saw someone else had the similar questions at the ambiguous notation Dr. Milewski uses. So I derived the theorem myself to clarify my understanding. I think the outcome is kind of interesting that worths a post about it.

I want to write this post in a very beginner-friendly manner, to explain the concepts to people whom knew only some basic category theory. Hopefully it’s gonna also help me to clear things out.

## Monoid

Monoid captures the generalized idea of multiplication. A monoid on a set $$S$$ is made of an element $$\eta\in S$$ called unit and binary operation $$\mu: S \times S \to S$$ called multiplication that satisfies these laws:

• left identity law: $$\mu(\eta, a) = a$$, for $$a \in S$$
• right identity law: $$\mu(a, \eta) = a$$, for $$a \in S$$
• associativity law: $$\mu(a,\mu(b,c)) = \mu(\mu(a,b),c)$$, for $$a,b,c\in S$$

Here are some common examples of monoids:

• list monoid on list, where $$\eta$$ is empty list and $$\mu$$ is the append operator (++ in Haskell)
• additive monoid on integer, where $$\eta$$ is $$0$$ and $$\mu$$ is $$+$$
• multiplicative monoid on integer, where $$\eta$$ is $$1$$ and $$\mu$$ is $$\times$$

## Monad

A monad is defined as an endofunctor $$T$$ along with two natural transformations $$\eta: a \to T a$$ called unit and $$\mu: T^2 a\to T a$$ called multiplication that satisified these laws:

• identity law:

$\begin{CD} T a @>\eta>> T^2 a \\ @VVT\mu V @V\mu VV \\ T^2 a @>\mu >> T a \end{CD}$

where the $$T a$$ at top-left is equal to the $$T a$$ at bottom right.

• mutiplication law:

$\begin{CD} T^3 a @>T \mu>> T^2 a \\ @VV\mu V @VV\mu V \\ T^2 a @>\mu >> T a \end{CD}$

These laws are essentially just monoid laws (left/right identity law and associativity law) on the category of endofunctors.

## List monad

The list functor is a monad with $$\eta$$ and $$\mu$$ defined as following:

η x = [x]
μ xs = concat xs

where $$\eta$$ sends a value to a singleton list containing that value, and $$\mu$$ is concatenation function of a list of lists.

It’s easy show the monad laws for this list monad hold, since it’s not today’s focus, I’ll skip it.

## Algebra

An algebra on an endofunctor $$F: C\to D$$ is given by a tuple $$(a, \sigma)$$, where $$a$$ is an object in $$C$$ and $$\sigma$$ is an endofunction $$F a \to a$$. It’s worth noting that an algebra is not a natural transformation as it seems.

A natural transformation has no knowledge on its component, therefore must be a polymorphic function. This restriction is, however, not required for an algebra. In an algebra, $$\sigma$$ is bound to a specific object $$a: C$$, thus it can do transformations on $$a$$ or generate an $$a$$ from nowhere.

An algebra can be viewed as a map to evaluate a functor on values (e.g. algebraic expression) into an single value. Here are some examples of algebras:

• sum on list of additive numbers
• length on polymorphic list
• foo (x:_) = x; foo [] = 1 on list of integers
• eval: ExprF a -> a on an expression of type a

In an algebra the functor plays the role to form an expression, and the $$\sigma$$ function evaluates it.

## Category of algebras

Algebras on an given endofunctor $$F:C\to D$$ can form a category, where the objects are the algebras $$(a, \sigma)$$, and the morphisms from $$(a,\sigma)$$ to $$(b,\tau)$$ can be defined as morphisms in $$C(a,b)$$. We now show that the morphisms are composible:

$\begin{CD} F a @>F f>> F b \\ @VV\sigma V @VV\tau V \\ a @>f>> b \end{CD}$

Since $$F$$ is a functor, this diagram automatically commutes.

## Monad algebra

Given an endofunctor $$T$$, A monad algebra on $$T$$ is a monad on $$T$$ along with an compatible algebra on $$T$$. A monad algebra contains all the operations from its monad part and its algebra part:

• $$\eta: a \to T a$$
• $$\mu: T^2 a \to T a$$
• $$\sigma: T a \to a$$

Be noted that a specific algebra can have a specific $$a$$.

To make the algebra compatible with the monad, we need to impose these two conditions:

• with unit, $$(\sigma \circ \eta) a = a$$
• with multiplication, the diagram below should commute

$\begin{CD} T^2 a @>\mu>> T a \\ @VV T\sigma V @VV\sigma V \\ T a @>\sigma>> a \end{CD}$

These two conditions are strong. Not all algebras on $$T$$ are compatible with a given monad on $$T$$. For example, in the list monad of integers, the condition requires η [x] = x; this eliminates all other algebras that don’t satisfy this property, like the length algebra.

## Algebra on list monad

Now we finally get to the interesting one. There are many monad-compatible algebras on list, for example: sum, product, concat etc. These algebras do various of operations but there’s one thing in common: they all seems to related to some monoid. In fact they indeed do. We will now prove it.

First we see what properties do algebras on list monad hold. By the compatibility we discussed eariler, we always have:

• η [x] = x and,
• (σ∘Tσ) x = (σ∘μ) x, where $$\mu$$ is the concat operator for list

Let σ [] = e and σ [x,y] = x <> y, we now show e is an unit and x <> y is the multiplication operator in a monoid.

We now prove the left identity law for the monoid. We prove this by evaluting (σ∘Tσ) [[], [x]] in two ways. On the left we got (σ∘Tσ) [[], [x]] = σ [e, x] = e <> x, on the right we got (σ∘Tσ) [[], [x]] = (σ∘μ) [[], [x]] = σ [x] = x. This shows e <> x = x. The right identity law can be proved in a similar fashion.

Now the associativity law, first we get (σ∘Tσ) [[x,y],z] = [x <> y, z] = (x <> y) <> z, and (σ∘Tσ) [x,[y,z]] = [x, y<>z] = x <> (y <> z). We also no that they both equal to (σ∘μ) [[x,y],z] = (σ∘μ) [x,[y,z]] = σ [x,y,z]. For consistency, this means σ [x,y,z] must be defined as (x <> y) <> z or x <> (y <> z), and they are equal.

Now we have proved that the algebra must gives rise to a monoid, and $$\sigma$$ is the mconcat function.

## References

]]>

Posted on November 24, 2017
Tags: translation Categories: rust

## 第一部分

### Future 簡述

Future 可以被理解為一種不會立即執行的古怪函數，相反，它們在未來才會執行（所以才叫做 future）。使用 future 而非普通函數的原因很多，譬如說為了性能，為了優雅，為了可組合性，等等。Future 的缺點在於寫起來有點難，好吧，是難。如果你都不知道一個函數何時會執行，你怎麼知道它的前因後果是什麼？

### Rust 的 Future

Rust 社區的發展是迅速的，Rust 中 future 的實現也是。所以，一如繼往要聲明一下，你從這裡學到的知識可能會在一段時間後顯得過時，所以要注意一下。

Rust 的 futures 其實就是 Results：也就是說你需要指定預期返回類型和錯誤類型。

fn my_fn() -> Result<u32, Box<Error>> {
Ok(100)
}

fn my_fut() -> impl Future<Item = u32, Error = Box<Error>> {
ok(100)
}

let retval = my_fn().unwrap();
println!("{:?}", retval);

Future 則會在實際執行前就返回了（或者更準確地講，我們先返回了準備以後執行的代碼），所以我們得有個方法執行它。對此我們可以用 Reactor，創建個 Reactor 並調用其 run 方法就能執行 future 了。對於我們上面的例子，可以這樣：

let mut reactor = Core::new().unwrap();

let retval = reactor.run(my_fut()).unwrap();
println!("{:?}", retval);

### 鏈式方法

fn my_fn_squared(i: u32) -> Result<u32, Box<Error>> {
Ok(i * i)
}

fn my_fut_squared(i: u32) -> impl Future<Item = u32, Error = Box<Error>> {
ok(i * i)
}

let retval = my_fn().unwrap();
println!("{:?}", retval);

let retval2 = my_fn_squared(retval).unwrap();
println!("{:?}", retval2);

let mut reactor = Core::new().unwrap();

let retval = reactor.run(my_fut()).unwrap();
println!("{:?}", retval);

let retval2 = reactor.run(my_fut_squared(retval)).unwrap();
println!("{:?}", retval2);

let chained_future = my_fut().and_then(|retval| my_fn_squared(retval));
let retval2 = reactor.run(chained_future).unwrap();
println!("{:?}", retval2);

1. 預定執行 my_fut()
2. my_fut() 執行完畢並返回一個叫做 retval 的變量，把它存入 my_fut() 的執行結果中。
3. 在那之後預訂執行 my_fn_squared(i: u32)，將 retval 作為參數傳進去。
4. 把這個 指令流程 打包成一個 future 函數，叫做 chained_future

### Future 和普通函數的混合寫法

fn fn_plain(i: u32) -> u32 {
i - 50
}

let chained_future = my_fut().and_then(|retval| {
let retval2 = fn_plain(retval);
my_fut_squared(retval2)
});
let retval3 = reactor.run(chained_future).unwrap();
println!("{:?}", retval3);

let chained_future = my_fut().and_then(|retval| {
done(my_fn_squared(retval)).and_then(|retval2| my_fut_squared(retval2))
});
let retval3 = reactor.run(chained_future).unwrap();
println!("{:?}", retval3);

let chained_future = my_fut().and_then(|retval| {
my_fn_squared(retval).and_then(|retval2| my_fut_squared(retval2))
});
let retval3 = reactor.run(chained_future).unwrap();
println!("{:?}", retval3);

   Compiling tst_fut2 v0.1.0 (file:///home/MINDFLAVOR/mindflavor/src/rust/tst_future_2)
error[E0308]: mismatched types
--> src/main.rs:136:50
|
136 |         my_fn_squared(retval).and_then(|retval2| my_fut_squared(retval2))
|                                                  ^^^^^^^^^^^^^^^^^^^^^^^ expected enum std::result::Result, found anonymized type
|
= note: expected type std::result::Result<_, std::boxed::Box<std::error::Error>>
found type impl futures::Future

error: aborting due to previous error

error: Could not compile tst_fut2.

### 泛型

fn fut_generic_own<A>(a1: A, a2: A) -> impl Future<Item = A, Error = Box<Error>>
where
A: std::cmp::PartialOrd,
{
if a1 < a2 {
ok(a1)
} else {
ok(a2)
}
}

let future = fut_generic_own("Sampdoria", "Juventus");
let retval = reactor.run(future).unwrap();
println!("fut_generic_own == {}", retval);

### 註釋

[dependencies]
futures="*"
tokio-core="*"
futures-await = { git = 'https://github.com/alexcrichton/futures-await' }

#![feature(conservative_impl_trait, proc_macro, generators)]

extern crate futures_await as futures;
extern crate tokio_core;

use futures::done;
use futures::prelude::*;
use futures::future::{err, ok};
use tokio_core::reactor::Core;
use std::error::Error;

(第一部分完)

##第二部分

###引言

### 麻煩的 Error

#[derive(Debug, Default)]
pub struct ErrorA {}

impl fmt::Display for ErrorA {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "ErrorA!")
}
}

impl error::Error for ErrorA {
fn description(&self) -> &str {
"Description for ErrorA"
}

fn cause(&self) -> Option<&error::Error> {
None
}
}

ErrorB 也一樣：

#[derive(Debug, Default)]
pub struct ErrorB {}

impl fmt::Display for ErrorB {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
write!(f, "ErrorB!")
}
}

impl error::Error for ErrorB {
fn description(&self) -> &str {
"Description for ErrorB"
}

fn cause(&self) -> Option<&error::Error> {
None
}
}

fn fut_error_a() -> impl Future<Item = (), Error = ErrorA> {
err(ErrorA {})
}

fn fut_error_b() -> impl Future<Item = (), Error = ErrorB> {
err(ErrorB {})
}

let retval = reactor.run(fut_error_a()).unwrap_err();
println!("fut_error_a == {:?}", retval);

let retval = reactor.run(fut_error_b()).unwrap_err();
println!("fut_error_b == {:?}", retval);

fut_error_a == ErrorA
fut_error_b == ErrorB

let future = fut_error_a().and_then(|_| fut_error_b());

Compiling tst_fut2 v0.1.0 (file:///home/MINDFLAVOR/mindflavor/src/rust/tst_future_2)
error[E0271]: type mismatch resolving <impl futures::Future as futures::IntoFuture>::Error == errors::ErrorA
--> src/main.rs:166:32
|
166 |     let future = fut_error_a().and_then(|_| fut_error_b());
|                                ^^^^^^^^ expected struct errors::ErrorB, found struct errors::ErrorA
|
= note: expected type errors::ErrorB
found type errors::ErrorA

let future = fut_error_a()
.map_err(|e| {
println!("mapping {:?} into ErrorB", e);
ErrorB::default()
})
.and_then(|_| fut_error_b());

let retval = reactor.run(future).unwrap_err();
println!("error chain == {:?}", retval);

mapping ErrorA into ErrorB
error chain == ErrorB

let future = fut_error_a()
.and_then(|_| fut_error_b())
.and_then(|_| fut_error_a());

let future = fut_error_a()
.map_err(|_| ErrorB::default())
.and_then(|_| fut_error_b())
.map_err(|_| ErrorA::default())
.and_then(|_| fut_error_a());

### From 來相救

impl From<ErrorB> for ErrorA {
fn from(e: ErrorB) -> ErrorA {
ErrorA::default()
}
}

impl From<ErrorA> for ErrorB {
fn from(e: ErrorA) -> ErrorB {
ErrorB::default()
}
}

let future = fut_error_a()
.from_err()
.and_then(|_| fut_error_b())
.from_err()
.and_then(|_| fut_error_a());

Future crate 很聰明，from_err 的代碼只會在出錯的情況下才會被調用到，所以這一切也是沒有任何運行時開銷的。

### Lifetimes

Rust 還有一個特性，叫作引用的顯式 lifetime 標註。Rust 支持 lifetime 省略，所以大部份時候我們可以省去顯式標註 lifetime。舉個例子，我們想要寫接受字符串引用作為參數的函數，如果沒出錯的話，會返回同字符串引用：

fn my_fn_ref<'a>(s: &'a str) -> Result<&'a str, Box<Error>> {
Ok(s)
}

fn my_fn_ref(s: &str) -> Result<&str, Box<Error>> {
Ok(s)
}

fn my_fut_ref_implicit(s: &str) -> impl Future<Item = &str, Error = Box<Error>> {
ok(s)
}

   Compiling tst_fut2 v0.1.0 (file:///home/MINDFLAVOR/mindflavor/src/rust/tst_future_2)
error: internal compiler error: /checkout/src/librustc_typeck/check/mod.rs:633: escaping regions in predicate Obligation(predicate=Binder(ProjectionPredicate(ProjectionTy { substs: Slice([_]), item_def_id: DefId { krate: CrateNum(15), index: DefIndex(0:330) => futures[59aa]::future::Future::Item } }, &str)),depth=0)
--> src/main.rs:39:36
|
39 | fn my_fut_ref_implicit(s: &str) -> impl Future<Item = &str, Error = Box<Error>> {
|                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports

note: rustc 1.23.0-nightly (2be4cc040 2017-11-01) running on x86_64-unknown-linux-gnu

thread 'rustc' panicked at 'Box<Any>', /checkout/src/librustc_errors/lib.rs:450:8
note: Run with RUST_BACKTRACE=1 for a backtrace.

fn my_fut_ref<'a>(s: &'a str) -> impl Future<Item = &'a str, Error = Box<Error>> {
ok(s)
}

### 帶 lifetime 的 impl Future

fn my_fut_ref_chained<'a>(s: &'a str) -> impl Future<Item = String, Error = Box<Error>> {
my_fut_ref(s).and_then(|s| ok(format!("received == {}", s)))
}

error[E0564]: only named lifetimes are allowed in impl Trait, but  was found in the type futures::AndThen<impl futures::Future, futures::FutureResult<std::string::String, std::boxed::Box<std::error::Error + 'static>>, [closure@src/main.rs:44:28: 44:64]>
--> src/main.rs:43:42
|
43 | fn my_fut_ref_chained<'a>(s: &'a str) -> impl Future<Item = String, Error = Box<Error>> {
|                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

fn my_fut_ref_chained<'a>(s: &'a str) -> impl Future<Item = String, Error = Box<Error>> + 'a {
my_fut_ref(s).and_then(|s| ok(format!("received == {}", s)))
}

let retval = reactor
.run(my_fut_ref_chained("str with lifetime"))
.unwrap();
println!("my_fut_ref_chained == {}", retval);

my_fut_ref_chained == received == str with lifetime

(第二部分完)

## 第三部分

### Reactor？循環？

Reactor 簡單來講就是一個循環。要解釋之，我想到一個類比：假如說你發了郵件邀請了一位女生/男生約會（好吧我知道這樣有點老套），你 想要 收到答覆，所以你不斷，不斷，不斷去檢查有沒有新郵件，直到你終於得到了答覆。

Rust 的 reactor 就有點像這樣子。給它一個 future，它會不斷檢查這個 future 的狀態，直到這個 future 完成或者出錯為止。它通過一個叫 poll （輪詢）的函數來實現這樣的功能，一點也不奇怪。Future 類型的實現者自己必須實現 poll 函數才行，他們要做的就是返回一個類行為 Poll<T, E> 的值（詳參 Poll 文檔）。好吧事實上 reactor 並不會不斷去調用你寫的輪詢函數，但暫時我們先別深究細節。我們來從這個例子開始看吧。

### 從頭開始實現一個 Future

#[derive(Debug)]
struct WaitForIt {
message: String,
until: DateTime<Utc>,
polls: u64,
}

impl WaitForIt {
pub fn new(message: String, delay: Duration) -> WaitForIt {
WaitForIt {
polls: 0,
message: message,
until: Utc::now() + delay,
}
}
}

impl Future for WaitForIt {
type Item = String;
type Error = Box<Error>;

fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
let now = Utc::now();
if self.until < now {
Ok(Async::Ready(
format!("{} after {} polls!", self.message, self.polls),
))
} else {
self.polls += 1;

println!("not ready yet --> {:?}", self);
Ok(Async::NotReady)
}
}
}

    type Item = String;
type Error = Box<Error>;

    fn poll(&mut self) -> Poll<Self::Item, Self::Error> {

let now = Utc::now();
if self.until < now {
// Tell reactor we are ready!
} else {
// Tell reactor we are not ready! Come back later!
}

impl Future for WaitForIt {
type Item = String;
type Error = Box<Error>;

fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
let now = Utc::now();
if self.until < now {
Ok(Async::Ready(
format!("{} after {} polls!", self.message, self.polls),
))
} else {
self.polls += 1;

println!("not ready yet --> {:?}", self);
Ok(Async::NotReady)
}
}
}

fn main() {
let mut reactor = Core::new().unwrap();

let wfi_1 = WaitForIt::new("I'm done:".to_owned(), Duration::seconds(1));
println!("wfi_1 == {:?}", wfi_1);

let ret = reactor.run(wfi_1).unwrap();
println!("ret == {:?}", ret);
}

Running target/debug/tst_fut_create
wfi_1 == WaitForIt { message: "I\'m done:", until: 2017-11-07T16:07:06.382232234Z, polls: 0 }
not ready yet --> WaitForIt { message: "I\'m done:", until: 2017-11-07T16:07:06.382232234Z, polls: 1 } img

### 取消駐留

futures::task::current().notify();

impl Future for WaitForIt {
type Item = String;
type Error = Box<Error>;

fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
let now = Utc::now();
if self.until < now {
Ok(Async::Ready(
format!("{} after {} polls!", self.message, self.polls),
))
} else {
self.polls += 1;

println!("not ready yet --> {:?}", self);
futures::task::current().notify();
Ok(Async::NotReady)
}
}
} img img

### Joining

Reactor 的一個實用特性是可以併發執行多個 future。我們在單線程中達到高效併發的方法是這樣的，一個 future 駐留時，其他 future 就可以趁機執行了。

let wfi_1 = WaitForIt::new("I'm done:".to_owned(), Duration::seconds(1));
println!("wfi_1 == {:?}", wfi_1);
let wfi_2 = WaitForIt::new("I'm done too:".to_owned(), Duration::seconds(1));
println!("wfi_2 == {:?}", wfi_2);

let v = vec![wfi_1, wfi_2];

fn main() {
let mut reactor = Core::new().unwrap();

let wfi_1 = WaitForIt::new("I'm done:".to_owned(), Duration::seconds(1));
println!("wfi_1 == {:?}", wfi_1);
let wfi_2 = WaitForIt::new("I'm done too:".to_owned(), Duration::seconds(1));
println!("wfi_2 == {:?}", wfi_2);

let v = vec![wfi_1, wfi_2];

let sel = join_all(v);

let ret = reactor.run(sel).unwrap();
println!("ret == {:?}", ret);
} img

### Select

future 中還有很多其他函數，另一個有趣的函數叫 select。Select 函數會同時執行兩個 future，並且會返回最先執行完成的那個。這個函數很適合用來實現超時，這裡是我們的例子：

fn main() {
let mut reactor = Core::new().unwrap();

let wfi_1 = WaitForIt::new("I'm done:".to_owned(), Duration::seconds(1));
println!("wfi_1 == {:?}", wfi_1);
let wfi_2 = WaitForIt::new("I'm done too:".to_owned(), Duration::seconds(2));
println!("wfi_2 == {:?}", wfi_2);

let v = vec![wfi_1, wfi_2];

let sel = select_all(v);

let ret = reactor.run(sel).unwrap();
println!("ret == {:?}", ret);
}

## 第四部分

]]>

Posted on October 4, 2016
Tags: security docker Categories: unix

## 背景與起因

shipyard 終於成功跑起來了，我玩了一會，得意之至，忘記了上面的警告和自己親手暴露的端口。接著我就折騰 Jenkins 之類的去了，玩得不亦樂乎。

## 發現入侵 ui-for-docker 截圖

（當時這兩個 Containers 還在運行）

## 研究入侵

• chrisfosterelli/rootplease 啟動時間是 2016-10-02T14:04:15.86229242Z，發生在 46 小時前
• CentOS 啟動時間是 2016-10-03T14:44:35.271194859Z，發生在 21 小時前

• chrisfosterelli/rootplease 是用的 -v /:/hostOS，也就意味著他拿到了 root shell
• CentOS 只掛載了 -v /root/.ssh:/mnt，意味著他的目的是修改 /root/.ssh/authorized_keys 來拿到 root 帳戶的 ssh 登入的權限。但就算目的不是，破壞範圍也有限，畢竟 Docker 是個沙盒

• chrisfosterelli/rootplease/root/.bash_history 其實是 Host 機器的 /root/.bash_history，內容都是我自己的命令歷史。謹慎起見，額外查看到 Container 的文件系統鏡像中的 /root/.bash_history 不存在
• CentOS 那個 Container 的 /root/.bash_history 也不存在

# cat /root/.ssh/authorized_keys
root@ubuntu:~# cat .ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDRrvCaHsH8nZ6YrTsZaTFeKW3aPzUvlK+h+KT8rT4w6EGJgl8LVANHUsl5BF3RVGjFKFnBkX6jd6tWt+435h9vrEhxynoI69ljiiP9lD8GWgp0axmupqrcU3+OBiAmQ1OrOsMeNBdlw3GjAGPLI+ACd2WPPfKlWyQqDYrtzUPm5cz7HmI5Xo10KDcAS8gRJolH1AzBLfb8gPv8X9c9pKlpkUeST7j6MWLg3QQTShqbDB5j3IvL92KPhFmsOtJFd+efRyTiFhKsiQDY1h2er4gWcAn95GgLG6ci4D3d/kCoYRwIjRRrk5/4pRRq3wpp7/anI8qqJ6pPbdV9HvA/AEOp root@localhost.localdomain

## 被入侵後的安全檢測

$w 13:06:55 up 108 days, 4:26, 6 users, load average: 0.01, 0.03, 0.05 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT shou ttyS0 18Jun16 108days 3.73s 2.98s [打碼] shou pts/1 tmux(5178).%0 28Sep16 6days 1.92s 1.42s [打碼] shou pts/2 tmux(5178).%1 28Sep16 5days 1:27m 1:27m [打碼] shou pts/3 tmux(5178).%2 28Sep16 5days 4.68s 4.68s -zsh shou pts/11 tmux(5178).%3 28Sep16 6days 0.67s 0.67s -zsh shou pts/14 [連入 IP 打碼] 11:16 1.00s 3.84s 0.00s sshd: shou [priv] 然後是登入紀錄，也沒有異常： $ last
shou     pts/14       [連入 IP 打碼]     Tue Oct  4 11:16    gone - no logout
shou     pts/4        [連入 IP 打碼]     Mon Oct  3 16:26 - 19:23  (02:57)
shou     pts/0        [連入 IP 打碼]     Mon Oct  3 10:45 - 13:07  (02:22)
shou     pts/0        [連入 IP 打碼]     Mon Oct  3 02:29 - 07:13  (04:44)
shou     pts/0        [連入 IP 打碼]     Sun Oct  2 05:32 - 20:00  (14:27)
[...]

$sudo less /var/log/auth.log（或者 sudo journalctl -u ssh） [...] 內容很正常，沒有看到有人用 root 登入認證出錯的信息。 那位使用 CentOS + Bash 的入侵者看來是碰壁了，畢竟他沒有辦法訪問到 Host 機器的 /etc/ssh/sshd_config 不可能知道我做了哪些安全措施。然後我有防火牆，連去 TCP/22 的連接會被直接棄掉，所以沒有 SSH 登入紀錄當然不意外。 於是來研究第一位入侵者可能做了的事情。首先檢查 systemd 啟動項： $ systemctl list-unit-files | grep enabled
accounts-daemon.service                    enabled
cron.service                               enabled
deploy_daemon.service                      enabled
docker.service                             enabled
[...]

$systemctl status accounts-daemon.service$ systemctl status cron.service
$[...] 所幸都沒問題。不通過 systemd 服務，要設置自動啟動程序，那就只能注入 /etc/profile/etc/{rc*.d/*,rc.local} 之類的啟動腳本了，簡單看了一下這些文件的修改日期： $ stat /etc/profile
File: '/etc/profile'
Size: 575       	Blocks: 8          IO Block: 4096   regular file
Device: 800h/2048d	Inode: 326         Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2016-04-21 19:25:20.949283213 +0000
Modify: 2015-10-22 17:15:21.000000000 +0000
Change: 2016-04-22 00:09:57.362266110 +0000
Birth: -

$sudo netstat -lptn | grep 0.0.0.0 [...] tcp 0 0 0.0.0.0:[打碼] 0.0.0.0:* LISTEN 29878/[某已知服務] tcp 0 0 0.0.0.0:[打碼] 0.0.0.0:* LISTEN 23577/sshd [...] 除了 TCP，UDP 也不例外： $ sudo netstat -lpun | grep 0.0.0.0
[...]

## 教訓

• 配置好防火牆很重要
• 配置好 ssh 安全性很重要
• 給防火牆白名單加規則前思考一下可能的後果，特別以 root 權限運行的進程
• 開放端口時一定要改默認端口
]]>
banana algebra https://log.lain.li/blog/banana-algebra/index.html 2015-06-08T00:00:00Z 2015-06-08T00:00:00Z
Posted on June 8, 2015
Tags: math Categories: category theory haskell

Abstract: I’ll be talking about the F-Algebra Category and related applications of F-Algebra in function programming in this post.

## F-Algebra

So first of all, an algebra over a type $$a$$ is set of functions that converts an algebra structure $$f a$$ to $$a$$ CoAlg. An algebra consists of:

• An algebra structure: $$\rm{F}$$
• A carrier type: $$\rm{a}$$
• A total function: $$\rm{F(a)} \to \rm{a}$$

An example of an algebra looks like:

• Algebra struct: data F1 a = Zero | One | Plus a a
• A carrier type: it could be any instance of a: Int, String, etc.
• A total function:
f1 :: F1 Int -> Int
f1 Zero       = 0
f1 One        = 1
f1 (Plus a b) = a + b

Or we can have:

f1' :: F1 String -> String
f1' Zero       = ""
f1' One        = "1"
f1' (Plus a b) = a ++ b

## F-Algebra Arrows

All algebras for an algebra structure $$\rm{F}$$ forms a category $$\cal{C}$$. The objects are, of course, the algebras, while the arrows are defined as morphisms between each two pair of algebras that transforming the carrier type: $$\hom_{\cal{C}}(\rm{Alg}(\rm{F},\rm{a}), \rm{Alg}(\rm{F},\rm{b}))$$.

       Alg(F,a)
F a --------------> a
|
|
| <- hom(Alg(F,a), Alg(F,b))
|
v
F b --------------> b
Alg(F,b)

For an arrow in F-algebra category, we need a transformation from F a to a.

## References

]]>

Posted on May 8, 2015
Tags: poem Categories: think

host 曰：即有大樂者，與悲治之。

]]>

Posted on May 3, 2015
Tags: poem Categories: think

]]>
Undeepening Continuations https://log.lain.li/blog/undeepening-continuations/index.html false 2015-03-13T00:00:00Z
Posted on March 13, 2015
Tags: tutorial Categories: haskell

Continuation was always a mystery to me since I started to learn monad.

My friend Javran once sent me a link to ‘the mother of all monads’, from where I start to rethink the significance of the continuation monad. That article, to me as a novice, wasn’t enough explanative for me to comprehend. Therefore, to demystify the significance of the continuation monad, I spend several hours to read through articles about it last night. In the final, I eventually carried my understanding to continuation out into code and achieved enlightenment. (not really)

So let’s get started. First of all, we need to know, at least have a primitive concept of, what continuation is used for. So here is my understanding,

A continuation is an intermediate state that allows computation to run at that point.

You don’t need to try hardly on comprehending the statement above. By going through this tutorial you will make your own understanding out.

## Computation with Holes

As we said above a continuation is a intermediate state that we can delay the computation to the future. Let’s see how these functions look like:

transformOne = ??? 1

doSomethingOnOne = do
putStrLn "before execution"
??? 1
putStrLn "after execution"

So we’re defining two functions with this feature. The first one is rather easy to see, on the blank (???) there should be a function, which takes an integer and yields whatever things. In the doSomethingOnOne there should also be a function like that in the first, expect being constrained to return an IO a.

Here’s the modified finished functions and their corresponding types:

transformOne :: (Int -> r) -> r
transformOne f = f 1

doSomethingOnOne :: (Int -> IO a) -> IO ()
doSomethingOnOne f = do
putStrLn "before execution"
val <- f 1
putStrLn ("we get " ++ show val)

If you have other experiences about Control.Monad.Cont, you might be able to recognize the pattern in the types. Strictly, the latter one isn’t what continuation monad looks like because IO a is not the same as IO (). Well, this is normal in our real life programming, but we can’t call them continuation because they’re not composable. And one of the main purposes we want to generalize continuation is to make them composable so they could be an instance of Monad. Why? Well, a monad is just a MONOID in the category endofunctors, monoid implies composablity, what’s the problem? Why? Well, a monoid is a just a SEMIGROUP in combine with an identity object, what’s the problem? But why? Well, that’s how semigroup is defined, what’s the problem?

Anyway, we’ll see if we can make a continuation stronger, that is, in type of (a -> r) -> r, we make our life easier. For now, just remember it, that (a -> r) -> r is the type of continuation. Which means we can’t have the second piece of code above unless we add a return val at the end of the do notation. It also means you can’t have continuation like foo f = show (f 1), which modifies the result.

These functions are what we called continuations. It’s intuitive to think them as functions with a hole to be filled. To fill the hole in a continuation we need to feed it with a function that takes the passed in value and return something we needed from outside of the hole. I call these functions fed to a continuation hole-filler functions. For the arguments these hole-fillers take, I call them seed and the returning thing was barely named yield. I made these names just for referring the components more intuitively.

## How to Fill in the Holes

We now see how (a -> r) -> r tastes like. You might think, well, continuation is weak that we are even restricted from modifying with the hole-filler’s yield. Well, yes, we cannot modify the yield. But continuation is not as weak as you might think, because modifying the yield is not the correct way to use a continuation.

If you just want to modify the result of a passed in function and carry on the computation, you want a Monad instead of a continuation. Yet without being capable to modify the yield, you’re allowed to modify the seed freely.

The significance of continuation is, in my understanding, just hole-filling. It’s like, if we’re playing with continuation, our aim will be giving out a seed to a incoming hole-filler, for which we don’t know what it is or what it will do. In the other words, we pass our result of computation IN instead of return it OUT.

If we’ve composed a bunch of continuation together, that is, we get a very deeply nested continuations, how can we take the result out? Since the computation always throw their results inwards, by the seeds. Can we acquire the seed out of the continuation? Just think of the simplest example: transformOne f = f 1. The answer is, id (id x = x). If we feed the continuation with id function as hole-filler, it will yield the seed without any modification and then we can acquire it.

As we take out the result, we can do whatever we want with it. Also we can take some actions directly onto the result. Think of feed a continuation with a print hole-filler, then we will see the seed printed out as expect.

Now let’s think, how do we generate a continuation from a single value such that we can take out result out by feeding the continuation with id? The solution isn’t very hard:

genCont :: a -> (a -> r) -> r
genCont val f = f val

or more point-lesslyfreely:

genCont = flip ($) So far, we have gain a basic concept how continuations are created and composed together. ## Functor Property of Continuations We now look on how to transform the seed with a function. If we have a function f :: a -> b, we hope we can use it to transform a continuation with type (a -> r) -> r into (b -> r) -> r. What does it mean? If I have a continuation with seed of type Int, how can use feed it with a hole-filler that eats Strings? The answer is to have a transforming function of type Int -> String. Here’s how we might these functions look like: toStr :: Int -> String toStr = show foo :: (Int -> r) -> r foo f = f 1 bar :: (String -> r) -> r bar f = f "1" If now we have toStr and foo, how should we combine them together to form bar? Some might recognize what we’re trying to implement is just fmap, if we make continuation an instance of Functor. We can do that, let’s try. First we wrap the function in this pattern it into a newtype, call it ‘Cont’. newtype Cont r a = Cont { runCont :: (a -> r) -> r } The reason I write Cont r a instead of Cont a r is, the varying type for a continuation is its type of seed, rather than that of yield. So now we try: instance Functor (Cont r) where fmap f (Cont cnt) = ??? The returning value of fmap should be a continuation with seed type b. So first of all, we should have a continuation that takes an argument with a hole-filler with type b -> r. fmap :: (a -> b) -> (Cont r a) -> (Cont r b) fmap f (Cont cnt) = Cont$ \(hf :: b -> r) -> (??? :: r)

We need a r as result, which should be generated by the passed in hf. And hf takes a value with type b. Obviously, this b should be transformed by f from a, i.e. f (??? ::: a). So now we would have:

fmap :: (a -> b) -> (Cont r a) -> (Cont r b)
fmap f (Cont cnt) = Cont $\hf -> hf (f (??? :: a)) Now the problem has been converted to, how can we extract a from the passed in continuation cnt? The word extract implies that we have to get INTO it for what we want. Just as what we did above with id. (cnt id) will give us a, pretty cool. For some purpose, here I will expand the definition of id, which is \x -> x, you’ll see why I do it in that way soon: fmap :: (a -> b) -> (Cont r a) -> (Cont r b) fmap f (Cont cnt) = Cont$ \hf -> hf (f (cnt (\a -> a)))

Looks good, seems in this way we can extract a out of cnt. But wait, it doesn’t typecheck? Let’s see what happens. Without the type constrain above, the code compiles. When we query the type of fmap, it was in type (a -> b) -> (Cont a a) -> (Cont r b).

This is not what we want. The returning type of cnt shouldn’t be restricted to a as what it is, rather, it should share the same r with the one in Cont r b as fmap finally returns. So we start to see how we can achieve that.

fmap :: (a -> b) -> (Cont r a) -> (Cont r b)
fmap f (Cont cnt) = Cont $\hf -> hf (f (cnt (\a -> (??? :: r)))) However, f expects to have an argument with type a isn’t it? How can we have the a in the context of f while keeping the returning type of cnt to be r? Okay, we look for where r is needed in the context and we easily found it: fmap :: (a -> b) -> (Cont r a) -> (Cont r b) fmap f (Cont cnt) = Cont$ \hf -> ((hf (f a)) :: r)

That means, if we have the a to feed into f, then we get a b to feed into the hole-filler hf, and the hole-filler will yield an r, which is what we wanted. But we don’t have an a for f at this point. The solution is to wrap the extraction of a around hf (f a):

fmap :: (a -> b) -> (Cont r a) -> (Cont r b)
fmap f (Cont cnt) = Cont $\hf -> cnt (\a -> (hf (f a))) It typechecks, so it should be correct. Try it out: (fmap (+1)$ genCont 3) id     -- => 4

Good.

## Monadic Continuations

We now make continuation an instance of a monad:

instance Monad (Cont r) where
return           = ???
(Cont cnt) >>= f = ???

The type of return is a -> Cont r a. Unwrap the Cont we have a -> (a -> r) -> r. Looks familiar? Yup it is just genCont we have above. The aim of return is to create a continuation from a value, the same as what genCont does. So this part is easy.

return x = Cont $\hf -> hf x -- or pointlessly return = Cont . flip ($)

Actually, after we deduced fmap, we’ll feel much easier to catch >>=. The significance is to take the value out from the continuation supplied as the first argument of >>=. The way to do this is similar to fmap.

-- cnt     :: (a -> r) -> r
-- f       :: a -> Cont r b
-- (>>=)   :: (Cont r a) -> (a -> Cont r b) -> Cont r b
-- hf      :: b -> r
-- runCont :: (Cont r b) -> ((b -> r) -> r)

(Cont cnt) >>= f = Cont \$ \hf -> cnt (\x -> runCont (f x) hf)

This solution typechecks. Let’s look into it to see what it does in >>=. First of all, a holefiller

## Why monad?

Disclaimer: DO NOT READ THIS SECTION. This section was purely my OWN understanding to continuation monad. I try to write correct things but my thought was specific and could be quite misleading to beginners. If you think you haven’t understood continuation yet, don’t read this section because it will muddle you up once again. Otherwise, you’ve understood continuation well, you don’t need to read on my premitive and partial and inaccurate opinion. Anyway, don’t read it.

Continuation, is just a kind of computations. A continuation could generate a dependent output that relies on a value, either a plain value or the output of another continuation, we call it composability property. (Although I used some general terms here, the way continuations take input is still very different from that of functions) On the other hand, we know we can produce a continuation that will generate a specified plain output. Therefore a continuation is a monad, whose ‘bind’ operation is the ‘compose’ operation of continuations.

composed. ‘Composing’ two computations means to collapse them into one. On the other hand, computations are definitely the arrows in the category of inputs and outputs. On the third hand, we know we can always create a computation that gives no matter what input, because it is a at the

We’re now entering the domain of monad, so we now need to make use of our knowledge of how to implement (>>=).

]]>