# Squinting at fusion

This being my first blog post and all, I’ll try to maximise boredom and minimise readability by writing as few lines of text as possible. Here we go…

As we know, recursive data types are fixpoints of non-recursive ones. So, for instance, the standard list data type:

data [a] = [] | a : [a]

is just the fixpoint of this guy:

data PreList a s = Nil | Cons a s

with these simple injection/projection functions:

inject :: PreList a [a] -> [a] inject Nil = [] inject (Cons x xs) = x : xs project :: [a] -> PreList a [a] project [] = Nil project (x:xs) = Cons x xs

Of course, `PreList` is also a functor:

instance Functor (PreList a) where fmap f Nil = Nil fmap f (Cons x s) = Cons x (f s)

Now, our goal is to mutilate the standard short cut fusion rules by using `PreList` as much as possible. Let's do `destroy/unfoldr` first:

destroy :: (forall s. (s -> PreList a s) -> s -> t) -> [a] -> t destroy g = g project unfoldr :: (s -> PreList a s) -> s -> [a] unfoldr f = inject . fmap (unfoldr f) . f

The fusion rule is

destroy g (unfoldr f s) = g f s

Of course, `unfoldr` is just the list anamorphism but what's so special about `destroy`? If we squint hard enough at its type we might realise that

forall t. (forall s. (s -> PreList a s) -> s -> t) -> t

is, in fact, isomorphic to

exists s. (s -> PreList a s, s)

This being Haskell, we have to introduce a separate data type for the existential:

data Unfolding a = forall s. Unfolding (s -> PreList a s) s

This makes the signatures of our two functions a lot simpler:

destroy :: [a] -> Unfolding a destroy xs = Unfolding project xs unfoldr :: Unfolding a -> [a] unfoldr (Unfolding f s) = ana s where ana = inject . fmap ana . f

And the fusion rule is a bit nicer, too:

destroy (unfoldr s) = s

In fact, what we have here is almost but not quite stream fusion: `destroy` is equivalent to `stream`, `unfoldr` to `unstream` and `Unfolding` to `Stream`. The only difference is that `PreList` (which corresponds to the `Step` data type in the paper) is missing the `Skip` constructor which, unfortunately, is crucial for making the whole thing work.

This part was clear to me back when we were doing stream fusion but the next bit I didn't understand until recently. The question is: can we do something similar with `foldr/build`? Again, let's get rid of as many funny types as possible and use `PreList` instead:

foldr :: (PreList a s -> s) -> [a] -> s foldr f = f . fmap (foldr f) . project build :: (forall s. (PreList a s -> s) -> s) -> [a] build g = g inject

If you are wondering where these signatures come from, here is a little hint:

(a -> s -> s) -> s -> t ~ ((a,s) -> s) -> (() -> s) -> t ~ ((a,s)+() -> s) -> t ~ (PreList a s -> s) -> t

Now, by squinting at the types we might just see that it makes sense to introduce an abstraction:

data Folding a = Folding (forall s. (PreList a s -> s) -> s)

and use it:

foldr :: [a] -> Folding a foldr xs = Folding (\f -> let cata = f . fmap cata . project in cata xs) build :: Folding a -> [a] build (Folding g) = g inject

VoilĂ ! We now have a shiny new `foldr/build` fusion rule:

foldr (build s) = s

It's probably worth pointing out that `Folding` does, in fact, support many useful list operations. Here is an example:

instance Functor Folding where fmap f (Folding g) = Folding (\h -> g (h . emap)) where emap Nil = Nil emap (Cons x s) = Cons (f x) s

So is there a point to all this? Well, I find it quite interesting that `foldr/build` fusion can be rewritten in this way. I'm even more intrigued by what we get if we squint at the list hylomorphism:

refold :: Unfolding a -> Folding a refold (Unfolding g s) = Folding (\f -> let hylo = f . fmap hylo . g in hylo s)

Could we have both stream fusion and `foldr/build` in one framework? Would that be useful? Is there a way of working on fusion without irreparably damaging my eyes?

Could we have both stream fusion and foldr/build in one framework?I don’t see any particular reason why not. I mean, both destroy/unfold and foldr/build are performing the same game, the only difference is in their duality. The real trick is the problem of interoperability, i.e. choosing to use Folding or Unfolding representations and when to switch. Most fusion frameworks try to avoid the conversion between them, and so they pick one or the other as the canonical representation. I think the correct way forward would be to have the fused structure take hylomorphic form. Both anamorphisms and catamorphisms are simplifications of hylomorphisms, and the root of fusion is turning them into their underlying hylomorphisms. So far, I don’t know of any system which takes the hylo-first approach though.

Of course, both varieties of fusion can be applied to any recursive type, not just lists. (And I’ve written up a type class for doing generic ana/build fusion for arbitrary mu-recursive types.) Lists are just special because they’re linear chains and so certain operations can be fused for lists which can’t be fused in general (i.e. in a manner agnostic to the particular shape of the recursive type). The prime example here is fusing tail concatenation (which GHC calls “augment”). Augmentation only works for lists because a) there is only a single tail due to non-branchiness, and b) there is only one base case for how that tail can terminate. If we relax either constraint then the single augment function turns into many functions depending on where and what we augment in the structure (or alternatively, the type of the augment function can no longer be generic in the fusion form of the open-recursive functor).