# Generics for small, fixed-size vectors

Recently, I decided to clean up and release a small library which I hacked together several months ago and then all but forgot about. I find it quite amusing; perhaps you will, too.

The library implements a framework for computing with small, fixed-size vectors such as complex numbers or coordinates. My goal was to be as generic and efficient as possible. In particular, it should be easy to generically define common functions such as dot product or magnitude for vectors of arbitrary arity and to add new vector types and operations. Equally importantly, there shouldn’t be any run-time overhead – all operations should be as fast as if they were written by hand.

There are a number of nice libraries on Hackage which provide all this to varying degrees. However, I thought it would be fun to write my own and perhaps I could come up with a design that is minimal in some sense. In the end, I’m not sure about minimal but I sure did have quite a bit of fun.

So what is a small, fixed-size vector? Essentially, it should be completely defined by two operations:

construct :: a -> ... -> a -> v a inspect :: v a -> (a -> ... -> a -> b) -> b

That is, we can construct a vector from a bunch of coordinates and inspect it with a function that takes a bunch of coordinates and produces some result. The number of coordinates is the arity of the vector and is known at compile time. Ideally, the framework should work for all types that define these two operations and nothing else. The question is: can we just put them in a type class and implement everything else on top of it?

The two signatures suggest (unsurprisingly) that functions of type `a -> ... -> a -> b` play a rather central role here: `construct` is such a function and `inspect` takes such a function as an argument. Nowadays, it is fairly straightforward to encode them in the type system. First, we define type-level natural numbers which represent the arity:

data Z -- zero data S n -- successor of n

Now, we can define such functions as a type family parametrised by the arity `n`, the type of arguments `a` and the result type `b`:

type family Fn n a b type instance Fn Z a b = b type instance Fn (S n) a b = a -> Fn n a b

For example, functions or arity 2 are represented by `Fn (S (S n)) a b = a -> a -> b`.

To define a generic vector class, we also have to associate vectors with their arity. This is easy:

type family Dim (v :: * -> *)

One design decision here is that vectors are type constructors such as `Complex` which must be applied to an element type. This makes certain things easier but means that tuples like `(Int,Int,Int)` aren’t vectors. An alternative design in the style of the vector-space package is also possible and easily supported by the framework. But that’s a topic for another post.

Anyway, a generic vector class should be easy now:

class Vector v a where construct :: Fn (Dim v) a (v a) inspect :: v a -> Fn (Dim v) a b -> b

Alas, this fails quite miserably. The problem is that `a` and `v` are not fixed in `construct`. `Fn` is a non-injective type family and so there is no way to find outwhat `v` and `a` are from `Fn (Dim v) a (v a)`.

To fix that, it is enough to introduce an injective type constructor which captures `Fn`:

newtype Fun n a b = Fun (Fn n a b)

Now, `Fn` is a type function which captures functions from `n` `a`s to `b` and `Fun` is a type constructor which does the same but is injective.

As an aside, it is well possible to make `Fun` an injective data type family or even a GADT and forgo `Fn` altogether:

data family Fun n a b newtype instance Fun Z a b = FunZ b newtype instance Fun (S n) a b = FunS (a -> Fun n a b)

However, this really messes up GHC’s optimiser and leads to very inefficient code. The slightly more complex setup with `Fn` and `Fun` is much better in this respect and not that hard to understand. Just keep in mind that `Fn n a b` doesn’t fix the types `n`, `a` and `b` and `Fun` does.

The final definition of `Vector` looks like this:

class Arity (Dim n) => Vector v a where construct :: Fun (Dim v) a (v a) inspect :: v a -> Fun (Dim v) a b -> b

Here is a simple instance:

type instance Dim Complex = S (S Z) instance RealFloat a => Vector Complex a where construct = Fun Complex inspect (Complex x y) (Fun f) = f x y

This is nice but rather pointless unless we can do something useful with these vectors. And we can! All necessary functionality is folded into the `Arity` type class which is a superclass of `Vector`. Here is its definition and instances for `Z` and `S`:

class Arity n where accum :: (forall m. t (S m) -> a -> t m) -> (t Z -> b) -> t n -> Fn n a b apply :: (forall m. t (S m) -> (a, t m)) -> t n -> Fn n a b -> b instance Arity Z where accum f g t = g t apply f t h = h instance Arity n => Arity (S n) where accum f g t = \a -> accum f g (f t a) apply f t h = case f t of (a,u) -> apply f u (h a)

This might look a bit scary but really isn’t. At least not very. The two methods allow us to define (`accum`) and apply n-ary functions generically. Let’s look at the latter first. In `apply f t h`, `t` is a seed which can generate `n` values of type `a`, where `n` is statically encoded in its type. From this seed, the function `f` produces one a value and a new seed which now embeds `n-1` values. It can do this for any `n` which means that we can apply it repeatedly to obtain all values stored in the seed. These values are then passed to `h`, which ultimately yields a result of type `b`. Here is an example of how to use `apply`:

data T_replicate n = T_replicate replicateF :: forall n a b. Arity n => a -> Fun n a b -> b replicateF x (Fun h) = apply (\T_replicate -> (x, T_replicate)) (T_replicate :: T_replicate n) h

This function applies an n-ary `Fun` to `n` copies of `x`. The seed `T_replicate` doesn’t carry any useful information, it just fixes `n`. When asked to produce a new a, we simply return `x`; the new seed is `T_replicate` with a different `n`.

Now, it is easy to define a generic replicate for arbitrary vectors:

replicate :: Vector v a => a -> v a replicate x = replicateF x $ construct

To understand how it all works, Let’s see how the term `replicate 0 :: Complex Double` is evaluated:

replicate 0 :: Complex Double = replicateF x (construct :: Fun (S (S Z)) Double (Complex Double)) = let f :: forall m. T_replicate (S m) -> (Double, T_replicate m) f T_replicate = (x, T_replicate) in apply f (T_replicate :: T_replicate (S (S Z))) Complex = let f = ... in case f (T_replicate :: T_replicate (S (S Z))) of (x,u) -> apply f u (Complex x) = let f = ... in apply f (T_replicate :: T_replicate (S Z)) (Complex x) = let f = ... in case f (T_replicate :: T_replicate (S Z)) of (x,u) -> apply f u (Complex x) = let f = ... in apply f (T_replicate :: Z) (Complex x x) = Complex x x

The interesting part is, of course, that replicate works for vectors of arbitrary arities. Equally importantly, everything is evaluated at compile time – GHC really compiles the term to `(0 :+ 0)`.

The other method of `Arity`, `accum`, is slightly more involved. The call `accum f g t` produces a function which accumulates its arguments (of type `a`) by repeatedly applying `f` to an accumulator value (starting with `t`) and an `a` argument. The number of values that can be accumulated into `t` is, again, determined by its type and decreases with each application of `f`. After accumulating all arguments in this way, we end up with a value of type `t Z` from which `g` produces the final result.

Here are two examples of how to use accumulate:

newtype T_foldl b n = T_foldl b foldlF :: forall n a b. Arity n => (b -> a -> b) -> b -> Fun n a b foldlF f b = Fun $ accum (\(T_foldl b) a -> T_foldl (f b a)) (\(T_foldl b) -> b) (T_foldl b :: T_foldl b n) newtype T_map b c n = T_map (Fn n b c) mapF :: forall n a b c. Arity n => (a -> b) -> Fun n b c -> Fun n a c mapF f (Fun h) = Fun $ accum (\(T_map h) a -> T_map (h (f a))) (\(T_map h) -> h) (T_map h :: T_map b c n)

Their semantics shouldn’t be surprising:

foldlF f z = \a1 ... an -> z `f` a1 `f` ... `f` an mapF f h = \a1 ... an -> h (f a1) ... (f an)

There are, of course, more useful functions which can be defined in this way. Of these, the most important is probably `zipWithF`. I’ll leave its implementation as an exercise for the interested reader (should there be any) for now. It is a bit elaborate but not that difficult to come up with. Took me only a few hours or so.

Now that we have all these nice functions, it is quite easy to implement a lot of useful stuff for our vectors. Here are a few examples:

map :: (Vector v a, Vector v b) => (a -> b) -> v a -> v b map f v = inspect v $ mapF f $ construct zipWith :: (Vector v a, Vector v b, Vector v c) => (a -> b -> c) -> v a -> v b -> v c zipWith f v w = inspect w $ inspect v $ zipWithF f $ construct fold :: Vector v a => (b -> a -> b) -> b -> v a -> b fold f z v = inspect v $ foldF f z eq :: (Vector v a, Eq a) => v a -> v a -> Bool eq v w = inspect w $ inspect v $ zipWithF (==) $ foldF (&&) True

The structure of these functions reflects the fact that we are essentially programming in continuation-passing style.

All this is nicely generic but what about efficiency? With appropriate `INLINE` pragmas, GHC’s mighty simplifier is perfectly capable of eliminating all intermediate data and functions and produce code that is equivalent to what we would write by hand. For example, here is a small function:

isZero :: Complex Double -> Bool isZero x = V.eq x (V.replicate 0)

The code generated by GHC is perfect:

\ (x_agj :: Complex Double) -> case x_agj of _ { :+ x1_XxL y_XxO -> case x1_XxL of _ { D# x2_awT -> case ==## x2_awT 0.0 of _ { False -> False; True -> case y_XxO of _ { D# x3_Xy6 -> ==## x3_Xy6 0.0 } } } }

There is a fairly obvious connection between `accum` and `apply` and well-known list operations. Consider that `Fun n a b` is, in a sense, equivalent to a partial function of type `[a] -> b`. The `Arity` methods then correspond to the following functions:

accumList :: (t -> a -> t) -> (t -> b) -> [a] -> b applyList :: (t -> (a,t)) -> t -> ([a] -> b) -> b

Of course, `accumList` is just a disguised `foldl` and `applyList` is almost exactly `unfoldr`. It would be nicer if `accum` corresponded to `foldr` instead of `foldl` but that would lead to much uglier code.

Anyway, this is it for now. I’m still cleaning up the library (which doesn’t have any comments whatsoever at the moment) but a first version should appear on Hackage in the next week or so.

Could this be an element type for Repa or Vector?

That depends on the actual vector type. But yes, the library will include vectors suitable for this.

Why does the newtype instances formulation mess up the optimizer? It feels like there is a trivial transformation between the type family + newtype wrapper and the newtype instances formulation.

I don’t think there is any particularly deep reason for this. It just isn’t a very common case and getting it right is probably quite a bit of work which nobody has done so far.