Last week I described functional error handling in Elixir based on Chapter 4 of Functional Programming in Scala. This week I’ll discuss the first part of Chapter 5 on “Strictness and Laziness”.
Also, this week’s post has come a day early; tomorrow I’ll be traveling to Elixir Conf 2014!
Source Code
As usual, I’ve uploaded my tests and exercises to github. If you want to work through the exercises for yourself, then follow the directions in the repo’s README before reading the rest of this post.
Strict Arguments
In Scala, arguments to a function can be strict (as in strictly evaluated) or non-strict. Strict arguments are the default, so for example if we call square(41.0 + 1.0)
then 41.0 + 1.0
will be evaluated before the call to square. A non-strict argument will not evaluate the argument until it is referenced. It does this by wrapping the argument in an anonymous function called a thunk. The function can then lazily evaluate its argument by calling the thunk.
Elixir doesn’t have non-strict arguments for functions. Though it does for macros. Throughout the exercises below I will use explicit thunks in order to allow functions to lazily evaluate arguments.
Lazy Streams
I had a bit of difficulty getting started with this set of exercises. The theory in the text made sense but getting the translation from Scala to Elixir right wasn’t easy for me. In fact, I’m still not sure that its correct. I may need to make more changes to the stream implementation for next week’s post on “Infitnite Streams and Corecursion”.
The problems I had stemmed from the fact that Scala provides some syntactic sugar for non-strict function arguments which Elixir lacks. That is, in Scala, the compiler will automatically generate a function wrapping an argument, called a thunk, based on type inference using the function’s signature. In Elixir, function arguments are always strictly evaluated, while macro arguments are non-strict.
cons
The text builds up lazily evaluated streams using cons
similarly to the way it built up lists in Chapter 3. However, the head and tail of the cons
are lazily evaluated. I was able to use an Elixir macro to build up cons
like this:
defmodule Cons do
defstruct head: nil, tail: nil
end
# Use a macro for lazy evaluation
defmacro cons(h, t) do
quote do
%Cons{ head: fn () -> unquote(h) end, tail: fn () -> unquote(t) end }
end
end
The cons
macro wraps both h
and t
in anonymous functions and creates a Cons
struct to hold them.
When I initially implemented the cons
macro I neglected to wrap t
in an annonymous function. Even with this oversight the implementation was working acceptably up until exercise 7 when I had extensively use foldr
. I ended up fixing cons
and starting over with new implementations of all the functions.
Testing Approach
Streams are supposed to be evaluated lazily, that is items are not evaluated unless they are going to be consumed. As we will see in the exercises there are functions like take
which take only the first few elements of a stream and ignore the rest. I wanted to be able to test that when calling take(stream, 2)
elements beyond 2 are not evaluated in the stream. To do this, I wrote two helper functions to build test streams:
def build_stream do
L.cons(1, L.cons(2, L.cons(3, [])))
end
def build_stream_with_sentinel do
L.cons(1, L.cons(2, L.cons(3, L.cons(raise("sentinel reached"), []))))
end
I use build_stream
in tests that will evaluate the entire stream. I use build_stream_with_sentinel
in tests that should not evaluate the entire stream. The raise
in the last cons will trigger an error when the element is evaluated thus causing the test to fail. Consider the test for take
test "take" do
assert L.take(build_stream_with_sentinel, 2) |> L.to_list == [1, 2]
end
The test calls build_stream_with_sentinel
and then takes only the first two elements, without evaluating them, and creates a new stream. Then to_list
is used to produce a result that can be easily confirmed. to_list
evaluates the elements. If take
were to take too many items or evaluate the original stream then the exception would be raised and the test would fail.
Examples - to_list and take
I’ll go through the first few exercises as examples of how to use the Cons
struct.
The first exercise was to implement a to_list
function that can create a standard Elixir list from our Stream.
def to_list([]), do: []
def to_list(%Cons{ head: h, tail: t}), do: [ h.() | to_list(t.()) ]
The base case here is []
because, for better or worse, I choose to represent the end of a Stream with []
. Looking back at it now I think this choice is somewhat arbitrary. The text creates a specific empty
value though I’m not sure how I would do that with Elixir’s pattern matching. Maybe an atom like, :end_of_stream
would be more appropriate.
The recursive case matches against a Cons
struct and binds variables h
and t
to the head and tail of the Cons
respectively. Recall that h
and t
are functions, or thunks, that allow the values to be evaluated lazily. In building a list we must evaluate and the head of the list is evaluated as h.()
, i.e. the result of calling the thunk h
. The tail of the list is computed recursively by calling to_list
on the tail. Again, to evaluate the tail we must call through the thunk t
.
In exercise 2 the text asked me to implement the take
function that I described above. Here’s my implementation:
# Exercise 2
# Build up a new cons consisting of the elements to take
def take([], _n), do: []
def take(%Cons{head: h, tail: t}, n) when n <= 1, do: %Cons{head: h, tail: terminal}
def take(%Cons{head: h, tail: t}, n) when n > 1, do: %Cons{head: h, tail: ld( take(t.(), n - 1) )}
Again, the base case matches against []
, the terminator for the stream. In the case of an empty stream we return another empty stream, regardless of the value of n
. Next, is a second base case, where there is a general Cons
struct and n <= 1
. In this case we build a new terminal Cons
struct to go at the end of the new stream - there is no need to recurse here. Also, note the use of the helpers terminal
and ld
(lamda) which are simply
defmacrop ld(x) do
quote do
fn -> unquote(x) end
end
end
def terminal, do: fn -> [] end
Aside: there is an interesting discussion on syntax for producing anonymous functions like ld
which can simplify some Elixir code
Finally, the recursive case matches a general Cons
struct and any value of n > 1
. This case builds up a new Cons
struct and recursively calls take
to build up the tail of the stream.
I initially wrote take
using only patterns. The single recursive case used an if
statement. Then I realized this would be a good place to try out guard clauses so I rewrote the example.
I could have also written take
like this:
# Build up a new cons consisting of the elements to take
def take([], _n), do: []
def take(%Cons{head: h, tail: t}, n) when n <= 1, do: cons(h.(), [])
def take(%Cons{head: h, tail: t}, n) when n > 1, do: cons(h.(), take(t.(), n - 1))
reusing the cons
macro to build up the Cons
structs. One difference between the two implementations is that using the cons
macro creates a new anonymous function. I don’t know how costly anonymous functions are or if this is even a concern. I believe, the only other differences are stylistic. Using cons
is more compact and, perhaps, easier to read. Using %Cons{...}
while more verbose may be clearer in that it does include the term h.()
. Neither case actually evaluates h
but this fact may be obscured in cons(h.(), ...)
since the reader must remember that cons
is a macro and is non-strict in it’s arguments.
foldr
The text gave an implementation in Scala of foldRight
which I’ve translated to the following
def foldr([], acc, _f), do: acc.()
def foldr(%Cons{head: h, tail: t}, acc, f) do
f.(h.(), fn -> foldr(t.(), acc, f) end)
end
Note that this foldr
takes a stream, a lazily evaluated accumulator (acc is a function that returns the initial accumulated value), and a function to apply. I’ve written a recursive definition. The base cases matches an empty stream, denoted by [], and just returns the initial value, acc.()
. The recursive case matches a general Cons
struct and calls the function f
as usual passing it the (eavluated) head element, h.()
and the result of folding the tail of the stream, t.()
.
It took me several tries through trial and error to write a correct implementation. At first I didn’t realize that acc
needs to be a function for lazy evaluation. Second, I didn’t realize I needed to evaluate acc.()
in the base case. Strangely, I was able to implement several functions including for_all
and map
using an incorrect implementation of foldr
. I only ran into trouble when trying to implement filter
and append
. I guess these two problems are symmetric in that a strict acc
wouldn’t need special evaluation in the base case, it would already be bound to the necessary value. It was only once I got to filter
and append
that I required evaluation of acc
.
I won’t go into detail on the implementations of map
, filter
, append
, and flat_map
. Given a working foldr
they look much like the List
versions. The only tricky parts are in how to represent the lazy accumulator and when to evaluate items in the stream.
I do want to discuss some of the techniques I ended up using to debug the problems I had with foldr
. I started out trying to use IO.inspect/2
but eventually turned to iex
.
Here’s the first issue I looked at with append
$ iex -S mix
Erlang/OTP 17 [erts-6.0] [source-07b8f44] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (0.14.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> require Laziness
nil
iex(2)> alias Laziness, as: L
nil
iex(3)> a = L.cons(1, L.cons(2, L.cons(3, [])))
%Laziness.Cons{head: #Function<20.106461118/0 in :erl_eval.expr/5>,
tail: #Function<20.106461118/0 in :erl_eval.expr/5>}
iex(4)> L.to_list(a)
[1, 2, 3]
iex(5)> b = L.cons(1, L.cons(2, L.cons(3, [])))
%Laziness.Cons{head: #Function<20.106461118/0 in :erl_eval.expr/5>,
tail: #Function<20.106461118/0 in :erl_eval.expr/5>}
iex(6)> L.to_list(b)
[1, 2, 3]
iex(8)> c = L.append(a, b)
%Laziness.Cons{head: #Function<10.13241963/0 in Laziness.append/2>,
tail: #Function<11.13241963/0 in Laziness.append/2>}
iex(9)> L.to_list(c)
[1, 2, 3, 1, 2, 3]
Here, in the iex
session I get the expected result, but my test fails? I finally realized I wasn’t actually doing the same thing that I was doing in the test. The text says that append
is non-strict in its second argument so my test says:
test "append" do
assert L.append(build_stream, fn -> build_stream end) |> L.to_list == [1, 2, 3, 1, 2, 3]
end
In iex
I didn’t wrap the second argument in a function! Thinking about it I realized the problem was that I hadn’t defined Cons
and cons
correctly and that the tail needed to be a function. In fact, I believe this is the reason why the second argument to append
needs to be a function. Once this was fixed I was able to pass the append
test. I never really used irb
with ruby but this encouraging session whet my appetite for iex
so I went on to debug filter
.
iex(9)> b = L.cons(1, L.cons(2, L.cons(3, L.cons(raise("sentinel reached"), []))))
...
iex(31)> r = Laziness.filter(b, fn (x) -> x < 3 end) |> L.take(2)
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
tail: #Function<9.21518084/0 in Laziness.take/2>}
iex(32)> r.head
#Function<17.21518084/0 in Laziness.filter/2>
iex(33)> r.head.()
1
iex(34)> r.tail
#Function<9.21518084/0 in Laziness.take/2>
iex(35)> r.tail.()
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
tail: #Function<9.21518084/0 in Laziness.take/2>}
iex(36)> r.tail.().head.()
2
iex(37)> r.tail.().tail.()
** (RuntimeError) sentinel reached
(laziness) lib/laziness.ex:48: Laziness.fold_right/3
(laziness) lib/laziness.ex:30: anonymous fn/2 in Laziness.take/2
I was expecting to get []
at the end but somehow the result contains the sentinel. This made me think there was a problem with take
running to far through stream b
. I played around with take
in iex
but wasn’t able to see the problem. I went back to filter
iex(52)> r = Laziness.filter(b, fn (x) -> x < 3 end)
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
tail: #Function<18.21518084/0 in Laziness.filter/2>}
iex(53)> r.head.()
1
iex(54)> r.tail.()
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
tail: #Function<18.21518084/0 in Laziness.filter/2>}
iex(55)> r.tail.().head.()
2
iex(56)> r.tail.().tail.().head.()
** (RuntimeError) sentinel reached
(laziness) lib/laziness.ex:48: Laziness.fold_right/3
So I find that filter contains [1, 2, sentinel] which means that filter
is doing the right thing and removing 3. This still points to a problem with take
. I keep poking around in iex
trying to find that problem. Finally, I noticed something peculiar. Looking back to iex(31)
and iex(35)
I see that in both cases we have:
tail: #Function<9.21518084/0 in Laziness.take/2>
This anonymous function in Laziness.take/2
is the lambda used in to wrap the recursion.
ld( take(t.(), n - 1) )
But, there should have been no recursive tail in iex(35)
as we should have been at the end of the stream. This led me to realize that there was an off-by-one error in take
such that I did not terminate the recursion soon enough. Above I showed you the correct solution for take, but at the time I had:
def take(%Cons{head: h, tail: t}, n) do
if (n > 1), do: %Cons{head: h, tail: ld( take(t.(), n - 1) )}, else: []
end
I rewrote it as:
def take(%Cons{head: h, tail: t}, n) do
if (n > 1) do
%Cons{head: h, tail: ld( take(t.(), n - 1) )}
else
%Cons{head: h, tail: ld([])}
end
end
to fix the problem. Later, I refactored this into the two functions with guard clauses matching on different values of n
as I showed above.
Coming up
This has become a pretty long post but I’ve learned quite a bit this week and am glad to be able to share it. Before I go, I want to take a moment to describe the next few posts that you can expect to see here on Learning Elixir.
There will likely be two posts next week, in the first I will tell you all about ElixirConf.
In the later part of next week I plan to continue with the exercises in Functional Programming in Scala, section 5.4 “Infinite Streams and Corecursion”.
Two weeks from now I want to take a quick break from the text to compare what I’ve learned in chapter 5 with the official Streams implementation for Elixir. I’ll to go through the Streams module source and compare and contrast it with what we’ve done here.
After that I’ll return to the text and continue with chapter 6.