Strictness and Laziness in Elixir - Part 1

Last week I described functional error handling in Elixir based on Chapter 4 of Functional Programming in Scala. This week I’ll discuss the first part of Chapter 5 on “Strictness and Laziness”.

Also, this week’s post has come a day early; tomorrow I’ll be traveling to Elixir Conf 2014!

Source Code

As usual, I’ve uploaded my tests and exercises to github. If you want to work through the exercises for yourself, then follow the directions in the repo’s README before reading the rest of this post.

Strict Arguments

In Scala, arguments to a function can be strict (as in strictly evaluated) or non-strict. Strict arguments are the default, so for example if we call square(41.0 + 1.0) then 41.0 + 1.0 will be evaluated before the call to square. A non-strict argument will not evaluate the argument until it is referenced. It does this by wrapping the argument in an anonymous function called a thunk. The function can then lazily evaluate its argument by calling the thunk.

Elixir doesn’t have non-strict arguments for functions. Though it does for macros. Throughout the exercises below I will use explicit thunks in order to allow functions to lazily evaluate arguments.

Lazy Streams

I had a bit of difficulty getting started with this set of exercises. The theory in the text made sense but getting the translation from Scala to Elixir right wasn’t easy for me. In fact, I’m still not sure that its correct. I may need to make more changes to the stream implementation for next week’s post on “Infitnite Streams and Corecursion”.

The problems I had stemmed from the fact that Scala provides some syntactic sugar for non-strict function arguments which Elixir lacks. That is, in Scala, the compiler will automatically generate a function wrapping an argument, called a thunk, based on type inference using the function’s signature. In Elixir, function arguments are always strictly evaluated, while macro arguments are non-strict.

cons

The text builds up lazily evaluated streams using cons similarly to the way it built up lists in Chapter 3. However, the head and tail of the cons are lazily evaluated. I was able to use an Elixir macro to build up cons like this:

defmodule Cons do
  defstruct head: nil, tail: nil
end

# Use a macro for lazy evaluation
defmacro cons(h, t) do
  quote do
    %Cons{ head: fn () -> unquote(h) end, tail: fn () -> unquote(t) end }
  end
end

The cons macro wraps both h and t in anonymous functions and creates a Cons struct to hold them.

When I initially implemented the cons macro I neglected to wrap t in an annonymous function. Even with this oversight the implementation was working acceptably up until exercise 7 when I had extensively use foldr. I ended up fixing cons and starting over with new implementations of all the functions.

Testing Approach

Streams are supposed to be evaluated lazily, that is items are not evaluated unless they are going to be consumed. As we will see in the exercises there are functions like take which take only the first few elements of a stream and ignore the rest. I wanted to be able to test that when calling take(stream, 2) elements beyond 2 are not evaluated in the stream. To do this, I wrote two helper functions to build test streams:

def build_stream do
  L.cons(1, L.cons(2, L.cons(3, [])))
end

def build_stream_with_sentinel do
  L.cons(1, L.cons(2, L.cons(3, L.cons(raise("sentinel reached"), []))))
end

I use build_stream in tests that will evaluate the entire stream. I use build_stream_with_sentinel in tests that should not evaluate the entire stream. The raise in the last cons will trigger an error when the element is evaluated thus causing the test to fail. Consider the test for take

test "take" do
  assert L.take(build_stream_with_sentinel, 2) |>  L.to_list == [1, 2]
end

The test calls build_stream_with_sentinel and then takes only the first two elements, without evaluating them, and creates a new stream. Then to_list is used to produce a result that can be easily confirmed. to_list evaluates the elements. If take were to take too many items or evaluate the original stream then the exception would be raised and the test would fail.

Examples - to_list and take

I’ll go through the first few exercises as examples of how to use the Cons struct.

The first exercise was to implement a to_list function that can create a standard Elixir list from our Stream.

def to_list([]), do: []
def to_list(%Cons{ head: h, tail: t}), do: [ h.() | to_list(t.()) ]

The base case here is [] because, for better or worse, I choose to represent the end of a Stream with []. Looking back at it now I think this choice is somewhat arbitrary. The text creates a specific empty value though I’m not sure how I would do that with Elixir’s pattern matching. Maybe an atom like, :end_of_stream would be more appropriate.

The recursive case matches against a Cons struct and binds variables h and t to the head and tail of the Cons respectively. Recall that h and t are functions, or thunks, that allow the values to be evaluated lazily. In building a list we must evaluate and the head of the list is evaluated as h.(), i.e. the result of calling the thunk h. The tail of the list is computed recursively by calling to_list on the tail. Again, to evaluate the tail we must call through the thunk t.

In exercise 2 the text asked me to implement the take function that I described above. Here’s my implementation:

# Exercise 2
# Build up a new cons consisting of the elements to take
def take([], _n), do: []
def take(%Cons{head: h, tail: t}, n) when n <= 1, do: %Cons{head: h, tail: terminal}
def take(%Cons{head: h, tail: t}, n) when n > 1, do: %Cons{head: h, tail: ld( take(t.(), n - 1) )}

Again, the base case matches against [], the terminator for the stream. In the case of an empty stream we return another empty stream, regardless of the value of n. Next, is a second base case, where there is a general Cons struct and n <= 1. In this case we build a new terminal Cons struct to go at the end of the new stream - there is no need to recurse here. Also, note the use of the helpers terminal and ld (lamda) which are simply

defmacrop ld(x) do
  quote do
    fn -> unquote(x) end
  end
end

def terminal, do: fn -> [] end

Aside: there is an interesting discussion on syntax for producing anonymous functions like ld which can simplify some Elixir code

Finally, the recursive case matches a general Cons struct and any value of n > 1. This case builds up a new Cons struct and recursively calls take to build up the tail of the stream.

I initially wrote take using only patterns. The single recursive case used an if statement. Then I realized this would be a good place to try out guard clauses so I rewrote the example.

I could have also written take like this:

# Build up a new cons consisting of the elements to take
def take([], _n), do: []
def take(%Cons{head: h, tail: t}, n) when n <= 1, do: cons(h.(), [])
def take(%Cons{head: h, tail: t}, n) when n > 1, do: cons(h.(), take(t.(), n - 1))

reusing the cons macro to build up the Cons structs. One difference between the two implementations is that using the cons macro creates a new anonymous function. I don’t know how costly anonymous functions are or if this is even a concern. I believe, the only other differences are stylistic. Using cons is more compact and, perhaps, easier to read. Using %Cons{...} while more verbose may be clearer in that it does include the term h.(). Neither case actually evaluates h but this fact may be obscured in cons(h.(), ...) since the reader must remember that cons is a macro and is non-strict in it’s arguments.

foldr

The text gave an implementation in Scala of foldRight which I’ve translated to the following

def foldr([], acc, _f), do: acc.()
def foldr(%Cons{head: h, tail: t}, acc, f) do
  f.(h.(), fn -> foldr(t.(), acc, f) end)
end

Note that this foldr takes a stream, a lazily evaluated accumulator (acc is a function that returns the initial accumulated value), and a function to apply. I’ve written a recursive definition. The base cases matches an empty stream, denoted by [], and just returns the initial value, acc.(). The recursive case matches a general Cons struct and calls the function f as usual passing it the (eavluated) head element, h.() and the result of folding the tail of the stream, t.().

It took me several tries through trial and error to write a correct implementation. At first I didn’t realize that acc needs to be a function for lazy evaluation. Second, I didn’t realize I needed to evaluate acc.() in the base case. Strangely, I was able to implement several functions including for_all and map using an incorrect implementation of foldr. I only ran into trouble when trying to implement filter and append. I guess these two problems are symmetric in that a strict acc wouldn’t need special evaluation in the base case, it would already be bound to the necessary value. It was only once I got to filter and append that I required evaluation of acc.

I won’t go into detail on the implementations of map, filter, append, and flat_map. Given a working foldr they look much like the List versions. The only tricky parts are in how to represent the lazy accumulator and when to evaluate items in the stream.

I do want to discuss some of the techniques I ended up using to debug the problems I had with foldr. I started out trying to use IO.inspect/2 but eventually turned to iex.

Here’s the first issue I looked at with append

$ iex -S mix
Erlang/OTP 17 [erts-6.0] [source-07b8f44] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Interactive Elixir (0.14.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> require Laziness
nil
iex(2)> alias Laziness, as: L
nil
iex(3)> a = L.cons(1, L.cons(2, L.cons(3, [])))
%Laziness.Cons{head: #Function<20.106461118/0 in :erl_eval.expr/5>,
 tail: #Function<20.106461118/0 in :erl_eval.expr/5>}
iex(4)> L.to_list(a)
[1, 2, 3]
iex(5)> b = L.cons(1, L.cons(2, L.cons(3, [])))
%Laziness.Cons{head: #Function<20.106461118/0 in :erl_eval.expr/5>,
 tail: #Function<20.106461118/0 in :erl_eval.expr/5>}
iex(6)> L.to_list(b)
[1, 2, 3]
iex(8)> c = L.append(a, b)
%Laziness.Cons{head: #Function<10.13241963/0 in Laziness.append/2>,
 tail: #Function<11.13241963/0 in Laziness.append/2>}
iex(9)> L.to_list(c)
[1, 2, 3, 1, 2, 3]

Here, in the iex session I get the expected result, but my test fails? I finally realized I wasn’t actually doing the same thing that I was doing in the test. The text says that append is non-strict in its second argument so my test says:

test "append" do
  assert L.append(build_stream, fn -> build_stream end) |> L.to_list == [1, 2, 3, 1, 2, 3]
end

In iex I didn’t wrap the second argument in a function! Thinking about it I realized the problem was that I hadn’t defined Cons and cons correctly and that the tail needed to be a function. In fact, I believe this is the reason why the second argument to append needs to be a function. Once this was fixed I was able to pass the append test. I never really used irb with ruby but this encouraging session whet my appetite for iex so I went on to debug filter.

iex(9)> b = L.cons(1, L.cons(2, L.cons(3, L.cons(raise("sentinel reached"), []))))
...
iex(31)> r = Laziness.filter(b, fn (x) -> x < 3 end)  |> L.take(2)
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
 tail: #Function<9.21518084/0 in Laziness.take/2>}
iex(32)> r.head
#Function<17.21518084/0 in Laziness.filter/2>
iex(33)> r.head.()
1
iex(34)> r.tail
#Function<9.21518084/0 in Laziness.take/2>
iex(35)> r.tail.()
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
 tail: #Function<9.21518084/0 in Laziness.take/2>}
iex(36)> r.tail.().head.()
2
iex(37)> r.tail.().tail.()
** (RuntimeError) sentinel reached
    (laziness) lib/laziness.ex:48: Laziness.fold_right/3
    (laziness) lib/laziness.ex:30: anonymous fn/2 in Laziness.take/2

I was expecting to get [] at the end but somehow the result contains the sentinel. This made me think there was a problem with take running to far through stream b. I played around with take in iex but wasn’t able to see the problem. I went back to filter

iex(52)> r = Laziness.filter(b, fn (x) -> x < 3 end)
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
 tail: #Function<18.21518084/0 in Laziness.filter/2>}
iex(53)> r.head.()
1
iex(54)> r.tail.()
%Laziness.Cons{head: #Function<17.21518084/0 in Laziness.filter/2>,
 tail: #Function<18.21518084/0 in Laziness.filter/2>}
iex(55)> r.tail.().head.()
2
iex(56)> r.tail.().tail.().head.()
** (RuntimeError) sentinel reached
    (laziness) lib/laziness.ex:48: Laziness.fold_right/3

So I find that filter contains [1, 2, sentinel] which means that filter is doing the right thing and removing 3. This still points to a problem with take. I keep poking around in iex trying to find that problem. Finally, I noticed something peculiar. Looking back to iex(31) and iex(35) I see that in both cases we have:

tail: #Function<9.21518084/0 in Laziness.take/2>

This anonymous function in Laziness.take/2 is the lambda used in to wrap the recursion.

ld( take(t.(), n - 1) )

But, there should have been no recursive tail in iex(35) as we should have been at the end of the stream. This led me to realize that there was an off-by-one error in take such that I did not terminate the recursion soon enough. Above I showed you the correct solution for take, but at the time I had:

def take(%Cons{head: h, tail: t}, n) do
  if (n > 1), do: %Cons{head: h, tail: ld( take(t.(), n - 1) )}, else: []
end

I rewrote it as:

def take(%Cons{head: h, tail: t}, n) do
  if (n > 1) do
    %Cons{head: h, tail: ld( take(t.(), n - 1) )}
  else
    %Cons{head: h, tail: ld([])}
  end
end

to fix the problem. Later, I refactored this into the two functions with guard clauses matching on different values of n as I showed above.

Coming up

This has become a pretty long post but I’ve learned quite a bit this week and am glad to be able to share it. Before I go, I want to take a moment to describe the next few posts that you can expect to see here on Learning Elixir.

There will likely be two posts next week, in the first I will tell you all about ElixirConf.

In the later part of next week I plan to continue with the exercises in Functional Programming in Scala, section 5.4 “Infinite Streams and Corecursion”.

Two weeks from now I want to take a quick break from the text to compare what I’ve learned in chapter 5 with the official Streams implementation for Elixir. I’ll to go through the Streams module source and compare and contrast it with what we’ve done here.

After that I’ll return to the text and continue with chapter 6.

Joseph Kain