I’ve written about Streams before in my series on functional programming in Elixir but that was a largely theoretical post. This week I tried to take a more practical look at my code and other code to see what patterns are used with Streams in Elixir. However, while looking at several projects I found that Streams don’t seem to be in wide use. I also tweeted to #elixirlang to see if I could get any suggestions:
How do you use streams in #elixirlang? I'd like to hear about it.
— Joseph Kain (@Joseph_Kain) February 28, 2015
but have not received any replies.
I will discuss the examples I found and hopefully this post will encourage people to explore streams and use them where appropriate.
Bmark
My own Bmark benchmarking tool uses the Stream
module to run benchmarks:
# From lib/bmark/server.ex
defp collect_single_benchmark(entry) do
{
entry.module,
entry.name,
Stream.repeatedly(fn -> time(entry) end) |> Stream.take(entry.runs)
}
end
I like this pattern of using Stream.repeatedly/1
to run a function multiple times with Stream.take/1
to control the number of times. It works like a for loop construct without needing a mutable counter. This of course only works with Stream
because there is no repeatedly
function in the List
module.
I feel like I should use this pattern more often, but thinking about it may not always be that useful. The escence of teh pattern
Stream.repeatedly(f) |> Stream.take(count)
implies that the function f
is not referentially transparent. That is, f
does not give the same result each time it is called with the same input (though in this case f has no input). If f
were referentially transparent it could be memoized so that only the first call would be necessary. The fact that I want to call f
multiple times suggests that I believe that each call will have a different result.
The time
function, which uses :timer.tc/3
, used to measure the time of a real event will not always give the same result due to dynamic conditions on the system. This is why this pattern is useful for Bmark. But in general, this may not be so useful.
For this post I spent a little time thinking about ways I could write this without using streams. While experimenting to get the syntax right, and to confirm correctness I wrote them as tests and pushed them to github. You can see the code here or you can just read on.
My first thought was to use ranges and map
like this:
Enum.map(1..count, f)
except that this doesn’t work, the arity of the function passed to map has to be 1 but f
has arity 0. I need to write:
Enum.map(1..count, fn (_) -> f end)
Having to wrap the f
this way decreases the readability a lot. I also think the 1..
doesn’t contribute much to readability either. Next, I tried a comprehension
for _ <- 1..count, do: f
Now, this is quite compact and fairly readable even if it is a bit imperative. Again 1..
doesn’t contribute much in this case but overal this version is cleaner that my original Stream.repeatedly(f) |> Stream.take(count)
. I’ll probably use this version from now on.
ExUnit
ExUnit uses Stream
in two functions in the ExUnit.DocTest module. The first depends on the second so I’ll start by looking at filter_by_opts
defp filter_by_opts(tests, opts) do
only = opts[:only] || []
except = opts[:except] || []
Stream.filter(tests, fn(test) ->
fa = test.fun_arity
Enum.all?(except, &(&1 != fa)) and Enum.all?(only, &(&1 == fa))
end)
end
This function takes a list of tests and passes it to Stream.filter
. The filter function is pretty involved. Looking at the history of the code, it looks like this was originally Enum.filter
.
filter_by_opts
is called by __doctests__
def __doctests__(module, opts) do
do_import = Keyword.get(opts, :import, false)
extract(module)
|> filter_by_opts(opts)
|> Stream.with_index
|> Enum.map(fn {test, acc} ->
compile_test(test, module, do_import, acc + 1)
end)
end
This function has a pipeline with a four stages. These stages do the following:
- extract - Extracts all the tests from the module and returns a
List
of tests. - filter_by_opts - Filters the
List
usingStream.filter
and returns aStream
.- This is finite Stream as it is a filter over a finite List.
- Stream.with_index - Just numbers the elements in the previous stream.
- Enum.map - Builds a
List
of compiled tests.
So this pattern boils down to:
Stream.filter(list, f1) |> Stream.with_index |> Enum.map(f2)
Given the use of Enum.map
which eagerly consumes its argument, I don’t understand the need to use Stream.filter
and Stream.with_index
over their Enum
counterparts here. That is, I think this is really the same as:
Enum.filter(list, f1) |> Enum.with_index |> Enum.map(f2)
If you have any thoughts on this please share them in the comments.
If you inerested in the book then sign up below to follow its progress and be notified when it is launched. You'll also receive two free chapters as a sample of what the book will contain.
Along the way you will also receive Elixir tips based on my research for the book.
Mix
elixir/lib/mix/lib/mix/utils.ex has a nice use of Streams:
@doc """
Returns `true` if any of the `sources` are stale
compared to the given `targets`.
"""
def stale?(sources, targets) do
Enum.any? stale_stream(sources, targets)
end
def extract_stale(sources, targets) do
stale_stream(sources, targets) |> Enum.to_list
end
defp stale_stream(sources, targets) do
modified_target = targets |> Enum.map(&last_modified(&1)) |> Enum.min
Stream.filter(sources, fn(source) ->
last_modified(source) > modified_target
end)
end
stale_stream
is used in two public functions, stale?
and extract_stale
. Because of extract_stale
the Stream
must be able to produce the entire list of stale files. But, stale?
uses Enum.any
which can stop evaluation of the Stream
as soon as it finds a match. This pattern can be boiled down to
Stream.filter(collection, f) |> Enum.any?
This pattern leverages the lazy evaluation of Stream
s to save work. It only needs to evaluate elements until it finds one that is truthy. It can avoid evaluating any elements after the match is found.
In Mix’s case this can potentially save a lot of work as the filter used in stale_stream
calls last_modified
has to look at the filesystem to get the file’s last modified timestamp.
Conclusion
I hope you’ve learned something about using Streams in Elixir, I know have. I think the most important thing to take away from this post is that Elixir has a rich syntax and there are many ways to express your ideas. Experiment with the syntax, find the most expressive way to write your code, and have fun with it. If you come up with interesting ways to use streams or to not use streams I’d love to hear about them.