Bmark - A benchmarking tool for Elixir - Documentation

Last week I continued working toward release of my bmark tool. For the most part I worked on the configuring the number of runs of a benchmark. Recall the to-do list from the last post:

TODO List

~~Clean up the code~~
~~Configurable number of runs~~
~~Use builtin Elixir server module for Bmark.Server?~~
~~Look at the ExActor implementation to see how to implement `bmark :name, runs: 5 do`~~
Add documentation
Add a readme
Add a license
Create a hex package
Test
Release

In this post I’ll focus on the documentation.

Bmark Readme

I’ll start with the easy standalone documents like the README, Contributing, and a License document. For the README I’ll tackle three different sections: Purpose, Usage, and Examples.

Purpose

I started with this:

Bmark is a tool for benchmarking and comparing benchmark runs of Elixir applications. It provides a simple DSL for specifying benchmarks. Benchmark results are compared using hypothesis testing to give a statistical confidence in the comparison.

But then I thought to put off the part about statistics as it might put people off. I exanded the section a bit, adapting some text from a previous blog post:

Bmark is a benchmarking tool for Elixir. It allows easy creation of benchmarks of Elixir functions. It also supports comparing sets of benchmarking results.

Comparing benchmarking results is a topic that I have struggled with for years. I run a benchmark several times and get varying results. Then, I make a change to my program and I want to decide if the change causes an improvement in the benchmark score. I rerun the benchmark several times and again get varying results. How do I compare these results? I can compare average score, but is that accurate? How do I tell if the mean of the second run is large enough to be meaningful? How do I know if it is “in the noise?”

Bmark answers this questions using statistical hytpotesis testing. Given two sets of benchmark runs, bmark can show:

RunA:                                 RunB:
24274268                              6426990
24563751                              6416149
24492221                              6507946
24516553                              6453309
24335224                              6491314
24158102                              6405073
24357174                              6504260
24213098                              6449789
24466586                              6532929
24289248                              6509800

24366622.5 -> 6469755.9 (-73.45%) with p < 0.0005
t = 391.56626146910503, 18 degrees of freedom

This shows that RunA ran in an average of 24366622.5 ms and RunB ran in an average of 6469755.9 ms and that the runtime improved by 73.45% which is statistically meaningful with a confidence level of 99.95%.

Usage

I can write up a single examples of bmark functions and then examples of how to run it. I’ll use one of the test bmarks as the sample.

I also ended up using the same results from above in the Usage section and described the results. I described it this way:

The first section contains the raw result data presented side-by-side. This is the same data your would get by looking at RunA.results and RunB.results.
The next line shows the change in mean (average) between the two runs. Next, it shows the percentage change and finally confidence value. You can interpret this as saying there is 1 - p, or a greater than 99.95% confidence that the change in means is statistically significant. That is, the smaller the value of p the more confident you can be in the change in performance.
The final line shows the t value and degrees of freedom. This is the raw statistical data used to compute the confidence value.

But I think refeing to “The first section” and “The next line” isn’t clear enough. First I thought to create an image with lables for the different sections. But then I deciced that I didn’t want to include the image in the git repo itself and hosting it elsewhere would be a little cumbersome. Instead, I decided to just break up the example blocks and describe each section.

Examples

Hmm, actually I used all the examples in the usage section. But, reading over the final version of the README I realize I should have a section on how to contribute. But, github will take this in a separate document, called CONTRIBUTING.md, and will link to it from the pull-request page. So, I’ll write up a CONTRIBUTING.md file and link to it from the end of the README.

~~Add a contributing guide~~

Installation

I need to describe how to install bmark. But, I will save that until after the Hex package is put together.

~~Add a readme~~
Add installation instructions to readme.

Contributing

I basically just a need pull request and a for all tests to pass.

~~Add a contributing guide~~

License

I’ll just use the MIT License.

~~Add a license~~

Elixir @doc and @moduledoc

Mix Tasks

I need to document the mix tasks so that users can read documentation and understand how to use the tasks. I wrote simple, short documentation for mix bmark and mix bmark.cmp to describe their usages. Since there are no options for the tasks the usage description is simple.

However, at some point I should at least add support for running a subset of benchmarks. I won’t do it now and it doesn’t belong in the to-do list since it isn’t documentation specific. So, I’ll file my first github issue for bmark.

Hmm, I didn’t know this but documenting private functions triggers a warning:

lib/mix/tasks/bmark.ex:42: warning: function setup_exit_handler/0 is private, @doc's are always discarded for private functions
lib/mix/tasks/bmark.ex:56: warning: function report_single_bmark/1 is private, @doc's are always discarded for private functions
lib/mix/tasks/bmark_cmp.ex:42: warning: function parse_args/1 is private, @doc's are always discarded for private functions
lib/mix/tasks/bmark_cmp.ex:64: warning: function filename_path_to_header/1 is private, @doc's are always discarded for private functions

I’ll convert these @docs to comments.

Inch CI

I wanted to be able to track my progress so I setup Inch CI. This was simply a matter of following the instructions. And the best part is I get a new badge:

I can also use Inch CI to guide further documentation. First, it notes that I have good documentation for the tasks. So that’s done. Inch CI has 12 suggestions.

Bmark

I certainly need to document the module and the bmark macro. It is the only macro or function form the project that is expected to be called from outside. This is my public interface.

Done, now I have only 11 suggestions left.

Bmark.ComparisonFormatter

I need to document the module and the public function Bmark.ComparisonFormatter.format/2. In order to fully document this function I need to add type specifications but, I think that should be a task for the future. For now I will just describe what the function does.

Down to 10 suggestions but, I expected 9. Inch CI says that Bmark.ComparisonFormatter.format/2 would have even better documentation if it described the arguments. It took me some work to get the syntax right. I needed this:

@doc """
This function formats a comparison report with the headers and results side by side. It accepts
two pairs of values.

`list`  [left, right] are the left and right headers.
`list2` [lresults, rresults] are the left and right results lists.
"""
def format([left, right], [lresults, rresults]) do
  header(left, right) <>
  side_by_side_results(lresults, rresults, alignment(left))      
end

It is also a little cumbersome to push this to github everytime I want to try different documentation. I can run

MIX_ENV=docs mix inch

to generate a report locally. But I still don’t know how to get the detailed suggestions that I get on the website like

Suggestions:
Describe the parameter "list1".

If you have some tips on how to do this, I’d like to hear them in the comments.

Anyway, I’m down to 9.

Keep Documenting

I document the Bmark.Distribution module and its public function t/2 and am down to 7.

Mix.Tasks.Bmark.Cmp.percent_increase/2 doesn’t need to be public -> down to 6.

I document the Bmark.Server module, its 3 public functions, and the BmarkEntry struct -> down to 2.

The last 2 suggestions are to document the arguments to the Mix tasks. I’m going to forego these for now because they are documented in the module’s usage information.

~~Add documentation~~

Next steps

Well I’ve learned a lot about Elixir documentation in this post. It was good experinence to start using @doc and @moduledoc and has improved the documentation of my project. I also really liked the integration between Mix and @moduledoc. It’s nice to be able to use one write up to document both the code and the command line tool usage. While looking at example documentation, I saw that the type specification also integrates strongly with the documentation as well as allowing type checking. Now I’m really starting to appreciate Elixir’s “first-class” documentation. The integration of the documentation with the rest of the system is simply amazing.

I wasn’t expecting this week’s post on documentation to be so interesting but, I’m really glad I wrote this post. I’d like to see your thougts, in the comments section, on how Elixir’s integrates documentation.

Next week I’ll build a hex package. I’m excited to be so close to releasing Bmark.